Sparse Representation Learning of Data by Autoencoders with L_1/2 Regularization

Feng Li, Jacek M. Zurada, Wei Wu


Autoencoder networks have been demonstrated to be efficient for unsupervised learning of representation of images, documents and time series. Sparse representation can improve the interpretability of the input data and the generalization of a model by eliminating redundant features and extracting the latent structure of data. In this paper, we use L_1/2 regularization method to enforce sparsity on the hidden representation of an autoencoder for achieving sparse representation of data. The performance of our approach in terms of unsupervised feature learning and supervised classification is assessed on the MNIST digit data set, the ORL face database and the Reuters-21578 text corpus. The results demonstrate that the proposed autoencoder can produce sparser representation and better reconstruction performance than the Sparse Autoencoder and the L_1 regularization Autoencoder. The new representation is also illustrated to be useful for a deep network to improve the classification performance.


Autoencoder; Sparse representation; Unsupervised feature learning; Deep network; L_1/2 regularization


Bengio Y. Learning deep architectures for AI. Foundations and trends in Machine Learning. 2009, 2, pp. 1-127,

doi: 10.1561/2200000006.

Vincent P., Larochelle H., Lajoie I., Bengio Y., Manzagol P.A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research. 2010, 11, pp. 3371-3408.

Coates A., Ng A.Y., Lee H. An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of International conference on artificial intelligence and statistics, 2011, pp. 215-223.

Deng L., Seltzer M.L., Yu D., Acero A., Mohamed A.R., Hinton G. E. Binary coding of speech spectro-grams using a deep auto-encoder. Interspeech. 2010, pp. 1692-1695.

Rifai S., Vincent P., Muller X., Glorot X., Bengio Y. Contractive auto-encoders: Explicit invariance during feature extraction. In: Proceedings of the 28th international conference on machine learning (ICML 2011), 2011, pp. 833-840.

Ng A. Sparse autoencoder. In: CS294A Lecture notes [online], 2011. Available from:

Wang W., Huang Y, Wang Y, Wang L. Generalized autoencoder: A neural network framework for dimensionality reduction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014, pp. 490-497.

Setiono R., Lu G. A neural network construction algorithm with application to image compression. Neural Computing & Applications. 1994, 2(2), pp. 61-68, doi: 10.1007/BF01414350.

LeCun Y., Bengio Y., Hinton G.E. Deep learning. Nature. 2015, 521(7553), pp. 436-444, doi: 10.1038/nature14539.

Krizhevsky A., Sutskever I., Hinton G.E. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, 2012, 25, pp. 1097-1105.

Hinton G.E., Osindero S., Teh Y.W. A fast learning algorithm for deep belief nets. Neural Computation. 2006, 18(7), pp. 1527-1554, doi: 10.1162/neco.2006.18.7.1527.

Hosseini-Asl E., Zurada J.M., Nasraoui O. Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Transactions on Neural Networks and Learning Systems. 2016, 27(12), pp. 2486-2498,

doi: 10.1109/TNNLS.2015.2479223.

Ayinde B.O., Hosseini-Asl E., Zurada J.M. Visualizing and understanding nonnegativity constrained sparse autoencoder in deep learning. In: Proceedings of International conference on Soft Computing and Artificial Intelligence, 2016,

doi: 10.1007/978-3-319-39378-0_1.

Wang J., He H., Prokhorov D.V. A folded neural network autoencoder for dimensionality reduction. Procedia Computer Science. 2012, 13, pp. 120-127, doi: 10.1016/j.procs.2012.09.120.

Attwell D., Laughlin S.B. An energy budget for signaling in the greymatter of the brain. Journal of Cerebral Blood Flow And Metabolism. 2001, 21(10), pp. 1133-1145,

doi: 10.1097/00004647-200110000-00001.

Lee H., Ekanadham C., Ng A. Sparse deep belief net model for visual area V2. In: Advances in Neural Information Processing Systems, 2007, 20, pp. 873-880.

Wright J., Yang A.Y., Ganesh A., Sastry S.S., Ma Y. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2009, 31(2), pp. 210-227,

doi: 10.1109/TPAMI.2008.79.

Nair V., Hinton G.E. 3D object recognition with deep belief nets. In: Advances in Neural Information Processing Systems, 2009, 22, pp. 1339-1347.

Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B. 1996, 58, pp. 267-288. Available from:

Donoho D.L., Huo X. Uncertainty principles and ideal atomic decomposition. IEEE Transactions on Information Theory. 2001, 47(7), pp. 2845-2862, doi: 10.1109/18.959265.

Chen S., Donoho D.L., Saunders M. Atomic decomposition by basis pursuit. SIAM Review. 2001, 43(1), pp. 129-159,

doi: 10.1137/S003614450037906X.

Zou H. The adaptive Lasso and its oracle properties. Journal of the American statistical association. 2006, 101(476), pp. 1418-1429, doi: 10.1198/016214506000000735.

Zhao P., Yu B. Stagewise lasso. Journal of Machine Learning Research. 2007, 8, pp. 2701-2726.

Liu H, Motoda H, Setiono R, Zhao Z. Feature selection: An ever evolving frontier in data mining. In Feature Selection in Data Mining, 2010, pp. 4-13.

Xu Z.B., Zhang H., Wang Y., Chang X.Y., Liang Y. L1/2 regularization. Science China Information Sciences. 2010, 53(6), pp. 1159-1169 doi: 10.1007/s11432-010-0090-0.

Xu Z.B., Chang X.Y., Xu F.M. L1/2 Regularization: A Thresholding Representation Theory and a Fast Solver. IEEE Transactions on Neural Networks and Learning Systems. 2012, 23(7), pp. 1013-1027, doi: 10.1109/TNNLS.2012.2197412.

Zeng J.S., Lin S.B., Wang Y., Xu Z.B. L1/2 Regularization: Convergence of Iterative Half Thresholding Algorithm. IEEE Transactions on Signal Processing. 2014, 62(9), pp. 2317-2329,

doi: 10.1109/TSP.2014.2309076.

Jiang X., Zhang Y., Zhang W., Xiao X. A novel sparse auto-encoder for deep unsupervised learning. In: Proceedings of International Conference on Advanced Computational Intelligence, 2013, pp. 256-261.

LeCun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, 1998, 86(11), pp. 2278-2324, doi: 10.1109/5.726791.

Samaria F.S., Harter A.C. Parameterisation of a stochastic model for human face identification. In: Proceedings of the Second IEEE Workshop on Applications of Computer Vision, 1994, pp. 138-142.

Krogh A., Hertz J.A. A simple weight decay can improve generalization. In: Advances in neural information processing systems, 1991, 4, pp. 950-957.

Wu W. Computation of Neural Networks. Beijing, Higher Education Press, 2003.

Zurada J.M. Introduction to Artificial Neural Systems. St. Paul, Minn., West Publishing Company, 1992.

Wu W., Fan Q.W., Zurada J.M., Wang J., Yang D.K., Liu Y. Batch gradient method with smoothing L1/2 regularization for training of feedforward neural networks. Neural Networks. 2014, 50, pp. 72-78, doi: 10.1016/j.neunet.2013.11.006.

Zhang H.S., Wu W. Convergence of Split-Complex Backpropagation Algorithm with a Momentum. Neural Network World. 2011, 21(1), pp. 75-90.

Do C., Ng A.Y. Transfer learning for text classification. In: Advances in Neural Information Processing Systems, 2005, 299-306.

Schmidt M. MinConf: Projection methods for optimization with simple constraints in Matlab. 2008. Available from:

Srivastava N., Hinton G.E., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research. 2014, 15(1), pp. 1929-1958.

Hoyer P.O. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research. 2004, 5, pp. 1457-1469.

Maaten L.V.D., Hinton G.E. Visualizing data using t-SNE. Journal of Machine Learning Research. 2008, 9, pp. 2579-2605.



  • There are currently no refbacks.

Should you encounter an error (non-functional link, missing or misleading information, application crash), please let us know at
Please, do not use the above address for non-OJS-related queries (manuscript status, etc.).
For your convenience we maintain a list of frequently asked questions here. General queries to items not covered by this FAQ shall be directed to the journal editoral office at