Outlier detection in self-organizing maps and their quality estimation

Pavel Stefanovič, Olga Kurasova

Abstract


In the paper, an algorithm that allows to detect and reject outliers in a self-organizing map (SOM) has been proposed. SOM is used for data clustering as well as dimensionality reduction and the results obtained are presented in a special graphical form. To detect outliers in SOM, a genetic algorithm-based Travelling Salesman approach has been applied. After outliers are detected and removed, the SOM quality has to be estimated. A measure has been proposed to evaluate the coincidence of data classes and clusters obtained in SOM. A larger value of the measure means that the distance between centers of different classes in SOM is longer and the clusters corresponding to the data classes separate better. With a view to illustrate the proposed algorithm, two datasets (numerical and textual) are used in this investigation.

Keywords


outlier detection and rejection; self-organizing map; SOM quality estimation

References


Abidogun O. A., Omlin C. W. A self-organizing maps model for outlier detection in call data from mobile telecommunication networks. In Proceedings of the 8th outhern African Telecommunication Networks and Applications Conference (SATNAC 2004), p. 4, South-Western Cape, South Africa, September, 2004.

Aggarwal C. C., Yu P. S. Outlier detection for high dimensional data. ACM Sig-mod Record 30(2), 37-46, 2001.

Asuncion A., Newman D. J. UCI Machine Learning Repository. Irvine, CA: University of California, School of nformation and Computer, 2007. http://www.ics.uci.edu/ mlearn/MLRepository.htm.

Cai Q., He H., Man, H., Qiu, J. IterativeSOMSO: An iterative self-organizing map for spatial outlier detection. In: L. Zhang, J. Kwok, and B.-L. Lu (Eds.): ISNN 2010, Part I, LNCS 6063, pp. 325-330. Springer-Verlag Berlin Heidelberg 2010.

Cai Q., He H., and Man H. Spatial outlier detection based on iterative self-organizing learning model. Neurocomputing, Vol. 117, pp. 161-172, 2013.

Deneshkumar V., Senthamaraikannan K., Manikandan M. Identification of outliers in medical diagnostic system using data mining techniques. International Journal of Statistics and

Applications 2014, 4(6): 241-248, 2014.

Hawkins S., He H., Williams G.J., Baxter R.A. Outlier detection using replicator neural networks. Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol.

, pp. 170-180. Springer, Heidelberg, 2002.

Iwasaki Y., Abe T., Wada, Y., Wada, K., Ikemura, T. Novel bioinformatics strategies for prediction of directional sequence changes in in uenza virus genomes and for surveillance of

potentially hazardous strains. BMC Infectious Diseases 3(386), 2013.

Joseph K. Traveling salesman problem-genetic algorithm. MathWorks. MATLAB, 7, 2014. http://www.mathworks.com/matlabcentral/fileexchange/13680.

Kanhere P., Khanuja H. K. A survey on outlier detection in fiinancial transactions. International Journal of Computer Applications (0975{8887) Volume 108-No 17, December

Kaski S., Honkela T., Lagus K., Kohonen T. WEBSOM - self-organizing maps of document collections. Neurocomputing 21:101-117, 1998.

Kohonen T. Self-Organizing Maps. 3rd ed., Springer Series in Information Sciences. Berlin: Springer-Verlag, 2001.

Kurasova O., Molytė A. Quality of quantization and visualization of vectors obtained by neural gas and self-organizing map. Informatica 22(1): 115-134, 2011.

Maddala G. S. Outliers. Introduction to Econometrics (2nd ed.). New York: MacMillan. pp. 88-96, 1992.

Marghny M. H., Taloba A. I. Outlier detection using improved genetic k-means. International Journal of Computer Applications, Vol. 28- No.11, 2011.

Manning D. C., Raghavan P., Schfiutze H. Introduction to information retrieval. Cambridge University Press, 2008.

Schubert E., Zimek A., Kriegel H. P. Fast and scalable outlier detection with approximate nearest neighbor ensembles. DASFAA (2) 2015: 19-36, 2015.

Stefanovic P., Kurasova O. Influence of learning rates and neighboring functions on selforganizing maps. In: J. Laaksonen, T. Honkela (Eds.). Advances in Self-Organizing Maps: 8th International Workshop, WSOM 2011, Espoo, Finland, June 13-15, 2011: Proceedings. Book Series: Lecture Notes in Computer Science. Vol. 6731. ISBN 9783642215 pp. 141-150, 2011.

Stefanovič P., Kurasova O. Visual analysis of self-organizing maps. Nonlinear Analysis: Modeling and Control, 16(4), 488-504, 2011.

Stefanovič P., Kurasova O. Creation of text document matrices and visualization by SOM. Information Technology and Control. Vol. 43, no. 1. ISSN 1392-124X pp. 37-46), 2014.

Stefanovič P., Kurasova O. Investigation on learning parameters of self-organizing maps. Baltic Journal of Modern Computing. Vol. 2, no. 2. ISSN 2255-8942 pp. 45-55, 2014

Strickert M., Hammer B. Merge SOM for temporal data. Neurocomputing 64: 39-72, 2005.

Voegtlin T. Recursive self-organizing maps. Neural Networks 15 (8-9), 979-992, 2002.

Zhao J., Lu C. T., Kou, Y. Detecting region outliers in meteorological data. In Proc. of the 11th ACM-GIS, pp. 49-55, 2003.

Zimek A., Schubert E., Kriegel H. P. A survey on unsupervised outlier detection in highdimensional numerical data. Statistical Analysis and Data Mining, vol. 5, no. 5, pp. 363-387, 2012.




DOI: http://dx.doi.org/10.14311/NNW.2018.%25x

Refbacks

  • There are currently no refbacks.


Should you encounter an error (non-functional link, missing or misleading information, application crash), please let us know at nnw.ojs@fd.cvut.cz.
Please, do not use the above address for non-OJS-related queries (manuscript status, etc.).
For your convenience we maintain a list of frequently asked questions here. General queries to items not covered by this FAQ shall be directed to the journal editoral office at nnw@fd.cvut.cz.