A Survey on Mixed-Attribute Outlier Detection Methods

Authors

  • Nur Rokhman Universitas Gadjah Mada

DOI:

https://doi.org/10.21512/commit.v13i1.5558

Keywords:

Outlier Detection, Categorical Data, Numerical Data, Mixed-Attribute Data

Abstract

In the data era, outlier detection methods play an important role. The existence of outliers can provide clues to the discovery of new things, irregularities in a system, or illegal intruders. Based on the data, outlier detection methods can be classified into numerical, categorical, or mixed-attribute data. However, the study of the outlier detection methods is generally conducted for numerical data. Meanwhile, many real-life facts are presented in mixed-attribute data. In this paper, the researcher presents a survey of outlier detection methods for mixed-attribute data. The methods are classified into four types, namely, categorized, enumerated, combined, and mixed outlier detection methods for mixed-attribute data. Through this classification, the methods can be easily analyzed and improved by applying appropriate functions.

Dimensions

Plum Analytics

Author Biography

Nur Rokhman, Universitas Gadjah Mada

Department of Computer Sciences and Electronics

References

Z. A. Bakar, R. Mohemad, A. Ahmad, and M. M. Deris, “A comparative study for outlier detection techniques in data mining,” in 2006 IEEE Conference on Cybernetics and Intelligent Systems. Bangkok, Thailand: IEEE, June 7–9, 2006, pp. 1–6.

J. Xi, “Outlier detection algorithms in data mining,” in 2008 Second International Symposium on Intelligent Information Technology Application, vol. 1. Shanghai, China: IEEE, Dec. 20–22, 2008, pp. 94–97.

V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Computing Surveys (CSUR), vol. 41, no. 3, pp. 1–72, 2009.

P. Gogoi, D. Bhattacharyya, B. Borah, and J. K. Kalita, “A survey of outlier detection methods in network anomaly identification,” The Computer Journal, vol. 54, no. 4, pp. 570–588, 2011.

K. Singh and S. Upadhyaya, “Outlier detection: Applications and techniques,” International Journal of Computer Science Issues (IJCSI), vol. 9, no. 1, pp. 307–323, 2012.

V. Ilango, R. Subramanian, and V. Vasudevan, “A five step procedure for outlier analysis in data mining,” European Journal of Scientific Research, vol. 75, no. 3, pp. 327–339, 2012.

P. Ajitha and E. Chandra, “A survey on outliers detection in distributed data mining for big data,” Journal of Basic and Applied Scentific Research, vol. 5, no. 2, pp. 31–38, 2015.

P. S. Femi and S. G. Vaidyanathan, “Comparative study of outlier detection approaches,” in 2018 International Conference on Inventive Research in Computing Applications (ICIRCA). Tamil Nadu, India: IEEE, July 11–12, 2018, pp. 366–371.

X. Xu, H. Liu, and M. Yao, “Recent progress of anomaly detection,” Complexity, vol. 2009, pp. 1–11, 2019.

C. C. Aggarwal, Outlier Analysis. Switzerland: Springer International Publishing AG, 2017.

A. Ghoting, M. E. Otey, and S. Parthasarathy, “LOADED: Link-based outlier and anomaly detection in evolving data sets,” in Fourth IEEE International Conference on Data Mining (ICDM’04). Brighton, UK: IEEE, Nov. 1–4, 2004, pp. 387–390.

Y.-G. Kim and K. M. Lee, “Association-based outlier detection for mixed data,” Indian Journal of Science and Technology, vol. 8, no. 25, pp. 1–6, 2015.

R. Foorthuis, “SECODA: Segmentation-and combination-based detection of anomalies,” in 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA). Tokyo, Japan: IEEE, Oct. 19–21, 2017, pp. 755–764.

D. Maryono, P. Hatta, and R. Ariyuana, “Implementation of numerical attribute discretization for outlier detection on mixed attribute dataset,” in 2018 International Conference on Information and Communications Technology (ICOIACT). Yogyakarta, Indonesia: IEEE, March 6–7, 2018, pp. 715–718.

M. E. Otey, S. Parthasarathy, and A. Ghoting, “Fast lightweight outlier detection in mixedattribute data,” The Ohio State University, Tech. Rep., 2005.

Y. C. Lu, F. Chen, Y. Chen, and C. T. Lu, “A generalized student-t based approach to mixedtype anomaly detection,” in Twenty-Seventh AAAI Conference on Artificial Intelligence, Bellevue, Washington, USA, July 14–18, 2013.

A. Koufakou, M. Georgiopoulos, and G. Anagnostopoulos, “Detecting outliers in high-dimensional datasets with mixed attributes,” in Proceedings of The 2008 International Conference on Data Mining, DMIN 2008. Las Vegas, USA: CSREA Press, July 14–17, 2008.

K. N. Tran and H. Jin, “Detecting network anomalies in mixed-attribute data sets,” in 2010 Third International Conference on Knowledge Discovery and Data Mining. Phuket, Thailand: IEEE, Jan. 9–10 2010, pp. 383–386.

M. K. Murthy, A. Govardhan, and L. D. SreenivasaReddy, “A model to find outliers in mixedattribute datasets using mixed attribute outlier factor,” International Journal of Computer Science Issues (IJCSI), vol. 10, no. 5, pp. 215–219, 2013.

G. A. Jahanban and T. S. Singh, “Detection of outlier schema for mixed data using ITB-SP and HilOut algorithms,” International Research Journal of Engineering and Technology (IRJET), vol. 01, no. 01, pp. 20–22, 2014.

M. Bouguessa, “A practical outlier detection approach for mixed-attribute data,” Expert Systems with Applications, vol. 42, no. 22, pp. 8637–8649, 2015.

B. A. Manjunatha and P. Gogoi, “Anomaly based intrusion detection in mixed attribute dataset using data mining methods,” Journal of Artificial Intelligence, vol. 9, no. 1–3, pp. 1–11, 2016.

Y. C. Lu, F. Chen, Y. Wang, and C. T. Lu, “Discovering anomalies on mixed-type data using a generalized student-t based approach,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 10, pp. 2582–2595, 2016.

K. Zhang and H. Jin, “An effective pattern based outlier detection approach for mixed attribute data,” in Australasian Joint Conference on Artificial Intelligence. Adelaide, Australia: Springer, Dec. 7–10, 2010, pp. 122–131.

Downloads

Published

2019-05-31
Abstract 1050  .
PDF downloaded 421  .