An Improved Weighted Median Algorithm for Spatial Outliers Detection

Authors

  • Zerlita Fahdha Pusdiktasari University of Brawijaya
  • Rahma Fitriani University of Brawijaya
  • Eni Sumarminingsih University of Brawijaya

DOI:

https://doi.org/10.21512/comtech.v13i2.7821

Keywords:

Weighted Median Algorithm (WMA), spatial outliers, average difference algorithm

Abstract

A spatial outlier is an object that significantly deviates from its surrounding neighbors. The median algorithm is one of the spatial outlier methods, which is robust. However, it assumes that all spatial objects have the same characteristics. Meanwhile, the Average Difference Algorithm (AvgDiff) has accommodated the differences in spatial characteristics, but it does not use statistical tests to determine the status of an object, whether it is an outlier or not. The research developed an improved version of the median algorithm and AvgDiff, called the Weighted Median Algorithm (WMA) which combined the advantages of the two methods. From the median algorithm, WMA adopted median and statistical test concepts. Meanwhile, from AvgDiff, WMA adopted the concept of using differences in objects’ spatial characteristics as weights. A combination of the two advantages was innovated by calculating WMA’s neighborhood score using a weighted median. Then, a simulation was conducted to analyze the accuracy of the method. The result confirms that when objects have heterogeneous spatial characteristics, WMA performs better than the median algorithm. The accuracy of WMA is not much higher than AvgDiff, but the use of WMA can prevent a serious false detection problem. The methods can be applied to an incidence rate of Covid-19 data in East Java.

Dimensions

Plum Analytics

Author Biographies

Zerlita Fahdha Pusdiktasari, University of Brawijaya

Department of Statistics, Faculty of Mathematics and Natural Sciences

Rahma Fitriani, University of Brawijaya

Department of Statistics, Faculty of Mathematics and Natural Sciences

Eni Sumarminingsih, University of Brawijaya

Department of Statistics, Faculty of Mathematics and Natural Sciences

References

Aggarwal, V., Gupta, V., Singh, P., Sharma, K., & Sharma, N. (2019). Detection of spatial outlier by using improved Z-score test. In 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI) (pp. 788–790). IEEE. https://doi.org/10.1109/ICOEI.2019.8862582

Araki, S., Shimadera, H., Yamamoto, K., & Kondo, A. (2017). Effect of spatial outliers on the regression modelling of air pollutant concentrations: A case study in Japan. Atmospheric Environment, 153(March), 83–93. https://doi.org/10.1016/j.atmosenv.2016.12.057

Ayadi, A., Ghorbel, O., Obeid, A. M., & Abid, M. (2017). Outlier detection approaches for wireless sensor networks: A survey. Computer Networks, 129, 319–333. https://doi.org/10.1016/j.comnet.2017.10.007

Baba, A. M., Midi, H., & Abd Rahman, N. H. (2021). A spatial outlier detection method for big data based on adjacency weighted residuals and its application to COVID-19 data. Economic Computation and Economic Cybernetics Studies and Research, 55(3), 87–102. https://doi.org/10.24818/18423264/55.3.21.06

Bosman, H. H., Iacca, G., Tejada, A., Wörtche, H. J., & Liotta, A. (2017). Spatial anomaly detection in sensor networks using neighborhood information. Information Fusion, 33(January), 41–56. https://doi.org/10.1016/j.inffus.2016.04.007

Chen, D., Lu, C. T., Kou, Y., & Chen, F. (2008). On detecting spatial outliers. GeoInformatica, 12, 455–475. https://doi.org/10.1007/s10707-007-0038-8

Colak, H. E., Memisoglu, T., Erbas, Y. S., & Bediroglu, S. (2018). Hot spot analysis based on network spatial weights to determine spatial statistics of traffic accidents in Rize, Turkey. Arabian Journal of Geosciences, 11, 1–11. https://doi.org/10.1007/s12517-018-3492-8

Djenouri, Y., & Zimek, A. (2018). Outlier detection in urban traffic data. In WIMS '18: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics (pp. 1–12). https://doi.org/10.1145/3227609.3227692

Duong, D. (2021). Alpha, Beta, Delta, Gamma: What’s important to know about SARS-CoV-2 variants of concern? CMAJ: Canadian Medical Association Journal, 193(27), E1059–E1060. https://doi.org/10.1503/cmaj.1095949

Edgeworth, F. Y. (1887). On observations relating to several quantities. Hermathena, 6(13), 279–285.

Ernst, M., & Haesbroeck, G. (2017). Comparison of local outlier detection techniques in spatial multivariate data. Data Mining and Knowledge Discovery, 31, 371–399. https://doi.org/10.1007/s10618-016-0471-0

Fitriani, R., Pusdiktasari, Z. F., & Diartho, H. C. (2021). Growth interdependence in the presence of spatial outliers: Implementation of an average difference algorithm on East Java regional economic growth, 2011-2016. Regional Statistics, 11(3), 119–132. https://doi.org/10.15196/RS110306

Fu, W., Zhao, K., Zhang, C., Wu, J., & Tunney, H. (2016). Outlier identification of soil phosphorus and its implication for spatial structure modeling. Precision Agriculture, 17, 121–135. https://doi.org/10.1007/s11119-015-9411-z

Goovaerts, P. (2005). Detection of spatial clusters and outliers in cancer rates using geostatistical filters and spatial neutral models. In Proceedings of the Fifth European Conference on Geostatistics for Environmental Applications (pp. 149–160). https://doi.org/10.1007/3-540-26535-X_13

Helwig, Z. D., Guggenberger, J., Elmore, A. C., & Uetrecht, R. (2019). Development of a variogram procedure to identify spatial outliers using a supplemental digital elevation model. Journal of Hydrology X, 3(April), 1–11. https://doi.org/10.1016/j.hydroa.2019.100029

Ijaz, M. F., Attique, M., & Son, Y. (2020). Data-driven cervical cancer prediction model with outlier detection and over-sampling methods. Sensors, 20(10), 1–22. https://doi.org/10.3390/s20102809

Khan, D., Rossen, L. M., Hamilton, B. E., He, Y., Wei, R., & Dienes, E. (2017). Hot spots, cluster detection and spatial outlier analysis of teen birth rates in the U.S., 2003–2012. Spatial and Spatio-Temporal Epidemiology, 21(June), 67–75. https://doi.org/10.1016/j.sste.2017.03.002

Kolbaşi, A., & Ünsal, A. (2019). A comparison of the outlier detecting methods: An application on Turkish foreign trade data. Journal of Mathematics and Statistical Science, 5, 213–234.

Kou, Y., Lu, C. T., & Chen, D. (2006). Spatial weighted outlier detection. In Proceedings of the 2006 SIAM International Conference on Data Mining (pp. 614–618). Society for Industrial and Applied Mathematics.

Lu, C. T., Chen, D., & Kou, Y. (2003). Algorithms for spatial outlier detection. In Third IEEE International Conference on Data Mining (pp. 597–600). IEEE. https://doi.org/10.1109/ICDM.2003.1250986

Mathieu, E., Ritchie, H., Rodés-Guirao, L., Appel, C., Gavrilov, D., Giattino, C., Hasell, J., Macdonald, B., Dattani, S., Beltekian, D., Ortiz-Ospina, E., & Roser, M. (2020). Coronavirus (COVID-19) cases. Retrieved from https://ourworldindata.org/covid-cases

Nguyen, T. T., Vu, D. T., Trinh, L. H., & Nguyen, T. L. H. (2016). Spatial cluster and outlier identification of geochemical association of elements: A case study in Juirui copper mining area. Bulletin of the Mineral Research and Exploration, 153(153), 159–167.

Prastawa, M., Bullitt, E., Ho, S., & Gerig, G. (2004). A brain tumor segmentation framework based on outlier detection. Medical Image Analysis, 8(3), 275–283. https://doi.org/10.1016/j.media.2004.06.007

Pu, J., Wang, Y., Liu, X., & Zhang, X. (2019). STLP-OD: Spatial and temporal label propagation for traffic outlier detection. IEEE Access, 7, 63036–63044. https://doi.org/10.1109/ACCESS.2019.2916853

Sajana, O. K., & Sajesh, T. A. (2018). Detection of multidimensional outlier using multivariate spatial median. Journal of Computer and Mathematical Sciences, 9(12), 1875–1881.

Satuan Tugas Penanganan COVID-19. (2021). Penanganan COVID-19 2021: Kesembuhan melebihi 4,1 juta, kasus aktif tersisa 4 ribu dan vaksinasi melampaui 161 juta orang. Retrieved from https://covid19.go.id/p/berita/penanganan-covid-19-2021-kesembuhan-melebihi-41-juta-kasus-aktif-tersisa-4-ribu-dan-vaksinasi-melampaui-161-juta-orang

Shekhar, S., Lu, C. T., & Zhang, P. (2001). A unified approach to spatial outlier detection. Retrieved from https://hdl.handle.net/11299/215495

Shukla, S., & Lalitha, S. (2021). Spatial analysis of water quality data using multivariate spatial outlier detection algorithms. GANITA, 70(2), 87–96.

Su, P. C. (2011). Statistical geocomputing: Spatial outlier detection in precision agriculture (Master's thesis). University of Waterloo.

Taha, A., Onsi, H. M., El Din, M. N., & Hegazy, O. M. (2019). A model for spatial outlier detection based on weighted neighborhood relationship. arXiv Preprint, 1–12. https://doi.org/10.48550/arXiv.1911.01867

Tang, J., & Ngan, H. Y. T. (2016). Traffic outlier detection by density-based bounded local outlier factors. Information Technology in Industry, 4(1), 6–18.

Tepanosyan, G., Sahakyan, L., Zhang, C., & Saghatelyan, A. (2019). The application of Local Moran's I to identify spatial clusters and hot spots of Pb, Mo and Ti in urban soils of Yerevan. Applied Geochemistry, 104(May), 116–123. https://doi.org/10.1016/j.apgeochem.2019.03.022

Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46, 234–240. https://doi.org/10.2307/143141

Ulak, M. B., Ozguven, E. E., Vanli, O. A., & Horner, M. W. (2019). Exploring alternative spatial weights to detect crash hotspots. Computers, Environment and Urban Systems, 78(November), 1–9. https://doi.org/10.1016/j.compenvurbsys.2019.101398

Van Zoest, V. M., Stein, A., & Hoek, G. (2018). Outlier detection in urban air quality sensor networks. Water, Air, & Soil Pollution, 229, 1–13. https://doi.org/10.1007/s11270-018-3756-7

Wang, S., & Serfling, R. (2018). On masking and swamping robustness of leading nonparametric outlier identifiers for multivariate data. Journal of Multivariate Analysis, 166(July), 32–49. https://doi.org/10.1016/j.jmva.2018.02.003

Wang, Z. Q., Wang, S. K., Hong, T., & Wan, X. H. (2004). A spatial outlier detection algorithm based multi-attributive correlation. In Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826) (pp. 1727–1732). IEEE. https://doi.org/10.1109/ICMLC.2004.1382054

WHO. (2021). Informasi terbaru tentang Omicron. Retrieved from https://www.who.int/indonesia/news/detail/30-11-2021-informasi-terbaru-tentang-omicron

Wu, H., Tang, X., & Wang, Z. (2018). Probabilistic automatic outlier detection for surface air quality measurements from the China national environmental monitoring network. Advances in Atmospheric Sciences, 35, 1522–1532. https://doi.org/10.1007/s00376-018-8067-9

Xia, H., An, W., Li, J., & Zhang, Z. (2022). Outlier knowledge management for extreme public health events: Understanding public opinions about COVID-19 based on microblog data. Socio-Economic Planning Sciences, 80(March), 1–12. https://doi.org/10.1016/j.seps.2020.100941

Xiao, F., Wang, K., Hou, W., & Erten, O. (2020). Identifying geochemical anomaly through spatially anisotropic singularity mapping: A case study from silver-gold deposit in Pangxidong district, SE China. Journal of Geochemical Exploration, 210(March), 1–20. https://doi.org/10.1016/j.gexplo.2019.106453

Downloads

Published

2022-11-25

Issue

Section

Articles
Abstract 298  .
PDF downloaded 268  .