Observing Pre-trained CNN Layer as Hand-crafted Features for Detecting Bias in Image Classification Data
Keywords:computer vision, pre-trained CNN layer, hand-crafted features, data bias
Detecting bias in data is important since it can pose some serious problems when developing an AI algorithm. This research aims to propose a novel study design to detect bias in image classification data by using pre-trained CNN layers as hand-crafted features. There are 3 datasets used in this research with varying degrees of complexities which are MNIST Digits, Batik collections, and CIFAR-10. The research observed the effect of pre-trained CNN layers especially on individual kernels for feature extraction. By observing the effect of individual kernels the research can better make sense of what is happening inside a CNN layer. The research found that color in the image is an important factor when working with CNN. Furthermore, the proposed study design is also able to detect bias in image classification data where it is related to the color of the image. Detecting this bias early on is important in helping developers improve AI algorithms.
S. Leavy, “Gender bias in artificial intelligence: the need for diversity and gender theory in machine learning,” in Proceedings of the 1st International Workshop on Gender Equality in Software Engineering, Gothenburg Sweden, May 2018, pp. 14–16. doi: 10.1145/3195570.3195580.
D. J. Fuchs, “The Dangers of Human-Like Bias in Machine-Learning Algorithms,” Missouri S&T’s Peer to Peer, vol. 2, no. 1, May 2018.
J. Dastin, “Amazon scraps secret AI recruiting tool that showed bias against women,” Reuters, Oct. 10, 2018. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G (accessed Jan. 22, 2022).
A. Chouldechova, Emily Putnam-Hornstein, D. Benavides-Prado, O. Fialko, and R. Vaithianathan, “A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions,” in Proceedings of the 1st Conference on Fairness, Accountability and Transparency, Feb. 2018, vol. 81, pp. 134–148.
N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and fairness in machine learning,” ACM Comput. Surv., vol. 54, no. 6, p. 115:1-115:35, Jul. 2021, doi: 10.1145/3457607.
S. Alelyani, “Detection and evaluation of machine learning bias,” Applied Sciences, vol. 11, no. 14, p. 6271, Jan. 2021, doi: 10.3390/app11146271.
H. Jiang and O. Nachum, “Identifying and Correcting Label Bias in Machine Learning,” arXiv:1901.04966 [cs, stat], Jan. 2019, Accessed: Jan. 22, 2022. [Online]. Available: http://arxiv.org/abs/1901.04966
W. Sun, O. Nasraoui, and P. Shafto, “Evolution and impact of bias in human and machine learning algorithm interaction,” PLoS ONE, vol. 15, no. 8, p. e0235502, Aug. 2020, doi: 10.1371/journal.pone.0235502
S. Albawi, T. A. Mohammed, and S. Al-Zawi, “Understanding of a convolutional neural network,” in 2017 International Conference on Engineering and Technology (ICET), Antalya, Aug. 2017, pp. 1–6. doi: 10.1109/ICEngTechnol.2017.8308186.
B. B. Traore, B. Kamsu-Foguem, and F. Tangara, “Deep convolution neural network for image recognition,” Ecological Informatics, vol. 48, pp. 257–268, Nov. 2018, doi: 10.1016/j.ecoinf.2018.10.002.
L. Shang, Q. Yang, J. Wang, S. Li, and W. Lei, “Detection of rail surface defects based on CNN image recognition and classification,” in 2018 20th International Conference on Advanced Communication Technology (ICACT), Chuncheon-si Gangwon-do, Korea (South), Feb. 2018, pp. 45–51. doi: 10.23919/ICACT.2018.8323642.
R. Chauhan, K. K. Ghanshala, and R. C. Joshi, “Convolutional Neural Network (CNN) for Image Detection and Recognition,” in 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), Jalandhar, India, Dec. 2018, pp. 278–282. doi: 10.1109/ICSCCC.2018.8703316.
B. P. Gyires-Tóth, M. Osváth, D. Papp, and G. Szűcs, “Deep Learning for Plant Classification and Content-Based Image Retrieval,” Cybernetics and Information Technologies, vol. 19, no. 1, pp. 88–100, Mar. 2019, doi: 10.2478/cait-2019-0005.
Y. Wang, H. Liu, M. Guo, X. Shen, B. Han, and Y. Zhou, “Image recognition model based on deep learning for remaining oil recognition from visualization experiment,” Fuel, vol. 291, p. 120216, May 2021, doi: 10.1016/j.fuel.2021.120216.
X. Yang, Y. Zhang, W. Lv, and D. Wang, “Image recognition of wind turbine blade damage based on a deep learning model with transfer learning and an ensemble learning classifier,” Renewable Energy, vol. 163, pp. 386–397, Jan. 2021, doi: 10.1016/j.renene.2020.08.125.
S. Hicks et al., “Dissecting Deep Neural Networks for Better Medical Image Classification and Classification Understanding,” in 2018 IEEE 31st International Symposium on Computer-Based Medical Systems (CBMS), Karlstad, June. 2018, pp. 363–368. doi: 10.1109/CBMS.2018.00070.
R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi, “A Survey of Methods for Explaining Black Box Models,” ACM Comput. Surv., vol. 51, no. 5, pp. 1–42, Sep. 2019, doi: 10.1145/3236009.
N. O’Mahony et al., “Deep Learning vs. Traditional Computer Vision,” in Advances in Computer Vision, vol. 943, K. Arai and S. Kapoor, Eds. Cham: Springer International Publishing, 2020, pp. 128–144. doi: 10.1007/978-3-030-17795-9_10.
Krishna, Sajja Tulasi, and Hemantha Kumar Kalluri. "Deep learning and transfer learning approaches for image classification." International Journal of Recent Technology and Engineering (IJRTE) 7.5S4 (2019): 427-432.
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
J. Wang, Y. Ma, L. Zhang, R. X. Gao, and D. Wu, “Deep learning for smart manufacturing: Methods and applications,” Journal of Manufacturing Systems, vol. 48, pp. 144–156, Jul. 2018, doi: 10.1016/j.jmsy.2018.01.003.
A. Krizhevsky, “Learning Multiple Layers of Features from Tiny Images,” University of Toronto, 2009.
Y. LeCun and C. Cortes, "THE MNIST DATABASE of handwritten digits," 1998.
Ming-Kuei Hu, “Visual pattern recognition by moment invariants,” IEEE Trans. Inform. Theory, vol. 8, no. 2, pp. 179–187, Feb. 1962, doi: 10.1109/TIT.1962.1057692.
C. Cortes and V. Vapnik, “Support-vector networks,” Mach Learn, vol. 20, no. 3, pp. 273–297, Sep. 1995, doi: 10.1007/BF00994018
Copyright (c) 2022 Amadea Claire Isabel Ardison, Mikhaya Josheba Rumondang Hutagalung, Reynaldi Chernando, Tjeng Wawan Cenggoro
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)