A Random Forest Approach For Gender Classification Based on Keystroke Dynamics

Authors

  • Ayu Maulina Bina Nusantara University
  • Rifqi Alfinnur Charisma Bina Nusantara University

DOI:

https://doi.org/10.21512/emacsjournal.v7i3.13445

Keywords:

Classification, Gender Classification, Keystroke Dynamics, Random Forest

Abstract

The purpose of this study is to improve gender categorization by examining the usage of keyboard dynamics, with enhanced model performance through data standardization and appropriate feature selection. Features including gender, age, handedness, language, education, and metrics measuring typing behavior like mean_latency, std_latency, and frequency are all included in the dataset. Correlation analysis served as the foundation for the feature selection procedure, and data normalization was performed to guarantee consistency among the characteristics that were chosen. Because of its stability and capacity to handle complicated data, the Random Forest classifier was selected. The findings demonstrate that the Random Forest model performed better than benchmark models, such as SVM, in terms of F1-score, recall, accuracy, and precision. The results emphasize how important it is to choose the appropriate characteristics and standardize the data in order to increase predictive accuracy. By showcasing keystroke dynamics' capacity for gender categorization, this study advances the area and creates opportunities for further research in user experience improvement, digital service customisation, and online behavioral analysis. All things considered, the study highlights how crucial feature engineering and model tuning are to getting better categorization outcomes.

Dimensions

Plum Analytics

Author Biographies

Ayu Maulina, Bina Nusantara University

Computer Science Department, School of Computer Science

Rifqi Alfinnur Charisma, Bina Nusantara University

Computer Science Department, School of Computer Science

References

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324/METRICS

Buker, A. A. N., Roffo, G., Vinciarelli, A., & Cambria, E. (2019). Type like a man! Inferring gender from keystroke dynamics in live-chats. IEEE Intelligent Systems, 34(6), 53–59. https://doi.org/10.1109/MIS.2019.2948514

Cascone, L., Nappi, M., Narducci, F., & Pero, C. (2022). Touch keystroke dynamics for demographic classification. Pattern Recognition Letters, 158, 63–70. https://doi.org/10.1016/J.PATREC.2022.04.023

Geetha Vadav, M., Rajasekhar, N., Reddy, E. S., Vishal, M. S., & Vishal, G. (2024). The Role of Machine Learning in Crime Analysis and Prediction. Proceedings - 2024 International Conference on Expert Clouds and Applications, ICOECA 2024, 885–890. https://doi.org/10.1109/ICOECA62351.2024.00157

Hengbo, X., Fengjun, L., Xuan, D., & Zhu, T. (2020). Analysis on the Applicability of the Random Forest. Journal of Physics: Conference Series, 1607(1), 012123. https://doi.org/10.1088/1742-6596/1607/1/012123

Hu, J., & Szymczak, S. (2023). A review on longitudinal data analysis with random forest. Briefings in Bioinformatics, 24(2), 1–11. https://doi.org/10.1093/BIB/BBAD002

Kołakowska, A., & Landowska, A. (2021). Keystroke Dynamics Patterns While Writing Positive and Negative Opinions. Sensors 2021, Vol. 21, Page 5963, 21(17), 5963. https://doi.org/10.3390/S21175963

Marrone, S., & Sansone, C. (n.d.). Identifying Users’ Emotional States through Keystroke Dynamics. https://doi.org/10.5220/0011367300003277

Pentel, A. (2019). Predicting User Age by Keystroke Dynamics. Advances in Intelligent Systems and Computing, 764, 336–343. https://doi.org/10.1007/978-3-319-91189-2_33

Raul, N., Shankarmani, R., & Joshi, P. (2020). A Comprehensive Review of Keystroke Dynamics-Based Authentication Mechanism. Advances in Intelligent Systems and Computing, 1059, 149–162. https://doi.org/10.1007/978-981-15-0324-5_13

Salman, H. A., Kalakech, A., & Steiti, A. (2024). Random Forest Algorithm Overview. Babylonian Journal of Machine Learning, 2024, 69–79. https://doi.org/10.58496/BJML/2024/007

Shekhawat, K., & Bhatt, D. P. (2022). A novel approach for user authentication using keystroke dynamics. Journal of Discrete Mathematical Sciences and Cryptography, 25(7), 2015–2027. https://doi.org/10.1080/09720529.2022.2133241

Sriman, J., Thapar, P., Alyas, A. A., & Singh, U. (2024). Unlocking Security: A Comprehensive Exploration of Biometric Authentication Techniques. Proceedings of the 14th International Conference on Cloud Computing, Data Science and Engineering, Confluence 2024, 136–141. https://doi.org/10.1109/CONFLUENCE60223.2024.10463322

Talekar, B., & Agrawal, S. (2020). A Detailed Review on Decision Tree and Random Forest. Biosc.Biotech.Res.Comm. Special Issue, 13, 245–248. https://doi.org/10.21786/bbrc/13.14/57

Thakare, A., Gondane, S., Prasad, N., & Chigale, S. (2021). A Machine Learning-Based Approach to Password Authentication Using Keystroke Biometrics. Lecture Notes in Electrical Engineering, 749 LNEE, 395–406. https://doi.org/10.1007/978-981-16-0289-4_30

Tsimperidis, I., Asvesta, O.-D., Vrochidou, E., & Papakostas, G. A. (2024). IKDD: A Keystroke Dynamics Dataset for User Classification. Information, 15(9), 511. https://doi.org/10.3390/INFO15090511

Tsimperidis, I., Yucel, C., & Katos, V. (2021). Age and Gender as Cyber Attribution Features in Keystroke Dynamic-Based User Classification Processes. Electronics, 10(7). https://doi.org/10.3390/ELECTRONICS10070835

Tsvetkova, A. D., & Bakhteev, D. V. (2024). KEYSTROKE DYNAMICS FEATURES IN FORENSIC IDENTIFICATION:: theoretical and experimental approaches. Revista EJEF, 5, 2024. https://doi.org/10.70982/REJEF.V1I5.66

Vázquez-Novoa, F., Conejero, J., Tatu, C., & Badia, R. M. (2023). Scalable Random Forest with Data-Parallel Computing. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 14100 LNCS, 397–410. https://doi.org/10.1007/978-3-031-39698-4_27

Downloads

Published

2025-09-29

How to Cite

Maulina, A., & Charisma, R. A. (2025). A Random Forest Approach For Gender Classification Based on Keystroke Dynamics. Engineering, MAthematics and Computer Science Journal (EMACS), 7(3), 269–274. https://doi.org/10.21512/emacsjournal.v7i3.13445
Abstract 1  .
PDF downloaded 0  .