A Random Forest Approach For Gender Classification Based on Keystroke Dynamics
DOI:
https://doi.org/10.21512/emacsjournal.v7i3.13445Keywords:
Classification, Gender Classification, Keystroke Dynamics, Random ForestAbstract
The purpose of this study is to improve gender categorization by examining the usage of keyboard dynamics, with enhanced model performance through data standardization and appropriate feature selection. Features including gender, age, handedness, language, education, and metrics measuring typing behavior like mean_latency, std_latency, and frequency are all included in the dataset. Correlation analysis served as the foundation for the feature selection procedure, and data normalization was performed to guarantee consistency among the characteristics that were chosen. Because of its stability and capacity to handle complicated data, the Random Forest classifier was selected. The findings demonstrate that the Random Forest model performed better than benchmark models, such as SVM, in terms of F1-score, recall, accuracy, and precision. The results emphasize how important it is to choose the appropriate characteristics and standardize the data in order to increase predictive accuracy. By showcasing keystroke dynamics' capacity for gender categorization, this study advances the area and creates opportunities for further research in user experience improvement, digital service customisation, and online behavioral analysis. All things considered, the study highlights how crucial feature engineering and model tuning are to getting better categorization outcomes.
Plum Analytics
References
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324/METRICS
Buker, A. A. N., Roffo, G., Vinciarelli, A., & Cambria, E. (2019). Type like a man! Inferring gender from keystroke dynamics in live-chats. IEEE Intelligent Systems, 34(6), 53–59. https://doi.org/10.1109/MIS.2019.2948514
Cascone, L., Nappi, M., Narducci, F., & Pero, C. (2022). Touch keystroke dynamics for demographic classification. Pattern Recognition Letters, 158, 63–70. https://doi.org/10.1016/J.PATREC.2022.04.023
Geetha Vadav, M., Rajasekhar, N., Reddy, E. S., Vishal, M. S., & Vishal, G. (2024). The Role of Machine Learning in Crime Analysis and Prediction. Proceedings - 2024 International Conference on Expert Clouds and Applications, ICOECA 2024, 885–890. https://doi.org/10.1109/ICOECA62351.2024.00157
Hengbo, X., Fengjun, L., Xuan, D., & Zhu, T. (2020). Analysis on the Applicability of the Random Forest. Journal of Physics: Conference Series, 1607(1), 012123. https://doi.org/10.1088/1742-6596/1607/1/012123
Hu, J., & Szymczak, S. (2023). A review on longitudinal data analysis with random forest. Briefings in Bioinformatics, 24(2), 1–11. https://doi.org/10.1093/BIB/BBAD002
Kołakowska, A., & Landowska, A. (2021). Keystroke Dynamics Patterns While Writing Positive and Negative Opinions. Sensors 2021, Vol. 21, Page 5963, 21(17), 5963. https://doi.org/10.3390/S21175963
Marrone, S., & Sansone, C. (n.d.). Identifying Users’ Emotional States through Keystroke Dynamics. https://doi.org/10.5220/0011367300003277
Pentel, A. (2019). Predicting User Age by Keystroke Dynamics. Advances in Intelligent Systems and Computing, 764, 336–343. https://doi.org/10.1007/978-3-319-91189-2_33
Raul, N., Shankarmani, R., & Joshi, P. (2020). A Comprehensive Review of Keystroke Dynamics-Based Authentication Mechanism. Advances in Intelligent Systems and Computing, 1059, 149–162. https://doi.org/10.1007/978-981-15-0324-5_13
Salman, H. A., Kalakech, A., & Steiti, A. (2024). Random Forest Algorithm Overview. Babylonian Journal of Machine Learning, 2024, 69–79. https://doi.org/10.58496/BJML/2024/007
Shekhawat, K., & Bhatt, D. P. (2022). A novel approach for user authentication using keystroke dynamics. Journal of Discrete Mathematical Sciences and Cryptography, 25(7), 2015–2027. https://doi.org/10.1080/09720529.2022.2133241
Sriman, J., Thapar, P., Alyas, A. A., & Singh, U. (2024). Unlocking Security: A Comprehensive Exploration of Biometric Authentication Techniques. Proceedings of the 14th International Conference on Cloud Computing, Data Science and Engineering, Confluence 2024, 136–141. https://doi.org/10.1109/CONFLUENCE60223.2024.10463322
Talekar, B., & Agrawal, S. (2020). A Detailed Review on Decision Tree and Random Forest. Biosc.Biotech.Res.Comm. Special Issue, 13, 245–248. https://doi.org/10.21786/bbrc/13.14/57
Thakare, A., Gondane, S., Prasad, N., & Chigale, S. (2021). A Machine Learning-Based Approach to Password Authentication Using Keystroke Biometrics. Lecture Notes in Electrical Engineering, 749 LNEE, 395–406. https://doi.org/10.1007/978-981-16-0289-4_30
Tsimperidis, I., Asvesta, O.-D., Vrochidou, E., & Papakostas, G. A. (2024). IKDD: A Keystroke Dynamics Dataset for User Classification. Information, 15(9), 511. https://doi.org/10.3390/INFO15090511
Tsimperidis, I., Yucel, C., & Katos, V. (2021). Age and Gender as Cyber Attribution Features in Keystroke Dynamic-Based User Classification Processes. Electronics, 10(7). https://doi.org/10.3390/ELECTRONICS10070835
Tsvetkova, A. D., & Bakhteev, D. V. (2024). KEYSTROKE DYNAMICS FEATURES IN FORENSIC IDENTIFICATION:: theoretical and experimental approaches. Revista EJEF, 5, 2024. https://doi.org/10.70982/REJEF.V1I5.66
Vázquez-Novoa, F., Conejero, J., Tatu, C., & Badia, R. M. (2023). Scalable Random Forest with Data-Parallel Computing. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 14100 LNCS, 397–410. https://doi.org/10.1007/978-3-031-39698-4_27
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ayu Maulina, Rifqi Alfinnur Charisma

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
USER RIGHTS
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)