Use of classification techniques in a dataset on financial inclusion: a study based on Latin American countries
DOI:
https://doi.org/10.47456/bjpe.v8i1.37019Keywords:
Data mining, Classification, Financial inclusion, Latin AmericaAbstract
A inclusão financeira é importante para reduzir a pobreza e proporcionar um crescimento econômico inclusivo, principalmente comparando grupos com grande desigualdade social. Este artigo utilizou a pesquisa Global Financial Inclusion (Global Findex) da World Bank Group para comparar técnicas de aprendizado de máquina na classificação de homens e mulheres quanto ao uso de serviços financeiros. Para isso, utilizou-se os classificadores Árvore de decisão, -vizinhos mais próximos, Naïve Bayes e Floresta randômica, e avaliadas as métricas de acurácia, precisão, sensibilidade, f1-score e área sob a curva Receiver Operating Characteristic (ROC). Verificou-se que todas as técnicas (exceto por Naïve Bayes) obtiveram uma acurácia próxima a 70%, sensibilidade próxima a 88% e precisão acima dos 72% na maioria dos parâmetros investigados. Quanto à área sob a curva ROC, a Floresta randômica atingiu 0,77, superando as outras técnicas nesta avaliação.
Downloads
References
Abdul Razak, A., & Asutay, M. (2022). Financial inclusion and economic well-being: Evidence from Islamic Pawnbroking (Ar-Rahn) in Malaysia. Research in International Business and Finance, 59, 101557. https://doi.org/10.1016/j.ribaf.2021.101557
Aggarwal, C. C. (2015). Data Mining. In Data Mining. Springer International Publishing. https://doi.org/10.1007/978-3-319-14142-8
Almeida, R. C. de, & Faceroli, S. T. (2014). Análise comparativa das técnicas KNN e rede neural MLP na classificação de padrões mioelétricos. Anais Do XXIV Congresso Brasileiro de Engenharia Biomédica.
Amaral, F. (2016). Aprenda Mineração de Dados: Teoria e Prática (1 ed.). Alta Books.
Berrar, D. (2018). Bayes’ Theorem and Naive Bayes Classifier. Encyclopedia of Bioinformatics and Computational Biology, 1, 403-412. https://doi.org/10.1016/b978-0-12-809633-8.20473-1
Bramer, M. (2016). Principles of Data Mining (3rd ed.). Springer London. https://doi.org/10.1007/978-1-4471-7307-6
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1007/9781441993267_5
Camilo, C. O., & Silva, J. C., da. (2009). Mineração de Dados: Conceitos, Tarefas, Métodos e Ferramentas. Recuperado de https://rozero.webcindario.com/disciplinas/fbmg/dm/RT-INF_001-09.pdf
Dogan, A. & Birant, D. (2021). Machine learning and data mining in manufacturing. Expert Systems with Applications, 166, 114060. https://doi.org/10.1016/j.eswa.2020.114060
Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From Data Mining to Knowledge Discovery in Databases. AI Magazine, 17(3), 37-53. https://doi.org/10.1609/aimag.v17i3.1230
Fenerich, A., Steiner, M. T. A., Steiner Neto, P. J., Tochetto, E., Tsutsumi, D., Assef, F. M., & Dos Santos, B. S. (2020). Use of machine learning techniques in bank credit risk analysis. Revista Internacional de Metodos Numericos Para Calculo y Diseno En Ingenieria, 36(3), 1-15. https://doi.org/10.23967/J.RIMNI.2020.08.003
Frey, B. B. (2018). Phi Correlation Coefficient. In The SAGE Encyclopedia of Educational Research, Measurement, and Evaluation. SAGE. https://doi.org/10.4135/9781506326139
Géron, A. (2019). Mãos à Obra: Aprendizado de Máuqina com Scikit-Learn & TensorFlow (1 ed.). Alta Books.
Goldschmidt, R., Passos, E., & Bezerra, E. (2015). Data Mining: Conceitos, técnicas, algoritmos, orientações e aplicações (2a ed.). Elsevier.
Gómez-Flores, W., Garza-Saldaña, J. J., & Varela-Fuentes, S. E. (2019). Detection of Huanglongbing disease based on intensity-invariant texture analysis of images in the visible spectrum. Computers and Electronics in Agriculture, 162(2018), 825-835. https://doi.org/10.1016/j.compag.2019.05.032
Henrique, B. M., Sobreiro, V. A., & Kimura, H. (2019). Literature review : Machine learning techniques applied to financial market prediction R. Expert Systems With Applications, 124, 226-251. https://doi.org/10.1016/j.eswa.2019.01.012
Kumar, R. & Verma, R. (2012). Classification Algorithms for Data Mining: A Survey. International Journal of Innovations in Engineering and Technology, 1(2), 7-14.
Larose, D. T. & Larose, C. D. (2014). Discovering Knowledge in Data (2nd ed.). John Wiley & Sons, Inc.
Liu, J., Kong, X., Zhou, X., Wang, L., Zhang, D., Lee, I., Xu, B., & Xia, F. (2019). Data Mining and Information Retrieval in the 21st century: A bibliographic review. Computer Science Review, 34. https://doi.org/10.1016/j.cosrev.2019.100193
Liu, Y., Esan, O. C., Pan, Z., & An, L. (2021). Machine learning for advanced energy materials. Energy and AI, 3. https://doi.org/10.1016/j.egyai.2021.100049
Marcelin, I., Egbendewe, A. Y. G., Oloufade, D. K., & Sun, W. (2021). Financial inclusion, bank ownership, and economy performance: Evidence from developing countries. Finance Research Letters, 102322. https://doi.org/10.1016/j.frl.2021.102322
Masmoudi, Y., Turkay, M., & Chabchoub, H. (2013). A binarization strategy for modelling mixed data in multigroup classification. International Conference on Advanced Logistics and Transport, 347-353. https://doi.org/10.1109/ICAdLT.2013.6568483
Morgan, P. J., & Pontines, V. (2018). Financial stability and financial inclusion: The case of SME lending. The Singapore Economic Review, 63(01), 111-124. https://doi.org/10.1142/S0217590818410035
Oliveira, A., Faria, B. M., Gaio, A. R., & Reis, L. P. (2017). Data Mining in HIV-AIDS Surveillance System: Application to Portuguese Data. Journal of Medical Systems, 41(4). https://doi.org/10.1007/s10916-017-0697-4
Rabelo, E., Campos, F. C. de, & Silva, L. M. C. da. (2021). Aplicação de um modelo de descoberta de conhecimento na Era do Big Data. Brazilian Journal of Production Engineering, 7(3), 106-125. https://doi.org/10.47456/bjpe.v7i3.35743
Robino, C., Trivelli, C., Villanueva, C., Sachetti, F. C., Walbey, H., Martinez, L., & Marincioni, M. (2018). Financial Inclusion for Women: A Way Forward.
Rodriguez-Galiano, V. F., Luque-Espinar, J. A., Chica-Olmo, M., & Mendes, M. P. (2018). Feature selection approaches for predictive modelling of groundwater nitrate pollution: An evaluation of filters, embedded and wrapper methods. Science of the Total Environment, 624, 661-672. https://doi.org/10.1016/j.scitotenv.2017.12.152
Tan, P.-N., Steinbach, M., Karpatne, A., & Kumar, V. (2019). Introduction to Data Mining (2nd ed.). Pearson Prentice Hall.
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Brazilian Journal of Production Engineering - BJPE
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.