Publicação no Journal of Information and Data Management (JIDM 2021)

Wed, 26 May 2021 23:57:14 -0300

O artigo “Analysis of Distinct Feature Groups in the Credit Scoring Problem” foi aceito no . O artigo é uma extensão do publicado no no ano passado. Mais informações sobre o artigo:

Autores: L. F. Vercosa, R. Lira, R. Monteiro, K. Silva, J. Magalhaes, A. Maciel, B. Leite, C. Bastos-Filho

Abstract: “Registration and financial data have been traditionally used for the credit scoring problem. However, slight improvements in the reliability of the scores positively impacts financial companies. Therefore, exploring new features is a strategic task. This work analyzes the importance of new feature groups not commonly employed for the credit scoring task and others already used. We categorized features from open credit scoring datasets, such as German and Australian and compared their groups with the ones of a company dataset used in this work. Our dataset contains unusual feature groups, such as historical, geolocation, web behavior, and demographic data. In our analyzes, we first conducted bivariate tests with each feature-pair to assess their individual importance. Secondly, we ran XGBoost machine learning model with each feature group to evaluate each group importance. We also applied feature selection with binary Particle Swarm Optimization to assess the groups importance when combined. Next, we employed correlation tests to find inner and inter-correlation among the features groups. Finally, we used the company dataset and employed AdaBoost, Multilayer Perceptron, and XGBoost algorithms to find the best model for the task. Some of our main findings were that the unusual features added a slight improvement to registration features. We also detected reasonable inner correlation among some feature groups and found that all groups were relevant for the task with the Historical Group as the most promising. Lastly, XGBoost obtained the best performance over AdaBoost and Multilayer-perceptron for the task.”

Publicação no Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2020)

Thu, 24 Sep 2020 23:57:14 -0300

O artigo “Impact of Unusual Features in Credit Scoring Problem” foi aceito no . O trabalho foi fruto do projeto da disciplina de Mineração de Dados do PPGEC/Ecomp. Mais informações sobre o artigo:

Autores: L. F. Vercosa, R. Lira, R. Monteiro, K. Silva, J. Magalhaes, A. Maciel, B. Leite, C. Bastos-Filho

Abstract: “Standard features used for Credit Scoring includes mainly registration and financial data from customers. However, exploring new features is of great interest for financial companies, since slight improvements in the person score directly impact the company revenue. In this work, we categorize features from open credit scoring datasets and compare them with the features found in a real company dataset. The company dataset contains unusual feature groups such as historical, geolocation, web behavior, and demographic data. We performed bivariate tests using the Kolmogorov-Smirnov metric and features to assess the performance of the particular feature groups. We also generated a score of good payer by using AdaBoost, Multilayer Perceptron, and XGBoost algorithms. Then, we analyzed the results with different metrics and compared them with the real company results. Our main finding was that these features added a small improvement to current datasets. We also identified the most promising feature groups and noticed that the tuned XGBoost performed better than the company solution in three out of four deployed metrics.”

Credit Scoring | Rodrigo Lira

Publicação no Journal of Information and Data Management (JIDM 2021)

Publicação no Symposium on Knowledge Discovery, Mining and Learning (KDMiLe 2020)