Submit manuscript...
eISSN: 2574-9927

Material Science & Engineering International Journal

Research Article Volume 8 Issue 1

Development of a credit risk evaluation system using multilayer neural networks

Frank Edward Tadeo Espinoza, Marco Antonio Coral Ygnacio

Department of Systems Engineering, Universidad Católica Sedes Sapientiae, Peru

Correspondence: Frank Edward Tadeo Espinoza, Department of Systems Engineering, Universidad Católica Sedes Sapientiae, Peru

Received: January 05, 2024 | Published: January 19, 2024

Citation: Espinoza FET, Ygnacio MAC. Development of a credit risk evaluation system using multilayer neural networks. Material Sci & Eng. 2024;8(1):1-7. DOI: 10.15406/mseij.2024.08.00228

Download PDF

Abstract

This paper deals with the development of a credit risk assessment system using multilayer neural networks. The main objective of this work is to provide a decision support tool for risk assessment, considering relevant variables in the process. To achieve this objective, the backpropagation algorithm and the Adam optimizer were used to train the model. In terms of materials and methods, a training and validation data set including relevant financial information of credit applicants was used. A multilayer neural network was implemented that made predictions and calculated the loss using the categorical cross-entropy function. The results obtained during the development of the system showed a favorable performance and a satisfactory level of accuracy in identifying and classifying different levels of credit risk. However, it is emphasized that the system does not provide absolute results; human intervention is recommended as a last resort for decision making.

Keywords: credit Risk, ANN, credit rating, financial assessment

Introduction

Currently, credit risk refers to the evaluation performed by a banking entity to classify a person applying for credit. These systems evaluate the credit behavior or credit history of the applicant, considering variables such as delinquent accounts, balance of other loans, monthly income, demographic data, among others.1 With the Covid-19 pandemic, the problem of credit risk assessment has increased due to the large number of credit applications, some of which are from people with bad credit history or who request amounts greater than their payment capacity, which has generated deficiencies in the assessment of default risk and losses for financial institutions.2

The seriousness of this problem lies in the double impact it has both for financial institutions, which face greater losses due to the high delinquency rates that could result from poor assessment, and for potential borrowers, who may face unjustified rejections due to insufficiently accurate assessment systems.3 This problem not only affects the financial stability of institutions, but can also hinder access to credit, a crucial tool for welfare and economic development.

Faced with this problem, the adoption of new cost-effective technologies could enable financial institutions to have decision support systems for credit granting. Credit risk evaluation systems based on artificial intelligence and machine learning have emerged as a viable alternative to improve evaluation processes and reduce the percentage of loss probability in loans; studies and research on the subject have shown that the adoption of such technologies for evaluation significantly improves the results of evaluations.4 However, these solutions depend on factors such as data availability, data processing standards, and variables used, which can affect the results of the systems and their efficiency.5

The motivation for designing a solution lies in the possibility of overcoming these current problems and providing a more efficient and accurate credit risk assessment tool. This solution not only benefits financial institutions by reducing their risk exposure, but could also expand access to credit for individuals and businesses, which could ultimately stimulate economic growth. The purpose of this article is to propose a credit evaluation model that combines the necessary variables with the opportunities offered by new technologies to improve these evaluation processes. This proposal addresses the implementation of a credit risk assessment system based on multilayer neural networks, explaining how this system can overcome current challenges and improve the accuracy and efficiency of credit risk assessment and proposing a credit assessment model that considers the variables necessary for such assessment to have the required accuracy.

During the development of the proposal, the results obtained showed a promising performance and a satisfactory level of accuracy. However, it is important to note that these results are not 100% accurate in all situations. Instead, the importance of human intervention as the ultimate decision maker is recognized.

The article is organized in a sequential manner, the Materials and Methods chapter will detail the state of the art of the subject, the problem and the proposed solution, detailing each of them, then, the following chapter of Results and Discussions will describe the results obtained from the proposal made and will conclude with an analysis of the results, and, finally, the last chapter will present the conclusions and propose recommendations for future research.

Methodology

This chapter seeks to cover the existing theory and research related to the subject in order to understand the proposed solution and to present all the studies related to the field of credit risk assessment.

Credit risk assessment is a field of study used to determine the probability that a borrower will default on payment obligations and aims to minimize the risk of default and protect the lender's interests. For this purpose, various factors are analyzed, such as the borrower's credit history, repayment capacity, current financial situation and other relevant factors.6 It is from this that different research has given rise to different credit risk assessment systems that use various techniques and methods to determine the level of risk and allowing the timely identification of potential risk factors is what could be incurred by the applicant.

The classification of these systems can be done in different ways, considering the statistical model used, the interaction with the environment and its functionality. In terms of the statistical model, there are systems that use logistic regression to estimate the probability of credit default based on predictor variables, and systems that use discriminant analysis to estimate the probabilities of belonging to different credit risk categories.7 In terms of interaction with the environment, systems can be classified as autonomous, which operate without direct interaction with the external environment and use predefined statistical models and parameters; adaptive, which constantly adjust to changes in the environment by updating their models and parameters; benchmarking-based, which compare credit performance with predefined external standards; and human-interactive, which combine statistical models with human intervention in the credit risk assessment process. In terms of functionality, systems can be classified as unidimensional, which focus on a single dimension of credit risk; multidimensional, which consider multiple dimensions of credit risk; predictive, which use statistical models and predictive analytics to estimate future credit risk; and rule-based, which use predefined rules and criteria to assess credit risk and generate feedback on the outcome.8

The creation of credit risk assessment systems can have multiple benefits in different aspects. From an economic point of view, these credit risk assessment systems can help financial institutions make informed decisions on lending and interest rate setting, which can reduce the risk of default and protect the lender's interests by minimizing the risk of default, credit risk assessment systems can help financial institutions maintain financial stability and solvency, which in turn can contribute to overall economic stability. Similarly, from a social point of view these credit risk assessment systems can help protect borrowers from excessive financial burden, as financial institutions can make more informed decisions about lending and setting interest rates by protecting borrowers from excessive financial burden, credit risk assessment systems can help reduce the risk of poverty and social exclusion. So also, from a research point of view, the creation of credit risk assessment systems can be the subject of research and development, which can contribute to the improvement of techniques and methods used in credit risk assessment research on credit risk assessment systems can also contribute to the understanding of the factors influencing the risk of default and the identification of new ways to minimize that risk.

Several machine learning libraries, such as Scikit-learn, TensorFlow and Keras, are used to develop these systems. Among them, Scikit-learn is a Python library that offers a wide range of classification and regression algorithms, including SVMs, and provides tools for data preprocessing and model performance evaluation.9 TensorFlow, on the other hand, enables the construction of highly complex and accurate neural network and Deep Learning models, and is capable of running models on a variety of devices and platforms.10 Keras focuses on speed of experimentation and ease of use, offering a large number of neural network layers, activation functions, and classification optimizers.11

From this analysis, it can be inferred that a credit risk assessment system is composed of processes, tools, and criteria used to analyze and measure the probability of default by a borrower on a loan or credit. These systems are based on the collection and analysis of financial, credit and other relevant variables to determine the level of risk associated with a specific loan applicant.

The development of credit risk assessment systems involves the use of different methods for each aspect of system development beyond software implementation technologies such as software development methodology, such as agile methodologies. These provide a flexible, feedback-focused approach to software development that allows for rapid changes in system design and functionality during the development process. Agile methodologies, unlike traditional methods, emphasize collaboration, adaptability, and continuous delivery of value to the end user,12 and the credit analysis method. Employed in these systems emulates the evaluation processes performed by credit experts. One approach frequently used in this context is the Financial Statement Analysis Method. This method consists of examining in detail the applicant's financial statements, including the balance sheet, income statement and cash flow. Various financial techniques and ratios are used to assess the applicant's financial health and, therefore, its creditworthiness.

Continuing with the process of implementing credit risk assessment systems, a crucial consideration is the selection of algorithms for credit evaluation and rating. Within these algorithms, Support Vector Machines (SVM) are a relevant option. These algorithms are used to categorize samples into two or more classes, which is useful in the context of credit assessment to predict whether an applicant represents a risk to the organization. SVMs operate by finding the hyperplane that best divides the samples in the feature space, which maximizes the distance between categories.13

In addition to this, Neural Networks are also used which are algorithms that are inspired by the workings of the human brain, they can be employed in credit evaluation to make predictions based on complex data patterns. These algorithms use multiple layers of interconnected nodes to process data and make predictions.14 Similarly, Random Forest algorithm also plays an essential role in these systems. This algorithm combines multiple decision trees into an ensemble for sample classification. Each tree in the Random Forest is trained with a random sample from the data set, and the predictions from all the trees are combined to give a final classification.15

Finally, XGBoost can be used to predict the credit risk probability of applicants. This algorithm, based on predictor variables such as income, credit history and age, requires historical training data to adjust its parameters and tune the model to the specific characteristics of credit applicants.5 The credit risk assessment process is a crucial element in the operation of financial institutions, as it makes it possible to determine the credit applicants' ability to pay and minimize the risks associated with lending.16 However, this process is complex and presents significant challenges that require careful attention by credit assessment experts. A point to consider within this problem is that many of the processes to verify the level of risk associated with the loan, focuses the evaluation on the financial category such as credit history, level of indebtedness, among other indicators, which, although they provide a frame of reference during the evaluation, there are other categories that can improve the outcome of the evaluation allowing to reduce the level of risk associated with the evaluations.2

From a theoretical perspective, credit risk assessment is based on financial models and theories that seek to quantify and predict default risk. These models take into account variables such as credit history, level of indebtedness and other financial indicators, which can provide a clear view of the creditworthiness and payment capacity of the credit applicant.16 However, although these theoretical models are useful, they do not always consider non-financial factors that can influence credit risk, such as consumer behavior, the general economic context and market conditions. In addition, theoretical analysis can sometimes be limited by the lack of historical data and the difficulty of modeling uncertainties and unforeseen events.2 From a practical perspective, the development of credit risk assessment systems based on AI and deep learning presents specific challenges. On the software side, the design and implementation of machine learning and deep learning algorithms can be complex. Developers must choose the right algorithm for each specific task, and it must be able to handle the unique characteristics of credit data, which can include a high degree of variability and noise.2 In addition, algorithms must be able to learn from large volumes of data and make accurate predictions in real time, which may require advanced optimization and model fitting techniques.17 From a hardware point of view, AI-based credit risk assessment systems may require significant computational resources. Deep learning model training can be processor and memory intensive, and may require specialized hardware, such as graphics processing units (GPUs) or distributed computing systems. The hardware must be able to support the workload without compromising system performance or user experience.18

Considering the exposed problematic, the need arises to develop an evaluation model that incorporates new categories with the objective of obtaining improved results in the process. This implies the search for additional variables and more complete criteria for a more accurate and effective credit evaluation. By expanding the evaluation categories, we seek to address the applicant's evaluation processes in a more complete manner. The proposed model consists of 3 categories:

  1. Financial Category. Within this category, variables that are directly related to the applicant's economic and credit situation within the financial system are considered. The variables in this category are defined by the information provided by the applicant such as monthly income, monthly expenses, among others; and variables are provided by financial institutions such as credit rating, debt history, among others.
  2. Socioeconomic Category: This category considers variables within the applicant's social and economic context, in order to determine certain payment capacities in relation to the loans requested.
  3. Personal Category: Within this category, variables are considered that allow the specialist to know the applicant's information in order to know his or her individual payment capacities, such as age, marital status, among others.

Based on this, Table 1 shows the evaluation model described along with the categories and associated variables:

 

Evaluation model

 

Categories

Variable

Description

Finance

Monthly income

Records compliance with previous payments and debts of the applicant.

 

Monthly Expenditures

The amount of money the applicant spends monthly on basic needs and financial commitments.

 

Requested Amount

The amount of money that the applicant wishes to obtain as a loan.

 

Loan Time

The time in which the applicant agrees to repay the loan, including interest.

 

Weighted Rating in the last year

The weighted rating granted by the SBS

 

Unpaid debts, punished or in legal dispute

If you have active debts in the financial system that are reported as unpaid.

 

Negative information reported

If you present information registered in SICOM

 

Closed Checking Accounts

If you present checking accounts closed by any financial institution

 

Canceled Credit Cards

If you present credit cards canceled by any financial institution

Socioeconomic

Job type

The nature of the applicant's employment (for example, salaried, self-employed, unemployed).

 

Time spent in employment

The length of time the applicant has been employed in their current job.

 

Education level

The level of education achieved by the applicant (for example, high school, college, graduate).

Staff

Civil status

The applicant's marital status (for example, single, married, divorced).

 

Number of dependents

The number of people who are financially dependent on the applicant.

 

Age

The age of the applicant in years.

 

Housing History

The applicant's residency history, including whether they are a homeowner, renter, or living with relatives.

 

Purpose of the loan

The reason the applicant is applying for the loan (for example, debt consolidation, home purchase, education, business).

Table 1 shows the evaluation model described along with the categories and associated variables

The categories mentioned above describe a series of variables within the evaluation model that would improve the credit risk evaluation process. These variables provide a more complete and detailed perspective of the repayment capacity and risk profile of the applicants. However, it is important to note that, although these variables contribute significantly to the evaluation, they are not a guarantee of a completely effective evaluation and the role of the expert remains fundamental in the final decision-making process. Despite the incorporation of the aforementioned categories and variables, it is necessary for the expert to analyze and evaluate each case individually, considering additional factors and exercising professional judgment.

In order to improve the accuracy and efficiency of credit risk assessment, the implementation of a system based on multilayer artificial neural networks is proposed. Based on this, a conceptual model of the solution is shown in Figure 1. This proposal seeks to take advantage of the power of machine learning and the information processing capacity of neural networks to make more accurate and reliable predictions about the payment capacity of applicants.

Figure 1 Conceptual solution model.

Multilayer artificial neural networks are computational models that simulate the functioning of neural networks in the human brain. These networks are composed of interconnected layers of artificial neurons, where each neuron performs mathematical computations and transmits information through weighted connections. As the system is trained with training data, the connections between neurons are adjusted to optimize the prediction process.

The choice of multilayer artificial neural networks as the basis of our system is due to their ability to recognize complex, nonlinear patterns in data sets. By training the system with a historical data set that includes financial and credit information, the neural network can learn to identify correlations and hidden patterns that can be used to predict the ability to pay of new applicants. The proposed system will leverage information gathered from the financial, socioeconomic, and personal variables mentioned above. These variables will serve as input data for the multilayer neural network system, which will process the information and generate a prediction of the applicant's ability to pay and level of credit risk.

For the development and construction of the classification model from the multilayer neural network approach, a graphical example is shown in Figure 2 to exemplify the multilayer neural network model:

Figure 2 Proposed multilayer neural network model

Input Layer: The input variables are related to the variables presented in the evaluation model and are shown in Table 2.

 

Inputs variables

 

Categories

Variable

Variables (Xn)

Financial

Monthly income

x1

 

Monthly Expenditures

x2

 

Requested Amount

X3

 

Loan Time

x4

 

Weighted Rating in the last year

X5

 

Unpaid debts, punished or in legal dispute

X6

 

Negative information reported

X7

 

Closed Checking Accounts

X8

 

Canceled Credit Cards

X9

Socioeconomic

Job type

x10

 

Time spent in employment

X11

 

Education level

x12

Personal

Civil status

X13

 

Number of dependents

X14

 

Age

X15

 

Housing History

X16

 

Purpose of the loan

X17

Table 2 Definition of input variables

Hidden Layers: In this layer two main functions are performed, the first one processes the product of the weights assigned for each evaluation criterion by the corresponding input value dependent on the category using the required formula (Table 3).

Producer of the variables

Categories

Variables (Xn)

Weights (Wi)

Calculation

Financial

x1

0.11

n=1 9 w (i* X n +b) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaale aaqaaaaaaaaaWdbiaad6gacqGH9aqpcaaIXaaapaqaaiaaiMdaa0Ga eyyeIuoak8qacaWG3bWdamaaBaaaleaapeGaaiikaiaadMgacaGGQa Gaamiwa8aadaWgaaadbaWdbiaad6gaa8aabeaal8qacqGHRaWkcaWG IbGaaiykaaWdaeqaaaaa@456A@  

x2

0.11

X3

0.11

x4

0.11

X5

0.12

X6

0.11

X7

0.11

X8

0.11

X9

0.11

Socioeconomic

x10

0.34

n=10 12 w (i* X n ) +b MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaeaa aaaaaaa8qacaWG3bWdamaaBaaaleaapeGaaiikaiaadMgacaGGQaGa amiwa8aadaWgaaadbaWdbiaad6gaa8aabeaal8qacaGGPaaapaqaba GcpeGaey4kaSIaamOyaaWcpaqaa8qacaWGUbGaeyypa0JaaGymaiaa icdaa8aabaGaaGymaiaaikdaa0GaeyyeIuoaaaa@46F7@  

X11

0.33

x12

0.33

Personal

X13

0.2

n=13 17 w (i* X n ) +b MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaeaa aaaaaaa8qacaWG3bWdamaaBaaaleaapeGaaiikaiaadMgacaGGQaGa amiwa8aadaWgaaadbaWdbiaad6gaa8aabeaal8qacaGGPaaapaqaba GcpeGaey4kaSIaamOyaaWcpaqaamaaDaaameaaaeaapeGaamOBaiab g2da9iaaigdacaaIZaaaaaWcpaqaaiaaigdacaaI3aaaniabggHiLd aaaa@4738@  

X14

0.2

X15

0.2

X16

0.2

X17

0.2

Table 3 Calculation in hidden layers

The second function is the activation function, which for the proposal is the ReLU function that allows to improve the outputs of the neurons by means of the following formula (Table 4).

Variable Activation Function

Categories

Variables (Xn)

Weights (Wi)

Calculation

ReLU function

Financial

x1

0.11

n=1 9 w (i* X n ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaale aaqaaaaaaaaaWdbiaad6gacqGH9aqpcaaIXaaapaqaa8qacaaI5aaa n8aacqGHris5aOWdbiaadEhapaWaaSbaaSqaa8qacaGGOaGaamyAai aacQcacaWGybWdamaaBaaameaapeGaamOBaaWdaeqaaSWdbiaacMca a8aabeaaaaa@43C0@   f( x )=max( 0, x ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzamaabmaapaqaa8qacaWG4baacaGLOaGaayzkaaGaeyypa0Ja amyBaiaadggacaWG4bWaaeWaa8aabaWdbiaaicdacaGGSaGaaeiOai aadIhaaiaawIcacaGLPaaaaaa@43CC@  

x2

0.11

X3

0.11

x4

0.11

X5

0.12

X6

0.11

X7

0.11

X8

0.11

X9

0.11

Socioeconomic

x10

0.34

n=10 12 w (i* X n ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaale aaqaaaaaaaaaWdbiaad6gacqGH9aqpcaaIXaGaaGimaaWdaeaacaaI XaGaaGOmaaqdcqGHris5aOWdbiaadEhapaWaaSbaaSqaa8qacaGGOa GaamyAaiaacQcacaWGybWdamaaBaaameaapeGaamOBaaWdaeqaaSWd biaacMcaa8aabeaaaaa@450F@   f( y )=max( 0, y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzamaabmaapaqaaiaadMhaa8qacaGLOaGaayzkaaGaeyypa0Ja amyBaiaadggacaWG4bWaaeWaa8aabaWdbiaaicdacaGGSaGaaeiOai aadMhaaiaawIcacaGLPaaaaaa@43CE@  

X11

0.33

x12

0.33

Personal

X13

0.2

n=13 17 w (i* X n ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaale aaqaaaaaaaaaWdbiaad6gacqGH9aqpcaaIXaGaaG4maaWdaeaacaaI XaGaaG4naaqdcqGHris5aOWdbiaadEhapaWaaSbaaSqaa8qacaGGOa GaamyAaiaacQcacaWGybWdamaaBaaameaapeGaamOBaaWdaeqaaSWd biaacMcaa8aabeaaaaa@4517@   f( z )=max( 0, z ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzamaabmaapaqaaiaadQhaa8qacaGLOaGaayzkaaGaeyypa0Ja amyBaiaadggacaWG4bWaaeWaa8aabaWdbiaaicdacaGGSaGaaeiOai aadQhaaiaawIcacaGLPaaaaaa@43D0@  

X14

0.2

X15

0.2

X16

0.2

X17

0.2

Table 4 Calculation of activation in the hidden layers

Output Layer: This layer performs the Sofmaz function for the final processing of the sets of values to convert them into probabilities by means of the following formula (Table 5).

Evaluation criteria

Classification rank

Result

0.00 ≤ R(score) ≤ 0.35

Low risk

0.35 < R(score) ≤ 0.70

Medium risk

0.70 < R(score) ≤ 1

High risk

Table 5 Calculation of the Softmax function at the output layer

Based on the calculations defined for the algorithm processing, a table of ranks is defined in order to analyze the result and grant a credit rating to the applicant. This table is shown below Table 6.

                               Variable output function

Categories

f(x)

Softmax function

Financial

1.58

R= e 1.58 ( e 1.58 + e 1.34 + e 1.8 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOuaiabg2da9maalaaapaqaa8qacaWGLbWdamaaCaaaleqabaWd biaaigdacaGGUaGaaGynaiaaiIdaaaaak8aabaWdbiaacIcacaWGLb WdamaaCaaaleqabaWdbiaaigdacaGGUaGaaGynaiaaiIdaaaGccqGH RaWkcaWGLbWdamaaCaaaleqabaWdbiaaigdacaGGUaGaaG4maiaais daaaGccqGHRaWkcaWGLbWdamaaCaaaleqabaWdbiaaigdacaGGUaGa aGioaaaakiaacMcaaaaaaa@4C6A@  

Socioeconomic

1.34

R= e 1.34 ( e 1.58 + e 1.34 + e 1.8 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOuaiabg2da9maalaaabaGaamyza8aadaahaaWcbeqaa8qacaaI XaGaaiOlaiaaiodacaaI0aaaaaGcbaGaaiikaiaadwgapaWaaWbaaS qabeaapeGaaGymaiaac6cacaaI1aGaaGioaaaakiabgUcaRiaadwga paWaaWbaaSqabeaapeGaaGymaiaac6cacaaIZaGaaGinaaaakiabgU caRiaadwgapaWaaWbaaSqabeaapeGaaGymaiaac6cacaaI4aaaaOGa aiykaaaaaaa@4C25@  

Personal

1.8

R= e 1.8 ( e 1.58 + e 1.34 + e 1.8 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOuaiabg2da9maalaaabaGaamyza8aadaahaaWcbeqaa8qacaaI XaGaaiOlaiaaiIdaaaaakeaacaGGOaGaamyza8aadaahaaWcbeqaa8 qacaaIXaGaaiOlaiaaiwdacaaI4aaaaOGaey4kaSIaamyza8aadaah aaWcbeqaa8qacaaIXaGaaiOlaiaaiodacaaI0aaaaOGaey4kaSIaam yza8aadaahaaWcbeqaa8qacaaIXaGaaiOlaiaaiIdaaaGccaGGPaaa aaaa@4B6C@  

Table 6 Performance evaluation table

The model is trained using the backpropagation algorithm and the Adam optimizer. During training, predictions are made on the training set and the error of these predictions is calculated using the categorical cross-entropy function. The Adam optimizer adjusts the weights and biases in the direction of the loss slope using gradient descent, taking into account recent changes in the weights and the adaptive learning rate.

The training process consists of several stages. First, feedforward is performed by passing the input data through the network to obtain a prediction. Then, the error is calculated using the mean square error loss. Next, backpropagation is performed by propagating the error backward through the network, calculating the partial derivatives of the error with respect to the weights of each layer. Finally, the network weights are updated proportionally to these partial derivatives, controlling the update rate by the learning rate defined in the Adam optimizer.

Results

Based on the proposal developed and to validate the results, we will present the results obtained by simulating an evaluation with synthetic data collected, from which we expect to receive as output a low credit risk rating. The following table shows the variables that will be entered in the layer as input data (Table 7).

Input data

Categories

Variables (Xn)

Input value

Financial

x1

2

x2

2

X3

1

x4

1

X5

4

X6

1

X7

1

X8

1

X9

1

Socioeconomic

x10

2

X11

1

x12

1

Personal

X13

2

X14

2

X15

2

X16

2

X17

1

Table 7 Input data by variable

Hidden Layers: The following table will represent the calculations of weights and activation using the formulas described above Table 8.

Calculation in the hidden layer

Categories

Variables (Xn)

Weights (Wi)

Calculation

Activation function

Financial

x1

0.11

n=1 9 w (i* X n ) +b=1.58 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaale aaqaaaaaaaaaWdbiaad6gacqGH9aqpcaaIXaaapaqaaiaaiMdaa0Ga eyyeIuoak8qacaWG3bWdamaaBaaaleaapeGaaiikaiaadMgacaGGQa Gaamiwa8aadaWgaaadbaWdbiaad6gaa8aabeaal8qacaGGPaaapaqa baGcpeGaey4kaSIaamOyaiabg2da9iaaigdacaGGUaGaaGynaiaaiI daaaa@4978@   f( x ) = max( 0, 1.58 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzamaabmaapaqaa8qacaWG4baacaGLOaGaayzkaaGaaeiOaiab g2da9iaabckacaWGTbGaamyyaiaadIhadaqadaWdaeaapeGaaGimai aacYcacaqGGcGaaGymaiaac6cacaaI1aGaaGioaaGaayjkaiaawMca aaaa@4803@  

x2

0.11

X3

0.11

x4

0.11

X5

0.12

X6

0.11

X7

0.11

X8

0.11

X9

0.11

Socioeconomic

x10

0.34

n=10 12 w (i* X n ) +b=1.34 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaale aaqaaaaaaaaaWdbiaad6gacqGH9aqpcaaIXaGaaGimaaWdaeaacaaI XaGaaGOmaaqdcqGHris5aOWdbiaadEhapaWaaSbaaSqaa8qacaGGOa GaamyAaiaacQcacaWGybWdamaaBaaameaapeGaamOBaaWdaeqaaSWd biaacMcaa8aabeaak8qacqGHRaWkcaWGIbGaeyypa0JaaGymaiaac6 cacaaIZaGaaGinaaaa@4AE0@   f( x ) = max( 0, 1.34 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzamaabmaapaqaa8qacaWG4baacaGLOaGaayzkaaGaaeiOaiab g2da9iaabckacaWGTbGaamyyaiaadIhadaqadaWdaeaapeGaaGimai aacYcacaqGGcGaaGymaiaac6cacaaIZaGaaGinaaGaayjkaiaawMca aaaa@47FD@  

X11

0.33

x12

0.33

Personal

X13

0.2

n=13 17 w (i* X n ) +b=1.8 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaaqahabaaale aaqaaaaaaaaaWdbiaad6gacqGH9aqpcaaIXaGaaG4maaWdaeaacaaI XaGaaG4naaqdcqGHris5aOWdbiaadEhapaWaaSbaaSqaa8qacaGGOa GaamyAaiaacQcacaWGybWdamaaBaaameaapeGaamOBaaWdaeqaaSWd biaacMcaa8aabeaak8qacqGHRaWkcaWGIbGaeyypa0JaaGymaiaac6 cacaaI4aaaaa@4A2F@   f( x ) = max( 0, 1.8 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzamaabmaapaqaa8qacaWG4baacaGLOaGaayzkaaGaaeiOaiab g2da9iaabckacaWGTbGaamyyaiaadIhadaqadaWdaeaapeGaaGimai aacYcacaqGGcGaaGymaiaac6cacaaI4aaacaGLOaGaayzkaaaaaa@4744@  

X14

0.2

X15

0.2

X16

0.2

X17

0.2

Table 8 Processing of the variables in the hidden layer

Output layer: The following table will represent the calculation in the output layer using the ReLU function and the product of the output by the weight of each category and will result in a summation of this multiplication (Table 9).

Output layer results

Categories

f(x)

Softmax function

Weight by category

Result

Financial

1.58

R= e 1.58 ( e 1.58 + e 1.34 + e 1.8 ) =0.3297 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOuaiabg2da98aadaWcaaqaa8qacaWGLbWdamaaCaaaleqabaWd biaaigdacaGGUaGaaGynaiaaiIdaaaaak8aabaWdbiaacIcacaWGLb WdamaaCaaaleqabaWdbiaaigdacaGGUaGaaGynaiaaiIdaaaGccqGH RaWkcaWGLbWdamaaCaaaleqabaWdbiaaigdacaGGUaGaaG4maiaais daaaGccqGHRaWkcaWGLbWdamaaCaaaleqabaWdbiaaigdacaGGUaGa aGioaaaakiaacMcaaaGaeyypa0JaaGimaiaac6cacaaIZaGaaGOmai aaiMdacaaI3aaaaa@51D8@  

0.4

 

0.13189547

Socioeconomic

1.34

R= e 1 .34 ( e 1.58 + e 1.34 + e 1.8 ) =0.2593 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOuaiabg2da9maalaaabaGaamyza8aadaahaaWcbeqaa8qacaaI XaaaaOGaaiOlaiaaiodacaaI0aaabaGaaiikaiaadwgapaWaaWbaaS qabeaapeGaaGymaiaac6cacaaI1aGaaGioaaaakiabgUcaRiaadwga paWaaWbaaSqabeaapeGaaGymaiaac6cacaaIZaGaaGinaaaakiabgU caRiaadwgapaWaaWbaaSqabeaapeGaaGymaiaac6cacaaI4aaaaOGa aiykaaaacqGH9aqpcaaIWaGaaiOlaiaaikdacaaI1aGaaGyoaiaaio daaaa@5192@  

0.3

0.07781449

Personal

1.8

R= e 1.8 ( e 1.58 + e 1.34 + e 1.8 ) =0.4108.. MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOuaiabg2da9maalaaabaGaamyza8aadaahaaWcbeqaa8qacaaI XaGaaiOlaiaaiIdaaaaakeaacaGGOaGaamyza8aadaahaaWcbeqaa8 qacaaIXaGaaiOlaiaaiwdacaaI4aaaaOGaey4kaSIaamyza8aadaah aaWcbeqaa8qacaaIXaGaaiOlaiaaiodacaaI0aaaaOGaey4kaSIaam yza8aadaahaaWcbeqaa8qacaaIXaGaaiOlaiaaiIdaaaGccaGGPaaa aiabg2da9iaaicdacaGGUaGaaGinaiaaigdacaaIWaGaaGioaiaac6 cacaGGUaaaaa@5237@  

0.3

0.12326391

 

 

 

Result        

0.33297387

Table 9 Processing in the output layer

With the value obtained we proceed to compare it with our evaluation criteria, which is presented in the following Table 10.

Comparison of the result

 

 

Classification rank

Result

Obtained value

Classification

0.00 ≤ R(score) ≤ 0.35

Low risk

   

0.35 < R(score) ≤ 0.70

Medium risk

0.33297387

Low risk

0.70 < R(score) ≤ 1

High risk

 

 

Table 10 Test results

Discussion

From the above result it can be seen that, as expected, the classification obtained was Low Risk, this rating indicates to the officer that based on the data provided the applicant has a low probability of incurring in payment problems. Although the result is as expected, it is important to note that the results provided by the system should not be considered absolute, but rather as a support tool for decision making in risk assessment. The credit risk assessment system has shown promise in identifying and classifying different levels of risk associated with credit applicants. However, it is essential to keep in mind that credit risk assessment is a complex process involving multiple variables and external factors that can influence the financial solvency of individuals or companies.

Conclusion

The implementation of the credit risk assessment system based on multilayer neural networks has demonstrated acceptable efficiency in the identification and classification of risks associated with credit applicants and the results obtained support the usefulness of this approach as a support tool in the decision-making process. Similarly, through the use of multilayer neural networks and the ReLU and Softmax activation functions, an accurate representation and prediction of credit risks has been achieved, thus demonstrating the efficiency of the model for cases where the volume of data and processing capacity is reduced; it should be noted that for larger or more complex data volumes it would be necessary to use and combine different prediction models that can support such level of processing. In addition, models were created for data entry, validation and processing, which for the proposal allowed to represent a correct flow of information to perform the evaluation, it is recommended to consider other modules that allow to improve the collection, flow or functionality of the system.

On the other hand, it is important to note that there are areas for improvement that could further enhance the effectiveness of the system. First, it is recommended to consider the inclusion of a model that involves a larger number of variables relevant to risk assessment. By expanding the set of variables considered, the system can capture a completer and more accurate picture of the financial situation of loan applicants. In addition, it is suggested to work on improving and expanding the training datasets used in the system. A completer and more updated dataset can increase the system's ability to recognize relevant patterns and improve the accuracy of predictions. Also, attention should be paid to the quality of the data and a thorough cleaning and validation process should be performed. Another aspect to consider for future research is the hardware used for data processing. Suitable hardware, with superior processing capabilities, would allow for greater speed and efficiency in the execution of the system, which could lead to even more accurate and faster results. This change would imply a new study and analysis for the choice of prediction models.

Acknowledgments

This research is not financially supported by any public or private institution.

Conflicts of interest

The authors declare that there is no conflict of interest.

References

  1. Brkić S, Hodžić M, Džanić E. Soft-hard data fusion using uncertainty balance principle-corporate credit risk in commercial banking. Periodicals of Engineering and Natural Sciences. 2019;7(3):1138–1151.
  2. Çallı BA, Coşkun E. A longitudinal systematic review of credit risk assessment and credit default predictors. SAGE Open. 2021;11(4).
  3. Dattachaudhuri A, Biswas S, Sarkar S, et al. Transparent decision support system for credit risk evaluation: An automated credit approval system. Proceedings of 2020 IEEE-HYDCON International Conference on Engineering in the 4th Industrial Revolution, HYDCON 2020. 2020.
  4. Ferreira FAF, Meidute-Kavaliauskiene I, Zavadskas EK, et al. A judgment-based risk assessment framework for consumer loans. International Journal of Information Technology and Decision Making. 2019;18(1):147–170.
  5. Gao L, Xiao J. Big data credit report in credit risk management of consumer finance. Wireless Communications and Mobile Computing. 2021.
  6. Golbayani P, Florescu I, Chatterjee R. A comparative study of forecasting corporate credit ratings using neural networks, support vector machines, and decision trees. North American Journal of Economics and Finance. 2020;54.
  7. Kvamme H, Sellereite N, Aas K, et al. Predicting mortgage default using convolutional neural networks. Expert Systems with Applications. 2018;102:207–217.
  8. Liu CY, Dong TY, Meng LX. The prevention of financial legal risks of B2B E-commerce supply chain. Wireless Communications and Mobile Computing. 2022.
  9. Liu W, Fan H, Xia M. Step wise multi grained augmented gradient boosting decision trees for credit scoring. Engineering Applications of Artificial Intelligence. 2021;97:104036.
  10. Mhlanga D. Financial inclusion in emerging economies: The application of machine learning and artificial intelligence in credit risk assessment. International Journal of Financial Studies. 2021;9(3).
  11. Olivera CD, Manuel L, Computacional I, et al. Software development models. Cuban Magazine of Computer Sciences. 2021;15(1):37–51.
  12. Pławiak P, Abdar M, Rajendra Acharya U. Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Applied Soft Computing Journal. 2019;84:105740.
  13. Shen F, Zhao X, Kou G, et al. A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique. Applied Soft Computing. 2021;98:106852.
  14. Wang F, Ding L, Yu H, et al. Big data analytics on enterprise credit risk evaluation of e-Business platform. In Information Systems and e-Business Management. 2020;18(3):350.
  15. Wen S, Zeng B, Liao W, et al.  Research and design of credit risk assessment system based on big data and machine learning. 2021 IEEE 6th International Conference on Big Data Analytics. ICBDA 2021. 2021;9–13.
  16. Wu F, Su X, Ock YS, et al. Personal credit risk evaluation model of p2p online lending based on ahp. Symmetry. 2021;13(1):1–20.
  17. Xi Y, Li Q. Improved AHP model and neural network for consumer finance credit risk assessment. Advances in Multimedia. 2022.
  18. Yangyudongnanxin G. Financial credit risk control strategy based on weighted random forest algorithm. Scientific Programming. 2021.
Creative Commons Attribution License

©2024 Espinoza, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.