To be able to balance the trade-off between your decline in revenue and a reduction in price, an optimization issue needs to be resolved by adjusting the limit and looking for the optimum.

To be able to balance the trade-off between your decline in revenue and a reduction in price, an optimization issue needs to be resolved by adjusting the limit and looking for the optimum.

Then by using the layout of the confusion matrix plotted in Figure 6, the four regions are divided as True Positive (TN), False Positive (FP), False Negative (FN) and True Negative (TN) if“Settled” is defined as positive and “Past Due” is defined as negative,. Aligned with all the confusion matrices plotted in Figure 5, TP may be the loans that are good, and FP may be the defaults missed. Our company is interested in both of these areas. To normalize the values, two widely used mathematical terms are defined: real good Rate (TPR) and False Positive Rate (FPR). Their equations are shown below:

In this application, TPR may be the hit price of good loans, plus it represents the ability of earning funds from loan interest; FPR is the rate that is missing of, also it represents the probability of taking a loss.

Receiver Operational Characteristic (ROC) bend is considered the most widely used plot to visualize the performance of the category model after all thresholds. In Figure 7 left, the ROC Curve for the Random Forest model is plotted. This plot really shows the partnership between TPR and FPR, where one always goes into the exact same way as one other, from 0 to at least one. a classification that is good would always have the ROC curve over the red standard, sitting because of the “random classifier”. The region Under Curve (AUC) can also be a metric for assessing the category model besides precision. The AUC regarding the Random Forest model is 0.82 away from 1, which can be decent.

Although the ROC Curve plainly shows the connection between TPR and FPR, the limit can be an implicit adjustable. The optimization task cannot purely be done by the ROC Curve. Consequently, another measurement is introduced to incorporate the limit adjustable, as plotted in Figure 7 right. Because the orange TPR represents the capacity of getting cash and FPR represents the possibility of losing, no credit check payday loans Shawnee WY the instinct is to look for the limit that expands the gap between curves as much as possible. The sweet spot is around 0.7 in this case.

You will find limits to the approach: the FPR and TPR are ratios. Also we still cannot infer the exact values of the profit that different thresholds lead to though they are good at visualizing the impact of the classification threshold on making the prediction. The FPR, TPR vs Threshold approach makes the assumption that the loans are equal (loan amount, interest due, etc.), but they are actually not on the other hand. Individuals who default on loans may have an increased loan quantity and interest that want become repaid, plus it adds uncertainties into the modeling outcomes.

Luckily for us, detail by detail loan amount and interest due are available from the dataset it self.

The only thing staying is to get a method to link these with the limit and model predictions. It’s not tough to determine a manifestation for profit. These two terms can be calculated using 5 known variables as shown below in Table 2 by assuming the revenue is solely from the interest collected from the settled loans and the cost is solely from the total loan amount that customers default