P2P Loan acceptance and default prediction with Artificial Intelligence
Pith reviewed 2026-05-25 09:56 UTC · model grok-4.3
The pith
A two-phase AI system using logistic regression and deep neural networks can reduce default risk on P2P loans by up to 70 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors propose a two-phase model where logistic regression predicts loan rejection with 77.4 percent recall and deep neural networks predict default with 72 percent recall, demonstrating that such AI techniques can reduce default risk of issued loans by up to 70 percent. When applied to small business loans alone, the phases show different optimal training datasets, suggesting a discrepancy in screening practices.
What carries the argument
The two-phase model that separates prediction of loan acceptance from default risk assessment on approved loans.
If this is right
- Logistic regression outperforms other tested methods for predicting loan rejection.
- Deep neural networks outperform other tested methods for predicting default on approved loans.
- The acceptance phase improves when trained on the full dataset while the default phase improves when trained only on small-business loans.
- Current screening of small-business loans may differ from the analysis that would best predict their defaults.
Where Pith is reading between the lines
- Lenders might run separate models for different loan categories rather than a single general model.
- The approach could be tested on live application streams to measure actual default reduction beyond historical data.
- Platforms could adjust approval thresholds based on the combined phase outputs to target a desired risk level.
Load-bearing premise
Test-set recall scores can be converted directly into realized reductions in default rates when the models are applied to new loan applications.
What would settle it
A side-by-side comparison of default rates on new loans screened with versus without the two-phase model over a multi-year period.
Figures
read the original abstract
Logistic Regression and Support Vector Machine algorithms, together with Linear and Non-Linear Deep Neural Networks, are applied to lending data in order to replicate lender acceptance of loans and predict the likelihood of default of issued loans. A two phase model is proposed; the first phase predicts loan rejection, while the second one predicts default risk for approved loans. Logistic Regression was found to be the best performer for the first phase, with test set recall macro score of $77.4 \%$. Deep Neural Networks were applied to the second phase only, were they achieved best performance, with validation set recall score of $72 \%$, for defaults. This shows that AI can improve current credit risk models reducing the default risk of issued loans by as much as $70 \%$. The models were also applied to loans taken for small businesses alone. The first phase of the model performs significantly better when trained on the whole dataset. Instead, the second phase performs significantly better when trained on the small business subset. This suggests a potential discrepancy between how these loans are screened and how they should be analysed in terms of default prediction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript applies logistic regression, SVM, and deep neural networks to P2P lending data in a two-phase framework: phase 1 predicts loan rejection (best: logistic regression, 77.4% test-set macro recall) and phase 2 predicts default among accepted loans (best: DNN, 72% validation recall on defaults). It further examines performance on the small-business subset and asserts that the approach can reduce realized default risk on issued loans by as much as 70%.
Significance. A rigorously derived and externally validated demonstration that a two-phase model materially lowers portfolio default rates would be of direct interest to credit-risk practitioners. The manuscript supplies no such derivation or validation; the headline 70% figure therefore cannot be assessed as a contribution.
major comments (2)
- [Abstract] Abstract: the assertion that the reported recall figures imply a 70% reduction in default risk among issued loans is unsupported. No equation, table, or supplementary calculation converts the 77.4% and 72% recall values into a change in realized default rate; such a conversion requires baseline prevalence, operating-point precision, and an explicit simulation of the loans filtered by the two-phase rule, none of which appear.
- [Model evaluation / results] Model-evaluation sections: performance is reported on a single held-out split drawn from the same dataset used for training and hyper-parameter selection. Because the 70% claim rests on these in-sample metrics being treated as proxies for out-of-sample default reduction, the absence of out-of-time or external validation directly undermines the central empirical claim.
minor comments (1)
- [Small-business subset] The small-business analysis is presented without a pre-specified analysis plan; if retained, it should be clearly labeled as exploratory.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and will make revisions where the concerns are valid.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that the reported recall figures imply a 70% reduction in default risk among issued loans is unsupported. No equation, table, or supplementary calculation converts the 77.4% and 72% recall values into a change in realized default rate; such a conversion requires baseline prevalence, operating-point precision, and an explicit simulation of the loans filtered by the two-phase rule, none of which appear.
Authors: We agree that the manuscript provides no explicit derivation, equation, or simulation converting the recall values into a quantified reduction in realized default rate. The 70% figure was an informal illustrative statement rather than a rigorously computed result. In revision we will remove the claim from the abstract and main text. revision: yes
-
Referee: [Model evaluation / results] Model-evaluation sections: performance is reported on a single held-out split drawn from the same dataset used for training and hyper-parameter selection. Because the 70% claim rests on these in-sample metrics being treated as proxies for out-of-sample default reduction, the absence of out-of-time or external validation directly undermines the central empirical claim.
Authors: The evaluation uses a single random held-out split. This is a genuine limitation for any claim about realized default reduction on future loans. We will revise the manuscript to explicitly note this limitation, moderate the language around out-of-sample performance, and remove reliance on the unsupported 70% figure. revision: partial
Circularity Check
No circularity; performance metrics reported on held-out sets without reduction to inputs by construction.
full rationale
The paper trains standard classifiers (LR, SVM, DNN) on lending data, reports recall on test/validation splits, and asserts a 70% default-risk reduction. No equation, parameter fit, or self-citation chain equates the reported recall values to the 70% figure; the reduction claim is an unsupported interpretive statement rather than a derived result. The two-phase modeling pipeline uses independent training and evaluation splits, satisfying the self-contained benchmark criterion. No self-definitional, fitted-input-as-prediction, or uniqueness-imported patterns appear in the derivation.
Axiom & Free-Parameter Ledger
free parameters (2)
- neural network architecture and hyperparameters
- train/test split and feature preprocessing
axioms (2)
- domain assumption The historical P2P loan records are representative of future loan applications.
- ad hoc to paper Recall on the held-out set is a sufficient proxy for reduction in realized defaults.
Reference graph
Works this paper leans on
-
[1]
Beyond fintech: Disruptive innovation in lending
Deloitte Reports. Beyond fintech: Disruptive innovation in lending. 2017
work page 2017
-
[2]
Fca sets out crackdown on peer-to-peer lending
Kate Beioley. Fca sets out crackdown on peer-to-peer lending. Financial Times, 2018
work page 2018
-
[3]
Financial Conduct Authority. Cp18/20: Loan-based (’peer-to-peer’) and investment-based crowdfunding platforms: Feedback on our post-implementation review and proposed changes to the regulatory framework. 2018
work page 2018
-
[4]
Determinants of loan performance in p2p lending
Nilas Möllenkamp. Determinants of loan performance in p2p lending. B.S. thesis, University of Twente, 2017
work page 2017
-
[5]
Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending
Riza Emekter, Yanbin Tu, Benjamas Jirasakuldech, and Min Lu. Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending. Applied Economics, 47(1):54–70, 2015
work page 2015
-
[6]
Determinants of default in p2p lending: the mexican case
Carlos Eduardo Canfield. Determinants of default in p2p lending: the mexican case. Independent Journal of Management & Production, 9(1):1–24, 2018
work page 2018
-
[7]
Mingfeng Lin, Nagpurnanand R Prabhala, and Siva Viswanathan. Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Management Science, 59(1):17–35, 2013
work page 2013
-
[8]
Credit rationing in markets with imperfect information
Joseph E Stiglitz and Andrew Weiss. Credit rationing in markets with imperfect information. The American economic review, 71(3):393–410, 1981
work page 1981
-
[9]
The long tail: Why the future of business is selling less of more
Chris Anderson. The long tail: Why the future of business is selling less of more. Hachette Books, 2006
work page 2006
-
[10]
Microfinance, the long tail and mission drift
Carlos Serrano-Cinca and Begoña Gutiérrez-Nieto. Microfinance, the long tail and mission drift. International Business Review, 23(1):181–194, 2014
work page 2014
-
[11]
All Lending Club loan data version 6, february 2018
Nathan George. All Lending Club loan data version 6, february 2018. https://www.kaggle.com/ wordsforthewise/lending-club. Accessed: 2018-10-1
work page 2018
-
[12]
Determinants of default in p2p lending
Carlos Serrano-Cinca, Begoña Gutiérrez-Nieto, and Luz López-Palacios. Determinants of default in p2p lending. PloS one, 10(10):e0139427, 2015
work page 2015
-
[13]
Applied logistic regression, volume 398
David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. Applied logistic regression, volume 398. John Wiley & Sons, 2013
work page 2013
-
[14]
Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf
Marti A. Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28, 1998
work page 1998
-
[15]
Deep learning in neural networks: An overview
Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61:85 – 117, 2015
work page 2015
-
[16]
Andrew Y Ng. Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In Proceedings of the twenty-first international conference on Machine learning, page 78. ACM, 2004
work page 2004
-
[17]
Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net.Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005
work page 2005
-
[18]
The use of the area under the roc curve in the evaluation of machine learning algorithms
Andrew P Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7):1145–1159, 1997
work page 1997
-
[19]
Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation
David Martin Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. 2011
work page 2011
-
[20]
M. Tumminello, T. Aste, T. Di Matteo, and R. N. Mantegna. A tool for filtering information in complex systems. Proceedings of the National Academy of Sciences of the United States of America, 102(30):10421–10426, 2005
work page 2005
-
[21]
A pólya urn approach to information filtering in complex networks
Riccardo Marcaccioli and Giacomo Livan. A pólya urn approach to information filtering in complex networks. Nature communications, 10(1):745, 2019. 10 A PREPRINT - J ULY 4, 2019
work page 2019
-
[22]
Network filtering for big data: Triangulated maximally filtered graph
Guido Previde Massara, Tiziana Di Matteo, and Tomaso Aste. Network filtering for big data: Triangulated maximally filtered graph. Journal of complex Networks, 5(2):161–178, 2016
work page 2016
-
[23]
Hierarchical structure in financial markets
Rosario N Mantegna. Hierarchical structure in financial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11(1):193–197, 1999. 11
work page 1999
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.