P2P Loan acceptance and default prediction with Artificial Intelligence

Jeremy D. Turiel; Tomaso Aste

arxiv: 1907.01800 · v1 · pith:M53YJEYHnew · submitted 2019-07-03 · 💱 q-fin.RM · q-fin.GN

P2P Loan acceptance and default prediction with Artificial Intelligence

Jeremy D. Turiel , Tomaso Aste This is my paper

Pith reviewed 2026-05-25 09:56 UTC · model grok-4.3

classification 💱 q-fin.RM q-fin.GN

keywords P2P lendingloan default predictionloan acceptancemachine learningdeep neural networkslogistic regressioncredit risk

0 comments

The pith

A two-phase AI system using logistic regression and deep neural networks can reduce default risk on P2P loans by up to 70 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops and tests machine learning models on lending data to first predict which loan applications will be rejected and then estimate default probability for those that are approved. Logistic regression performs best on the acceptance stage with 77.4 percent recall while deep neural networks lead on default prediction with 72 percent recall. If the test-set performance carries over, the combined approach would allow lenders to issue fewer loans that later default. Separate analysis of small-business loans shows the phases benefit from different training data, pointing to possible mismatches between current screening and optimal default analysis.

Core claim

The authors propose a two-phase model where logistic regression predicts loan rejection with 77.4 percent recall and deep neural networks predict default with 72 percent recall, demonstrating that such AI techniques can reduce default risk of issued loans by up to 70 percent. When applied to small business loans alone, the phases show different optimal training datasets, suggesting a discrepancy in screening practices.

What carries the argument

The two-phase model that separates prediction of loan acceptance from default risk assessment on approved loans.

If this is right

Logistic regression outperforms other tested methods for predicting loan rejection.
Deep neural networks outperform other tested methods for predicting default on approved loans.
The acceptance phase improves when trained on the full dataset while the default phase improves when trained only on small-business loans.
Current screening of small-business loans may differ from the analysis that would best predict their defaults.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Lenders might run separate models for different loan categories rather than a single general model.
The approach could be tested on live application streams to measure actual default reduction beyond historical data.
Platforms could adjust approval thresholds based on the combined phase outputs to target a desired risk level.

Load-bearing premise

Test-set recall scores can be converted directly into realized reductions in default rates when the models are applied to new loan applications.

What would settle it

A side-by-side comparison of default rates on new loans screened with versus without the two-phase model over a multi-year period.

Figures

Figures reproduced from arXiv: 1907.01800 by Jeremy D. Turiel, Tomaso Aste.

**Figure 1.** Figure 1: Time series plots of the dataset [11]. Three plots are presented: the number of defaulted loans as a fraction of the total number of accepted loans (blue), the number of rejected loans as a fraction of the total number of loans requested (green) and the total number of requested loans (red). The black lines represent the raw time series, with statistics (fractions and total number) computed per calendar mo… view at source ↗

**Figure 2.** Figure 2: Neural network representation with node size and colour representing total outgoing weight and edge width [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

read the original abstract

Logistic Regression and Support Vector Machine algorithms, together with Linear and Non-Linear Deep Neural Networks, are applied to lending data in order to replicate lender acceptance of loans and predict the likelihood of default of issued loans. A two phase model is proposed; the first phase predicts loan rejection, while the second one predicts default risk for approved loans. Logistic Regression was found to be the best performer for the first phase, with test set recall macro score of $77.4 \%$. Deep Neural Networks were applied to the second phase only, were they achieved best performance, with validation set recall score of $72 \%$, for defaults. This shows that AI can improve current credit risk models reducing the default risk of issued loans by as much as $70 \%$. The models were also applied to loans taken for small businesses alone. The first phase of the model performs significantly better when trained on the whole dataset. Instead, the second phase performs significantly better when trained on the small business subset. This suggests a potential discrepancy between how these loans are screened and how they should be analysed in terms of default prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper runs standard classifiers on P2P data in a two-phase setup and notes a difference for small-business loans, but the 70% default-risk reduction is not derived from the reported recall numbers.

read the letter

The main takeaway is that this is a straightforward application of logistic regression, SVM, and neural nets to a lending dataset, split into an acceptance stage and a default stage. The 70% reduction claim does not follow from the 72% or 77.4% recall figures they give, because recall alone does not tell you the change in realized default rate on the accepted loans without the base rate, precision, and a portfolio simulation. None of that appears in the abstract or the stress-test note, so the headline result stays unsupported on the evidence provided here.

Referee Report

2 major / 1 minor

Summary. The manuscript applies logistic regression, SVM, and deep neural networks to P2P lending data in a two-phase framework: phase 1 predicts loan rejection (best: logistic regression, 77.4% test-set macro recall) and phase 2 predicts default among accepted loans (best: DNN, 72% validation recall on defaults). It further examines performance on the small-business subset and asserts that the approach can reduce realized default risk on issued loans by as much as 70%.

Significance. A rigorously derived and externally validated demonstration that a two-phase model materially lowers portfolio default rates would be of direct interest to credit-risk practitioners. The manuscript supplies no such derivation or validation; the headline 70% figure therefore cannot be assessed as a contribution.

major comments (2)

[Abstract] Abstract: the assertion that the reported recall figures imply a 70% reduction in default risk among issued loans is unsupported. No equation, table, or supplementary calculation converts the 77.4% and 72% recall values into a change in realized default rate; such a conversion requires baseline prevalence, operating-point precision, and an explicit simulation of the loans filtered by the two-phase rule, none of which appear.
[Model evaluation / results] Model-evaluation sections: performance is reported on a single held-out split drawn from the same dataset used for training and hyper-parameter selection. Because the 70% claim rests on these in-sample metrics being treated as proxies for out-of-sample default reduction, the absence of out-of-time or external validation directly undermines the central empirical claim.

minor comments (1)

[Small-business subset] The small-business analysis is presented without a pre-specified analysis plan; if retained, it should be clearly labeled as exploratory.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will make revisions where the concerns are valid.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that the reported recall figures imply a 70% reduction in default risk among issued loans is unsupported. No equation, table, or supplementary calculation converts the 77.4% and 72% recall values into a change in realized default rate; such a conversion requires baseline prevalence, operating-point precision, and an explicit simulation of the loans filtered by the two-phase rule, none of which appear.

Authors: We agree that the manuscript provides no explicit derivation, equation, or simulation converting the recall values into a quantified reduction in realized default rate. The 70% figure was an informal illustrative statement rather than a rigorously computed result. In revision we will remove the claim from the abstract and main text. revision: yes
Referee: [Model evaluation / results] Model-evaluation sections: performance is reported on a single held-out split drawn from the same dataset used for training and hyper-parameter selection. Because the 70% claim rests on these in-sample metrics being treated as proxies for out-of-sample default reduction, the absence of out-of-time or external validation directly undermines the central empirical claim.

Authors: The evaluation uses a single random held-out split. This is a genuine limitation for any claim about realized default reduction on future loans. We will revise the manuscript to explicitly note this limitation, moderate the language around out-of-sample performance, and remove reliance on the unsupported 70% figure. revision: partial

Circularity Check

0 steps flagged

No circularity; performance metrics reported on held-out sets without reduction to inputs by construction.

full rationale

The paper trains standard classifiers (LR, SVM, DNN) on lending data, reports recall on test/validation splits, and asserts a 70% default-risk reduction. No equation, parameter fit, or self-citation chain equates the reported recall values to the 70% figure; the reduction claim is an unsupported interpretive statement rather than a derived result. The two-phase modeling pipeline uses independent training and evaluation splits, satisfying the self-contained benchmark criterion. No self-definitional, fitted-input-as-prediction, or uniqueness-imported patterns appear in the derivation.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that fitted model performance on historical data will translate to future risk reduction, plus the unstated mapping from recall to the 70% figure.

free parameters (2)

neural network architecture and hyperparameters
Depth, width, learning rate, and regularization chosen to maximize validation recall on the given data.
train/test split and feature preprocessing
Choices that affect the reported 77.4% and 72% recall scores.

axioms (2)

domain assumption The historical P2P loan records are representative of future loan applications.
Required for any generalization from test-set recall to deployed risk reduction.
ad hoc to paper Recall on the held-out set is a sufficient proxy for reduction in realized defaults.
Used to convert the 72% default recall into the headline 70% risk-reduction claim.

pith-pipeline@v0.9.0 · 5718 in / 1338 out tokens · 47425 ms · 2026-05-25T09:56:47.768383+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

[1]

Beyond ﬁntech: Disruptive innovation in lending

Deloitte Reports. Beyond ﬁntech: Disruptive innovation in lending. 2017

work page 2017
[2]

Fca sets out crackdown on peer-to-peer lending

Kate Beioley. Fca sets out crackdown on peer-to-peer lending. Financial Times, 2018

work page 2018
[3]

Cp18/20: Loan-based (’peer-to-peer’) and investment-based crowdfunding platforms: Feedback on our post-implementation review and proposed changes to the regulatory framework

Financial Conduct Authority. Cp18/20: Loan-based (’peer-to-peer’) and investment-based crowdfunding platforms: Feedback on our post-implementation review and proposed changes to the regulatory framework. 2018

work page 2018
[4]

Determinants of loan performance in p2p lending

Nilas Möllenkamp. Determinants of loan performance in p2p lending. B.S. thesis, University of Twente, 2017

work page 2017
[5]

Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending

Riza Emekter, Yanbin Tu, Benjamas Jirasakuldech, and Min Lu. Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending. Applied Economics, 47(1):54–70, 2015

work page 2015
[6]

Determinants of default in p2p lending: the mexican case

Carlos Eduardo Canﬁeld. Determinants of default in p2p lending: the mexican case. Independent Journal of Management & Production, 9(1):1–24, 2018

work page 2018
[7]

Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending

Mingfeng Lin, Nagpurnanand R Prabhala, and Siva Viswanathan. Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Management Science, 59(1):17–35, 2013

work page 2013
[8]

Credit rationing in markets with imperfect information

Joseph E Stiglitz and Andrew Weiss. Credit rationing in markets with imperfect information. The American economic review, 71(3):393–410, 1981

work page 1981
[9]

The long tail: Why the future of business is selling less of more

Chris Anderson. The long tail: Why the future of business is selling less of more. Hachette Books, 2006

work page 2006
[10]

Microﬁnance, the long tail and mission drift

Carlos Serrano-Cinca and Begoña Gutiérrez-Nieto. Microﬁnance, the long tail and mission drift. International Business Review, 23(1):181–194, 2014

work page 2014
[11]

All Lending Club loan data version 6, february 2018

Nathan George. All Lending Club loan data version 6, february 2018. https://www.kaggle.com/ wordsforthewise/lending-club. Accessed: 2018-10-1

work page 2018
[12]

Determinants of default in p2p lending

Carlos Serrano-Cinca, Begoña Gutiérrez-Nieto, and Luz López-Palacios. Determinants of default in p2p lending. PloS one, 10(10):e0139427, 2015

work page 2015
[13]

Applied logistic regression, volume 398

David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. Applied logistic regression, volume 398. John Wiley & Sons, 2013

work page 2013
[14]

Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf

Marti A. Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28, 1998

work page 1998
[15]

Deep learning in neural networks: An overview

Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61:85 – 117, 2015

work page 2015
[16]

Feature selection, l 1 vs

Andrew Y Ng. Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In Proceedings of the twenty-ﬁrst international conference on Machine learning, page 78. ACM, 2004

work page 2004
[17]

Regularization and variable selection via the elastic net.Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005

Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net.Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005

work page 2005
[18]

The use of the area under the roc curve in the evaluation of machine learning algorithms

Andrew P Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7):1145–1159, 1997

work page 1997
[19]

Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation

David Martin Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. 2011

work page 2011
[20]

Tumminello, T

M. Tumminello, T. Aste, T. Di Matteo, and R. N. Mantegna. A tool for ﬁltering information in complex systems. Proceedings of the National Academy of Sciences of the United States of America, 102(30):10421–10426, 2005

work page 2005
[21]

A pólya urn approach to information ﬁltering in complex networks

Riccardo Marcaccioli and Giacomo Livan. A pólya urn approach to information ﬁltering in complex networks. Nature communications, 10(1):745, 2019. 10 A PREPRINT - J ULY 4, 2019

work page 2019
[22]

Network ﬁltering for big data: Triangulated maximally ﬁltered graph

Guido Previde Massara, Tiziana Di Matteo, and Tomaso Aste. Network ﬁltering for big data: Triangulated maximally ﬁltered graph. Journal of complex Networks, 5(2):161–178, 2016

work page 2016
[23]

Hierarchical structure in ﬁnancial markets

Rosario N Mantegna. Hierarchical structure in ﬁnancial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11(1):193–197, 1999. 11

work page 1999

[1] [1]

Beyond ﬁntech: Disruptive innovation in lending

Deloitte Reports. Beyond ﬁntech: Disruptive innovation in lending. 2017

work page 2017

[2] [2]

Fca sets out crackdown on peer-to-peer lending

Kate Beioley. Fca sets out crackdown on peer-to-peer lending. Financial Times, 2018

work page 2018

[3] [3]

Cp18/20: Loan-based (’peer-to-peer’) and investment-based crowdfunding platforms: Feedback on our post-implementation review and proposed changes to the regulatory framework

Financial Conduct Authority. Cp18/20: Loan-based (’peer-to-peer’) and investment-based crowdfunding platforms: Feedback on our post-implementation review and proposed changes to the regulatory framework. 2018

work page 2018

[4] [4]

Determinants of loan performance in p2p lending

Nilas Möllenkamp. Determinants of loan performance in p2p lending. B.S. thesis, University of Twente, 2017

work page 2017

[5] [5]

Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending

Riza Emekter, Yanbin Tu, Benjamas Jirasakuldech, and Min Lu. Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending. Applied Economics, 47(1):54–70, 2015

work page 2015

[6] [6]

Determinants of default in p2p lending: the mexican case

Carlos Eduardo Canﬁeld. Determinants of default in p2p lending: the mexican case. Independent Journal of Management & Production, 9(1):1–24, 2018

work page 2018

[7] [7]

Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending

Mingfeng Lin, Nagpurnanand R Prabhala, and Siva Viswanathan. Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Management Science, 59(1):17–35, 2013

work page 2013

[8] [8]

Credit rationing in markets with imperfect information

Joseph E Stiglitz and Andrew Weiss. Credit rationing in markets with imperfect information. The American economic review, 71(3):393–410, 1981

work page 1981

[9] [9]

The long tail: Why the future of business is selling less of more

Chris Anderson. The long tail: Why the future of business is selling less of more. Hachette Books, 2006

work page 2006

[10] [10]

Microﬁnance, the long tail and mission drift

Carlos Serrano-Cinca and Begoña Gutiérrez-Nieto. Microﬁnance, the long tail and mission drift. International Business Review, 23(1):181–194, 2014

work page 2014

[11] [11]

All Lending Club loan data version 6, february 2018

Nathan George. All Lending Club loan data version 6, february 2018. https://www.kaggle.com/ wordsforthewise/lending-club. Accessed: 2018-10-1

work page 2018

[12] [12]

Determinants of default in p2p lending

Carlos Serrano-Cinca, Begoña Gutiérrez-Nieto, and Luz López-Palacios. Determinants of default in p2p lending. PloS one, 10(10):e0139427, 2015

work page 2015

[13] [13]

Applied logistic regression, volume 398

David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. Applied logistic regression, volume 398. John Wiley & Sons, 2013

work page 2013

[14] [14]

Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf

Marti A. Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28, 1998

work page 1998

[15] [15]

Deep learning in neural networks: An overview

Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61:85 – 117, 2015

work page 2015

[16] [16]

Feature selection, l 1 vs

Andrew Y Ng. Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In Proceedings of the twenty-ﬁrst international conference on Machine learning, page 78. ACM, 2004

work page 2004

[17] [17]

Regularization and variable selection via the elastic net.Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005

Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net.Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005

work page 2005

[18] [18]

The use of the area under the roc curve in the evaluation of machine learning algorithms

Andrew P Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7):1145–1159, 1997

work page 1997

[19] [19]

Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation

David Martin Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. 2011

work page 2011

[20] [20]

Tumminello, T

M. Tumminello, T. Aste, T. Di Matteo, and R. N. Mantegna. A tool for ﬁltering information in complex systems. Proceedings of the National Academy of Sciences of the United States of America, 102(30):10421–10426, 2005

work page 2005

[21] [21]

A pólya urn approach to information ﬁltering in complex networks

Riccardo Marcaccioli and Giacomo Livan. A pólya urn approach to information ﬁltering in complex networks. Nature communications, 10(1):745, 2019. 10 A PREPRINT - J ULY 4, 2019

work page 2019

[22] [22]

Network ﬁltering for big data: Triangulated maximally ﬁltered graph

Guido Previde Massara, Tiziana Di Matteo, and Tomaso Aste. Network ﬁltering for big data: Triangulated maximally ﬁltered graph. Journal of complex Networks, 5(2):161–178, 2016

work page 2016

[23] [23]

Hierarchical structure in ﬁnancial markets

Rosario N Mantegna. Hierarchical structure in ﬁnancial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11(1):193–197, 1999. 11

work page 1999