pith. sign in

arxiv: 1907.01800 · v1 · pith:M53YJEYHnew · submitted 2019-07-03 · 💱 q-fin.RM · q-fin.GN

P2P Loan acceptance and default prediction with Artificial Intelligence

Pith reviewed 2026-05-25 09:56 UTC · model grok-4.3

classification 💱 q-fin.RM q-fin.GN
keywords P2P lendingloan default predictionloan acceptancemachine learningdeep neural networkslogistic regressioncredit risk
0
0 comments X

The pith

A two-phase AI system using logistic regression and deep neural networks can reduce default risk on P2P loans by up to 70 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops and tests machine learning models on lending data to first predict which loan applications will be rejected and then estimate default probability for those that are approved. Logistic regression performs best on the acceptance stage with 77.4 percent recall while deep neural networks lead on default prediction with 72 percent recall. If the test-set performance carries over, the combined approach would allow lenders to issue fewer loans that later default. Separate analysis of small-business loans shows the phases benefit from different training data, pointing to possible mismatches between current screening and optimal default analysis.

Core claim

The authors propose a two-phase model where logistic regression predicts loan rejection with 77.4 percent recall and deep neural networks predict default with 72 percent recall, demonstrating that such AI techniques can reduce default risk of issued loans by up to 70 percent. When applied to small business loans alone, the phases show different optimal training datasets, suggesting a discrepancy in screening practices.

What carries the argument

The two-phase model that separates prediction of loan acceptance from default risk assessment on approved loans.

If this is right

  • Logistic regression outperforms other tested methods for predicting loan rejection.
  • Deep neural networks outperform other tested methods for predicting default on approved loans.
  • The acceptance phase improves when trained on the full dataset while the default phase improves when trained only on small-business loans.
  • Current screening of small-business loans may differ from the analysis that would best predict their defaults.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Lenders might run separate models for different loan categories rather than a single general model.
  • The approach could be tested on live application streams to measure actual default reduction beyond historical data.
  • Platforms could adjust approval thresholds based on the combined phase outputs to target a desired risk level.

Load-bearing premise

Test-set recall scores can be converted directly into realized reductions in default rates when the models are applied to new loan applications.

What would settle it

A side-by-side comparison of default rates on new loans screened with versus without the two-phase model over a multi-year period.

Figures

Figures reproduced from arXiv: 1907.01800 by Jeremy D. Turiel, Tomaso Aste.

Figure 1
Figure 1. Figure 1: Time series plots of the dataset [11]. Three plots are presented: the number of defaulted loans as a fraction of the total number of accepted loans (blue), the number of rejected loans as a fraction of the total number of loans requested (green) and the total number of requested loans (red). The black lines represent the raw time series, with statistics (fractions and total number) computed per calendar mo… view at source ↗
Figure 2
Figure 2. Figure 2: Neural network representation with node size and colour representing total outgoing weight and edge width [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

Logistic Regression and Support Vector Machine algorithms, together with Linear and Non-Linear Deep Neural Networks, are applied to lending data in order to replicate lender acceptance of loans and predict the likelihood of default of issued loans. A two phase model is proposed; the first phase predicts loan rejection, while the second one predicts default risk for approved loans. Logistic Regression was found to be the best performer for the first phase, with test set recall macro score of $77.4 \%$. Deep Neural Networks were applied to the second phase only, were they achieved best performance, with validation set recall score of $72 \%$, for defaults. This shows that AI can improve current credit risk models reducing the default risk of issued loans by as much as $70 \%$. The models were also applied to loans taken for small businesses alone. The first phase of the model performs significantly better when trained on the whole dataset. Instead, the second phase performs significantly better when trained on the small business subset. This suggests a potential discrepancy between how these loans are screened and how they should be analysed in terms of default prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript applies logistic regression, SVM, and deep neural networks to P2P lending data in a two-phase framework: phase 1 predicts loan rejection (best: logistic regression, 77.4% test-set macro recall) and phase 2 predicts default among accepted loans (best: DNN, 72% validation recall on defaults). It further examines performance on the small-business subset and asserts that the approach can reduce realized default risk on issued loans by as much as 70%.

Significance. A rigorously derived and externally validated demonstration that a two-phase model materially lowers portfolio default rates would be of direct interest to credit-risk practitioners. The manuscript supplies no such derivation or validation; the headline 70% figure therefore cannot be assessed as a contribution.

major comments (2)
  1. [Abstract] Abstract: the assertion that the reported recall figures imply a 70% reduction in default risk among issued loans is unsupported. No equation, table, or supplementary calculation converts the 77.4% and 72% recall values into a change in realized default rate; such a conversion requires baseline prevalence, operating-point precision, and an explicit simulation of the loans filtered by the two-phase rule, none of which appear.
  2. [Model evaluation / results] Model-evaluation sections: performance is reported on a single held-out split drawn from the same dataset used for training and hyper-parameter selection. Because the 70% claim rests on these in-sample metrics being treated as proxies for out-of-sample default reduction, the absence of out-of-time or external validation directly undermines the central empirical claim.
minor comments (1)
  1. [Small-business subset] The small-business analysis is presented without a pre-specified analysis plan; if retained, it should be clearly labeled as exploratory.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will make revisions where the concerns are valid.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that the reported recall figures imply a 70% reduction in default risk among issued loans is unsupported. No equation, table, or supplementary calculation converts the 77.4% and 72% recall values into a change in realized default rate; such a conversion requires baseline prevalence, operating-point precision, and an explicit simulation of the loans filtered by the two-phase rule, none of which appear.

    Authors: We agree that the manuscript provides no explicit derivation, equation, or simulation converting the recall values into a quantified reduction in realized default rate. The 70% figure was an informal illustrative statement rather than a rigorously computed result. In revision we will remove the claim from the abstract and main text. revision: yes

  2. Referee: [Model evaluation / results] Model-evaluation sections: performance is reported on a single held-out split drawn from the same dataset used for training and hyper-parameter selection. Because the 70% claim rests on these in-sample metrics being treated as proxies for out-of-sample default reduction, the absence of out-of-time or external validation directly undermines the central empirical claim.

    Authors: The evaluation uses a single random held-out split. This is a genuine limitation for any claim about realized default reduction on future loans. We will revise the manuscript to explicitly note this limitation, moderate the language around out-of-sample performance, and remove reliance on the unsupported 70% figure. revision: partial

Circularity Check

0 steps flagged

No circularity; performance metrics reported on held-out sets without reduction to inputs by construction.

full rationale

The paper trains standard classifiers (LR, SVM, DNN) on lending data, reports recall on test/validation splits, and asserts a 70% default-risk reduction. No equation, parameter fit, or self-citation chain equates the reported recall values to the 70% figure; the reduction claim is an unsupported interpretive statement rather than a derived result. The two-phase modeling pipeline uses independent training and evaluation splits, satisfying the self-contained benchmark criterion. No self-definitional, fitted-input-as-prediction, or uniqueness-imported patterns appear in the derivation.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that fitted model performance on historical data will translate to future risk reduction, plus the unstated mapping from recall to the 70% figure.

free parameters (2)
  • neural network architecture and hyperparameters
    Depth, width, learning rate, and regularization chosen to maximize validation recall on the given data.
  • train/test split and feature preprocessing
    Choices that affect the reported 77.4% and 72% recall scores.
axioms (2)
  • domain assumption The historical P2P loan records are representative of future loan applications.
    Required for any generalization from test-set recall to deployed risk reduction.
  • ad hoc to paper Recall on the held-out set is a sufficient proxy for reduction in realized defaults.
    Used to convert the 72% default recall into the headline 70% risk-reduction claim.

pith-pipeline@v0.9.0 · 5718 in / 1338 out tokens · 47425 ms · 2026-05-25T09:56:47.768383+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    Beyond fintech: Disruptive innovation in lending

    Deloitte Reports. Beyond fintech: Disruptive innovation in lending. 2017

  2. [2]

    Fca sets out crackdown on peer-to-peer lending

    Kate Beioley. Fca sets out crackdown on peer-to-peer lending. Financial Times, 2018

  3. [3]

    Cp18/20: Loan-based (’peer-to-peer’) and investment-based crowdfunding platforms: Feedback on our post-implementation review and proposed changes to the regulatory framework

    Financial Conduct Authority. Cp18/20: Loan-based (’peer-to-peer’) and investment-based crowdfunding platforms: Feedback on our post-implementation review and proposed changes to the regulatory framework. 2018

  4. [4]

    Determinants of loan performance in p2p lending

    Nilas Möllenkamp. Determinants of loan performance in p2p lending. B.S. thesis, University of Twente, 2017

  5. [5]

    Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending

    Riza Emekter, Yanbin Tu, Benjamas Jirasakuldech, and Min Lu. Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending. Applied Economics, 47(1):54–70, 2015

  6. [6]

    Determinants of default in p2p lending: the mexican case

    Carlos Eduardo Canfield. Determinants of default in p2p lending: the mexican case. Independent Journal of Management & Production, 9(1):1–24, 2018

  7. [7]

    Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending

    Mingfeng Lin, Nagpurnanand R Prabhala, and Siva Viswanathan. Judging borrowers by the company they keep: Friendship networks and information asymmetry in online peer-to-peer lending. Management Science, 59(1):17–35, 2013

  8. [8]

    Credit rationing in markets with imperfect information

    Joseph E Stiglitz and Andrew Weiss. Credit rationing in markets with imperfect information. The American economic review, 71(3):393–410, 1981

  9. [9]

    The long tail: Why the future of business is selling less of more

    Chris Anderson. The long tail: Why the future of business is selling less of more. Hachette Books, 2006

  10. [10]

    Microfinance, the long tail and mission drift

    Carlos Serrano-Cinca and Begoña Gutiérrez-Nieto. Microfinance, the long tail and mission drift. International Business Review, 23(1):181–194, 2014

  11. [11]

    All Lending Club loan data version 6, february 2018

    Nathan George. All Lending Club loan data version 6, february 2018. https://www.kaggle.com/ wordsforthewise/lending-club. Accessed: 2018-10-1

  12. [12]

    Determinants of default in p2p lending

    Carlos Serrano-Cinca, Begoña Gutiérrez-Nieto, and Luz López-Palacios. Determinants of default in p2p lending. PloS one, 10(10):e0139427, 2015

  13. [13]

    Applied logistic regression, volume 398

    David W Hosmer Jr, Stanley Lemeshow, and Rodney X Sturdivant. Applied logistic regression, volume 398. John Wiley & Sons, 2013

  14. [14]

    Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf

    Marti A. Hearst, Susan T Dumais, Edgar Osuna, John Platt, and Bernhard Scholkopf. Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28, 1998

  15. [15]

    Deep learning in neural networks: An overview

    Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61:85 – 117, 2015

  16. [16]

    Feature selection, l 1 vs

    Andrew Y Ng. Feature selection, l 1 vs. l 2 regularization, and rotational invariance. In Proceedings of the twenty-first international conference on Machine learning, page 78. ACM, 2004

  17. [17]

    Regularization and variable selection via the elastic net.Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005

    Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net.Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005

  18. [18]

    The use of the area under the roc curve in the evaluation of machine learning algorithms

    Andrew P Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7):1145–1159, 1997

  19. [19]

    Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation

    David Martin Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. 2011

  20. [20]

    Tumminello, T

    M. Tumminello, T. Aste, T. Di Matteo, and R. N. Mantegna. A tool for filtering information in complex systems. Proceedings of the National Academy of Sciences of the United States of America, 102(30):10421–10426, 2005

  21. [21]

    A pólya urn approach to information filtering in complex networks

    Riccardo Marcaccioli and Giacomo Livan. A pólya urn approach to information filtering in complex networks. Nature communications, 10(1):745, 2019. 10 A PREPRINT - J ULY 4, 2019

  22. [22]

    Network filtering for big data: Triangulated maximally filtered graph

    Guido Previde Massara, Tiziana Di Matteo, and Tomaso Aste. Network filtering for big data: Triangulated maximally filtered graph. Journal of complex Networks, 5(2):161–178, 2016

  23. [23]

    Hierarchical structure in financial markets

    Rosario N Mantegna. Hierarchical structure in financial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11(1):193–197, 1999. 11