pith. machine review for the scientific record. sign in

arxiv: 2605.14467 · v1 · submitted 2026-05-14 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Focused PU learning from imbalanced data

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:44 UTC · model grok-4.3

classification 💻 cs.LG
keywords positive-unlabeled learningimbalanced dataempirical risk estimatorbinary classificationSCAR labelingSAR labelingfinancial misstatement detection
0
0 comments X

The pith

A focused empirical risk estimator enables effective training of binary classifiers from positive and unlabeled examples in highly imbalanced data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that standard positive-unlabeled learning breaks down when negatives heavily outnumber positives and when positives look like negatives. It introduces a focused empirical risk estimator that folds both the labeled positives and the large unlabeled pool into the training objective. This estimator is evaluated on controlled imbalanced benchmarks under the SCAR and SAR labeling models and then applied to the task of spotting financial misstatements. A reader would care because many real detection problems, from fraud to gene identification, naturally produce exactly this kind of partial, skewed data.

Core claim

The authors claim that a focused empirical risk estimator, by directly incorporating both positive and unlabeled instances, produces binary classifiers that reach state-of-the-art accuracy on imbalanced positive-unlabeled datasets under the SCAR and SAR labeling mechanisms and that the same estimator delivers practical gains when used for financial misstatement detection.

What carries the argument

The focused empirical risk estimator, which re-weights the contribution of labeled positives and unlabeled examples to the overall risk so that imbalance does not dominate the optimization.

If this is right

  • Classifiers trained with the estimator outperform prior PU methods on imbalanced data under both SCAR and SAR labeling.
  • The same estimator yields measurable improvement on a concrete financial misstatement detection task.
  • Performance gains hold when the unlabeled pool contains a realistic mixture of negatives and hidden positives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The estimator could be combined with modern representation learners to handle high-dimensional imbalanced data without changing the risk formulation.
  • If real labeling processes deviate from SCAR and SAR, the method may still serve as a strong baseline that requires only modest adaptation.
  • The approach suggests a route for extending other risk-based PU algorithms to extreme imbalance without introducing new hyperparameters.

Load-bearing premise

The estimator keeps working even when positive examples closely resemble negative ones and when the labeling process matches the SCAR or SAR models used in the experiments.

What would settle it

Running the method on a dataset with extreme imbalance and near-identical positive and negative distributions, then observing that it matches or underperforms standard PU baselines, would falsify the central claim.

read the original abstract

We propose a new method of learning from positive and unlabeled (PU) examples in highly imbalanced datasets. Many real-world problems, such as disease gene identification, targeted marketing, fraud detection, and recommender systems, are hard to address with machine learning methods, due to limited labeled data. Often, training data comprises positive and unlabeled instances, the latter typically being dominated by negative, but including also several positive instances. While PU learning is well-studied, few methods address imbalanced settings or hard-to-detect positive examples that resemble negative ones. Our approach uses a focused empirical risk estimator, incorporating both positive and unlabeled examples to train binary classifiers. Empirical evaluations demonstrate state-of-the-art performance on imbalanced datasets under two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR). Beyond these controlled experiments, we demonstrate the value of the proposed method in the real-world application of financial misstatement detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a focused empirical risk estimator for positive-unlabeled (PU) learning on highly imbalanced datasets. It incorporates both labeled positives and unlabeled examples (mostly negatives) to train binary classifiers, evaluates the method under SCAR and SAR labeling mechanisms, claims state-of-the-art performance on imbalanced data, and demonstrates utility on a real-world financial misstatement detection task.

Significance. If the focused estimator can be shown to deliver reliable gains precisely when positives closely resemble negatives under severe imbalance, the work would address a practically relevant gap in PU learning for applications such as fraud detection and gene identification. The inclusion of a real-world case study strengthens potential impact, but the absence of methodological derivations and targeted robustness checks limits the current assessment of significance.

major comments (2)
  1. [Abstract] Abstract: The central claim of state-of-the-art performance is asserted without any equations, derivation of the focused empirical risk estimator, description of baselines, error bars, or statistical tests, so the support for the claim cannot be evaluated from the provided text.
  2. [Empirical evaluations] Empirical evaluations: The abstract identifies hard-to-detect positives that resemble negatives as a key challenge, yet the reported results consist only of aggregate performance on imbalanced datasets under SCAR/SAR without controlled overlap experiments, feature-separation ablations, or similarity metrics between classes; this leaves the advantage over prior PU baselines unproven in the regime the method claims to address.
minor comments (1)
  1. [Abstract] The abstract refers to 'two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR)' without clarifying the precise difference or citing the original definitions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review of our manuscript. We address each major comment in detail below and have made revisions to strengthen the presentation of our contributions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim of state-of-the-art performance is asserted without any equations, derivation of the focused empirical risk estimator, description of baselines, error bars, or statistical tests, so the support for the claim cannot be evaluated from the provided text.

    Authors: The abstract is necessarily concise and serves as a high-level overview; the full derivation of the focused empirical risk estimator appears in Section 3, baseline methods are detailed in Section 4.1, and all reported results in Section 5 include error bars together with statistical significance tests (paired t-tests at p<0.05). We agree that a brief reference to these elements would improve the abstract and have revised it to mention the estimator's key form and the evaluation protocol with error bars and tests. revision: partial

  2. Referee: [Empirical evaluations] Empirical evaluations: The abstract identifies hard-to-detect positives that resemble negatives as a key challenge, yet the reported results consist only of aggregate performance on imbalanced datasets under SCAR/SAR without controlled overlap experiments, feature-separation ablations, or similarity metrics between classes; this leaves the advantage over prior PU baselines unproven in the regime the method claims to address.

    Authors: We acknowledge that the current experiments report aggregate performance under SCAR and SAR on imbalanced data and do not contain explicit controlled overlap studies or similarity metrics. While the SAR mechanism already induces varying degrees of positive-negative resemblance, we agree that targeted ablations would better isolate the regime of interest. We have added a new subsection (5.4) containing synthetic overlap experiments with adjustable class similarity, feature-separation ablations, and cosine-similarity metrics between class centroids, which demonstrate the focused estimator's gains precisely when positives closely resemble negatives. revision: yes

Circularity Check

0 steps flagged

No circularity: new focused estimator introduced with independent empirical validation

full rationale

The paper introduces a focused empirical risk estimator for PU learning on imbalanced data as a novel construction that incorporates both positive and unlabeled examples. No equations, fitted parameters, or self-citations are shown that reduce any claimed prediction or result to the inputs by definition or construction. The method is presented as a distinct estimator rather than a re-expression of prior quantities, with performance claims resting on empirical evaluations under SCAR and SAR labeling mechanisms plus a real-world application. No self-definitional steps, fitted-input predictions, load-bearing self-citations, or ansatz smuggling are identifiable from the provided text. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The approach implicitly relies on standard PU learning assumptions (e.g., that unlabeled data contains a mixture of positives and negatives) but these are not enumerated.

pith-pipeline@v0.9.0 · 5450 in / 1019 out tokens · 33715 ms · 2026-05-15T01:44:57.609669+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 1 internal anchor

  1. [1]

    IEEE Transactions on Neural Networks and Learning Systems25(5), 845–869 (2014) https://doi.org/10.1109/TNNLS.2013.2292894

    Frenay, B., Verleysen, M.: Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems25(5), 845–869 (2014) https://doi.org/10.1109/TNNLS.2013.2292894

  2. [2]

    National Science Review5(1), 44–53 (2017) https://doi.org/10.1093/nsr/nwx106

    Zhou, Z.-H.: A brief introduction to weakly supervised learning. National Science Review5(1), 44–53 (2017) https://doi.org/10.1093/nsr/nwx106

  3. [3]

    Machine Learning109(4), 719–760 (2020) https://doi.org/10.1007/ s10994-020-05877-5

    Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Machine Learning109(4), 719–760 (2020) https://doi.org/10.1007/ s10994-020-05877-5

  4. [4]

    Neurocomput- ing160, 73–84 (2015) https://doi.org/10.1016/j.neucom.2014.10.081 20

    Claesen, M., De Smet, F., Suykens, J., De Moor, B.: A robust ensemble approach to learn from positive and unlabeled data using svm base models. Neurocomput- ing160, 73–84 (2015) https://doi.org/10.1016/j.neucom.2014.10.081 20

  5. [5]

    Building Classifiers to Predict the Start of Glucose-Lowering Pharmacotherapy Using Belgian Health Expenditure Data

    Claesen, M., De Smet, F., Gillard, P., Mathieu, C., De Moor, B.: Building classi- fiers to predict the start of glucose-lowering pharmacotherapy using belgian health expenditure data. ArXivabs/1504.07389(2015)

  6. [6]

    Decision Support Systems111, 13–26 (2018) https://doi.org/10

    Stripling, E., Baesens, B., Chizi, B., vanden Broucke, S.: Isolation-based condi- tional anomaly detection on mixed-attribute data to uncover workers’ compen- sation fraud. Decision Support Systems111, 13–26 (2018) https://doi.org/10. 1016/j.dss.2018.04.001

  7. [7]

    Journal of Accounting Research58(1), 199–235 (2020)

    Bao, Y., Ke, B., Li, B., Yu, Y.J., Zhang, J.: Detecting accounting fraud in pub- licly traded us firms using a machine learning approach. Journal of Accounting Research58(1), 199–235 (2020)

  8. [8]

    Review of Accounting Studies26, 468–519 (2021)

    Bertomeu, J., Cheynel, E., Floyd, E., W., P.: Using machine learning to detect misstatements. Review of Accounting Studies26, 468–519 (2021)

  9. [9]

    In: Pro- ceedings of the Second ACM International Conference on AI in Finance

    Zavitsanos, E., Mavroeidis, D., Bougiatiotis, K., Spyropoulou, E., Loukas, L., Paliouras, G.: Financial misstatement detection: A realistic evaluation. In: Pro- ceedings of the Second ACM International Conference on AI in Finance. ACM, Virtual event (2021). https://doi.org/10.1145/3490354.3494453

  10. [10]

    The VLDB Journal24(6), 707–730 (2015) https://doi.org/10.1007/s00778-015-0394-1

    Gal´ arraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in onto- logical knowledge bases with amie+. The VLDB Journal24(6), 707–730 (2015) https://doi.org/10.1007/s00778-015-0394-1

  11. [11]

    Computational Biology and Chemistry76, 23–31 (2018) https://doi.org/10.1016/j.compbiolchem.2018

    Vasighizaker, A., Jalili, S.: C-pugp: A cluster-based positive unlabeled learning method for disease gene prediction and prioritization. Computational Biology and Chemistry76, 23–31 (2018) https://doi.org/10.1016/j.compbiolchem.2018. 05.022

  12. [12]

    In: Proceedings of the 2018 World Wide Web Conference, pp

    Zupanc, K., Davis, J.: Estimating rule quality for knowledge base completion with the relationship between coverage assumption. In: Proceedings of the 2018 World Wide Web Conference, pp. 1073–1081. International World Wide Web Conferences Steering Committee, Lyon, France (2018). https://doi.org/10.1145/ 3178876.3186006

  13. [13]

    Scientific Reports10(1), 22295 (2020) https://doi.org/10.1038/s41598-020-78033-7

    Mignone, P., Pio, G., Dˇ zeroski, S., Ceci, M.: Multi-task learning for the simulta- neous reconstruction of the human and mouse gene regulatory networks. Scientific Reports10(1), 22295 (2020) https://doi.org/10.1038/s41598-020-78033-7

  14. [14]

    Bioinformatics36(5), 1553– 1561 (2019) https://doi.org/10.1093/bioinformatics/btz781

    Mignone, P., Pio, G., D’Elia, D., Ceci, M.: Exploiting transfer learning for the reconstruction of the human gene regulatory network. Bioinformatics36(5), 1553– 1561 (2019) https://doi.org/10.1093/bioinformatics/btz781

  15. [15]

    In: Pro- ceedings of the 2000 International Conference on Artificial Intelligence ICAI

    Japkowicz, N.: The class imbalance problem: Significance and strategies. In: Pro- ceedings of the 2000 International Conference on Artificial Intelligence ICAI. Computer Science Research, Education and Applications Press, Athens, Georgia, 21 USA (2000)

  16. [16]

    Progress in Artificial Intelligence5(4), 221–232 (2016) https://doi.org/10

    Krawczyk, B.: Learning from imbalanced data: open challenges and future direc- tions. Progress in Artificial Intelligence5(4), 221–232 (2016) https://doi.org/10. 1007/s13748-016-0094-0

  17. [17]

    In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp

    Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–220. ACM, Las Vegas, Nevada, USA (2008). https://doi.org/10.1145/1401890.1401920

  18. [18]

    In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD

    Bekker, J., Robberechts, P., Davis, J.: Beyond the selected completely at random assumption for learning from positive and unlabeled data. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD. Springer, Wurzburg, Germany (2019). https://doi.org/10.1007/ 978-3-030-46147-8 5

  19. [19]

    IEEE Transactions on Knowledge and Data Engineering18(1), 6–20 (2006) https://doi.org/10.1109/TKDE.2006.16

    Fung, G.P.C., Yu, J.X., Lu, H., Yu, P.S.: Text classification without negative examples revisit. IEEE Transactions on Knowledge and Data Engineering18(1), 6–20 (2006) https://doi.org/10.1109/TKDE.2006.16

  20. [20]

    In: International Conference on Machine Learning (ICML), p

    Liu, B., Yu, P., Li, X.: Partially supervised classification of text documents. In: International Conference on Machine Learning (ICML), p. 8. AAAI press, Washington, DC USA (2003). https://doi.org/10.1385/1-59259-358-5:387

  21. [21]

    In: Machine Learning: ECML 2005, pp

    Li, X.-L., Liu, B.: Learning from positive and unlabeled examples with different data distributions. In: Machine Learning: ECML 2005, pp. 218–229. Springer, Porto, Portugal (2005)

  22. [22]

    In: International Joint Conference on Artificial Intelligence, pp

    Li, X., Liu, B., Ng, S.-K.: Learning to identify unexpected instances in the test set. In: International Joint Conference on Artificial Intelligence, pp. 2802–

  23. [23]

    https://api

    Morgan Kaufmann Publishers Inc., Hyderabad India (2007). https://api. semanticscholar.org/CorpusID:14296672

  24. [24]

    In: Machine Learning and Data Mining in Pattern Recognition, pp

    Yu, S., Li, C.: Pe-puc: A graph based pu-learning approach for text classification. In: Machine Learning and Data Mining in Pattern Recognition, pp. 574–584. Springer, Leipzig, Germany (2007)

  25. [25]

    In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp

    Yu, H., Han, J., Chang, K.C.-C.: Pebl: Positive example based learning for web page classification using svm. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 239–248. ACM, Edmonton Alberta Canada (2002). https://doi.org/10.1145/775047.775083

  26. [26]

    IEEE Transactions on Knowledge and Data Engineering16(2003) https://doi.org/10.1109/TKDE.2004.1264823 22

    Yu, H., Han, J., Chang, K.: Pebl: Web page classification without negative examples. IEEE Transactions on Knowledge and Data Engineering16(2003) https://doi.org/10.1109/TKDE.2004.1264823 22

  27. [27]

    Knowledge and Information Systems16(3), 281–301 (2008) https://doi.org/10.1007/s10115-007-0107-1

    Peng, T., Zuo, W., He, F.: Svm based adaptive learning method for text clas- sification from positive and unlabeled documents. Knowledge and Information Systems16(3), 281–301 (2008) https://doi.org/10.1007/s10115-007-0107-1

  28. [28]

    In: 18th International Joint Conference on Artificial Intelligence, pp

    Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: 18th International Joint Conference on Artificial Intelligence, pp. 587–594. Morgan Kaufmann Publishers Inc, Acapulco, Mexico (2003)

  29. [29]

    In: Li, H., M` arquez, L

    Li, X.-L., Liu, B., Ng, S.-K.: Negative training data can be harmful to text classification. In: Li, H., M` arquez, L. (eds.) Empirical Methods in Natural Lan- guage Processing, pp. 218–228. ACM, Cambridge Massachusetts (2010). https: //aclanthology.org/D10-1022

  30. [30]

    In: 19th International Conference on Neural Infor- mation Processing, pp

    Chaudhari, S., Shevade, S.: Learning from positive and unlabelled examples using maximum margin clustering. In: 19th International Conference on Neural Infor- mation Processing, pp. 465–473. ACM, Doha Qatar (2012). https://doi.org/10. 1007/978-3-642-34487-9 56

  31. [31]

    https://api.semanticscholar.org/CorpusID:14424311

    Uden, M.A.: Rocchio : Relevance Feedback in Learning Classification Algorithms (2007). https://api.semanticscholar.org/CorpusID:14424311

  32. [32]

    Data Mining and Knowledge Discovery37(3), 1301–1325 (2023) https://doi.org/10.1007/ s10618-023-00925-9

    Ortega V´ azquez, C., Broucke, S., De Weerdt, J.: A two-step anomaly detec- tion based method for pu classification in imbalanced data sets. Data Mining and Knowledge Discovery37(3), 1301–1325 (2023) https://doi.org/10.1007/ s10618-023-00925-9

  33. [33]

    In: Third IEEE International Conference on Data Mining, pp

    Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using posi- tive and unlabeled examples. In: Third IEEE International Conference on Data Mining, pp. 179–186. IEEE, Melbourne, Florida (2003). https://doi.org/10.1109/ ICDM.2003.1250918

  34. [34]

    In: Wang, J., Yen, G.G., Polycarpou, M.M

    Ke, T., Yang, B., Zhen, L., Tan, J., Li, Y., Jing, L.: Building high-performance classifiers using positive and unlabeled examples for text classification. In: Wang, J., Yen, G.G., Polycarpou, M.M. (eds.) Advances in Neural Networks – ISNN 2012, pp. 187–195. Springer, Shenyang, China (2012)

  35. [35]

    In: Advanced Data Mining and Applications, pp

    Liu, Z., Shi, W., Li, D., Qin, Q.: Partially supervised classification – based on weighted unlabeled samples support vector machine. In: Advanced Data Mining and Applications, pp. 118–129. Springer, Wuhan, China (2005)

  36. [36]

    In: 20th International Conference on International Conference on Machine Learning, pp

    Lee, W.S., Liu, B.: Learning with positive and unlabeled examples using weighted logistic regression. In: 20th International Conference on International Conference on Machine Learning, pp. 448–455. AAAI Press, Washington, DC USA (2003)

  37. [37]

    Methods in molecular biology939, 47–58 (2013) 23

    Mordelet, F., Vert, J.-P.: Supervised inference of gene regulatory networks from positive and unlabeled examples. Methods in molecular biology939, 47–58 (2013) 23

  38. [38]

    Celestial Mechan- ics and Dynamical Astronomy83, 155–169 (2002) https://doi.org/10.1023/A: 1020143116091

    Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classi- fiers. Neural Processing Letters9(3), 293–300 (1999) https://doi.org/10.1023/A: 1018628609742

  39. [39]

    Applied Intelligence48(8), 2373–2392 (2018) https:// doi.org/10.1007/s10489-017-1076-z

    Ke, T., Jing, L., Lv, H., Zhang, L., Hu, Y.: Global and local learning from positive and unlabeled examples. Applied Intelligence48(8), 2373–2392 (2018) https:// doi.org/10.1007/s10489-017-1076-z

  40. [40]

    Physica A: Statistical Mechanics and its Applications509, 422–438 (2018) https://doi.org/10.1016/j.physa.2018

    Ke, T., Lv, H., Sun, M., Zhang, L.: A biased least squares support vector machine based on mahalanobis distance for pu learning. Physica A: Statistical Mechanics and its Applications509, 422–438 (2018) https://doi.org/10.1016/j.physa.2018. 05.128

  41. [41]

    Proceedings of the AAAI Conference on Artificial Intelligence32(1), 2712–2719 (2018) https://doi.org/10.1609/aaai.v32i1.11715

    Bekker, J., Davis, J.: Estimating the class prior in positive and unlabeled data through decision tree induction. Proceedings of the AAAI Conference on Artificial Intelligence32(1), 2712–2719 (2018) https://doi.org/10.1609/aaai.v32i1.11715

  42. [42]

    In: Proceedings of The 33rd International Conference on Machine Learning

    Ramaswamy, H., Scott, C., Tewari, A.: Mixture proportion estimation via kernel embeddings of distributions. In: Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 2052–2060. JMLR, New York, New York, USA (2016). https://proceedings.mlr. press/v48/ramaswamy16.html

  43. [43]

    Machine Learning106(4), 463–492 (2017) https: //doi.org/10.1007/s10994-016-5604-6

    Plessis, M.C., Niu, G., Sugiyama, M.: Class-prior estimation for learning from positive and unlabeled data. Machine Learning106(4), 463–492 (2017) https: //doi.org/10.1007/s10994-016-5604-6

  44. [44]

    Proceedings of the AAAI Conference on Artificial Intelligence34(04), 6729–6736 (2020) https://doi.org/ 10.1609/aaai.v34i04.6151

    Zeiberg, D., Jain, S., Radivojac, P.: Fast nonparametric estimation of class pro- portions in the positive-unlabeled classification setting. Proceedings of the AAAI Conference on Artificial Intelligence34(04), 6729–6736 (2020) https://doi.org/ 10.1609/aaai.v34i04.6151

  45. [45]

    In: Advances in Neural Information Processing Systems, vol

    Plessis, M.C., Niu, G., Sugiyama, M.: Analysis of learning from positive and unlabeled data. In: Advances in Neural Information Processing Systems, vol. 27, p. 9. Curran Associates, Inc, Montreal, Canada (2014)

  46. [46]

    In: 32nd International Conference on Machine Learning

    Plessis, M.D., Niu, G., Sugiyama, M.: Convex formulation for learning from posi- tive and unlabeled data. In: 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 1386–1394. JMLR, Lille France (2015)

  47. [47]

    In: Advances in Neural Information Processing Systems

    Kiryo, R., Niu, G., Plessis, M., Sugiyama, M.: Positive-unlabeled learning with non-negative risk estimator. In: Advances in Neural Information Processing Systems. Curran Associates Inc., Long Beach, CA, USA (2017)

  48. [48]

    In: 13th International Joint Conference on Artificial Intelligence, IJCAI-21, pp

    Su, G., Chen, W., Xu, M.: Positive-unlabeled learning from imbalanced data. In: 13th International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 24 2995–3001. Curran Associates, Inc, Virtual conference (2021). https://doi.org/ 10.24963/ijcai.2021/412

  49. [49]

    Machine Learning113(7), 4547–4578 (2024) https://doi.org/10.1007/s10994-023-06323-y

    Ortega V´ azquez, C., Broucke, S., De Weerdt, J.: Hellinger distance decision trees for pu learning in imbalanced data sets. Machine Learning113(7), 4547–4578 (2024) https://doi.org/10.1007/s10994-023-06323-y

  50. [50]

    In: NIPS, p

    Niu, G., Plessis, M.c., Sakai, T., Ma, Y., M., S.: Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In: NIPS, p. 9. Curran Associates Inc., Barcelona Spain (2016)

  51. [51]

    IEEE Transactions on Pattern Analysis and Machine Intelligence 42(2), 318–327 (2020) https://doi.org/10.1109/TPAMI.2018.2858826

    Lin, T.-Y., Goyal, P., Girshick, R., He, K., Doll´ ar, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(2), 318–327 (2020) https://doi.org/10.1109/TPAMI.2018.2858826

  52. [52]

    In: Proceedings of the 37th International Conference on Machine Learning, p

    Chou, Y.-T., Niu, G., Lin, H.-T., Sugiyama, M.: Unbiased risk estimators can mislead: a case study of learning with complementary labels. In: Proceedings of the 37th International Conference on Machine Learning, p. 10. JMLR, Virtual conference (2020)

  53. [53]

    In: IEEE Conference on Computer Vision and Pattern Recognition

    Charoenphakdee, N., Vongkulbhisal, J., Chairatanakul, N., Sugiyama, M.: On focal loss for class-posterior probability estimation: A theoretical perspective. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Nashville, TN, USA (2020)

  54. [54]

    IEEE Transactions on Pattern Analysis & Machine Intelligence44(08), 4163–4177 (2022) https://doi.org/10.1109/TPAMI.2021.3061456

    Gong, C., Wang, Q., Liu, T., Han, B., You, J., Yang, J., Tao, D.: Instance- Dependent Positive and Unlabeled Learning With Labeling Bias Estimation . IEEE Transactions on Pattern Analysis & Machine Intelligence44(08), 4163–4177 (2022) https://doi.org/10.1109/TPAMI.2021.3061456

  55. [55]

    In: 2021 IEEE International Conference on Data Mining (ICDM), pp

    Yoo, J., Kim, J., Yoon, H., Kim, G., Jang, C., Kang, U.: Accurate graph-based pu learning without class prior. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 827–836. IEEE, Virtual conference (2021). https://doi.org/ 10.1109/ICDM51629.2021.00094

  56. [56]

    Neural Computation14(1), 21–41 (2002)

    Saerens, M., Latinne, P., Decaestecker, C.: Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure. Neural Computation14(1), 21–41 (2002)

  57. [57]

    Neural Networks50, 110–119 (2014) https://doi.org/10.1016/j.neunet.2013.11.010

    du Plessis, M.C., Sugiyama, M.: Semi-supervised learning of class balance under class-prior change by distribution matching. Neural Networks50, 110–119 (2014) https://doi.org/10.1016/j.neunet.2013.11.010

  58. [58]

    Applied Intelligence55(1), 15 (2025)

    Zavitsanos, E., Kelesis, D., Paliouras, G.: Calibrating tabtransformer for financial misstatement detection. Applied Intelligence55(1), 15 (2025)

  59. [59]

    Contemporary accounting research28(1), 17–82 (2011)

    Dechow, P.M., Ge, W., Larson, C.R., Sloan, R.G.: Predicting material accounting 25 misstatements. Contemporary accounting research28(1), 17–82 (2011)

  60. [60]

    Management Science56(7), 1146–1160 (2010)

    Cecchini, M., Aytug, H., Koehler, G.J., Pathak, P.: Detecting management fraud in public companies. Management Science56(7), 1146–1160 (2010)

  61. [61]

    Dyck, A., Morse, A., Zingales, L.: Who blows the whistle on corporate fraud? The journal of finance65(6), 2213–2253 (2010)

  62. [62]

    TabTransformer: Tabular data modeling using contextual embeddings,

    Huang, X., Khetan, A., Cvitkovic, M., Karnin, Z.: TabTransformer: Tabular Data Modeling Using Contextual Embeddings. arXiv (2020). https://doi.org/10. 48550/ARXIV.2012.06678

  63. [63]

    In: Advances in Neural Information Processing Systems, vol

    Liu, H., Dai, Z., So, D., Le, Q.V.: Pay attention to mlps. In: Advances in Neural Information Processing Systems, vol. 34, pp. 9204–9215. Curran Associates, Inc., Virtual conference (2021) 26