arxiv: 2605.14467 · v1 · submitted 2026-05-14 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Focused PU learning from imbalanced data

Elias Zavitsanos , Georgios Paliouras

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:44 UTC · model grok-4.3

classification 💻 cs.LG

keywords positive-unlabeled learningimbalanced dataempirical risk estimatorbinary classificationSCAR labelingSAR labelingfinancial misstatement detection

0 comments

The pith

A focused empirical risk estimator enables effective training of binary classifiers from positive and unlabeled examples in highly imbalanced data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that standard positive-unlabeled learning breaks down when negatives heavily outnumber positives and when positives look like negatives. It introduces a focused empirical risk estimator that folds both the labeled positives and the large unlabeled pool into the training objective. This estimator is evaluated on controlled imbalanced benchmarks under the SCAR and SAR labeling models and then applied to the task of spotting financial misstatements. A reader would care because many real detection problems, from fraud to gene identification, naturally produce exactly this kind of partial, skewed data.

Core claim

The authors claim that a focused empirical risk estimator, by directly incorporating both positive and unlabeled instances, produces binary classifiers that reach state-of-the-art accuracy on imbalanced positive-unlabeled datasets under the SCAR and SAR labeling mechanisms and that the same estimator delivers practical gains when used for financial misstatement detection.

What carries the argument

The focused empirical risk estimator, which re-weights the contribution of labeled positives and unlabeled examples to the overall risk so that imbalance does not dominate the optimization.

If this is right

Classifiers trained with the estimator outperform prior PU methods on imbalanced data under both SCAR and SAR labeling.
The same estimator yields measurable improvement on a concrete financial misstatement detection task.
Performance gains hold when the unlabeled pool contains a realistic mixture of negatives and hidden positives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The estimator could be combined with modern representation learners to handle high-dimensional imbalanced data without changing the risk formulation.
If real labeling processes deviate from SCAR and SAR, the method may still serve as a strong baseline that requires only modest adaptation.
The approach suggests a route for extending other risk-based PU algorithms to extreme imbalance without introducing new hyperparameters.

Load-bearing premise

The estimator keeps working even when positive examples closely resemble negative ones and when the labeling process matches the SCAR or SAR models used in the experiments.

What would settle it

Running the method on a dataset with extreme imbalance and near-identical positive and negative distributions, then observing that it matches or underperforms standard PU baselines, would falsify the central claim.

read the original abstract

We propose a new method of learning from positive and unlabeled (PU) examples in highly imbalanced datasets. Many real-world problems, such as disease gene identification, targeted marketing, fraud detection, and recommender systems, are hard to address with machine learning methods, due to limited labeled data. Often, training data comprises positive and unlabeled instances, the latter typically being dominated by negative, but including also several positive instances. While PU learning is well-studied, few methods address imbalanced settings or hard-to-detect positive examples that resemble negative ones. Our approach uses a focused empirical risk estimator, incorporating both positive and unlabeled examples to train binary classifiers. Empirical evaluations demonstrate state-of-the-art performance on imbalanced datasets under two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR). Beyond these controlled experiments, we demonstrate the value of the proposed method in the real-world application of financial misstatement detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A practical adjustment to the PU risk estimator for imbalanced data that shows gains on standard benchmarks and one real task, but lacks targeted checks where positives overlap heavily with negatives.

read the letter

The paper introduces a focused empirical risk estimator meant to handle positive-unlabeled learning when positives are rare and the unlabeled pool is dominated by negatives. It tests the approach under the usual SCAR and SAR labeling assumptions and includes a financial misstatement detection case as a real-world check. That application is the clearest strength; it shows the method can be dropped into an existing domain problem without needing full negative labels.

Referee Report

2 major / 1 minor

Summary. The paper proposes a focused empirical risk estimator for positive-unlabeled (PU) learning on highly imbalanced datasets. It incorporates both labeled positives and unlabeled examples (mostly negatives) to train binary classifiers, evaluates the method under SCAR and SAR labeling mechanisms, claims state-of-the-art performance on imbalanced data, and demonstrates utility on a real-world financial misstatement detection task.

Significance. If the focused estimator can be shown to deliver reliable gains precisely when positives closely resemble negatives under severe imbalance, the work would address a practically relevant gap in PU learning for applications such as fraud detection and gene identification. The inclusion of a real-world case study strengthens potential impact, but the absence of methodological derivations and targeted robustness checks limits the current assessment of significance.

major comments (2)

[Abstract] Abstract: The central claim of state-of-the-art performance is asserted without any equations, derivation of the focused empirical risk estimator, description of baselines, error bars, or statistical tests, so the support for the claim cannot be evaluated from the provided text.
[Empirical evaluations] Empirical evaluations: The abstract identifies hard-to-detect positives that resemble negatives as a key challenge, yet the reported results consist only of aggregate performance on imbalanced datasets under SCAR/SAR without controlled overlap experiments, feature-separation ablations, or similarity metrics between classes; this leaves the advantage over prior PU baselines unproven in the regime the method claims to address.

minor comments (1)

[Abstract] The abstract refers to 'two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR)' without clarifying the precise difference or citing the original definitions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review of our manuscript. We address each major comment in detail below and have made revisions to strengthen the presentation of our contributions.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim of state-of-the-art performance is asserted without any equations, derivation of the focused empirical risk estimator, description of baselines, error bars, or statistical tests, so the support for the claim cannot be evaluated from the provided text.

Authors: The abstract is necessarily concise and serves as a high-level overview; the full derivation of the focused empirical risk estimator appears in Section 3, baseline methods are detailed in Section 4.1, and all reported results in Section 5 include error bars together with statistical significance tests (paired t-tests at p<0.05). We agree that a brief reference to these elements would improve the abstract and have revised it to mention the estimator's key form and the evaluation protocol with error bars and tests. revision: partial
Referee: [Empirical evaluations] Empirical evaluations: The abstract identifies hard-to-detect positives that resemble negatives as a key challenge, yet the reported results consist only of aggregate performance on imbalanced datasets under SCAR/SAR without controlled overlap experiments, feature-separation ablations, or similarity metrics between classes; this leaves the advantage over prior PU baselines unproven in the regime the method claims to address.

Authors: We acknowledge that the current experiments report aggregate performance under SCAR and SAR on imbalanced data and do not contain explicit controlled overlap studies or similarity metrics. While the SAR mechanism already induces varying degrees of positive-negative resemblance, we agree that targeted ablations would better isolate the regime of interest. We have added a new subsection (5.4) containing synthetic overlap experiments with adjustable class similarity, feature-separation ablations, and cosine-similarity metrics between class centroids, which demonstrate the focused estimator's gains precisely when positives closely resemble negatives. revision: yes

Circularity Check

0 steps flagged

No circularity: new focused estimator introduced with independent empirical validation

full rationale

The paper introduces a focused empirical risk estimator for PU learning on imbalanced data as a novel construction that incorporates both positive and unlabeled examples. No equations, fitted parameters, or self-citations are shown that reduce any claimed prediction or result to the inputs by definition or construction. The method is presented as a distinct estimator rather than a re-expression of prior quantities, with performance claims resting on empirical evaluations under SCAR and SAR labeling mechanisms plus a real-world application. No self-definitional steps, fitted-input predictions, load-bearing self-citations, or ansatz smuggling are identifiable from the provided text. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The approach implicitly relies on standard PU learning assumptions (e.g., that unlabeled data contains a mixture of positives and negatives) but these are not enumerated.

pith-pipeline@v0.9.0 · 5450 in / 1019 out tokens · 33715 ms · 2026-05-15T01:44:57.609669+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We derive a non-negative risk estimator for PU data that relies on Focal Loss [50], which divides samples into hard-to-classify and easy-to-classify. ... ˜RiFPU(g) = ... max{0, ...} (Eq. 11)
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Empirical evaluations demonstrate state-of-the-art performance on imbalanced datasets under two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 1 internal anchor

[1]

IEEE Transactions on Neural Networks and Learning Systems25(5), 845–869 (2014) https://doi.org/10.1109/TNNLS.2013.2292894

Frenay, B., Verleysen, M.: Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems25(5), 845–869 (2014) https://doi.org/10.1109/TNNLS.2013.2292894

work page doi:10.1109/tnnls.2013.2292894 2014
[2]

National Science Review5(1), 44–53 (2017) https://doi.org/10.1093/nsr/nwx106

Zhou, Z.-H.: A brief introduction to weakly supervised learning. National Science Review5(1), 44–53 (2017) https://doi.org/10.1093/nsr/nwx106

work page doi:10.1093/nsr/nwx106 2017
[3]

Machine Learning109(4), 719–760 (2020) https://doi.org/10.1007/ s10994-020-05877-5

Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Machine Learning109(4), 719–760 (2020) https://doi.org/10.1007/ s10994-020-05877-5

work page 2020
[4]

Neurocomput- ing160, 73–84 (2015) https://doi.org/10.1016/j.neucom.2014.10.081 20

Claesen, M., De Smet, F., Suykens, J., De Moor, B.: A robust ensemble approach to learn from positive and unlabeled data using svm base models. Neurocomput- ing160, 73–84 (2015) https://doi.org/10.1016/j.neucom.2014.10.081 20

work page doi:10.1016/j.neucom.2014.10.081 2015
[5]

Building Classifiers to Predict the Start of Glucose-Lowering Pharmacotherapy Using Belgian Health Expenditure Data

Claesen, M., De Smet, F., Gillard, P., Mathieu, C., De Moor, B.: Building classi- fiers to predict the start of glucose-lowering pharmacotherapy using belgian health expenditure data. ArXivabs/1504.07389(2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[6]

Decision Support Systems111, 13–26 (2018) https://doi.org/10

Stripling, E., Baesens, B., Chizi, B., vanden Broucke, S.: Isolation-based condi- tional anomaly detection on mixed-attribute data to uncover workers’ compen- sation fraud. Decision Support Systems111, 13–26 (2018) https://doi.org/10. 1016/j.dss.2018.04.001

work page 2018
[7]

Journal of Accounting Research58(1), 199–235 (2020)

Bao, Y., Ke, B., Li, B., Yu, Y.J., Zhang, J.: Detecting accounting fraud in pub- licly traded us firms using a machine learning approach. Journal of Accounting Research58(1), 199–235 (2020)

work page 2020
[8]

Review of Accounting Studies26, 468–519 (2021)

Bertomeu, J., Cheynel, E., Floyd, E., W., P.: Using machine learning to detect misstatements. Review of Accounting Studies26, 468–519 (2021)

work page 2021
[9]

In: Pro- ceedings of the Second ACM International Conference on AI in Finance

Zavitsanos, E., Mavroeidis, D., Bougiatiotis, K., Spyropoulou, E., Loukas, L., Paliouras, G.: Financial misstatement detection: A realistic evaluation. In: Pro- ceedings of the Second ACM International Conference on AI in Finance. ACM, Virtual event (2021). https://doi.org/10.1145/3490354.3494453

work page doi:10.1145/3490354.3494453 2021
[10]

The VLDB Journal24(6), 707–730 (2015) https://doi.org/10.1007/s00778-015-0394-1

Gal´ arraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in onto- logical knowledge bases with amie+. The VLDB Journal24(6), 707–730 (2015) https://doi.org/10.1007/s00778-015-0394-1

work page doi:10.1007/s00778-015-0394-1 2015
[11]

Computational Biology and Chemistry76, 23–31 (2018) https://doi.org/10.1016/j.compbiolchem.2018

Vasighizaker, A., Jalili, S.: C-pugp: A cluster-based positive unlabeled learning method for disease gene prediction and prioritization. Computational Biology and Chemistry76, 23–31 (2018) https://doi.org/10.1016/j.compbiolchem.2018. 05.022

work page doi:10.1016/j.compbiolchem.2018 2018
[12]

In: Proceedings of the 2018 World Wide Web Conference, pp

Zupanc, K., Davis, J.: Estimating rule quality for knowledge base completion with the relationship between coverage assumption. In: Proceedings of the 2018 World Wide Web Conference, pp. 1073–1081. International World Wide Web Conferences Steering Committee, Lyon, France (2018). https://doi.org/10.1145/ 3178876.3186006

work page arXiv 2018
[13]

Scientific Reports10(1), 22295 (2020) https://doi.org/10.1038/s41598-020-78033-7

Mignone, P., Pio, G., Dˇ zeroski, S., Ceci, M.: Multi-task learning for the simulta- neous reconstruction of the human and mouse gene regulatory networks. Scientific Reports10(1), 22295 (2020) https://doi.org/10.1038/s41598-020-78033-7

work page doi:10.1038/s41598-020-78033-7 2020
[14]

Bioinformatics36(5), 1553– 1561 (2019) https://doi.org/10.1093/bioinformatics/btz781

Mignone, P., Pio, G., D’Elia, D., Ceci, M.: Exploiting transfer learning for the reconstruction of the human gene regulatory network. Bioinformatics36(5), 1553– 1561 (2019) https://doi.org/10.1093/bioinformatics/btz781

work page doi:10.1093/bioinformatics/btz781 2019
[15]

In: Pro- ceedings of the 2000 International Conference on Artificial Intelligence ICAI

Japkowicz, N.: The class imbalance problem: Significance and strategies. In: Pro- ceedings of the 2000 International Conference on Artificial Intelligence ICAI. Computer Science Research, Education and Applications Press, Athens, Georgia, 21 USA (2000)

work page 2000
[16]

Progress in Artificial Intelligence5(4), 221–232 (2016) https://doi.org/10

Krawczyk, B.: Learning from imbalanced data: open challenges and future direc- tions. Progress in Artificial Intelligence5(4), 221–232 (2016) https://doi.org/10. 1007/s13748-016-0094-0

work page 2016
[17]

In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp

Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–220. ACM, Las Vegas, Nevada, USA (2008). https://doi.org/10.1145/1401890.1401920

work page doi:10.1145/1401890.1401920 2008
[18]

In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD

Bekker, J., Robberechts, P., Davis, J.: Beyond the selected completely at random assumption for learning from positive and unlabeled data. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD. Springer, Wurzburg, Germany (2019). https://doi.org/10.1007/ 978-3-030-46147-8 5

work page 2019
[19]

IEEE Transactions on Knowledge and Data Engineering18(1), 6–20 (2006) https://doi.org/10.1109/TKDE.2006.16

Fung, G.P.C., Yu, J.X., Lu, H., Yu, P.S.: Text classification without negative examples revisit. IEEE Transactions on Knowledge and Data Engineering18(1), 6–20 (2006) https://doi.org/10.1109/TKDE.2006.16

work page doi:10.1109/tkde.2006.16 2006
[20]

In: International Conference on Machine Learning (ICML), p

Liu, B., Yu, P., Li, X.: Partially supervised classification of text documents. In: International Conference on Machine Learning (ICML), p. 8. AAAI press, Washington, DC USA (2003). https://doi.org/10.1385/1-59259-358-5:387

work page doi:10.1385/1-59259-358-5:387 2003
[21]

In: Machine Learning: ECML 2005, pp

Li, X.-L., Liu, B.: Learning from positive and unlabeled examples with different data distributions. In: Machine Learning: ECML 2005, pp. 218–229. Springer, Porto, Portugal (2005)

work page 2005
[22]

In: International Joint Conference on Artificial Intelligence, pp

Li, X., Liu, B., Ng, S.-K.: Learning to identify unexpected instances in the test set. In: International Joint Conference on Artificial Intelligence, pp. 2802–

work page
[23]

https://api

Morgan Kaufmann Publishers Inc., Hyderabad India (2007). https://api. semanticscholar.org/CorpusID:14296672

work page 2007
[24]

In: Machine Learning and Data Mining in Pattern Recognition, pp

Yu, S., Li, C.: Pe-puc: A graph based pu-learning approach for text classification. In: Machine Learning and Data Mining in Pattern Recognition, pp. 574–584. Springer, Leipzig, Germany (2007)

work page 2007
[25]

In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp

Yu, H., Han, J., Chang, K.C.-C.: Pebl: Positive example based learning for web page classification using svm. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 239–248. ACM, Edmonton Alberta Canada (2002). https://doi.org/10.1145/775047.775083

work page doi:10.1145/775047.775083 2002
[26]

IEEE Transactions on Knowledge and Data Engineering16(2003) https://doi.org/10.1109/TKDE.2004.1264823 22

Yu, H., Han, J., Chang, K.: Pebl: Web page classification without negative examples. IEEE Transactions on Knowledge and Data Engineering16(2003) https://doi.org/10.1109/TKDE.2004.1264823 22

work page doi:10.1109/tkde.2004.1264823 2003
[27]

Knowledge and Information Systems16(3), 281–301 (2008) https://doi.org/10.1007/s10115-007-0107-1

Peng, T., Zuo, W., He, F.: Svm based adaptive learning method for text clas- sification from positive and unlabeled documents. Knowledge and Information Systems16(3), 281–301 (2008) https://doi.org/10.1007/s10115-007-0107-1

work page doi:10.1007/s10115-007-0107-1 2008
[28]

In: 18th International Joint Conference on Artificial Intelligence, pp

Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: 18th International Joint Conference on Artificial Intelligence, pp. 587–594. Morgan Kaufmann Publishers Inc, Acapulco, Mexico (2003)

work page 2003
[29]

In: Li, H., M` arquez, L

Li, X.-L., Liu, B., Ng, S.-K.: Negative training data can be harmful to text classification. In: Li, H., M` arquez, L. (eds.) Empirical Methods in Natural Lan- guage Processing, pp. 218–228. ACM, Cambridge Massachusetts (2010). https: //aclanthology.org/D10-1022

work page 2010
[30]

In: 19th International Conference on Neural Infor- mation Processing, pp

Chaudhari, S., Shevade, S.: Learning from positive and unlabelled examples using maximum margin clustering. In: 19th International Conference on Neural Infor- mation Processing, pp. 465–473. ACM, Doha Qatar (2012). https://doi.org/10. 1007/978-3-642-34487-9 56

work page 2012
[31]

https://api.semanticscholar.org/CorpusID:14424311

Uden, M.A.: Rocchio : Relevance Feedback in Learning Classification Algorithms (2007). https://api.semanticscholar.org/CorpusID:14424311

work page 2007
[32]

Data Mining and Knowledge Discovery37(3), 1301–1325 (2023) https://doi.org/10.1007/ s10618-023-00925-9

Ortega V´ azquez, C., Broucke, S., De Weerdt, J.: A two-step anomaly detec- tion based method for pu classification in imbalanced data sets. Data Mining and Knowledge Discovery37(3), 1301–1325 (2023) https://doi.org/10.1007/ s10618-023-00925-9

work page 2023
[33]

In: Third IEEE International Conference on Data Mining, pp

Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using posi- tive and unlabeled examples. In: Third IEEE International Conference on Data Mining, pp. 179–186. IEEE, Melbourne, Florida (2003). https://doi.org/10.1109/ ICDM.2003.1250918

work page arXiv 2003
[34]

In: Wang, J., Yen, G.G., Polycarpou, M.M

Ke, T., Yang, B., Zhen, L., Tan, J., Li, Y., Jing, L.: Building high-performance classifiers using positive and unlabeled examples for text classification. In: Wang, J., Yen, G.G., Polycarpou, M.M. (eds.) Advances in Neural Networks – ISNN 2012, pp. 187–195. Springer, Shenyang, China (2012)

work page 2012
[35]

In: Advanced Data Mining and Applications, pp

Liu, Z., Shi, W., Li, D., Qin, Q.: Partially supervised classification – based on weighted unlabeled samples support vector machine. In: Advanced Data Mining and Applications, pp. 118–129. Springer, Wuhan, China (2005)

work page 2005
[36]

In: 20th International Conference on International Conference on Machine Learning, pp

Lee, W.S., Liu, B.: Learning with positive and unlabeled examples using weighted logistic regression. In: 20th International Conference on International Conference on Machine Learning, pp. 448–455. AAAI Press, Washington, DC USA (2003)

work page 2003
[37]

Methods in molecular biology939, 47–58 (2013) 23

Mordelet, F., Vert, J.-P.: Supervised inference of gene regulatory networks from positive and unlabeled examples. Methods in molecular biology939, 47–58 (2013) 23

work page 2013
[38]

Celestial Mechan- ics and Dynamical Astronomy83, 155–169 (2002) https://doi.org/10.1023/A: 1020143116091

Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classi- fiers. Neural Processing Letters9(3), 293–300 (1999) https://doi.org/10.1023/A: 1018628609742

work page doi:10.1023/a: 1999
[39]

Applied Intelligence48(8), 2373–2392 (2018) https:// doi.org/10.1007/s10489-017-1076-z

Ke, T., Jing, L., Lv, H., Zhang, L., Hu, Y.: Global and local learning from positive and unlabeled examples. Applied Intelligence48(8), 2373–2392 (2018) https:// doi.org/10.1007/s10489-017-1076-z

work page doi:10.1007/s10489-017-1076-z 2018
[40]

Physica A: Statistical Mechanics and its Applications509, 422–438 (2018) https://doi.org/10.1016/j.physa.2018

Ke, T., Lv, H., Sun, M., Zhang, L.: A biased least squares support vector machine based on mahalanobis distance for pu learning. Physica A: Statistical Mechanics and its Applications509, 422–438 (2018) https://doi.org/10.1016/j.physa.2018. 05.128

work page doi:10.1016/j.physa.2018 2018
[41]

Proceedings of the AAAI Conference on Artificial Intelligence32(1), 2712–2719 (2018) https://doi.org/10.1609/aaai.v32i1.11715

Bekker, J., Davis, J.: Estimating the class prior in positive and unlabeled data through decision tree induction. Proceedings of the AAAI Conference on Artificial Intelligence32(1), 2712–2719 (2018) https://doi.org/10.1609/aaai.v32i1.11715

work page doi:10.1609/aaai.v32i1.11715 2018
[42]

In: Proceedings of The 33rd International Conference on Machine Learning

Ramaswamy, H., Scott, C., Tewari, A.: Mixture proportion estimation via kernel embeddings of distributions. In: Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 2052–2060. JMLR, New York, New York, USA (2016). https://proceedings.mlr. press/v48/ramaswamy16.html

work page 2052
[43]

Machine Learning106(4), 463–492 (2017) https: //doi.org/10.1007/s10994-016-5604-6

Plessis, M.C., Niu, G., Sugiyama, M.: Class-prior estimation for learning from positive and unlabeled data. Machine Learning106(4), 463–492 (2017) https: //doi.org/10.1007/s10994-016-5604-6

work page doi:10.1007/s10994-016-5604-6 2017
[44]

Proceedings of the AAAI Conference on Artificial Intelligence34(04), 6729–6736 (2020) https://doi.org/ 10.1609/aaai.v34i04.6151

Zeiberg, D., Jain, S., Radivojac, P.: Fast nonparametric estimation of class pro- portions in the positive-unlabeled classification setting. Proceedings of the AAAI Conference on Artificial Intelligence34(04), 6729–6736 (2020) https://doi.org/ 10.1609/aaai.v34i04.6151

work page doi:10.1609/aaai.v34i04.6151 2020
[45]

In: Advances in Neural Information Processing Systems, vol

Plessis, M.C., Niu, G., Sugiyama, M.: Analysis of learning from positive and unlabeled data. In: Advances in Neural Information Processing Systems, vol. 27, p. 9. Curran Associates, Inc, Montreal, Canada (2014)

work page 2014
[46]

In: 32nd International Conference on Machine Learning

Plessis, M.D., Niu, G., Sugiyama, M.: Convex formulation for learning from posi- tive and unlabeled data. In: 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 1386–1394. JMLR, Lille France (2015)

work page 2015
[47]

In: Advances in Neural Information Processing Systems

Kiryo, R., Niu, G., Plessis, M., Sugiyama, M.: Positive-unlabeled learning with non-negative risk estimator. In: Advances in Neural Information Processing Systems. Curran Associates Inc., Long Beach, CA, USA (2017)

work page 2017
[48]

In: 13th International Joint Conference on Artificial Intelligence, IJCAI-21, pp

Su, G., Chen, W., Xu, M.: Positive-unlabeled learning from imbalanced data. In: 13th International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 24 2995–3001. Curran Associates, Inc, Virtual conference (2021). https://doi.org/ 10.24963/ijcai.2021/412

work page doi:10.24963/ijcai.2021/412 2021
[49]

Machine Learning113(7), 4547–4578 (2024) https://doi.org/10.1007/s10994-023-06323-y

Ortega V´ azquez, C., Broucke, S., De Weerdt, J.: Hellinger distance decision trees for pu learning in imbalanced data sets. Machine Learning113(7), 4547–4578 (2024) https://doi.org/10.1007/s10994-023-06323-y

work page doi:10.1007/s10994-023-06323-y 2024
[50]

In: NIPS, p

Niu, G., Plessis, M.c., Sakai, T., Ma, Y., M., S.: Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In: NIPS, p. 9. Curran Associates Inc., Barcelona Spain (2016)

work page 2016
[51]

IEEE Transactions on Pattern Analysis and Machine Intelligence 42(2), 318–327 (2020) https://doi.org/10.1109/TPAMI.2018.2858826

Lin, T.-Y., Goyal, P., Girshick, R., He, K., Doll´ ar, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(2), 318–327 (2020) https://doi.org/10.1109/TPAMI.2018.2858826

work page doi:10.1109/tpami.2018.2858826 2020
[52]

In: Proceedings of the 37th International Conference on Machine Learning, p

Chou, Y.-T., Niu, G., Lin, H.-T., Sugiyama, M.: Unbiased risk estimators can mislead: a case study of learning with complementary labels. In: Proceedings of the 37th International Conference on Machine Learning, p. 10. JMLR, Virtual conference (2020)

work page 2020
[53]

In: IEEE Conference on Computer Vision and Pattern Recognition

Charoenphakdee, N., Vongkulbhisal, J., Chairatanakul, N., Sugiyama, M.: On focal loss for class-posterior probability estimation: A theoretical perspective. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Nashville, TN, USA (2020)

work page 2020
[54]

IEEE Transactions on Pattern Analysis & Machine Intelligence44(08), 4163–4177 (2022) https://doi.org/10.1109/TPAMI.2021.3061456

Gong, C., Wang, Q., Liu, T., Han, B., You, J., Yang, J., Tao, D.: Instance- Dependent Positive and Unlabeled Learning With Labeling Bias Estimation . IEEE Transactions on Pattern Analysis & Machine Intelligence44(08), 4163–4177 (2022) https://doi.org/10.1109/TPAMI.2021.3061456

work page doi:10.1109/tpami.2021.3061456 2022
[55]

In: 2021 IEEE International Conference on Data Mining (ICDM), pp

Yoo, J., Kim, J., Yoon, H., Kim, G., Jang, C., Kang, U.: Accurate graph-based pu learning without class prior. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 827–836. IEEE, Virtual conference (2021). https://doi.org/ 10.1109/ICDM51629.2021.00094

work page doi:10.1109/icdm51629.2021.00094 2021
[56]

Neural Computation14(1), 21–41 (2002)

Saerens, M., Latinne, P., Decaestecker, C.: Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure. Neural Computation14(1), 21–41 (2002)

work page 2002
[57]

Neural Networks50, 110–119 (2014) https://doi.org/10.1016/j.neunet.2013.11.010

du Plessis, M.C., Sugiyama, M.: Semi-supervised learning of class balance under class-prior change by distribution matching. Neural Networks50, 110–119 (2014) https://doi.org/10.1016/j.neunet.2013.11.010

work page doi:10.1016/j.neunet.2013.11.010 2014
[58]

Applied Intelligence55(1), 15 (2025)

Zavitsanos, E., Kelesis, D., Paliouras, G.: Calibrating tabtransformer for financial misstatement detection. Applied Intelligence55(1), 15 (2025)

work page 2025
[59]

Contemporary accounting research28(1), 17–82 (2011)

Dechow, P.M., Ge, W., Larson, C.R., Sloan, R.G.: Predicting material accounting 25 misstatements. Contemporary accounting research28(1), 17–82 (2011)

work page 2011
[60]

Management Science56(7), 1146–1160 (2010)

Cecchini, M., Aytug, H., Koehler, G.J., Pathak, P.: Detecting management fraud in public companies. Management Science56(7), 1146–1160 (2010)

work page 2010
[61]

Dyck, A., Morse, A., Zingales, L.: Who blows the whistle on corporate fraud? The journal of finance65(6), 2213–2253 (2010)

work page 2010
[62]

TabTransformer: Tabular data modeling using contextual embeddings,

Huang, X., Khetan, A., Cvitkovic, M., Karnin, Z.: TabTransformer: Tabular Data Modeling Using Contextual Embeddings. arXiv (2020). https://doi.org/10. 48550/ARXIV.2012.06678

work page arXiv 2020
[63]

In: Advances in Neural Information Processing Systems, vol

Liu, H., Dai, Z., So, D., Le, Q.V.: Pay attention to mlps. In: Advances in Neural Information Processing Systems, vol. 34, pp. 9204–9215. Curran Associates, Inc., Virtual conference (2021) 26

work page 2021