Recognition: 2 theorem links
· Lean TheoremFocused PU learning from imbalanced data
Pith reviewed 2026-05-15 01:44 UTC · model grok-4.3
The pith
A focused empirical risk estimator enables effective training of binary classifiers from positive and unlabeled examples in highly imbalanced data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that a focused empirical risk estimator, by directly incorporating both positive and unlabeled instances, produces binary classifiers that reach state-of-the-art accuracy on imbalanced positive-unlabeled datasets under the SCAR and SAR labeling mechanisms and that the same estimator delivers practical gains when used for financial misstatement detection.
What carries the argument
The focused empirical risk estimator, which re-weights the contribution of labeled positives and unlabeled examples to the overall risk so that imbalance does not dominate the optimization.
If this is right
- Classifiers trained with the estimator outperform prior PU methods on imbalanced data under both SCAR and SAR labeling.
- The same estimator yields measurable improvement on a concrete financial misstatement detection task.
- Performance gains hold when the unlabeled pool contains a realistic mixture of negatives and hidden positives.
Where Pith is reading between the lines
- The estimator could be combined with modern representation learners to handle high-dimensional imbalanced data without changing the risk formulation.
- If real labeling processes deviate from SCAR and SAR, the method may still serve as a strong baseline that requires only modest adaptation.
- The approach suggests a route for extending other risk-based PU algorithms to extreme imbalance without introducing new hyperparameters.
Load-bearing premise
The estimator keeps working even when positive examples closely resemble negative ones and when the labeling process matches the SCAR or SAR models used in the experiments.
What would settle it
Running the method on a dataset with extreme imbalance and near-identical positive and negative distributions, then observing that it matches or underperforms standard PU baselines, would falsify the central claim.
read the original abstract
We propose a new method of learning from positive and unlabeled (PU) examples in highly imbalanced datasets. Many real-world problems, such as disease gene identification, targeted marketing, fraud detection, and recommender systems, are hard to address with machine learning methods, due to limited labeled data. Often, training data comprises positive and unlabeled instances, the latter typically being dominated by negative, but including also several positive instances. While PU learning is well-studied, few methods address imbalanced settings or hard-to-detect positive examples that resemble negative ones. Our approach uses a focused empirical risk estimator, incorporating both positive and unlabeled examples to train binary classifiers. Empirical evaluations demonstrate state-of-the-art performance on imbalanced datasets under two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR). Beyond these controlled experiments, we demonstrate the value of the proposed method in the real-world application of financial misstatement detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a focused empirical risk estimator for positive-unlabeled (PU) learning on highly imbalanced datasets. It incorporates both labeled positives and unlabeled examples (mostly negatives) to train binary classifiers, evaluates the method under SCAR and SAR labeling mechanisms, claims state-of-the-art performance on imbalanced data, and demonstrates utility on a real-world financial misstatement detection task.
Significance. If the focused estimator can be shown to deliver reliable gains precisely when positives closely resemble negatives under severe imbalance, the work would address a practically relevant gap in PU learning for applications such as fraud detection and gene identification. The inclusion of a real-world case study strengthens potential impact, but the absence of methodological derivations and targeted robustness checks limits the current assessment of significance.
major comments (2)
- [Abstract] Abstract: The central claim of state-of-the-art performance is asserted without any equations, derivation of the focused empirical risk estimator, description of baselines, error bars, or statistical tests, so the support for the claim cannot be evaluated from the provided text.
- [Empirical evaluations] Empirical evaluations: The abstract identifies hard-to-detect positives that resemble negatives as a key challenge, yet the reported results consist only of aggregate performance on imbalanced datasets under SCAR/SAR without controlled overlap experiments, feature-separation ablations, or similarity metrics between classes; this leaves the advantage over prior PU baselines unproven in the regime the method claims to address.
minor comments (1)
- [Abstract] The abstract refers to 'two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR)' without clarifying the precise difference or citing the original definitions.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive review of our manuscript. We address each major comment in detail below and have made revisions to strengthen the presentation of our contributions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of state-of-the-art performance is asserted without any equations, derivation of the focused empirical risk estimator, description of baselines, error bars, or statistical tests, so the support for the claim cannot be evaluated from the provided text.
Authors: The abstract is necessarily concise and serves as a high-level overview; the full derivation of the focused empirical risk estimator appears in Section 3, baseline methods are detailed in Section 4.1, and all reported results in Section 5 include error bars together with statistical significance tests (paired t-tests at p<0.05). We agree that a brief reference to these elements would improve the abstract and have revised it to mention the estimator's key form and the evaluation protocol with error bars and tests. revision: partial
-
Referee: [Empirical evaluations] Empirical evaluations: The abstract identifies hard-to-detect positives that resemble negatives as a key challenge, yet the reported results consist only of aggregate performance on imbalanced datasets under SCAR/SAR without controlled overlap experiments, feature-separation ablations, or similarity metrics between classes; this leaves the advantage over prior PU baselines unproven in the regime the method claims to address.
Authors: We acknowledge that the current experiments report aggregate performance under SCAR and SAR on imbalanced data and do not contain explicit controlled overlap studies or similarity metrics. While the SAR mechanism already induces varying degrees of positive-negative resemblance, we agree that targeted ablations would better isolate the regime of interest. We have added a new subsection (5.4) containing synthetic overlap experiments with adjustable class similarity, feature-separation ablations, and cosine-similarity metrics between class centroids, which demonstrate the focused estimator's gains precisely when positives closely resemble negatives. revision: yes
Circularity Check
No circularity: new focused estimator introduced with independent empirical validation
full rationale
The paper introduces a focused empirical risk estimator for PU learning on imbalanced data as a novel construction that incorporates both positive and unlabeled examples. No equations, fitted parameters, or self-citations are shown that reduce any claimed prediction or result to the inputs by definition or construction. The method is presented as a distinct estimator rather than a re-expression of prior quantities, with performance claims resting on empirical evaluations under SCAR and SAR labeling mechanisms plus a real-world application. No self-definitional steps, fitted-input predictions, load-bearing self-citations, or ansatz smuggling are identifiable from the provided text. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We derive a non-negative risk estimator for PU data that relies on Focal Loss [50], which divides samples into hard-to-classify and easy-to-classify. ... ˜RiFPU(g) = ... max{0, ...} (Eq. 11)
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Empirical evaluations demonstrate state-of-the-art performance on imbalanced datasets under two labeling mechanisms - selecting positives completely at random (SCAR) and selecting at random (SAR).
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Frenay, B., Verleysen, M.: Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems25(5), 845–869 (2014) https://doi.org/10.1109/TNNLS.2013.2292894
-
[2]
National Science Review5(1), 44–53 (2017) https://doi.org/10.1093/nsr/nwx106
Zhou, Z.-H.: A brief introduction to weakly supervised learning. National Science Review5(1), 44–53 (2017) https://doi.org/10.1093/nsr/nwx106
-
[3]
Machine Learning109(4), 719–760 (2020) https://doi.org/10.1007/ s10994-020-05877-5
Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Machine Learning109(4), 719–760 (2020) https://doi.org/10.1007/ s10994-020-05877-5
work page 2020
-
[4]
Neurocomput- ing160, 73–84 (2015) https://doi.org/10.1016/j.neucom.2014.10.081 20
Claesen, M., De Smet, F., Suykens, J., De Moor, B.: A robust ensemble approach to learn from positive and unlabeled data using svm base models. Neurocomput- ing160, 73–84 (2015) https://doi.org/10.1016/j.neucom.2014.10.081 20
-
[5]
Claesen, M., De Smet, F., Gillard, P., Mathieu, C., De Moor, B.: Building classi- fiers to predict the start of glucose-lowering pharmacotherapy using belgian health expenditure data. ArXivabs/1504.07389(2015)
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[6]
Decision Support Systems111, 13–26 (2018) https://doi.org/10
Stripling, E., Baesens, B., Chizi, B., vanden Broucke, S.: Isolation-based condi- tional anomaly detection on mixed-attribute data to uncover workers’ compen- sation fraud. Decision Support Systems111, 13–26 (2018) https://doi.org/10. 1016/j.dss.2018.04.001
work page 2018
-
[7]
Journal of Accounting Research58(1), 199–235 (2020)
Bao, Y., Ke, B., Li, B., Yu, Y.J., Zhang, J.: Detecting accounting fraud in pub- licly traded us firms using a machine learning approach. Journal of Accounting Research58(1), 199–235 (2020)
work page 2020
-
[8]
Review of Accounting Studies26, 468–519 (2021)
Bertomeu, J., Cheynel, E., Floyd, E., W., P.: Using machine learning to detect misstatements. Review of Accounting Studies26, 468–519 (2021)
work page 2021
-
[9]
In: Pro- ceedings of the Second ACM International Conference on AI in Finance
Zavitsanos, E., Mavroeidis, D., Bougiatiotis, K., Spyropoulou, E., Loukas, L., Paliouras, G.: Financial misstatement detection: A realistic evaluation. In: Pro- ceedings of the Second ACM International Conference on AI in Finance. ACM, Virtual event (2021). https://doi.org/10.1145/3490354.3494453
-
[10]
The VLDB Journal24(6), 707–730 (2015) https://doi.org/10.1007/s00778-015-0394-1
Gal´ arraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in onto- logical knowledge bases with amie+. The VLDB Journal24(6), 707–730 (2015) https://doi.org/10.1007/s00778-015-0394-1
-
[11]
Computational Biology and Chemistry76, 23–31 (2018) https://doi.org/10.1016/j.compbiolchem.2018
Vasighizaker, A., Jalili, S.: C-pugp: A cluster-based positive unlabeled learning method for disease gene prediction and prioritization. Computational Biology and Chemistry76, 23–31 (2018) https://doi.org/10.1016/j.compbiolchem.2018. 05.022
-
[12]
In: Proceedings of the 2018 World Wide Web Conference, pp
Zupanc, K., Davis, J.: Estimating rule quality for knowledge base completion with the relationship between coverage assumption. In: Proceedings of the 2018 World Wide Web Conference, pp. 1073–1081. International World Wide Web Conferences Steering Committee, Lyon, France (2018). https://doi.org/10.1145/ 3178876.3186006
-
[13]
Scientific Reports10(1), 22295 (2020) https://doi.org/10.1038/s41598-020-78033-7
Mignone, P., Pio, G., Dˇ zeroski, S., Ceci, M.: Multi-task learning for the simulta- neous reconstruction of the human and mouse gene regulatory networks. Scientific Reports10(1), 22295 (2020) https://doi.org/10.1038/s41598-020-78033-7
-
[14]
Bioinformatics36(5), 1553– 1561 (2019) https://doi.org/10.1093/bioinformatics/btz781
Mignone, P., Pio, G., D’Elia, D., Ceci, M.: Exploiting transfer learning for the reconstruction of the human gene regulatory network. Bioinformatics36(5), 1553– 1561 (2019) https://doi.org/10.1093/bioinformatics/btz781
-
[15]
In: Pro- ceedings of the 2000 International Conference on Artificial Intelligence ICAI
Japkowicz, N.: The class imbalance problem: Significance and strategies. In: Pro- ceedings of the 2000 International Conference on Artificial Intelligence ICAI. Computer Science Research, Education and Applications Press, Athens, Georgia, 21 USA (2000)
work page 2000
-
[16]
Progress in Artificial Intelligence5(4), 221–232 (2016) https://doi.org/10
Krawczyk, B.: Learning from imbalanced data: open challenges and future direc- tions. Progress in Artificial Intelligence5(4), 221–232 (2016) https://doi.org/10. 1007/s13748-016-0094-0
work page 2016
-
[17]
Elkan, C., Noto, K.: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213–220. ACM, Las Vegas, Nevada, USA (2008). https://doi.org/10.1145/1401890.1401920
-
[18]
In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD
Bekker, J., Robberechts, P., Davis, J.: Beyond the selected completely at random assumption for learning from positive and unlabeled data. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD. Springer, Wurzburg, Germany (2019). https://doi.org/10.1007/ 978-3-030-46147-8 5
work page 2019
-
[19]
Fung, G.P.C., Yu, J.X., Lu, H., Yu, P.S.: Text classification without negative examples revisit. IEEE Transactions on Knowledge and Data Engineering18(1), 6–20 (2006) https://doi.org/10.1109/TKDE.2006.16
-
[20]
In: International Conference on Machine Learning (ICML), p
Liu, B., Yu, P., Li, X.: Partially supervised classification of text documents. In: International Conference on Machine Learning (ICML), p. 8. AAAI press, Washington, DC USA (2003). https://doi.org/10.1385/1-59259-358-5:387
-
[21]
In: Machine Learning: ECML 2005, pp
Li, X.-L., Liu, B.: Learning from positive and unlabeled examples with different data distributions. In: Machine Learning: ECML 2005, pp. 218–229. Springer, Porto, Portugal (2005)
work page 2005
-
[22]
In: International Joint Conference on Artificial Intelligence, pp
Li, X., Liu, B., Ng, S.-K.: Learning to identify unexpected instances in the test set. In: International Joint Conference on Artificial Intelligence, pp. 2802–
-
[23]
Morgan Kaufmann Publishers Inc., Hyderabad India (2007). https://api. semanticscholar.org/CorpusID:14296672
work page 2007
-
[24]
In: Machine Learning and Data Mining in Pattern Recognition, pp
Yu, S., Li, C.: Pe-puc: A graph based pu-learning approach for text classification. In: Machine Learning and Data Mining in Pattern Recognition, pp. 574–584. Springer, Leipzig, Germany (2007)
work page 2007
-
[25]
In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp
Yu, H., Han, J., Chang, K.C.-C.: Pebl: Positive example based learning for web page classification using svm. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 239–248. ACM, Edmonton Alberta Canada (2002). https://doi.org/10.1145/775047.775083
-
[26]
Yu, H., Han, J., Chang, K.: Pebl: Web page classification without negative examples. IEEE Transactions on Knowledge and Data Engineering16(2003) https://doi.org/10.1109/TKDE.2004.1264823 22
-
[27]
Knowledge and Information Systems16(3), 281–301 (2008) https://doi.org/10.1007/s10115-007-0107-1
Peng, T., Zuo, W., He, F.: Svm based adaptive learning method for text clas- sification from positive and unlabeled documents. Knowledge and Information Systems16(3), 281–301 (2008) https://doi.org/10.1007/s10115-007-0107-1
-
[28]
In: 18th International Joint Conference on Artificial Intelligence, pp
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: 18th International Joint Conference on Artificial Intelligence, pp. 587–594. Morgan Kaufmann Publishers Inc, Acapulco, Mexico (2003)
work page 2003
-
[29]
Li, X.-L., Liu, B., Ng, S.-K.: Negative training data can be harmful to text classification. In: Li, H., M` arquez, L. (eds.) Empirical Methods in Natural Lan- guage Processing, pp. 218–228. ACM, Cambridge Massachusetts (2010). https: //aclanthology.org/D10-1022
work page 2010
-
[30]
In: 19th International Conference on Neural Infor- mation Processing, pp
Chaudhari, S., Shevade, S.: Learning from positive and unlabelled examples using maximum margin clustering. In: 19th International Conference on Neural Infor- mation Processing, pp. 465–473. ACM, Doha Qatar (2012). https://doi.org/10. 1007/978-3-642-34487-9 56
work page 2012
-
[31]
https://api.semanticscholar.org/CorpusID:14424311
Uden, M.A.: Rocchio : Relevance Feedback in Learning Classification Algorithms (2007). https://api.semanticscholar.org/CorpusID:14424311
work page 2007
-
[32]
Ortega V´ azquez, C., Broucke, S., De Weerdt, J.: A two-step anomaly detec- tion based method for pu classification in imbalanced data sets. Data Mining and Knowledge Discovery37(3), 1301–1325 (2023) https://doi.org/10.1007/ s10618-023-00925-9
work page 2023
-
[33]
In: Third IEEE International Conference on Data Mining, pp
Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using posi- tive and unlabeled examples. In: Third IEEE International Conference on Data Mining, pp. 179–186. IEEE, Melbourne, Florida (2003). https://doi.org/10.1109/ ICDM.2003.1250918
-
[34]
In: Wang, J., Yen, G.G., Polycarpou, M.M
Ke, T., Yang, B., Zhen, L., Tan, J., Li, Y., Jing, L.: Building high-performance classifiers using positive and unlabeled examples for text classification. In: Wang, J., Yen, G.G., Polycarpou, M.M. (eds.) Advances in Neural Networks – ISNN 2012, pp. 187–195. Springer, Shenyang, China (2012)
work page 2012
-
[35]
In: Advanced Data Mining and Applications, pp
Liu, Z., Shi, W., Li, D., Qin, Q.: Partially supervised classification – based on weighted unlabeled samples support vector machine. In: Advanced Data Mining and Applications, pp. 118–129. Springer, Wuhan, China (2005)
work page 2005
-
[36]
In: 20th International Conference on International Conference on Machine Learning, pp
Lee, W.S., Liu, B.: Learning with positive and unlabeled examples using weighted logistic regression. In: 20th International Conference on International Conference on Machine Learning, pp. 448–455. AAAI Press, Washington, DC USA (2003)
work page 2003
-
[37]
Methods in molecular biology939, 47–58 (2013) 23
Mordelet, F., Vert, J.-P.: Supervised inference of gene regulatory networks from positive and unlabeled examples. Methods in molecular biology939, 47–58 (2013) 23
work page 2013
-
[38]
Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classi- fiers. Neural Processing Letters9(3), 293–300 (1999) https://doi.org/10.1023/A: 1018628609742
work page doi:10.1023/a: 1999
-
[39]
Applied Intelligence48(8), 2373–2392 (2018) https:// doi.org/10.1007/s10489-017-1076-z
Ke, T., Jing, L., Lv, H., Zhang, L., Hu, Y.: Global and local learning from positive and unlabeled examples. Applied Intelligence48(8), 2373–2392 (2018) https:// doi.org/10.1007/s10489-017-1076-z
-
[40]
Ke, T., Lv, H., Sun, M., Zhang, L.: A biased least squares support vector machine based on mahalanobis distance for pu learning. Physica A: Statistical Mechanics and its Applications509, 422–438 (2018) https://doi.org/10.1016/j.physa.2018. 05.128
-
[41]
Bekker, J., Davis, J.: Estimating the class prior in positive and unlabeled data through decision tree induction. Proceedings of the AAAI Conference on Artificial Intelligence32(1), 2712–2719 (2018) https://doi.org/10.1609/aaai.v32i1.11715
-
[42]
In: Proceedings of The 33rd International Conference on Machine Learning
Ramaswamy, H., Scott, C., Tewari, A.: Mixture proportion estimation via kernel embeddings of distributions. In: Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 2052–2060. JMLR, New York, New York, USA (2016). https://proceedings.mlr. press/v48/ramaswamy16.html
work page 2052
-
[43]
Machine Learning106(4), 463–492 (2017) https: //doi.org/10.1007/s10994-016-5604-6
Plessis, M.C., Niu, G., Sugiyama, M.: Class-prior estimation for learning from positive and unlabeled data. Machine Learning106(4), 463–492 (2017) https: //doi.org/10.1007/s10994-016-5604-6
-
[44]
Zeiberg, D., Jain, S., Radivojac, P.: Fast nonparametric estimation of class pro- portions in the positive-unlabeled classification setting. Proceedings of the AAAI Conference on Artificial Intelligence34(04), 6729–6736 (2020) https://doi.org/ 10.1609/aaai.v34i04.6151
-
[45]
In: Advances in Neural Information Processing Systems, vol
Plessis, M.C., Niu, G., Sugiyama, M.: Analysis of learning from positive and unlabeled data. In: Advances in Neural Information Processing Systems, vol. 27, p. 9. Curran Associates, Inc, Montreal, Canada (2014)
work page 2014
-
[46]
In: 32nd International Conference on Machine Learning
Plessis, M.D., Niu, G., Sugiyama, M.: Convex formulation for learning from posi- tive and unlabeled data. In: 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 1386–1394. JMLR, Lille France (2015)
work page 2015
-
[47]
In: Advances in Neural Information Processing Systems
Kiryo, R., Niu, G., Plessis, M., Sugiyama, M.: Positive-unlabeled learning with non-negative risk estimator. In: Advances in Neural Information Processing Systems. Curran Associates Inc., Long Beach, CA, USA (2017)
work page 2017
-
[48]
In: 13th International Joint Conference on Artificial Intelligence, IJCAI-21, pp
Su, G., Chen, W., Xu, M.: Positive-unlabeled learning from imbalanced data. In: 13th International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 24 2995–3001. Curran Associates, Inc, Virtual conference (2021). https://doi.org/ 10.24963/ijcai.2021/412
-
[49]
Machine Learning113(7), 4547–4578 (2024) https://doi.org/10.1007/s10994-023-06323-y
Ortega V´ azquez, C., Broucke, S., De Weerdt, J.: Hellinger distance decision trees for pu learning in imbalanced data sets. Machine Learning113(7), 4547–4578 (2024) https://doi.org/10.1007/s10994-023-06323-y
-
[50]
Niu, G., Plessis, M.c., Sakai, T., Ma, Y., M., S.: Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In: NIPS, p. 9. Curran Associates Inc., Barcelona Spain (2016)
work page 2016
-
[51]
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Doll´ ar, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(2), 318–327 (2020) https://doi.org/10.1109/TPAMI.2018.2858826
-
[52]
In: Proceedings of the 37th International Conference on Machine Learning, p
Chou, Y.-T., Niu, G., Lin, H.-T., Sugiyama, M.: Unbiased risk estimators can mislead: a case study of learning with complementary labels. In: Proceedings of the 37th International Conference on Machine Learning, p. 10. JMLR, Virtual conference (2020)
work page 2020
-
[53]
In: IEEE Conference on Computer Vision and Pattern Recognition
Charoenphakdee, N., Vongkulbhisal, J., Chairatanakul, N., Sugiyama, M.: On focal loss for class-posterior probability estimation: A theoretical perspective. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Nashville, TN, USA (2020)
work page 2020
-
[54]
Gong, C., Wang, Q., Liu, T., Han, B., You, J., Yang, J., Tao, D.: Instance- Dependent Positive and Unlabeled Learning With Labeling Bias Estimation . IEEE Transactions on Pattern Analysis & Machine Intelligence44(08), 4163–4177 (2022) https://doi.org/10.1109/TPAMI.2021.3061456
-
[55]
In: 2021 IEEE International Conference on Data Mining (ICDM), pp
Yoo, J., Kim, J., Yoon, H., Kim, G., Jang, C., Kang, U.: Accurate graph-based pu learning without class prior. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 827–836. IEEE, Virtual conference (2021). https://doi.org/ 10.1109/ICDM51629.2021.00094
-
[56]
Neural Computation14(1), 21–41 (2002)
Saerens, M., Latinne, P., Decaestecker, C.: Adjusting the Outputs of a Classifier to New a Priori Probabilities: A Simple Procedure. Neural Computation14(1), 21–41 (2002)
work page 2002
-
[57]
Neural Networks50, 110–119 (2014) https://doi.org/10.1016/j.neunet.2013.11.010
du Plessis, M.C., Sugiyama, M.: Semi-supervised learning of class balance under class-prior change by distribution matching. Neural Networks50, 110–119 (2014) https://doi.org/10.1016/j.neunet.2013.11.010
-
[58]
Applied Intelligence55(1), 15 (2025)
Zavitsanos, E., Kelesis, D., Paliouras, G.: Calibrating tabtransformer for financial misstatement detection. Applied Intelligence55(1), 15 (2025)
work page 2025
-
[59]
Contemporary accounting research28(1), 17–82 (2011)
Dechow, P.M., Ge, W., Larson, C.R., Sloan, R.G.: Predicting material accounting 25 misstatements. Contemporary accounting research28(1), 17–82 (2011)
work page 2011
-
[60]
Management Science56(7), 1146–1160 (2010)
Cecchini, M., Aytug, H., Koehler, G.J., Pathak, P.: Detecting management fraud in public companies. Management Science56(7), 1146–1160 (2010)
work page 2010
-
[61]
Dyck, A., Morse, A., Zingales, L.: Who blows the whistle on corporate fraud? The journal of finance65(6), 2213–2253 (2010)
work page 2010
-
[62]
TabTransformer: Tabular data modeling using contextual embeddings,
Huang, X., Khetan, A., Cvitkovic, M., Karnin, Z.: TabTransformer: Tabular Data Modeling Using Contextual Embeddings. arXiv (2020). https://doi.org/10. 48550/ARXIV.2012.06678
-
[63]
In: Advances in Neural Information Processing Systems, vol
Liu, H., Dai, Z., So, D., Le, Q.V.: Pay attention to mlps. In: Advances in Neural Information Processing Systems, vol. 34, pp. 9204–9215. Curran Associates, Inc., Virtual conference (2021) 26
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.