mega hub Mixed citations

Machine Learning 45(1), 5–32 (Oct 2001)

Leo Breiman · 2001 · Machine Learning · DOI 10.1023/a:1010933404324

Mixed citation behavior. Most common role is background (55%).

84 Pith papers citing it

110k external citations · Crossref

Background 55% of classified citations

open at publisher browse 84 citing papers more from Leo Breiman

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 12 method 7 baseline 1

citation-polarity summary

background 11 use method 7 baseline 1 unclear 1

authors

Leo Breiman

mega hub controls

export citing contexts JSON export graph JSON export full bundle JSON annotated reader queued

Recognition alignment

counterfactual ablation

If this work disappeared, these are the nearest dependency candidates in Pith, weighted toward method, dataset, baseline, and extension contexts where available. This is a structural signal, not a retraction verdict.

co-cited works

representative citing papers

Self-Stigma Is Not a Monolith, but Generic Empathy Is: Persona-Conditioned LLM Support for People Who Use Drugs

cs.CL · 2026-06-22 · unverdicted · novelty 7.0 · 2 refs

Four self-stigma personas identified via LPA on 1,174 Reddit users; persona-conditioned LLMs achieve targeted shifts but experts prefer generic empathy baselines.

Polarisation and Faraday rotation measure imaging at metre wavelengths with sub-arcsecond resolution: a foundational calibration strategy

astro-ph.IM · 2026-06-16 · unverdicted · novelty 7.0

A calibration strategy using full-Jones corrections with an in-field unpolarised calibrator and visibility-based multi-epoch alignment enables sub-arcsecond polarimetric imaging with LOFAR at metre wavelengths.

TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics

cs.OS · 2026-05-18 · unverdicted · novelty 7.0

TIDAL recovers temporal phase signals from LLM-derived semantics of provisioning metadata to enable complementary CVD placement, reducing overload frequency by 79.1% on production traces.

TabPFN-MT: A Natively Multitask In-Context Learner for Tabular Data

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

TabPFN-MT is a multitask in-context learner for tabular data that sets a new state-of-the-art on deep multitask learning for datasets under 1000 samples while reducing inference cost from O(T) to O(1) passes.

The Nova Synthetic Data Base: A Principal Component/AI Analysis of Novae Synoptic Spectra

astro-ph.SR · 2026-05-14 · unverdicted · novelty 7.0

Presents the first public synthetic spectra database for novae and demonstrates a PCA/AI framework for retrieving physical properties from limited spectral data as a proof of concept for future surveys.

MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

MulTaBench is a new collection of 40 image-tabular and text-tabular datasets designed to test target-aware representation tuning in multimodal tabular models.

LLM-guided Semi-Supervised Approaches for Social Media Crisis Data Classification

cs.AI · 2026-05-08 · conditional · novelty 7.0

LG-CoTrain, an LLM-guided co-training method, outperforms classical semi-supervised baselines for crisis tweet classification in low-resource settings with 5-25 labeled examples per class.

Intrinsic effective sample size for manifold-valued Markov chain Monte Carlo via kernel discrepancy

stat.ML · 2026-05-05 · unverdicted · novelty 7.0

An intrinsic effective sample size for manifold MCMC is defined via kernel discrepancy as the number of independent draws yielding equivalent expected squared discrepancy to the target.

Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks

cs.LG · 2026-05-02 · unverdicted · novelty 7.0

EstGraph benchmark evaluates LLMs on estimating properties of very large graphs from random-walk samples that fit in context limits.

Profile Likelihood Inference for Anisotropic Hyperbolic Wrapped Normal Models on Hyperbolic Space

math.ST · 2026-05-01 · unverdicted · novelty 7.0

The profile maximum likelihood estimator for the location in anisotropic hyperbolic wrapped normal models is strongly consistent, asymptotically normal, and attains the Hájek-Le Cam minimax lower bound under squared geodesic loss.

SynQL: A Controllable and Scalable Rule-Based Framework for SQL Workload Synthesis for Performance Benchmarking

cs.DB · 2026-04-09 · unverdicted · novelty 7.0

SynQL synthesizes diverse, execution-ready SQL workloads by deterministically traversing foreign-key graphs to populate ASTs, yielding high topological entropy and cost-model training data with R² ≥ 0.79 on held-out sets.

A Perfect Storm: First-Nature Geography and Economic Development

econ.GN · 2024-08-01 · unverdicted · novelty 7.0

A 1825 storm created a new sea connection in Denmark, producing a 27 percent population increase (elasticity 1.6 to market access) driven by fertility and occupational change toward fishing and manufacturing, with symmetric medieval declines after waterway closure.

A welding penetration prediction model for laser welding process based on self-supervised learning using physics-informed neural networks

cs.CV · 2026-06-24 · unverdicted · novelty 6.0

SimPhysNet achieves 96.06% accuracy classifying laser welding penetration states using self-supervised contrastive learning with a physics-informed neural network and prototypical networks on only 200 labeled images.

Are We Lost in the Woods? Detecting Silent Semantic Faults for Random Forest Classifiers with Data-informed Static Analysis

cs.SE · 2026-06-05 · unverdicted · novelty 6.0

dille detects silent semantic faults in random forest ML pipelines with 91% precision via data-informed static analysis on Kaggle notebooks, finding 12-18% of scripts affected.

Reactivity-Informed Machine Learning for Performance Prediction and Design Space Exploration of Alkali-Activated Slag

cond-mat.mtrl-sci · 2026-06-04 · unverdicted · novelty 6.0

Machine learning on the largest curated alkali-activated slag dataset shows that average metal oxide dissociation energy serves as a compact, physically interpretable reactivity descriptor enabling strength prediction and low-emission design space exploration.

Skew-adaptive conformal prediction

stat.ML · 2026-05-15 · unverdicted · novelty 6.0

Develops a skew-adaptive split conformal prediction method that learns local skewness via a gauge-derived conformity score and an asinh residual model while preserving marginal validity under exchangeability.

Neural Point-Forms

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

Neural point-forms are introduced as permutation-invariant neural layers that output learned form-comparison matrices for point clouds, with a claimed consistency proof under sampling and manifold assumptions and competitive results on synthetic and biological data.

Nonparametric inference for sublevel-set probabilities of conditional average treatment effect functions

stat.ME · 2026-05-14 · unverdicted · novelty 6.0

Develops Grenander-type and debiased machine learning estimators for the sublevel-set probability curve of the CATE function, shown to be non-pathwise differentiable, along with its piecewise linear approximation.

Semantic Feature Segmentation for Interpretable Predictive Maintenance in Complex Systems

cs.AI · 2026-05-14 · unverdicted · novelty 6.0

Semantic segmentation decomposes monitoring features into canonical and residual components that concentrate fault-predictive information while preserving operational meaning in predictive maintenance.

cs.CL · 2026-05-01 · unverdicted · novelty 6.0

Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.

Vesselpose: Vessel Graph Reconstruction from Learned Voxel-wise Direction Vectors in 3D Vascular Images

cs.CV · 2026-05-01 · unverdicted · novelty 6.0

Vesselpose predicts voxel-wise direction vectors to extend the TEASAR algorithm for topologically accurate vascular graph reconstruction from 3D images.

RCProb: Probabilistic Rule Extraction for Efficient Simplification of Tree Ensembles

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

RCProb uses Dirichlet-smoothed class priors and Beta-smoothed condition likelihoods in a Naive Bayes formulation to extract rules from tree ensembles approximately 22 times faster than RuleCOSI+ while maintaining competitive accuracy and producing more compact rule sets on 33 benchmark datasets.

StarCLR: Contrastive Learning Representation for Astronomical Light Curves

astro-ph.SR · 2026-04-27 · conditional · novelty 6.0

StarCLR pretrains on TESS light curves via contrastive learning on overlapping subsequences and improves variable star classification F1 scores over scratch-trained models when fine-tuned on TESS, ZTF, and Gaia.

Resource-Lean Lexicon Induction for German Dialects

cs.CL · 2026-04-26 · accept · novelty 6.0

Random forests on string similarity features outperform LLMs for German dialect lexicon induction and boost dialect information retrieval by up to 50% in recall.

citing papers explorer

Showing 6 of 6 citing papers after filters.

Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus cs.CL · 2026-05-01 · unverdicted · none · ref 12
Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.
Data-Efficient Indentation Size Effect Correction in Steels Using Machine Learning and Physics-Guided Augmentation cond-mat.mtrl-sci · 2026-04-30 · unverdicted · none · ref 49
Physics-guided data augmentation combined with neural networks enables accurate indentation size effect correction in steels from small sets of shallow nanoindentation measurements, outperforming Nix-Gao in the shallow regime.
Is the `Known' Enough? An Integrated Machine Learning Framework for Eclipsing Binary Classification and Parameter Estimation Based on Well-Characterized Systems astro-ph.SR · 2026-04-21 · conditional · none · ref 3
An ensemble ML framework achieves 90.7% morphology classification accuracy and R² values of 0.77–0.92 for key parameters on held-out test data, with external validation against OGLE and Kepler catalogs.
The T16 Planet Hunt: 10,000 New Planet Candidates from TESS Cycle 1 and the Confirmation of a Hot Jupiter Around TIC 183374187 astro-ph.EP · 2026-04-20 · conditional · none · ref 10
A transit search on TESS Cycle 1 full-frame images produced 10,091 new planet candidates down to T=16 mag, more than doubling the known TESS total, with one hot Jupiter confirmed by radial velocity.
Impact of Validation Strategy on Machine Learning Performance in EEG-Based Alcoholism Classification eess.SP · 2026-04-11 · unverdicted · none · ref 17
Nested cross-validation reveals optimistic bias in standard validation for EEG alcoholism classification, with AdaBoost reaching 78.3% accuracy and most model differences not statistically significant per McNemar's test.
An Explainable Unsupervised-to-Supervised Machine Learning Framework for Dietary Pattern Discovery Using UK National Dietary Survey Data q-bio.QM · 2026-05-07 · unverdicted · none · ref 26
An unsupervised-to-supervised ML pipeline on UK NDNS data discovers four dietary patterns, reproduces them with macro-F1 0.963 using a surrogate classifier, and interprets them via SHAP for potential clinical use.

Machine Learning 45(1), 5–32 (Oct 2001)

hub tools

citation-role summary

citation-polarity summary

authors

mega hub controls

Recognition alignment

counterfactual ablation

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer