Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman

Bancroft, T · 1944 · arXiv aoms/1177731

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

citation-role summary

background 2 method 1

citation-polarity summary

background 1 support 1 use method 1

representative citing papers

Zero-Shot Active Feature Acquisition via LLM-Elicitation

cs.LG · 2026-06-17 · unverdicted · novelty 7.0

A framework elicits discriminative MRF statistics from an LLM and closes the model via maximum entropy to enable zero-shot active feature acquisition, outperforming baselines on IBD patient data especially for hardest cases.

Declarative Outcome-Conformant Synthesis: Exact, Closed-Form Specification Satisfaction and a Conformance Benchmark

cs.LG · 2026-06-07 · unverdicted · novelty 7.0

Defines outcome-conformant synthesis as exact closed-form generation of relational data matching declared aggregates via Gamma conditional-sum sampling, introduces SpecBench for measuring conformance, and shows it is orthogonal to fidelity.

Exactness of the DNN Relaxation for Random Standard Quadratic Programs

math.OC · 2026-05-12 · unverdicted · novelty 7.0

Under independence and tail conditions on random symmetric matrices, the DNN relaxation of the standard quadratic program is exact with probability tending to 1, the optimizer is unique and rank one, and recoverable in O(n^2) time.

Statistical Model Checking of the Keynes+Schumpeter Model: A Transient Sensitivity Analysis of a Macroeconomic ABM

cs.MA · 2026-05-11 · unverdicted · novelty 7.0

Statistical model checking on the K+S model shows macro-financial and structural parameters produce stronger transient effects on unemployment and GDP growth than heuristic-rule parameters under fixed precision policies.

Simultaneous false discovery rate control in location families

stat.ME · 2026-05-10 · unverdicted · novelty 7.0

A generalization of the Benjamini-Hochberg procedure controls the FDR curve below any specified level in location families, and the standard procedure simultaneously controls the entire curve for free.

Instance-Adaptive Online Multicalibration

cs.LG · 2026-05-10 · unverdicted · novelty 7.0 · 2 refs

A single algorithm for online multicalibration achieves instance-adaptive rates by dynamically refining a dyadic prediction grid, recovering the worst-case Õ(T^{2/3}) bound and improving to Õ(√T) in marginal stochastic settings and Õ(√(JT)) for J-piecewise stationary means.

Power one sequential tests exist for weakly compact $\mathscr P$ against $\mathscr P^c$

math.ST · 2026-04-03 · unverdicted · novelty 7.0

Power-one sequential tests exist for testing any weakly compact null set of distributions against its complement.

Revisiting the Behrens-Fisher Problem: Validity-First Optimality

math.ST · 2026-06-05 · unverdicted · novelty 6.0

The IM interval is the shortest valid prior-free procedure for the Behrens-Fisher problem, established via cylindrical predictive random sets, minimaxity, admissibility, and a projection argument.

Decision-Theoretic Stopping Rules for Document Screening

cs.IR · 2026-06-05 · unverdicted · novelty 6.0

Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.

Optimal sequential two-stage Bayes Factor Design for two-arm clinical Phase II Trials with binary Endpoints

stat.ME · 2026-06-01 · unverdicted · novelty 6.0

Derives exact operating characteristic corrections and a numerical search over sample sizes to obtain optimal two-stage Bayes factor designs for two-arm binary-endpoint phase II trials that minimize expected sample size under the null.

Comparing Two Categorical Gini Correlations with Applications to Classification Problems

stat.ME · 2026-05-18 · unverdicted · novelty 6.0

Proposes an inferential framework to test differences in categorical Gini correlations for predictor importance in classification, establishing asymptotic normality and consistency while accommodating unequal dimensions and dependence.

The Autocorrelation Blind Spot: Why 42% of Turn-Level Findings in LLM Conversation Analysis May Be Spurious

cs.CL · 2026-04-15 · accept · novelty 6.0

42% of significant turn-level associations in LLM conversation analysis are spurious due to unaccounted autocorrelation, with a validated two-stage correction framework improving replication.

Exploring Statistical Change Point Detection Techniques for Performance Anomaly Detection at Mozilla

cs.SE · 2026-06-16 · unverdicted · novelty 5.0

Ensemble voting strategies for change point detection improve F1-score by 11% over Mozilla's T-test method on a new ground-truth dataset of 174 performance time series annotated by practitioners.

When Should Forecasting Models Be Re-Specified? A Cost-Sensitive Trigger for Adaptive Model-Form Updating

stat.AP · 2026-06-04 · unverdicted · novelty 5.0

A cost-sensitive trigger using specification debt for deciding when to re-specify forecasting model forms, shown on M4 data to match full-update accuracy at 28% of the compute cost.

Statistical Model Checking of the Island Model: An Established Economic Agent-Based Model of Endogenous Growth

cs.MA · 2026-04-06 · unverdicted · novelty 5.0

Statistical model checking reproduces key stylized facts of the Island Model with confidence intervals, confirms moderate exploration rates are optimal, and enables counterfactual sensitivity analysis across parameters.

Scalable Uncertainty Reasoning in Knowledge Graphs

cs.AI · 2026-05-15 · unverdicted · novelty 4.0

The thesis proposes specialized algebraic, logical, and geometric methods to enable scalable reasoning over imprecise attributes, probabilistic triples, and incomplete schemas in knowledge graphs.

citing papers explorer

Showing 16 of 16 citing papers.

Zero-Shot Active Feature Acquisition via LLM-Elicitation cs.LG · 2026-06-17 · unverdicted · none · ref 39
A framework elicits discriminative MRF statistics from an LLM and closes the model via maximum entropy to enable zero-shot active feature acquisition, outperforming baselines on IBD patient data especially for hardest cases.
Declarative Outcome-Conformant Synthesis: Exact, Closed-Form Specification Satisfaction and a Conformance Benchmark cs.LG · 2026-06-07 · unverdicted · none · ref 11
Defines outcome-conformant synthesis as exact closed-form generation of relational data matching declared aggregates via Gamma conditional-sum sampling, introduces SpecBench for measuring conformance, and shows it is orthogonal to fidelity.
Exactness of the DNN Relaxation for Random Standard Quadratic Programs math.OC · 2026-05-12 · unverdicted · none · ref 14
Under independence and tail conditions on random symmetric matrices, the DNN relaxation of the standard quadratic program is exact with probability tending to 1, the optimizer is unique and rank one, and recoverable in O(n^2) time.
Statistical Model Checking of the Keynes+Schumpeter Model: A Transient Sensitivity Analysis of a Macroeconomic ABM cs.MA · 2026-05-11 · unverdicted · none · ref 46
Statistical model checking on the K+S model shows macro-financial and structural parameters produce stronger transient effects on unemployment and GDP growth than heuristic-rule parameters under fixed precision policies.
Simultaneous false discovery rate control in location families stat.ME · 2026-05-10 · unverdicted · none · ref 27
A generalization of the Benjamini-Hochberg procedure controls the FDR curve below any specified level in location families, and the standard procedure simultaneously controls the entire curve for free.
Instance-Adaptive Online Multicalibration cs.LG · 2026-05-10 · unverdicted · none · ref 16 · 2 links
A single algorithm for online multicalibration achieves instance-adaptive rates by dynamically refining a dyadic prediction grid, recovering the worst-case Õ(T^{2/3}) bound and improving to Õ(√T) in marginal stochastic settings and Õ(√(JT)) for J-piecewise stationary means.
Power one sequential tests exist for weakly compact $\mathscr P$ against $\mathscr P^c$ math.ST · 2026-04-03 · unverdicted · none · ref 37
Power-one sequential tests exist for testing any weakly compact null set of distributions against its complement.
Revisiting the Behrens-Fisher Problem: Validity-First Optimality math.ST · 2026-06-05 · unverdicted · none · ref 23
The IM interval is the shortest valid prior-free procedure for the Behrens-Fisher problem, established via cylindrical predictive random sets, minimaxity, admissibility, and a projection argument.
Decision-Theoretic Stopping Rules for Document Screening cs.IR · 2026-06-05 · unverdicted · none · ref 50
Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.
Optimal sequential two-stage Bayes Factor Design for two-arm clinical Phase II Trials with binary Endpoints stat.ME · 2026-06-01 · unverdicted · none · ref 79
Derives exact operating characteristic corrections and a numerical search over sample sizes to obtain optimal two-stage Bayes factor designs for two-arm binary-endpoint phase II trials that minimize expected sample size under the null.
Comparing Two Categorical Gini Correlations with Applications to Classification Problems stat.ME · 2026-05-18 · unverdicted · none · ref 22
Proposes an inferential framework to test differences in categorical Gini correlations for predictor importance in classification, establishing asymptotic normality and consistency while accommodating unequal dimensions and dependence.
The Autocorrelation Blind Spot: Why 42% of Turn-Level Findings in LLM Conversation Analysis May Be Spurious cs.CL · 2026-04-15 · accept · none · ref 20
42% of significant turn-level associations in LLM conversation analysis are spurious due to unaccounted autocorrelation, with a validated two-stage correction framework improving replication.
Exploring Statistical Change Point Detection Techniques for Performance Anomaly Detection at Mozilla cs.SE · 2026-06-16 · unverdicted · none · ref 93
Ensemble voting strategies for change point detection improve F1-score by 11% over Mozilla's T-test method on a new ground-truth dataset of 174 performance time series annotated by practitioners.
When Should Forecasting Models Be Re-Specified? A Cost-Sensitive Trigger for Adaptive Model-Form Updating stat.AP · 2026-06-04 · unverdicted · none · ref 23
A cost-sensitive trigger using specification debt for deciding when to re-specify forecasting model forms, shown on M4 data to match full-update accuracy at 28% of the compute cost.
Statistical Model Checking of the Island Model: An Established Economic Agent-Based Model of Endogenous Growth cs.MA · 2026-04-06 · unverdicted · none · ref 67
Statistical model checking reproduces key stylized facts of the Island Model with confidence intervals, confirms moderate exploration rates are optimal, and enables counterfactual sensitivity analysis across parameters.
Scalable Uncertainty Reasoning in Knowledge Graphs cs.AI · 2026-05-15 · unverdicted · none · ref 48
The thesis proposes specialized algebraic, logical, and geometric methods to enable scalable reasoning over imprecise attributes, probabilistic triples, and incomplete schemas in knowledge graphs.

Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer