A framework elicits discriminative MRF statistics from an LLM and closes the model via maximum entropy to enable zero-shot active feature acquisition, outperforming baselines on IBD patient data especially for hardest cases.
Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman
16 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 16representative citing papers
Defines outcome-conformant synthesis as exact closed-form generation of relational data matching declared aggregates via Gamma conditional-sum sampling, introduces SpecBench for measuring conformance, and shows it is orthogonal to fidelity.
Under independence and tail conditions on random symmetric matrices, the DNN relaxation of the standard quadratic program is exact with probability tending to 1, the optimizer is unique and rank one, and recoverable in O(n^2) time.
Statistical model checking on the K+S model shows macro-financial and structural parameters produce stronger transient effects on unemployment and GDP growth than heuristic-rule parameters under fixed precision policies.
A generalization of the Benjamini-Hochberg procedure controls the FDR curve below any specified level in location families, and the standard procedure simultaneously controls the entire curve for free.
A single algorithm for online multicalibration achieves instance-adaptive rates by dynamically refining a dyadic prediction grid, recovering the worst-case Õ(T^{2/3}) bound and improving to Õ(√T) in marginal stochastic settings and Õ(√(JT)) for J-piecewise stationary means.
Power-one sequential tests exist for testing any weakly compact null set of distributions against its complement.
The IM interval is the shortest valid prior-free procedure for the Behrens-Fisher problem, established via cylindrical predictive random sets, minimaxity, admissibility, and a projection argument.
Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.
Derives exact operating characteristic corrections and a numerical search over sample sizes to obtain optimal two-stage Bayes factor designs for two-arm binary-endpoint phase II trials that minimize expected sample size under the null.
Proposes an inferential framework to test differences in categorical Gini correlations for predictor importance in classification, establishing asymptotic normality and consistency while accommodating unequal dimensions and dependence.
42% of significant turn-level associations in LLM conversation analysis are spurious due to unaccounted autocorrelation, with a validated two-stage correction framework improving replication.
Ensemble voting strategies for change point detection improve F1-score by 11% over Mozilla's T-test method on a new ground-truth dataset of 174 performance time series annotated by practitioners.
A cost-sensitive trigger using specification debt for deciding when to re-specify forecasting model forms, shown on M4 data to match full-update accuracy at 28% of the compute cost.
Statistical model checking reproduces key stylized facts of the Island Model with confidence intervals, confirms moderate exploration rates are optimal, and enables counterfactual sensitivity analysis across parameters.
The thesis proposes specialized algebraic, logical, and geometric methods to enable scalable reasoning over imprecise attributes, probabilistic triples, and incomplete schemas in knowledge graphs.
citing papers explorer
-
Zero-Shot Active Feature Acquisition via LLM-Elicitation
A framework elicits discriminative MRF statistics from an LLM and closes the model via maximum entropy to enable zero-shot active feature acquisition, outperforming baselines on IBD patient data especially for hardest cases.
-
Declarative Outcome-Conformant Synthesis: Exact, Closed-Form Specification Satisfaction and a Conformance Benchmark
Defines outcome-conformant synthesis as exact closed-form generation of relational data matching declared aggregates via Gamma conditional-sum sampling, introduces SpecBench for measuring conformance, and shows it is orthogonal to fidelity.
-
Exactness of the DNN Relaxation for Random Standard Quadratic Programs
Under independence and tail conditions on random symmetric matrices, the DNN relaxation of the standard quadratic program is exact with probability tending to 1, the optimizer is unique and rank one, and recoverable in O(n^2) time.
-
Statistical Model Checking of the Keynes+Schumpeter Model: A Transient Sensitivity Analysis of a Macroeconomic ABM
Statistical model checking on the K+S model shows macro-financial and structural parameters produce stronger transient effects on unemployment and GDP growth than heuristic-rule parameters under fixed precision policies.
-
Simultaneous false discovery rate control in location families
A generalization of the Benjamini-Hochberg procedure controls the FDR curve below any specified level in location families, and the standard procedure simultaneously controls the entire curve for free.
-
Instance-Adaptive Online Multicalibration
A single algorithm for online multicalibration achieves instance-adaptive rates by dynamically refining a dyadic prediction grid, recovering the worst-case Õ(T^{2/3}) bound and improving to Õ(√T) in marginal stochastic settings and Õ(√(JT)) for J-piecewise stationary means.
-
Power one sequential tests exist for weakly compact $\mathscr P$ against $\mathscr P^c$
Power-one sequential tests exist for testing any weakly compact null set of distributions against its complement.
-
Revisiting the Behrens-Fisher Problem: Validity-First Optimality
The IM interval is the shortest valid prior-free procedure for the Behrens-Fisher problem, established via cylindrical predictive random sets, minimaxity, admissibility, and a projection argument.
-
Decision-Theoretic Stopping Rules for Document Screening
Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.
-
Optimal sequential two-stage Bayes Factor Design for two-arm clinical Phase II Trials with binary Endpoints
Derives exact operating characteristic corrections and a numerical search over sample sizes to obtain optimal two-stage Bayes factor designs for two-arm binary-endpoint phase II trials that minimize expected sample size under the null.
-
Comparing Two Categorical Gini Correlations with Applications to Classification Problems
Proposes an inferential framework to test differences in categorical Gini correlations for predictor importance in classification, establishing asymptotic normality and consistency while accommodating unequal dimensions and dependence.
-
The Autocorrelation Blind Spot: Why 42% of Turn-Level Findings in LLM Conversation Analysis May Be Spurious
42% of significant turn-level associations in LLM conversation analysis are spurious due to unaccounted autocorrelation, with a validated two-stage correction framework improving replication.
-
Exploring Statistical Change Point Detection Techniques for Performance Anomaly Detection at Mozilla
Ensemble voting strategies for change point detection improve F1-score by 11% over Mozilla's T-test method on a new ground-truth dataset of 174 performance time series annotated by practitioners.
-
When Should Forecasting Models Be Re-Specified? A Cost-Sensitive Trigger for Adaptive Model-Form Updating
A cost-sensitive trigger using specification debt for deciding when to re-specify forecasting model forms, shown on M4 data to match full-update accuracy at 28% of the compute cost.
-
Statistical Model Checking of the Island Model: An Established Economic Agent-Based Model of Endogenous Growth
Statistical model checking reproduces key stylized facts of the Island Model with confidence intervals, confirms moderate exploration rates are optimal, and enables counterfactual sensitivity analysis across parameters.
-
Scalable Uncertainty Reasoning in Knowledge Graphs
The thesis proposes specialized algebraic, logical, and geometric methods to enable scalable reasoning over imprecise attributes, probabilistic triples, and incomplete schemas in knowledge graphs.