FLOATBench is a tabular benchmark dataset with 582,120 fatigue labels from 19,404 OpenFAST simulations of three 22 MW FOWT towers, featuring alpha-shape regime partitioning and three evaluation protocols for surrogate models.
hub Tool reference
Statist.] 10.1214/aos/1176344136 , 6, 461
Tool reference. 71% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
A new benchmark with cognitive traps shows frontier deep research agents achieve only 13-16% acceptance on expert consulting tasks under combined verifier and rubric criteria.
First use of the learned harmonic mean estimator for Bayesian model selection across circular/eccentric, white-noise/GP, and trend variants in radial velocity exoplanet analyses.
First measurement of ^4_ΛHe yields in 3 GeV Au+Au collisions shows consistency with ^4_ΛH yields and JAM coalescence model while thermal model overpredicts absolute yields.
Direct fixed-weight solver for free-support Wasserstein medians relocates atoms using OT barycentric projections and inverse-distance weights, achieving monotone descent on smoothed objectives with fewer subproblems than nested Weiszfeld baselines.
Nonthermal line broadening at solar flare footpoints is primarily field-aligned, demonstrated by systematic decrease in line widths from disk center to limb across 4,593 Hinode/EIS spectra from 407 flares.
EML-CD recovers causal DAG structure and closed-form mechanisms via gated EML trees, matching PC/GES SHD on Sachs data while recovering 10 of 11 function families in bivariate tests and outperforming SINDy on mechanism f-MSE.
A dual-encoder deepfake detector pairs a frozen specialist with a LoRA-tuned MLLM, trained first via binary alignment then via RL to reward explain-then-classify behavior, yielding improved cross-dataset performance and interpretability.
ProactBench measures LLM conversational proactivity in three phases using 198 multi-agent dialogues and finds recovery behavior hard to predict from existing benchmarks.
Zero-noise extrapolation has a finite-shot help-harm boundary below which it increases local mean-squared error due to variance penalties outweighing bias reduction.
JudgeSense benchmark shows LLM judge consistency does not reliably improve with model scale, with coherence most sensitive to prompt changes and factuality more stable.
Causal Process Models reframe dynamic causal graph discovery as multi-agent reinforcement learning to build sparse time-varying graphs only at active interactions, outperforming dense baselines on physical prediction.
vsOED uses a variational one-point reward and RL policy optimization to provide a lower bound on expected information gain for sequential experimental design, supporting nuisance parameters, implicit likelihoods, and multiple design goals.
Kolmogorov n-width theory plus PRESS statistics yield closed-form optimal spline resolution; KORE estimates bias/noise scales from two pilots and matches CV performance with far fewer fits.
Pre-registered validation of an ML Na-cathode voltage screen yields 0.67 V MAE against experiment, with Materials Project PBE+U references 0.54 V low and dominating the error.
A timing-based data-driven method to measure single-particle corner-clipping probabilities in segmented detectors, validated on Pierre Auger Underground Muon Detector simulations and parameterized by an analytical model.
Proposes and analyzes a homogeneity test using squared L2 distance of empirical EOT maps to uniform-on-ball reference, with FCLT, Gaussian quadratic null limit, consistency, local power, and weighted multiplier bootstrap.
TypedCSIP applies typed counterfactual selective intervention pretraining on expert revisions to lift macro-F1 by 0.9-1.3 pp on the LCR-CN Chinese legislative conflict classification benchmark under a pre-registered multi-seed test.
RankElastor mitigates embedding collapse via spectrum-robust token mixing and GLU-based P-FFNs, yielding better performance and scaling on industrial recommendation datasets.
COO co-optimizes orbitals with TrimCI to absorb many-body correlations into the basis, cutting determinant count by orders of magnitude for iron-sulfur clusters versus localized bases or DMRG.
Proposes adaptive multiple importance sampling for robust Bayesian model evidence estimation under parameter non-identifiability, shown to outperform deterministic methods on ecological case studies while being cheaper than MCMC.
A moment-based alternative to OLS for fractional polynomials achieves closed-form variance reduction for skewed errors by the factor g2 = 1 - gamma3^2/(2+gamma4) while preserving coverage and reverting to OLS under symmetry.
A Bayesian model for multi-feature contact matrices that uses tensor structures and contingency table theory to satisfy structural constraints and impute missing contact features, validated on simulations and US/German survey data.
Bayesian procedures are derived to compute the posterior probability that a recoverable process is currently in control or that a drifting latent parameter lies in an acceptable region.
citing papers explorer
-
Density Evolution: A Multiscale View of Density Estimation
A review reframing density estimation as 'density evolution' across scales, linking kernel smoothing to heat flow, mixtures to compression, and topology to level sets, while stating three structural results on modes, Gaussian semigroups, and log-concavity.