TopoFisher optimizes trainable filtrations, vectorizations, and compressors in persistent homology to maximize Fisher information, yielding higher information than fixed cosmological summaries and approaching neural baselines with far fewer parameters while generalizing better under simulator shifts
Title resolution pending
38 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 12 cs.CL 6 stat.ML 4 astro-ph.EP 2 astro-ph.GA 2 astro-ph.SR 2 cs.AI 2 astro-ph.HE 1 cs.CV 1 math.OC 1roles
background 1polarities
unclear 1representative citing papers
Introduces Causal Functional Signatures grounded in causal evidence and ILP-learned architectural signatures to enable explicit, comparable, and portable mechanistic claims across model scales.
At z=1, disk galaxies exhibit U-shaped stellar age profiles with turnover at the edge, indicating inside-out growth with approximately 300% mass increase in outer regions since z=0.
First near-IR weak-lensing analysis of CANDELS fields detects 12 shear-selected overdensities with masses 0.2-2.2 x 10^14 solar masses at redshifts 0.22-0.9 and mean z=0.68.
Develops CAST, a polynomial-time approximation algorithm for selecting k individuals for HIV treatment in a network to minimize expected transmission cascades, achieving a 2√|P| approximation ratio.
LLM-generated synthetic datasets steered uniformly across a 2D performance space defined by two landmark algorithms improve meta-learner performance on algorithm selection for regression tasks.
Finite-sample risk bounds for DQN with ReLU networks are extended to τ-mixing data, showing an extra dimensionality penalty in the convergence rate due to dependence.
A fully differentiable TensorFlow gyrokinetic code allows approximate gradients of nonlinear turbulence quantities to be used for outer-loop tasks such as profile prediction despite stochasticity.
Soft-MSM is a smooth, gradient-enabled version of the context-aware MSM distance for time series alignment that outperforms Soft-DTW alternatives in clustering and nearest-centroid classification.
Training per-layer affine probes on frozen transformers yields more reliable latent predictions than the logit lens and enables detection of malicious inputs from prediction trajectories.
The paper proposes a statistical test for asteroid surface color heterogeneity from sparse multiband photometry and evaluates its performance and sensitivity to model errors through Monte Carlo simulations of synthetic asteroids.
In the high-dimensional proportional regime, a large gradient step on a two-layer network induces a target-dependent spiked Gaussian covariance on the features, yielding a data-adaptive kernel that amplifies target-aligned eigenvalues and mixes leading eigenfunctions.
A dual hierarchical RL framework with two agents coordinates high-level dialogue strategy and low-level question generation to emulate judicial questioning and extract key information from Supreme Court arguments, outperforming baselines.
BoolXLLM augments an existing Boolean rule learner with LLMs for feature selection, discretization thresholds, and natural-language rule translation to improve interpretability while preserving accuracy.
Dynamic Meta-Metrics learns source-sentence conditioned combinations of MT metrics, with MLP-based and soft-conditioned versions showing gains over linear and GP ensembles on WMT data.
Newer LLMs exhibit reduced syntactic and lexical diversity in English news text generation compared to older models, as measured by HPSG grammar and diversity metrics from ecology and information theory, while human-authored text shows little change.
CAST reduces object hallucination in LVLMs by 6.03% on average across five models and five benchmarks by identifying caption-sensitive attention heads and applying optimized steering directions to their outputs, with negligible added inference cost.
A time-aware convolutional attention network trained on StarSim synthetic spectra reduces stellar activity radial velocity jitter to 52.5% and 62.4% of original levels in HARPS and CARMENES data for epsilon Eridani and TZ Arietis.
Predictive Bayesian inference posteriors concentrate onto a forward-model-dependent quantity and produce miscalibrated credible sets unless the predictive model contains the true data-generating process.
Concept Separation Curves provide a classifier-independent method to visualize and quantify how sentence embeddings distinguish conceptual meaning from syntactic variations across languages and domains.
Unsupervised single-generation confidence calibration for reasoning LLMs via offline self-consistency proxy distillation outperforms baselines on math and QA tasks and improves selective prediction.
ParamBoost improves GAMs by fitting piecewise cubic polynomials via gradient boosting and supports constraints for continuity, monotonicity, convexity, and feature interactions.
Contrastive Activation Addition steers Llama 2 Chat by adding averaged residual-stream activation differences from contrastive example pairs to control targeted behaviors at inference time.
UEC-STD is an architecture-agnostic corrector that uses seasonal-trend decomposition to mitigate autoregressive error accumulation in deep forecasters and reports gains across 4 backbones and 10 datasets.
citing papers explorer
-
TopoFisher: Learning Topological Summary Statistics by Maximizing Fisher Information
TopoFisher optimizes trainable filtrations, vectorizations, and compressors in persistent homology to maximize Fisher information, yielding higher information than fixed cosmological summaries and approaching neural baselines with far fewer parameters while generalizing better under simulator shifts
-
From Circuit Evidence to Mechanistic Theory: An Inductive Logic Approach
Introduces Causal Functional Signatures grounded in causal evidence and ILP-learned architectural signatures to enable explicit, comparable, and portable mechanistic claims across model scales.
-
Witnessing the rapid growth of disk galaxies over cosmic time using JWST and HST
At z=1, disk galaxies exhibit U-shaped stellar age profiles with turnover at the edge, indicating inside-out growth with approximately 300% mass increase in outer regions since z=0.
-
Near-IR Weak-lensing (NIRWL) Measurements in the CANDELS Fields. II. Mass Mapping and Overdensity Characterization
First near-IR weak-lensing analysis of CANDELS fields detects 12 shear-selected overdensities with masses 0.2-2.2 x 10^14 solar masses at redshifts 0.22-0.9 and mean z=0.68.
-
Network-Based Interventions for HIV Prevention via Cascade-Aware Suppression of Transmission
Develops CAST, a polynomial-time approximation algorithm for selecting k individuals for HIV treatment in a network to minimize expected transmission cascades, achieving a 2√|P| approximation ratio.
-
LLM-Driven Performance-Space Augmentation for Meta-Learning-Based Algorithm Selection
LLM-generated synthetic datasets steered uniformly across a 2D performance space defined by two landmark algorithms improve meta-learner performance on algorithm selection for regression tasks.
-
Beyond the Independence Assumption: Finite-Sample Guarantees for Deep Q-Learning under $\tau$-Mixing
Finite-sample risk bounds for DQN with ReLU networks are extended to τ-mixing data, showing an extra dimensionality penalty in the convergence rate due to dependence.
-
iGENE: A Differentiable Flux-Tube Gyrokinetic Code in TensorFlow
A fully differentiable TensorFlow gyrokinetic code allows approximate gradients of nonlinear turbulence quantities to be used for outer-loop tasks such as profile prediction despite stochasticity.
-
Soft-MSM: Differentiable Context-Aware Elastic Alignment for Time Series
Soft-MSM is a smooth, gradient-enabled version of the context-aware MSM distance for time series alignment that outperforms Soft-DTW alternatives in clustering and nearest-centroid classification.
-
Eliciting Latent Predictions from Transformers with the Tuned Lens
Training per-layer affine probes on frozen transformers yields more reliable latent predictions than the logit lens and enables detection of malicious inputs from prediction trajectories.
-
Prospects for detecting surface color heterogeneity on asteroid surfaces from sparse multiband photometric survey data
The paper proposes a statistical test for asteroid surface color heterogeneity from sparse multiband photometry and evaluates its performance and sensitivity to model errors through Monte Carlo simulations of synthetic asteroids.
-
How does feature learning reshape the function space?
In the high-dimensional proportional regime, a large gradient step on a two-layer network induces a target-dependent spiked Gaussian covariance on the features, yielding a data-adaptive kernel that amplifies target-aligned eigenvalues and mixes leading eigenfunctions.
-
Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents
A dual hierarchical RL framework with two agents coordinates high-level dialogue strategy and low-level question generation to emulate judicial questioning and extract key information from Supreme Court arguments, outperforming baselines.
-
BoolXLLM: LLM-Assisted Explainability for Boolean Models
BoolXLLM augments an existing Boolean rule learner with LLMs for feature selection, discretization thresholds, and natural-language rule translation to improve interpretability while preserving accuracy.
-
Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation
Dynamic Meta-Metrics learns source-sentence conditioned combinations of MT metrics, with MLP-based and soft-conditioned versions showing gains over linear and GP ensembles on WMT data.
-
More Aligned, Less Diverse? Analyzing the Grammar and Lexicon of Two Generations of LLMs
Newer LLMs exhibit reduced syntactic and lexical diversity in English news text generation compared to older models, as measured by HPSG grammar and diversity metrics from ecology and information theory, while human-authored text shows little change.
-
CAST: Mitigating Object Hallucination in Large Vision-Language Models via Caption-Guided Visual Attention Steering
CAST reduces object hallucination in LVLMs by 6.03% on average across five models and five benchmarks by identifying caption-sensitive attention heads and applying optimized steering directions to their outputs, with negligible added inference cost.
-
Mitigating stellar radial velocity jitter using orthogonal activity indices and a time-aware neural network
A time-aware convolutional attention network trained on StarSim synthetic spectra reduces stellar activity radial velocity jitter to 52.5% and 62.4% of original levels in HARPS and CARMENES data for epsilon Eridani and TZ Arietis.
-
Concentration and Calibration in Predictive Bayesian Inference
Predictive Bayesian inference posteriors concentrate onto a forward-model-dependent quantity and produce miscalibrated credible sets unless the predictive model contains the true data-generating process.
-
Finding Meaning in Embeddings: Concept Separation Curves
Concept Separation Curves provide a classifier-independent method to visualize and quantify how sentence embeddings distinguish conceptual meaning from syntactic variations across languages and domains.
-
Unsupervised Confidence Calibration for Reasoning LLMs from a Single Generation
Unsupervised single-generation confidence calibration for reasoning LLMs via offline self-consistency proxy distillation outperforms baselines on math and QA tasks and improves selective prediction.
-
ParamBoost: Gradient Boosted Piecewise Cubic Polynomials
ParamBoost improves GAMs by fitting piecewise cubic polynomials via gradient boosting and supports constraints for continuity, monotonicity, convexity, and feature interactions.
-
Steering Llama 2 via Contrastive Activation Addition
Contrastive Activation Addition steers Llama 2 Chat by adding averaged residual-stream activation differences from contrastive example pairs to control targeted behaviors at inference time.
-
Reviving Error Correction in Modern Deep Time-Series Forecasting
UEC-STD is an architecture-agnostic corrector that uses seasonal-trend decomposition to mitigate autoregressive error accumulation in deep forecasters and reports gains across 4 backbones and 10 datasets.
-
Precise and Rapid Parameter Inference of Kilonova with Conditional Variational Autoencoder
A conditional variational autoencoder is trained on public kilonova light curves to enable rapid parameter inference for binary neutron star merger models in under three hours total.
-
Explainable AI Isn't Enough! Rethinking Algorithmic Contestability
The paper defines algorithmic contestability as identifying evidence to overturn potentially incorrect decisions and identifies three types of such evidence that make decisions normatively indefensible under the decision maker's standards.
-
Smooth Multi-Policy Causal Effect Estimation in Longitudinal Settings
PEQ-Net uses policy-aware reparameterization of ICE Q-functions and kernel mean embeddings in a shared encoder, followed by LTMLE, to jointly estimate multiple policies while constraining second-order bias for lower variance.
-
bde: A Python Package for Bayesian Deep Ensembles via MILE
bde is a new Python package that implements Bayesian deep ensembles via efficient JAX-based Microcanonical Langevin Ensembles for tabular regression and classification with uncertainty estimates.
-
A Unified Approach for Computing Wasserstein Barycenters of Discrete and Continuous Measures
A mirror descent algorithm computes exact Wasserstein barycenters for mixed discrete and continuous input measures with convergence guarantees.
-
A Mean Curvature Approach to Boundary Detection: Geometric Insights for Unsupervised Learning
MCBP detects boundaries by computing discrete mean curvature from k-nearest neighbor patches on the data manifold, then decomposes data into low-curvature smooth and high-curvature boundary subsets to improve clustering.
-
NeuralSet: A High-Performing Python Package for Neuro-AI
NeuralSet is a scalable Python framework that unifies diverse neural recordings and stimuli with deep learning embeddings via metadata decoupling and lazy data extraction.
-
Predicting the thermodynamics in the chromosphere from the translation of SDO data into the IRIS$^{2}$ inversion results using a visual transformer model
A visual transformer model trained on IRIS inversions predicts chromospheric temperature and density from SDO data with correlations around 0.8 on 80% of test cases.
-
Self-Improving Tabular Language Models via Iterative Reward-Guided Post-Training
TabGRAA applies group-relative advantage alignment in an iterative reward-guided post-training loop to improve tabular language model generators on fidelity, utility, and privacy trade-offs across five benchmarks.
-
Benchmarking Machine Learning Architectures for Antimicrobial Stewardship in Pediatric ICUs
Benchmarking in pediatric ICU antimicrobial stewardship shows performance depends mainly on target prevalence and dataset traits rather than model complexity, with sequence models improving precision-recall at 24-hour resolution but showing poorer calibration than tabular models.
-
Making Uncertainty Visible: Multiverse Analysis for Robust Computational Social Science
Multiverse analysis of three published CSS studies reveals substantial variation in findings across methodological decision combinations and identifies cases of computational failure not reported in originals.
-
Observational Signatures and Constraints on the Intermediate Neutron-Capture Process. The Case of the CEMP star TYC 6044-714-1 (RAVE J094921.8-161722)
High-precision analysis of TYC 6044-714-1 favors s+r nucleosynthesis over i-process models, which require implausible conditions and mismatch Ba isotopes.
-
Trustworthy AI Suffers from Invariance Conflicts and Causality is The Solution
Causality resolves trade-offs in trustworthy AI by treating them as invariance conflicts under different data-generating process changes.
-
Fine-Tuning Pre-Trained Code Models for AI-Generated Code Detection
Fine-tuning CodeBERT, GraphCodeBERT, UniXcoder and CodeT5+ with augmentation, cross-validation and ensembling yields macro-F1 of 0.737 on binary human-vs-AI code detection and 0.422 on 11-class model attribution in SemEval-2026 Task 13.