UD-DML creates balanced representative subsamples via uniform design in PCA space for efficient double machine learning estimation of average treatment effects on large datasets.
Communications of the ACM , volume=
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
A retrieval approach identifies anomalous dimensions in a set of query vectors and retrieves database vectors that are anomalous across those dimensions, with performance improving as query set size grows to around 8.
Multiscale CMH scanning generalizes the classic test to continuous spaces, achieving consistency for conditional independence testing by conditioning on marginal order statistics without requiring large stratum sizes.
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.
citing papers explorer
-
UD-DML: Uniform Design Subsampling for Double Machine Learning over Massive Data
UD-DML creates balanced representative subsamples via uniform design in PCA space for efficient double machine learning estimation of average treatment effects on large datasets.
-
Retrieval with Multiple Query Vectors through Anomalous Pattern Detection
A retrieval approach identifies anomalous dimensions in a set of query vectors and retrieves database vectors that are anomalous across those dimensions, with performance improving as query set size grows to around 8.
-
Multiscale Cochran-Mantel-Haenszel Scanning for Conditional Dependency
Multiscale CMH scanning generalizes the classic test to continuous spaces, achieving consistency for conditional independence testing by conditioning on marginal order statistics without requiring large stratum sizes.
-
Calibrating Model-Based Evaluation Metrics for Summarization
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.