ArgBench unifies 33 existing datasets into a standardized benchmark for testing LLMs across 46 argumentation tasks and analyzes the impact of prompting techniques and model factors on performance.
hub
Watanabe, Tree-structured parzen estimator: Understanding its al- gorithm components and their roles for better empirical performance (2023)
18 Pith papers cite this work. Polarity classification is still indexing.
abstract
Recent scientific advances require complex experiment design, necessitating the meticulous tuning of many experiment parameters. Tree-structured Parzen estimator (TPE) is a widely used Bayesian optimization method in recent parameter tuning frameworks such as Hyperopt and Optuna. Despite its popularity, the roles of each control parameter in TPE and the algorithm intuition have not been discussed so far. The goal of this paper is to identify the roles of each control parameter and their impacts on parameter tuning based on the ablation studies using diverse benchmark datasets. The recommended setting concluded from the ablation studies is demonstrated to improve the performance of TPE. Our TPE implementation used in this paper is available at https://github.com/nabenabe0928/tpe/tree/single-opt. OptunaHub now provides our standalone TPE implementation at https://hub.optuna.org/samplers/tpe_tutorial/.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
COCOCO is a conformal framework for NeSy-CBMs that jointly conformalizes concepts and labels, reconciles them via deduction-abduction revision, and satisfies consistency, coverage, and conciseness while retaining distribution-free guarantees.
PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
LRP on EEG transformers reveals Clever Hans artifacts in motor imagery tasks and a recurring central electrode cluster as a candidate sensorimotor signature of arousal.
GFlowNets sample multiple valid mechanistic simulator configurations for digital twin adaptation, recovering main parameter regions and preserving uncertainty in a tomato model case study.
FluidFlow uses conditional flow-matching with U-Net and DiT architectures to predict pressure and friction coefficients on airfoils and 3D aircraft meshes, outperforming MLP baselines with better generalization.
PENEX is a new formulation of the multi-class exponential loss for neural networks that supports first-order optimization and improves generalization in low-data regimes.
A new leaf-instance dataset for soybean-cotton detection and segmentation collected across growth stages and conditions from commercial farms is presented and validated with YOLOv11.
Multi-objective Bayesian optimization with TPE tunes industrial drive current controllers to expert-level performance in minutes on real hardware without a model or firmware changes.
Time series foundation models scale under a single training recipe, with forecast quality improving from 4M to 2.5B parameters and new SOTA results on BOOM, GIFT-Eval, and TIME benchmarks.
A physics-informed neural network infers pT spectra of pi, K, p, Lambda, and Ks in unmeasured rapidity regions from PYTHIA8 pp collisions at 13.6 TeV, achieving 1.5-5.83% yield uncertainties while reproducing yield ratios and freeze-out parameters.
OrthoBO introduces an orthogonal acquisition estimator subtracting an optimally weighted score-function control variate to reduce Monte Carlo variance, preserve the acquisition target, and improve ranking stability in Bayesian hyperparameter optimization.
PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.
A decision-support framework applies AFT models to show Nvidia L4 GPUs yield 20% longer adversarial survival time at 75% lower cost than V100, with inference latency as the strongest robustness predictor.
A quantile-regression ensemble with safety factor reduces under-allocated jobs from 4.17% to 2.89% and average overallocation from 148% to 44.51% on SAP build data.
A Gated Residual Network correction model reduces fault location error by 76% in simulated onshore wind farm collector networks compared to state-of-the-art methods.
EZR combines active Naive Bayes sampling and decision-tree distillation to reach over 90% of best-known multi-objective optimization performance on 60 datasets while producing clearer explanations than LIME, SHAP or BreakDown.
Transformer models under active learning classify high-binding epitopes from a small docking dataset more accurately than random sampling or other architectures in low-data regimes for PRRS.
citing papers explorer
-
ArgBench: Benchmarking LLMs on Computational Argumentation Tasks
ArgBench unifies 33 existing datasets into a standardized benchmark for testing LLMs across 46 argumentation tasks and analyzes the impact of prompting techniques and model factors on performance.
-
Concise and Logically Consistent Conformal Sets for Neuro-Symbolic Concept-Based Models
COCOCO is a conformal framework for NeSy-CBMs that jointly conformalizes concepts and labels, reconciles them via deduction-abduction revision, and satisfies consistency, coverage, and conciseness while retaining distribution-free guarantees.
-
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts
PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
-
From Clever Hans to Scientific Discovery: Interpreting EEG Foundational Transformers with LRP
LRP on EEG transformers reveals Clever Hans artifacts in motor imagery tasks and a recurring central electrode cluster as a candidate sensorimotor signature of arousal.
-
Generative Flow Networks for Model Adaptation in Digital Twins of Natural Systems
GFlowNets sample multiple valid mechanistic simulator configurations for digital twin adaptation, recovering main parameter regions and preserving uncertainty in a tomato model case study.
-
FluidFlow: a flow-matching generative model for fluid dynamics surrogates on unstructured meshes
FluidFlow uses conditional flow-matching with U-Net and DiT architectures to predict pressure and friction coefficients on airfoils and 3D aircraft meshes, outperforming MLP baselines with better generalization.
-
PENEX: AdaBoost-Inspired Neural Network Regularization
PENEX is a new formulation of the multi-class exponential loss for neural networks that supports first-order optimization and improves generalization in low-data regimes.
-
A Leaf-Level Dataset for Soybean-Cotton Detection and Segmentation
A new leaf-instance dataset for soybean-cotton detection and segmentation collected across growth stages and conditions from commercial farms is presented and validated with YOLOv11.
-
Towards Autonomous Commissioning of Industrial Drives via Multi-Objective Bayesian Optimization
Multi-objective Bayesian optimization with TPE tunes industrial drive current controllers to expert-level performance in minutes on real hardware without a model or firmware changes.
-
Toto 2.0: Time Series Forecasting Enters the Scaling Era
Time series foundation models scale under a single training recipe, with forecast quality improving from 4M to 2.5B parameters and new SOTA results on BOOM, GIFT-Eval, and TIME benchmarks.
-
Inferring identified hadron production in $pp$ collisions with physics-informed machine learning at the LHC
A physics-informed neural network infers pT spectra of pi, K, p, Lambda, and Ks in unmeasured rapidity regions from PYTHIA8 pp collisions at 13.6 TeV, achieving 1.5-5.83% yield uncertainties while reproducing yield ratios and freeze-out parameters.
-
ORTHOBO: Orthogonal Bayesian Hyperparameter Optimization
OrthoBO introduces an orthogonal acquisition estimator subtracting an optimally weighted score-function control variate to reduce Monte Carlo variance, preserve the acquisition target, and improve ranking stability in Bayesian hyperparameter optimization.
-
Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference
PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.
-
Survival of the Cheapest: Cost-Aware Hardware Adaptation for Adversarial Robustness
A decision-support framework applies AFT models to show Nvidia L4 GPUs yield 20% longer adversarial survival time at 75% lower cost than V100, with inference latency as the strongest robustness predictor.
-
Optimizing Memory Allocation in Distributed Clusters with Predictive Modeling
A quantile-regression ensemble with safety factor reduces under-allocated jobs from 4.17% to 2.89% and average overallocation from 148% to 44.51% on SAP build data.
-
Data-Driven Reduction of Fault Location Errors in Onshore Wind Farm Collectors
A Gated Residual Network correction model reduces fault location error by 76% in simulated onshore wind farm collector networks compared to state-of-the-art methods.
-
Minimal Data, Maximum Clarity: A Heuristic for Explaining Optimization
EZR combines active Naive Bayes sampling and decision-tree distillation to reach over 90% of best-known multi-objective optimization performance on 60 datasets while producing clearer explanations than LIME, SHAP or BreakDown.
-
Transformer-Based Active Learning for Data-Efficient Vaccine Epitope Selection in PRRS
Transformer models under active learning classify high-binding epitopes from a small docking dataset more accurately than random sampling or other architectures in low-data regimes for PRRS.