Optuna: A Next-generation Hyperparameter Optimization Framework

Masanori Koyama; Shotaro Sano; Takeru Ohta; Takuya Akiba; Toshihiko Yanase

arxiv: 1907.10902 · v1 · pith:VDONACX5new · submitted 2019-07-25 · 💻 cs.LG · stat.ML

Optuna: A Next-generation Hyperparameter Optimization Framework

Takuya Akiba , Shotaro Sano , Toshihiko Yanase , Takeru Ohta , Masanori Koyama This is my paper

Pith reviewed 2026-05-24 16:19 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords hyperparameter optimizationdefine-by-runpruning strategiesdistributed computingmachine learningsearch spaceOptuna

0 comments

The pith

Optuna is the first hyperparameter optimization software designed with a define-by-run principle.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes three design criteria for next-generation hyperparameter optimization software: a define-by-run API that lets users build the search space dynamically, efficient searching and pruning strategies, and a versatile architecture for uses ranging from distributed systems to interactive notebooks. Optuna is introduced as the first framework to follow the define-by-run principle while satisfying these criteria. A sympathetic reader would care because fixed search spaces in earlier tools limit flexibility when hyperparameters depend on one another or on trial outcomes.

Core claim

Optuna implements a define-by-run API that allows users to construct the parameter search space dynamically during the optimization process, paired with efficient implementations of searching and pruning strategies and an easy-to-setup architecture that supports scalable distributed computing as well as lightweight interactive experiments. The paper presents this as the first optimization software designed around the define-by-run principle and shows its effectiveness through experimental results and real-world applications.

What carries the argument

The define-by-run API, which permits dynamic construction of the parameter search space as optimization proceeds rather than requiring it to be fixed in advance.

If this is right

Users can define conditional hyperparameters whose availability depends on values chosen earlier in a trial.
Pruning strategies can be applied efficiently because the framework knows the full trial structure at runtime.
The same codebase can run unchanged from a single interactive session to a multi-node distributed setup.
Real-world applications become feasible without rewriting the search logic for each deployment scale.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The dynamic space construction could reduce manual engineering effort when tuning models whose architecture choices affect later hyperparameters.
Similar define-by-run patterns might transfer to automated machine learning pipelines beyond hyperparameter search.
Integration with dynamic-graph frameworks would become more natural because both sides evaluate structure at runtime.

Load-bearing premise

That the three proposed design criteria are the appropriate and sufficient requirements for next-generation hyperparameter optimization software.

What would settle it

A head-to-head comparison in which a conventional fixed-space optimizer matches or exceeds Optuna on both final performance and total compute time when the task requires conditional or outcome-dependent hyperparameters.

Figures

Figures reproduced from arXiv: 1907.10902 by Masanori Koyama, Shotaro Sano, Takeru Ohta, Takuya Akiba, Toshihiko Yanase.

**Figure 1.** Figure 1: An example code of Optuna’s define-by-run style API. This code builds a space of hyperparameters for a classifier of the MNIST dataset and optimizes the number of layers and the number of hidden units at each layer. 1 import hyperopt 2 import ... 3 4 space = { 5 ’n units l1 ’: hp.randint(’n units l1 ’, 128) , 6 ’l2’: hp.choice(’l2’, [{ 7 ’has l2 ’: True , 8 ’n units l2 ’: hp.randint(’n units l2 ’, 128) , 9… view at source ↗

**Figure 2.** Figure 2: An example code of Hyperopt [1] that has the exactly same functionality as the code in 1. Hyperopt is an example of define-and-run style API. Optuna is released under the MIT license (https://github. com/pfnet/optuna/), and is in production use at Preferred Networks for more than one year. 2 Define-by-run API In this section we describe the significance of the define-by-run principle. As we will elaborate … view at source ↗

**Figure 3.** Figure 3: An example code of Optuna for the construction of a heterogeneous parameter-space. This code simultaneously explores the parameter spaces of both random forest and MLP. is another important strength of the define-by-run design. Figure 4 is another example code written in Optuna for a more complex scenario. This code is capable of simultaneously optimizing both the topology of a multilayer perceptron (met… view at source ↗

**Figure 6.** Figure 6: Overview of Optuna’s system design. Each worker executes one instance of an objective function in each study. The Objective function runs its trial using Optuna APIs. When the API is invoked, the objective function accesses the shared storage and obtains the information of the past studies from the storage when necessary. Each worker runs the objective function independently and shares the progress of the … view at source ↗

**Figure 9.** Figure 9: Result of comparing TPE+CMA-ES against other existing methods in terms of best attained objective value. Each algorithm was applied to each study 30 times, and Paired MannWhitney U test with α = 0.0005 was used to determine whether TPE+CMA-ES outperforms each rival. the purpose, but also comes with multiple built-in optimization algorithms including the mixture of independent and relational sampling, whic… view at source ↗

**Figure 8.** Figure 8: Optuna dashboard. This example shows the online transition of objective values, the parallel coordinates plot of sampled parameters, the learning curves, and the tabular descriptions of investigated trials. Optuna’s new design thus significantly reduces the effort required for storage deployment. This new design can be easily incorporated into a container-orchestration system like Kubernetes as well. As… view at source ↗

**Figure 10.** Figure 10: Computational time spent by different frameworks for each test case. 5.2 Performance Evaluation of Pruning We evaluated the performance gain from the pruning procedure in the Optuna-implemented optimization of Alex Krizhevsky’s neural network (AlexNet) [25] on the Street View House Numbers (SVHN) dataset [26]. We tested our pruning system together with random search and TPE. Following the experiment in … view at source ↗

**Figure 11.** Figure 11: The transition of average test errors of simplified AlexNet for SVHN dataset. Figure (a) illustrates the e [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗

**Figure 12.** Figure 12: Distributed hyperparameter optimization process for [PITH_FULL_IMAGE:figures/full_fig_p008_12.png] view at source ↗

read the original abstract

The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (https://github.com/pfnet/optuna/).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Optuna introduces a define-by-run API for dynamic hyperparameter search spaces in a practical open-source tool, but the 'first of its kind' claim lacks needed comparisons to prior frameworks.

read the letter

Optuna's core idea is a define-by-run API that lets users build the hyperparameter search space inside the objective function itself. This handles conditional and dynamic spaces more naturally than static definitions in older tools. The paper also pushes for efficient search plus pruning and a versatile setup that works from laptops to distributed runs. They deliver an MIT-licensed library that matches those goals on paper, which is a concrete step forward for routine tuning work. The design criteria they list are reasonable and the implementation focus is useful for people who actually run experiments. The main weakness is the novelty positioning. The abstract states Optuna is the first optimization software built on the define-by-run principle, yet it gives no direct contrast to earlier systems such as Hyperopt that already support conditional parameters. That gap makes the central claim rest on an untested assumption about what came before. The experiments and real-world cases are referenced but the abstract supplies no numbers, baselines, or error bars, so the efficiency claims stay unverified from the text alone. This work is aimed at machine learning practitioners who spend time on hyperparameter tuning and want something more flexible than grid search or basic Bayesian libraries. Readers building or evaluating HPO tools would find the API discussion and architecture choices worth looking at. It deserves peer review because the software contribution is real and the design points are practical, even though the paper would need added comparisons and clearer results to stand stronger.

Referee Report

2 major / 0 minor

Summary. The paper proposes three design criteria for next-generation hyperparameter optimization software: (1) a define-by-run API enabling dynamic construction of the parameter search space, (2) efficient implementations of searching and pruning strategies, and (3) a versatile, easy-to-deploy architecture supporting distributed and interactive use cases. It introduces Optuna as an open-source (MIT) implementation of these criteria, asserts that it is the first HPO framework designed around the define-by-run principle, and illustrates its utility via experimental results and real-world applications.

Significance. If the design criteria and their implementation in Optuna hold, the work could meaningfully advance practical HPO by supporting more flexible search spaces than static APIs allow. The public GitHub release under an open license is a concrete strength that aids reproducibility and community adoption.

major comments (2)

[Abstract] Abstract: the claim that Optuna is 'particularly the first of its kind' as a define-by-run HPO framework is presented without any comparison to prior systems that already support conditional or dynamic parameter spaces at runtime (e.g., Hyperopt's conditional parameters). Because this novelty assertion is used to position the entire contribution, the absence of such a comparison is load-bearing for the central claim.
[Experimental results (referenced in abstract)] The manuscript does not indicate whether the experimental results include head-to-head comparisons against existing HPO libraries on standard benchmarks with reported metrics (wall-clock time, final objective value, number of trials). Without such baselines the demonstration that the three proposed criteria yield measurable gains remains unverified.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate the revisions we will make to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that Optuna is 'particularly the first of its kind' as a define-by-run HPO framework is presented without any comparison to prior systems that already support conditional or dynamic parameter spaces at runtime (e.g., Hyperopt's conditional parameters). Because this novelty assertion is used to position the entire contribution, the absence of such a comparison is load-bearing for the central claim.

Authors: We agree that the abstract's claim would be strengthened by explicit comparison to prior systems supporting conditional parameters. The define-by-run API in Optuna permits fully dynamic search-space construction at runtime via arbitrary Python control flow inside the objective function, which is distinct from the conditional mechanisms in frameworks such as Hyperopt that still require an upfront static specification. To address the concern, we will revise the abstract to qualify or remove the 'first of its kind' phrasing and add a concise comparison to related work in the manuscript body. revision: yes
Referee: [Experimental results (referenced in abstract)] The manuscript does not indicate whether the experimental results include head-to-head comparisons against existing HPO libraries on standard benchmarks with reported metrics (wall-clock time, final objective value, number of trials). Without such baselines the demonstration that the three proposed criteria yield measurable gains remains unverified.

Authors: The experiments in the manuscript illustrate the three design criteria through real-world applications and selected benchmarks. We acknowledge that the current presentation does not explicitly report head-to-head comparisons with quantitative metrics against other libraries. We will revise the experimental section to include such comparisons on standard benchmarks, reporting wall-clock time, final objective values, and number of trials. revision: yes

Circularity Check

0 steps flagged

No circularity: software introduction paper with no derivation chain

full rationale

The paper proposes three design criteria for HPO software and presents Optuna as an implementation meeting them, explicitly stating it is the first with define-by-run API. No equations, fitted parameters, predictions, or mathematical derivations exist that could reduce to inputs by construction. The novelty assertion is a direct claim without load-bearing self-citations or self-definitional loops. This matches the expected non-finding for a framework introduction paper that is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on the three stated design criteria as the foundation for what constitutes next-generation software; no free parameters or invented entities are introduced.

axioms (1)

domain assumption The three proposed design criteria are the key requirements for next-generation hyperparameter optimization software.
Explicitly stated as the purpose of the study in the abstract.

pith-pipeline@v0.9.0 · 5715 in / 1064 out tokens · 23694 ms · 2026-05-24T16:19:22.658447+00:00 · methodology

discussion (0)

Forward citations

Cited by 22 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Searches for $B^0\to K^+\pi^-\tau^+\tau^-$ and $B_s^0\to K^+K^-\tau^+\tau^-$ decays
hep-ex 2025-10 accept novelty 8.0

LHCb reports the first upper limits on B0 → K+π−τ+τ− and Bs0 → K+K−τ+τ− branching fractions, with recast limits of 2.8×10−4 on B0 → K*(892)0 τ+τ− at 95% CL that improve prior bounds by an order of magnitude.
Learning Dynamic Stability Landscapes in Synchronization Networks
cs.LG 2026-05 unverdicted novelty 7.0

Introduces graph-to-image prediction of per-node dynamic stability landscapes in oscillator networks from topology, releases two 10k-graph datasets, and shows GNN-CNN models achieve good accuracy with cross-size gener...
Multivariate quantum reservoir computing with discrete and continuous variable systems
quant-ph 2026-04 unverdicted novelty 7.0

Quantum reservoirs handle multivariate time series best with task-specific encodings that leverage non-classical effects.
Polarized Target Nuclear Magnetic Resonance Measurements with Deep Neural Networks
physics.ins-det 2026-03 unverdicted novelty 7.0

Deep neural networks reduce fitting uncertainties in CW-NMR polarization measurements for dynamically polarized targets.
optimize_anything: A Universal API for Optimizing any Text Parameter
cs.CL 2026-05 unverdicted novelty 6.0

A universal LLM optimizer for text artifacts achieves SOTA results on six tasks including tripling ARC-AGI accuracy and cutting cloud costs by 40% via cross-task transfer and side information.
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts
cs.CL 2026-05 unverdicted novelty 6.0

PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
Self-Supervised Laplace Approximation for Bayesian Uncertainty Quantification
stat.ML 2026-05 unverdicted novelty 6.0

SSLA approximates the posterior predictive distribution by refitting Bayesian models on self-predicted data, providing a sampling-free method that improves predictive calibration over classical Laplace approximations ...
On Privacy Leakage in Tabular Diffusion Models: Influential Factors, Attacker Knowledge, and Metrics
cs.LG 2026-05 unverdicted novelty 6.0

Tabular diffusion models leak membership information via attacks even with partial attacker knowledge, and common heuristic privacy metrics like distance-to-closest-record are unreliable.
Euclid preparation. CosmoPostProcess: A simulation calibrated framework for weak lensing selection bias in richness-selected galaxy clusters
astro-ph.CO 2026-05 unverdicted novelty 6.0

CosmoPostProcess delivers simulation-calibrated radial corrections for projection-induced selection bias (20-40% amplitude near 1 h^{-1} Mpc) and baryonic effects in Euclid richness-selected cluster weak lensing profiles.
Efficiently emulating distribution functions in gigaparsec volumes for varying cosmological parameters
astro-ph.CO 2026-04 conditional novelty 6.0

A new overdensity-conditioned emulator trained on small subvolumes from Quijote recovers the global halo mass function via integration over the overdensity distribution at 0.026% of the simulation cost.
Natural Language Embeddings of Synthesis and Testing conditions Enhance Glass Dissolution Prediction
cond-mat.mtrl-sci 2026-04 unverdicted novelty 6.0

Natural language embeddings of synthesis and testing conditions improve ML predictions of glass dissolution rates and enable generalization to out-of-distribution compositions with new elements.
Search for the lepton-flavour violating decays $B^+ \to \pi^+ \mu^\pm e^\mp$
hep-ex 2026-04 accept novelty 6.0

No signal observed for B+ → π+ μ± e∓; branching fraction upper limit set at 1.8 × 10^{-9} at 90% CL.
Efficient Brain Extraction of MRI Scans with Mild to Moderate Neuropathology
eess.IV 2026-02 accept novelty 6.0

A U-net with signed-distance transform loss achieves mean Dice scores of 0.96 on held-out and external MRI data for robust skull stripping in neuropathological cases.
TreeCoder: Systematic Exploration and Optimisation of Decoding and Constraints for LLM Code Generation
cs.LG 2025-11 unverdicted novelty 6.0

TreeCoder improves LLM code generation accuracy by representing decoding as an optimizable tree search over programs with first-class constraints for syntax, style, and execution, outperforming baselines on MBPP and S...
Search for the lepton-flavor-violating $\tau^{-} \rightarrow e^{\mp} \ell^{\pm} \ell^{\mp}$ decays at Belle II
hep-ex 2025-07 accept novelty 6.0

Belle II sets upper limits between 1.3 and 2.5 times 10 to the minus 8 on branching fractions for four tau to e l l decay modes at 90 percent confidence level, the most stringent to date for four modes.
A Leaf-Level Dataset for Soybean-Cotton Detection and Segmentation
cs.CV 2025-03 unverdicted novelty 6.0

A new leaf-instance dataset for soybean-cotton detection and segmentation collected across growth stages and conditions from commercial farms is presented and validated with YOLOv11.
Inferring identified hadron production in $pp$ collisions with physics-informed machine learning at the LHC
hep-ph 2026-05 unverdicted novelty 5.0

A physics-informed neural network infers pT spectra of pi, K, p, Lambda, and Ks in unmeasured rapidity regions from PYTHIA8 pp collisions at 13.6 TeV, achieving 1.5-5.83% yield uncertainties while reproducing yield ra...
Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference
cs.AR 2025-09 unverdicted novelty 5.0

PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.
Improved Chase-Pyndiah Decoding for Product Codes with Scaled Messages
cs.IT 2026-04 unverdicted novelty 4.0

Scaling extrinsic messages by decoder confidence in Chase-Pyndiah decoding for product codes delivers a 0.1 dB gain over the baseline decoder.
VIGILant: an automatic classification pipeline for glitches in the Virgo detector
gr-qc 2026-04 unverdicted novelty 4.0

VIGILant applies tree-based models and a ResNet CNN to classify Virgo O3b glitches with 98% accuracy and has been deployed for daily use with an interactive dashboard.
PR3DICTR: A modular AI framework for medical 3D image-based detection and outcome prediction
cs.CV 2026-04 unverdicted novelty 4.0

PR3DICTR is a new open-access modular framework for 3D medical image classification and outcome prediction that works with as little as two lines of code.
An Automatic Ground Collision Avoidance System with Reinforcement Learning
cs.LG 2026-04 unverdicted novelty 3.0

The paper designs a reinforcement learning-based automatic ground collision avoidance system for jet trainers that uses limited observations and line-of-sight terrain queries to prevent collisions.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · cited by 22 Pith papers · 2 internal anchors

[1]

Hyperopt: a Python library for model selection and hyperparameter optimization

James Bergstra, Brent Komer, Chris Eliasmith, Dan Yamins, and David D Cox. Hyperopt: a Python library for model selection and hyperparameter optimization. Compu- tational Science & Discovery, 8(1):14008, 2015

work page 2015
[2]

Prac- tical bayesian optimization of machine learning algorithms

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Prac- tical bayesian optimization of machine learning algorithms. In NIPS, pages 2951–2959, 2012

work page 2012
[3]

Hoos, and Kevin Leyton-Brown

Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm 8 Blender Foundation — www.blender.org Preprint – Optuna: A Next-generation Hyperparameter Optimization Framework 9 conﬁguration. In LION, pages 507–523, 2011. ISBN 978- 3-642-25565-6

work page 2011
[4]

Autotune: A derivative- free optimization framework for hyperparameter tuning

Patrick Koch, Oleg Golovidov, Steven Gardner, Brett Wu- jek, Joshua Griﬃn, and Yan Xu. Autotune: A derivative- free optimization framework for hyperparameter tuning. In KDD, pages 443–452, 2018. ISBN 978-1-4503-5552-0

work page 2018
[5]

Google Vizier: A service for black-box optimization

Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D Sculley. Google Vizier: A service for black-box optimization. In KDD, pages 1487–1495, 2017. ISBN 978-1-4503-4887-4

work page 2017
[6]

Algorithms for hyper-parameter optimization

James Bergstra, R ´emi Bardenet, Yoshua Bengio, and Bal´azs K´egl. Algorithms for hyper-parameter optimization. In NIPS, pages 2546–2554, 2011

work page 2011
[7]

Gonzalez, and Ion Stoica

Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. Tune: A research platform for distributed model selection and train- ing. In ICML Workshop on AutoML, 2018

work page 2018
[8]

Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves

Tobias Domhan, Jost Tobias Springenberg, and Frank Hut- ter. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In IJCAI, pages 3460–3468, 2015

work page 2015
[9]

Learning curve prediction with Bayesian neural networks

Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. Learning curve prediction with Bayesian neural networks. In ICLR, 2017

work page 2017
[10]

Hyperband: A novel bandit-based approach to hyperparameter optimization

Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Ros- tamizadeh, and Ameet Talwalkar. Hyperband: A novel bandit-based approach to hyperparameter optimization. Journal of Machine Learning Research , 18(185):1–52, 2018

work page 2018
[11]

Ray: A Distributed Framework for Emerging AI Applications

Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, William Paul, Michael I. Jordan, and Ion Stoica. Ray: A dis- tributed framework for emerging AI applications. CoRR, abs/1712.05889, 2017. URL http://arxiv.org/abs/ 1712.05889

work page internal anchor Pith review Pith/arXiv arXiv 2017
[12]

Automatic Machine Learning: Methods, Sys- tems, Challenges

Frank Hutter, Lars Kottho ﬀ, and Joaquin Vanschoren, editors. Automatic Machine Learning: Methods, Sys- tems, Challenges . Springer, 2018. In press, available at http://automl.org/book

work page 2018
[13]

Chainer: a next-generation open source framework for deep learning

Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. Chainer: a next-generation open source framework for deep learning. In NIPS Workshop on Machine Learning Systems, 2015

work page 2015
[14]

DyNet: The Dynamic Neural Network Toolkit

Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kun- coro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra,...

work page internal anchor Pith review Pith/arXiv arXiv 2017
[15]

Automatic diﬀerentiation in PyTorch

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Al- ban Desmaison, Luca Antiga, and Adam Lerer. Automatic diﬀerentiation in PyTorch. In NIPS Autodiﬀ Workshop, 2017

work page 2017
[16]

Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng

Mart´ın Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeﬀrey Dean, Matthieu Devin, Sanjay Ghe- mawat, Geoﬀrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: A system for larg...

work page 2016
[17]

Completely derandomized self-adaptation in evolution strategies

Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evo- lutionary Computation, 9(2):159–195, 2001

work page 2001
[18]

Taking the human out of the loop: A review of bayesian optimization

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016

work page 2016
[19]

Massively parallel hyperparameter tuning

Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekate- rina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. Massively parallel hyperparameter tuning. In NeurIPS Workshop on Machine Learning Systems, 2018

work page 2018
[20]

Non-stochastic best arm identiﬁcation and hyperparameter optimization

Kevin Jamieson and Ameet Talwalkar. Non-stochastic best arm identiﬁcation and hyperparameter optimization. In Artiﬁcial Intelligence and Statistics, pages 240–248, 2016

work page 2016
[21]

Pandas: a foundational python library for data analysis and statistics

Wes McKinney. Pandas: a foundational python library for data analysis and statistics. In SC Workshop on Python for High Performance and Scientiﬁc Computing, 2011

work page 2011
[22]

Jupyter notebooks – a publishing format for re- producible computational workﬂows

Thomas Kluyver, Benjamin Ragan-Kelley, Fernando P´erez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Cor- lay, Paul Ivanov, Dami´an Avila, Saﬁa Abdalla, and Carol Willing. Jupyter notebooks – a publishing format for re- producible computational workﬂows. In F. Loizides and B. Schmidt, editors, P...

work page 2016
[23]

Benchmark suite of test functions suitable for evaluating black-box optimization strategies

Michael McCourt. Benchmark suite of test functions suitable for evaluating black-box optimization strategies. https://github.com/sigopt/evalset, 2016

work page 2016
[24]

A strategy for ranking optimization methods using multiple criteria

Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, and George Ke. A strategy for ranking optimization methods using multiple criteria. In ICML Workshop on AutoML, pages 11–20, 2016

work page 2016
[25]

ImageNet classiﬁcation with deep convolutional neural networks

Alex Krizhevsky, Ilya Sutskever, and Geoﬀrey E Hinton. ImageNet classiﬁcation with deep convolutional neural networks. In NIPS, pages 1097–1105, 2012

work page 2012
[26]

Reading digits in nat- ural images with unsupervised feature learning

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bis- sacco, Bo Wu, and Andrew Y Ng. Reading digits in nat- ural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011

work page 2011
[27]

Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper R. R. Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Tom Duerig, and Vittorio Ferrari. The open images dataset V4: uniﬁed image classi- ﬁcation, object detection, and visual relationship detection at scale. CoRR, abs/1811.00982, 2018

work page arXiv 2018
[28]

PFDet: 2nd Preprint – Optuna: A Next-generation Hyperparameter Optimization Framework 10 place solution to open images challenge 2018 object detec- tion track

Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, and Shuji Suzuki. PFDet: 2nd Preprint – Optuna: A Next-generation Hyperparameter Optimization Framework 10 place solution to open images challenge 2018 object detec- tion track. In ECCV Workshop on Open Images Challenge, 2018

work page 2018
[29]

Optimizing space ampliﬁcation in rocksdb

Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. Optimizing space ampliﬁcation in rocksdb. InCIDR, 2017

work page 2017

[1] [1]

Hyperopt: a Python library for model selection and hyperparameter optimization

James Bergstra, Brent Komer, Chris Eliasmith, Dan Yamins, and David D Cox. Hyperopt: a Python library for model selection and hyperparameter optimization. Compu- tational Science & Discovery, 8(1):14008, 2015

work page 2015

[2] [2]

Prac- tical bayesian optimization of machine learning algorithms

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Prac- tical bayesian optimization of machine learning algorithms. In NIPS, pages 2951–2959, 2012

work page 2012

[3] [3]

Hoos, and Kevin Leyton-Brown

Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm 8 Blender Foundation — www.blender.org Preprint – Optuna: A Next-generation Hyperparameter Optimization Framework 9 conﬁguration. In LION, pages 507–523, 2011. ISBN 978- 3-642-25565-6

work page 2011

[4] [4]

Autotune: A derivative- free optimization framework for hyperparameter tuning

Patrick Koch, Oleg Golovidov, Steven Gardner, Brett Wu- jek, Joshua Griﬃn, and Yan Xu. Autotune: A derivative- free optimization framework for hyperparameter tuning. In KDD, pages 443–452, 2018. ISBN 978-1-4503-5552-0

work page 2018

[5] [5]

Google Vizier: A service for black-box optimization

Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D Sculley. Google Vizier: A service for black-box optimization. In KDD, pages 1487–1495, 2017. ISBN 978-1-4503-4887-4

work page 2017

[6] [6]

Algorithms for hyper-parameter optimization

James Bergstra, R ´emi Bardenet, Yoshua Bengio, and Bal´azs K´egl. Algorithms for hyper-parameter optimization. In NIPS, pages 2546–2554, 2011

work page 2011

[7] [7]

Gonzalez, and Ion Stoica

Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. Tune: A research platform for distributed model selection and train- ing. In ICML Workshop on AutoML, 2018

work page 2018

[8] [8]

Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves

Tobias Domhan, Jost Tobias Springenberg, and Frank Hut- ter. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In IJCAI, pages 3460–3468, 2015

work page 2015

[9] [9]

Learning curve prediction with Bayesian neural networks

Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. Learning curve prediction with Bayesian neural networks. In ICLR, 2017

work page 2017

[10] [10]

Hyperband: A novel bandit-based approach to hyperparameter optimization

Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Ros- tamizadeh, and Ameet Talwalkar. Hyperband: A novel bandit-based approach to hyperparameter optimization. Journal of Machine Learning Research , 18(185):1–52, 2018

work page 2018

[11] [11]

Ray: A Distributed Framework for Emerging AI Applications

Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, William Paul, Michael I. Jordan, and Ion Stoica. Ray: A dis- tributed framework for emerging AI applications. CoRR, abs/1712.05889, 2017. URL http://arxiv.org/abs/ 1712.05889

work page internal anchor Pith review Pith/arXiv arXiv 2017

[12] [12]

Automatic Machine Learning: Methods, Sys- tems, Challenges

Frank Hutter, Lars Kottho ﬀ, and Joaquin Vanschoren, editors. Automatic Machine Learning: Methods, Sys- tems, Challenges . Springer, 2018. In press, available at http://automl.org/book

work page 2018

[13] [13]

Chainer: a next-generation open source framework for deep learning

Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. Chainer: a next-generation open source framework for deep learning. In NIPS Workshop on Machine Learning Systems, 2015

work page 2015

[14] [14]

DyNet: The Dynamic Neural Network Toolkit

Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kun- coro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra,...

work page internal anchor Pith review Pith/arXiv arXiv 2017

[15] [15]

Automatic diﬀerentiation in PyTorch

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Al- ban Desmaison, Luca Antiga, and Adam Lerer. Automatic diﬀerentiation in PyTorch. In NIPS Autodiﬀ Workshop, 2017

work page 2017

[16] [16]

Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng

Mart´ın Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeﬀrey Dean, Matthieu Devin, Sanjay Ghe- mawat, Geoﬀrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: A system for larg...

work page 2016

[17] [17]

Completely derandomized self-adaptation in evolution strategies

Nikolaus Hansen and Andreas Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evo- lutionary Computation, 9(2):159–195, 2001

work page 2001

[18] [18]

Taking the human out of the loop: A review of bayesian optimization

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016

work page 2016

[19] [19]

Massively parallel hyperparameter tuning

Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekate- rina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. Massively parallel hyperparameter tuning. In NeurIPS Workshop on Machine Learning Systems, 2018

work page 2018

[20] [20]

Non-stochastic best arm identiﬁcation and hyperparameter optimization

Kevin Jamieson and Ameet Talwalkar. Non-stochastic best arm identiﬁcation and hyperparameter optimization. In Artiﬁcial Intelligence and Statistics, pages 240–248, 2016

work page 2016

[21] [21]

Pandas: a foundational python library for data analysis and statistics

Wes McKinney. Pandas: a foundational python library for data analysis and statistics. In SC Workshop on Python for High Performance and Scientiﬁc Computing, 2011

work page 2011

[22] [22]

Jupyter notebooks – a publishing format for re- producible computational workﬂows

Thomas Kluyver, Benjamin Ragan-Kelley, Fernando P´erez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Cor- lay, Paul Ivanov, Dami´an Avila, Saﬁa Abdalla, and Carol Willing. Jupyter notebooks – a publishing format for re- producible computational workﬂows. In F. Loizides and B. Schmidt, editors, P...

work page 2016

[23] [23]

Benchmark suite of test functions suitable for evaluating black-box optimization strategies

Michael McCourt. Benchmark suite of test functions suitable for evaluating black-box optimization strategies. https://github.com/sigopt/evalset, 2016

work page 2016

[24] [24]

A strategy for ranking optimization methods using multiple criteria

Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, and George Ke. A strategy for ranking optimization methods using multiple criteria. In ICML Workshop on AutoML, pages 11–20, 2016

work page 2016

[25] [25]

ImageNet classiﬁcation with deep convolutional neural networks

Alex Krizhevsky, Ilya Sutskever, and Geoﬀrey E Hinton. ImageNet classiﬁcation with deep convolutional neural networks. In NIPS, pages 1097–1105, 2012

work page 2012

[26] [26]

Reading digits in nat- ural images with unsupervised feature learning

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bis- sacco, Bo Wu, and Andrew Y Ng. Reading digits in nat- ural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011

work page 2011

[27] [27]

Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper R. R. Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Tom Duerig, and Vittorio Ferrari. The open images dataset V4: uniﬁed image classi- ﬁcation, object detection, and visual relationship detection at scale. CoRR, abs/1811.00982, 2018

work page arXiv 2018

[28] [28]

PFDet: 2nd Preprint – Optuna: A Next-generation Hyperparameter Optimization Framework 10 place solution to open images challenge 2018 object detec- tion track

Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, and Shuji Suzuki. PFDet: 2nd Preprint – Optuna: A Next-generation Hyperparameter Optimization Framework 10 place solution to open images challenge 2018 object detec- tion track. In ECCV Workshop on Open Images Challenge, 2018

work page 2018

[29] [29]

Optimizing space ampliﬁcation in rocksdb

Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. Optimizing space ampliﬁcation in rocksdb. InCIDR, 2017

work page 2017