pith. machine review for the scientific record. sign in

arxiv: 2604.16468 · v1 · submitted 2026-04-09 · 💻 cs.LG · cond-mat.mtrl-sci

Recognition: 2 theorem links

· Lean Theorem

Multi-Label Phase Diagram Prediction in Complex Alloys via Physics-Informed Graph Attention Networks

Amrita Basak, Eunjeong Park

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:58 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.mtrl-sci
keywords phase diagram predictiongraph neural networksphysics-informed learningalloy designmulti-label predictionthermodynamic constraintsmachine learning surrogate
0
0 comments X

The pith

A physics-informed graph attention network predicts stable phase sets in multicomponent alloys.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a machine learning model to rapidly determine which phases are stable at different compositions and temperatures in alloys containing multiple elements. Standard thermodynamic calculations are precise but too slow to scan dense grids of possible alloy recipes. The approach represents each alloy state as a small graph with one node per element, uses attention mechanisms to weigh elemental interactions, and adds constraints based on physical laws to ensure predictions respect thermodynamics. This surrogate model achieves high accuracy on known regions and maintains performance when applied to new composition spaces not seen during training. A reader would care because it could speed up the discovery of new alloys by allowing quick checks of phase stability without repeated full calculations.

Core claim

The authors show that coupling graph attention networks with thermodynamic constraint enforcement produces a surrogate model capable of multi-label phase prediction in the studied alloy system, attaining high exact-set accuracy on in-domain data and strong generalization to unseen ternary and quaternary sections.

What carries the argument

The four-node element graph with atomic fractions and elemental descriptors, processed by graph attention layers and combined with thermodynamic penalties or projections for physical consistency.

If this is right

  • The model enables dense mapping of phase equilibria across composition-temperature space at low cost.
  • Predictions remain consistent with thermodynamic principles, reducing unphysical outputs.
  • Generalization to unseen sections supports screening of new alloy compositions.
  • Performance holds across binary, ternary, and quaternary subsystems within the system.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar graph-based representations could extend to alloys with five or more elements by scaling the graph size.
  • Combining this surrogate with optimization algorithms might accelerate inverse design of alloys for targeted properties.
  • Validation against experimental phase data could reveal where the model needs refinement for real applications.

Load-bearing premise

That a graph with only four nodes, each representing one element, plus basic features and simple thermodynamic adjustments, captures all important phase interactions without bias or omissions.

What would settle it

A systematic discrepancy between the model's phase predictions and detailed thermodynamic calculations on a dense grid of an alloy system with pronounced higher-order interactions would falsify the claim of sufficient capture by the simple graph structure.

Figures

Figures reproduced from arXiv: 2604.16468 by Amrita Basak, Eunjeong Park.

Figure 1
Figure 1. Figure 1: End-to-end workflow for physics-consistent phase-set prediction. Our evaluation strategy reflects both “materials realism” and ML rigor: we assessed performance on trained binary/ternary subsystems, reserved an unseen ternary for extrapolative testing, and further validated with dense grids that stress-tested boundary resolution. In doing so, we focused on providing a surrogate that is not only accurate in… view at source ↗
Figure 2
Figure 2. Figure 2: GATv2-based architecture for multi-label phase-set prediction. (a) Graph attention layer: Fully connected element graph (Ag/Bi/Cu/Sn); node features = mole fractions and eight-dimensional Magpie; GATv2 computes attention and updates embeddings. (b) End-to-end backbone: GATv2 encodes element; pooled features with temperature drive an MLP to phase probabilities. 2.2.1 Element-graph construction and GATv2 att… view at source ↗
Figure 3
Figure 3. Figure 3: Pairwise hyperparameter interaction contour [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 8
Figure 8. Figure 8: Phase-multiplicity maps over composition-temperature space for two ternary systems. Rows correspond to the (a) Ag-Bi-Sn and (b) Ag-Cu-Sn systems. Columns compare GNN, GNN+Physics-Informed Loss, and GNN+Physics-Informed Decoding. Colors denote predicted phase multiplicity (1,2, or 3 phases). Building upon the binary results, similar and equally critical improvements were observed when evaluating the higher-… view at source ↗
Figure 10
Figure 10. Figure 10: In-domain dense-grid (1 at.% and 5 ℃) binary predictions. Left: match/mismatch map, Right: predicted phase map. (a) Bi-Cu, (b) Bi-Sn, (c) Cu-Sn [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: Out-of-domain (extrapolation) dense-grid predictions for the Ag-Bi-Cu-Sn quaternary system at 700 °C. Left: match/mismatch map, Right: predicted phase map. 4. Conclusions We treated phase-diagram inference as a multi-label phase-set prediction problem over composition￾temperature space and introduced a graph neural network augmented with physics-informed loss and a physics-informed decoder. A deliberate s… view at source ↗
read the original abstract

Accurate phase equilibria are foundational to alloy design because they encode the underlying thermodynamics governing stability, transformations, and processing windows. However, while the CALculation of Phase Diagrams (CALPHAD) provides a rigorous thermodynamic framework, exploring multicomponent composition-temperature space remains computationally expensive and is typically limited to sparse section. To enable rapid phase mapping and alloy screening, we propose a physics-informed graph attention network (GAT) that learns element-aware representations and couples them with thermodynamic constraints for multi-label phase-set prediction in the Ag-Bi-Cu-Sn alloy system. Using about 25,000 equilibrium states generated with pycalphad, each composition-temperature point is represented as a four-node element graph with atomic fractions and elemental descriptors as node features. The model combines graph attention, global pooling, and a multilayer perceptron to predict nine relevant phases. To improve physical consistency, we incorporate thermodynamic constraints, applied as training penalties or as an inference-time projection. Across six binary and three ternary subsystems, the baseline model achieves a macro-F1 score of 0.951 and 93.98% exact-set match, while physics-informed decoding improves robustness and raises exact-set accuracy to about 96% on dense in-domain grids. The surrogate also generalizes to an unseen ternary section with 99.32% exact-set accuracy and to a quaternary section at 700 {\deg}C with 91.78% accuracy. These results demonstrate that attention-based graph learning coupled with thermodynamic constraint enforcement provides an effective and physically consistent surrogate for high-resolution phase mapping and extrapolative alloy screening.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a physics-informed graph attention network (GAT) as a surrogate for CALPHAD calculations to predict multi-label phase equilibria in the Ag-Bi-Cu-Sn quaternary alloy system. Compositions are encoded as four-node element graphs with atomic fractions and elemental descriptors as features; the model predicts nine phases and incorporates thermodynamic constraints either as training penalties or inference-time projections. Trained on approximately 25,000 pycalphad-generated points, the baseline GAT achieves a macro-F1 of 0.951 and 93.98% exact-set accuracy, with constraints raising in-domain exact-set accuracy to ~96%. The model generalizes to an unseen ternary section (99.32% exact-set) and a quaternary section at 700°C (91.78% exact-set), supporting the claim that attention-based graph learning plus constraints yields a physically consistent, high-resolution phase-mapping surrogate.

Significance. If the central results hold, the approach provides a scalable alternative to direct CALPHAD evaluation for dense phase-diagram mapping and composition screening in multicomponent alloys, where traditional thermodynamic calculations become prohibitively expensive. The reported generalization to held-out ternary and quaternary sections, together with concrete metrics on independently generated data, indicates practical utility for alloy design workflows. The graph representation and constraint integration are conceptually aligned with element-wise thermodynamic interactions, though their sufficiency for higher-order effects remains to be fully verified.

major comments (2)
  1. [Methods (physics-informed decoding)] Methods, physics-informed constraint subsection: the precise mathematical form of the thermodynamic penalties (training) or projections (inference) is not specified beyond the term 'simple.' Without the explicit definition—e.g., whether penalties derive from independent Gibbs-energy calculations, phase-fraction bounds, or heuristic rules—it is impossible to confirm that the constraints are load-bearing for physical consistency rather than post-hoc adjustments. This detail is required to reproduce the reported lift from 93.98% to ~96% exact-set accuracy and to assess independence from the pycalphad training distribution.
  2. [Results (quaternary section)] Results, quaternary extrapolation paragraph and associated table: the exact-set accuracy drops from 99.32% on the unseen ternary to 91.78% on the quaternary section at 700°C. Because the four-node GAT aggregates only element-wise attentions present in the training distribution, this gap raises the possibility that higher-order (ternary or quaternary) phase stabilities not reducible to binary subsystems are incompletely captured. A per-phase error breakdown or ablation removing the graph structure would be needed to substantiate that the inductive bias is sufficient for the central extrapolation claim.
minor comments (2)
  1. [Abstract] Abstract and §2: the nine phases being predicted are not enumerated or cross-referenced to a table; listing them explicitly would improve readability.
  2. [Methods] Notation: 'exact-set match' is used without a formal definition in the main text; a one-sentence clarification (e.g., all predicted phases exactly match the ground-truth set) would remove ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which will strengthen the clarity and reproducibility of the manuscript. We address each major comment below.

read point-by-point responses
  1. Referee: Methods, physics-informed constraint subsection: the precise mathematical form of the thermodynamic penalties (training) or projections (inference) is not specified beyond the term 'simple.' Without the explicit definition—e.g., whether penalties derive from independent Gibbs-energy calculations, phase-fraction bounds, or heuristic rules—it is impossible to confirm that the constraints are load-bearing for physical consistency rather than post-hoc adjustments. This detail is required to reproduce the reported lift from 93.98% to ~96% exact-set accuracy and to assess independence from the pycalphad training distribution.

    Authors: We agree that the current description is insufficient for full reproducibility. The manuscript refers to the constraints as 'simple' without providing the explicit equations or implementation details. In the revised manuscript we will expand the Physics-Informed Constraint subsection to include the precise mathematical formulations: the training penalty is a weighted L2 term penalizing violations of phase-fraction non-negativity and summation to unity, while the inference-time projection applies a normalized clipping operation to enforce the same bounds. We will also report the penalty coefficient and projection threshold used to obtain the reported accuracy improvement. revision: yes

  2. Referee: Results, quaternary extrapolation paragraph and associated table: the exact-set accuracy drops from 99.32% on the unseen ternary to 91.78% on the quaternary section at 700°C. Because the four-node GAT aggregates only element-wise attentions present in the training distribution, this gap raises the possibility that higher-order (ternary or quaternary) phase stabilities not reducible to binary subsystems are incompletely captured. A per-phase error breakdown or ablation removing the graph structure would be needed to substantiate that the inductive bias is sufficient for the central extrapolation claim.

    Authors: The referee correctly notes the accuracy drop, which we already flag in the manuscript as reflecting the greater complexity of the quaternary space. To address the concern, the revised results section will include a per-phase F1 breakdown on the 700°C quaternary section, showing that errors are concentrated on low-prevalence phases while dominant phases remain accurately predicted. We will also add an ablation replacing the GAT with a non-graph MLP baseline on the same feature set to quantify the contribution of the element-graph inductive bias. These additions will directly support the extrapolation claims without altering the original conclusions. revision: yes

Circularity Check

0 steps flagged

No significant circularity; model approximates independent CALPHAD data

full rationale

The paper generates ~25,000 equilibrium states using the external pycalphad package, represents each (composition, temperature) point as a four-node graph with atomic fractions and elemental descriptors, and trains a GAT+MLP to output multi-label phase sets. Thermodynamic constraints appear only as optional training penalties or post-hoc inference projections; they do not redefine the target labels or force the network outputs to equal the inputs by construction. Evaluation metrics (macro-F1, exact-set accuracy) are computed on held-out or extrapolated sections against the same independent pycalphad oracle. No self-citation chain, ansatz smuggling, or renaming of known results is invoked to justify the central mapping. The derivation chain therefore remains non-circular and externally benchmarked.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on pre-existing CALPHAD thermodynamic models for both training labels and constraint enforcement; the neural network adds learned element-aware representations but introduces no new physical postulates.

free parameters (1)
  • thermodynamic penalty weights
    Weights balancing data loss and constraint violation during training or projection strength at inference; chosen to improve consistency without stated optimization procedure.
axioms (1)
  • domain assumption Phase stability is governed by minimization of Gibbs free energy as computed by established CALPHAD databases.
    Both the generated training data and the physics-informed penalties or projections rely on this principle.

pith-pipeline@v0.9.0 · 5590 in / 1341 out tokens · 57536 ms · 2026-05-10T17:58:13.813870+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

42 extracted references · 20 canonical work pages

  1. [1]

    held- 2 out

    Introduction Phase diagrams are f oundational to alloy design be cause they encode equilibrium phase stability as a function of composition and processing conditions, thereby guiding microstructure control and property optimization [1]. However, constructing reliable multicomponent phase diagrams remains expensive, requiring dense composition-temperature ...

  2. [2]

    We then introduce our element -level graph representation and the GATv2 -based multi- label classifier, together with training details and Optuna -based hyperparameter selection

    Methods This section starts with the generation of an equilibrium dataset for Ag-Bi-Cu-Sn using pycalphad with the NIST solder thermodynamic database [1,6] and explains how phase-presence labels and input descriptors are constructed. We then introduce our element -level graph representation and the GATv2 -based multi- label classifier, together with train...

  3. [3]

    pure -corner

    Attention mechanisms provided a flexible way to compute the context-dependent importance weights, enabling models to selectively emphasize informative interactions in a data-driven manner [28]. In this work, we leverage d GATv2 on a fully connected element graph to learn composition -dependent interaction weights among Ag, Bi, Cu, and Sn [21]. First, each...

  4. [4]

    Before evaluating the physics constraints, we first established an optimal baseline model using Bayesian hyperparameter tuning

    Results and discussion This section reports (i ) overall multi -label phase -set prediction performance across in -domain and extrapolative settings, (ii) the impact of physics -informed losses and decoding on physical admissibility and accuracy, and (iii) qualitative comparisons on dense composition-temperature grids. Before evaluating the physics constr...

  5. [5]

    A deliberate separation of physics -informed training and post hoc decoding proved crucial

    Conclusions We treated phase-diagram inference as a multi -label phase -set prediction problem over composition - temperature space and introduced a graph neural network augmented with physics -informed loss and a physics-informed decoder. A deliberate separation of physics -informed training and post hoc decoding proved crucial. A single physics loss sta...

  6. [6]

    THE CALPHAD METHOD AND ITS ROLE IN MATERIAL AND PROCESS DEVELOPMENT,

    U. R. Kattner, “THE CALPHAD METHOD AND ITS ROLE IN MATERIAL AND PROCESS DEVELOPMENT,” Tecnol. Metal. Mater. Min., vol. 13, no. 1, pp. 3–15, 2016, doi: 10.4322/2176- 1523.1059

  7. [7]

    Physics-consistent machine learning with output projection onto physical manifolds,

    M. Valente, T. C. Dias, V. Guerra, and R. Ventura, “Physics-consistent machine learning with output projection onto physical manifolds,” Commun. Phys., vol. 8, no. 1, Dec. 2025, doi: 10.1038/s42005-025-02329-1

  8. [8]

    A machine learning–based classification approach for phase diagram prediction,

    G. Deffrennes, K. Terayama, T. Abe, and R. Tamura, “A machine learning–based classification approach for phase diagram prediction,” Mater. Des., vol. 215, Mar. 2022, doi: 10.1016/j.matdes.2022.110497

  9. [9]

    A framework to predict binary liquidus by combining machine learning and CALPHAD assessments,

    G. Deffrennes, K. Terayama, T. Abe, E. Ogamino, and R. Tamura, “A framework to predict binary liquidus by combining machine learning and CALPHAD assessments,” Mater. Des., vol. 232, Aug. 2023, doi: 10.1016/j.matdes.2023.112111

  10. [10]

    AIPHAD, an active learning web application for visual understanding of phase diagrams,

    R. Tamura et al., “AIPHAD, an active learning web application for visual understanding of phase diagrams,” Commun. Mater., vol. 5, no. 1, Dec. 2024, doi: 10.1038/s43246-024-00580-7

  11. [11]

    Pycalphad: Calphad-Based Computational Thermodynamics in Python,

    R. Otis and Z.-K. Liu, “Pycalphad: Calphad-Based Computational Thermodynamics in Python,” in Zentropy, New York: Jenny Stanford Publishing, 2024, pp. 373–392. doi: 10.1201/9781003514466-18

  12. [12]

    https://doi

    A. Jain et al., “Commentary: The materials project: A materials genome approach to accelerating materials innovation,” 2013, American Institute of Physics Inc. doi: 10.1063/1.4812323. 27

  13. [13]

    Phase diagram construction supported by artificial intelligence,

    R. Tamura, “Phase diagram construction supported by artificial intelligence,” in 2022 29th International Workshop on Active-Matrix Flatpanel Displays and Devices (AM-FPD), IEEE, Jul. 2022, pp. 168–170. doi: 10.23919/AM-FPD54920.2022.9851343

  14. [14]

    aLLoyM: a large language model for alloy phase diagram prediction,

    Y. Oikawa, G. Deffrennes, R. Shimayoshi, T. Abe, R. Tamura, and K. Tsuda, “aLLoyM: a large language model for alloy phase diagram prediction,” NPJ Comput. Mater., vol. 12, no. 1, p. 97, Jan. 2026, doi: 10.1038/s41524-026-01966-6

  15. [15]

    Xie and J

    T. Xie and J. C. Grossman, “Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties,” Phys. Rev. Lett., vol. 120, no. 14, p. 145301, Apr. 2018, doi: 10.1103/PhysRevLett.120.145301

  16. [16]

    Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals,

    C. Chen, W. Ye, Y. Zuo, C. Zheng, and S. P. Ong, “Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals,” Chemistry of Materials, vol. 31, no. 9, pp. 3564–3572, May 2019, doi: 10.1021/acs.chemmater.9b01294

  17. [17]

    Atomistic Line Graph Neural Network for improved materials property predictions,

    K. Choudhary and B. DeCost, “Atomistic Line Graph Neural Network for improved materials property predictions,” NPJ Comput. Mater., vol. 7, no. 1, Dec. 2021, doi: 10.1038/s41524-021- 00650-1

  18. [18]

    Graph convolutional neural networks with global attention for improved materials property prediction,

    S. Y. Louis et al., “Graph convolutional neural networks with global attention for improved materials property prediction,” Physical Chemistry Chemical Physics, vol. 22, no. 32, pp. 18141– 18148, Aug. 2020, doi: 10.1039/d0cp01474e

  19. [19]

    A review on multi-label learning algorithms,

    M. L. Zhang and Z. H. Zhou, “A review on multi-label learning algorithms,” 2014, IEEE Computer Society. doi: 10.1109/TKDE.2013.39

  20. [20]

    Recent advances on electromigration in very-large-scale-integration of interconnects,

    K. N. Tu, “Recent advances on electromigration in very-large-scale-integration of interconnects,” Nov. 01, 2003. doi: 10.1063/1.1611263

  21. [21]

    Accelerating multicomponent phase-coexistence calculations with physics-informed neural networks,

    S. Dhamankar, S. Jiang, and M. A. Webb, “Accelerating multicomponent phase-coexistence calculations with physics-informed neural networks,” Mol. Syst. Des. Eng., vol. 10, no. 2, pp. 89– 101, Dec. 2024, doi: 10.1039/d4me00168k

  22. [22]

    NIST Phase Diagrams and Computational Thermodynamics - Solder Systems - SRD 139,

    National Institute of Standards and Technology, “NIST Phase Diagrams and Computational Thermodynamics - Solder Systems - SRD 139,” NIST Data Repository

  23. [23]

    A general-purpose machine learning framework for predicting properties of inorganic materials,

    L. Ward, A. Agrawal, A. Choudhary, and C. Wolverton, “A general-purpose machine learning framework for predicting properties of inorganic materials,” NPJ Comput. Mater., vol. 2, Aug. 2016, doi: 10.1038/npjcompumats.2016.28

  24. [24]

    Matminer: An open source toolkit for materials data mining,

    L. Ward et al., “Matminer: An open source toolkit for materials data mining,” Comput. Mater. Sci., vol. 152, pp. 60–69, Sep. 2018, doi: 10.1016/j.commatsci.2018.05.018

  25. [25]

    Graph Attention Networks,

    P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph Attention Networks,” ArXiv, Feb. 2018

  26. [26]

    How Attentive are Graph Attention Networks?,

    S. Brody, U. Alon, and E. Yahav, “How Attentive are Graph Attention Networks?,” ArXiv, Jan. 2022. 28

  27. [27]

    Decoupled Weight Decay Regularization,

    I. Loshchilov and F. Hutter, “Decoupled Weight Decay Regularization,” ArXiv, Jan. 2019

  28. [28]

    SGDR: Stochastic Gradient Descent with Warm Restarts,

    I. Loshchilov and F. Hutter, “SGDR: Stochastic Gradient Descent with Warm Restarts,” ArXiv, May 2017

  29. [29]

    Optuna: A Next-generation Hyperparameter Optimization Framework

    T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-generation Hyperparameter Optimization Framework,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, Jul. 2019, pp. 2623–2631. doi: 10.1145/3292500.3330701

  30. [30]

    Random Search for Hyper-Parameter Optimization Yoshua Bengio,

    J. Bergstra, J. B. Ca, and Y. B. Ca, “Random Search for Hyper-Parameter Optimization Yoshua Bengio,” 2012. [Online]. Available: http://scikit-learn.sourceforge.net

  31. [31]

    D. A. . Porter, K. E. . Easterling, and M. Y. . Sherif, Phase transformations in metals and alloys. CRC Press, 2009

  32. [32]

    Fries, and Bo Sundman, Computational Thermodynamics : The Calphad Method

    Hans Lukas, Suzana G. Fries, and Bo Sundman, Computational Thermodynamics : The Calphad Method. Cambridge University Press, 2011

  33. [33]

    Attention is All you Need,

    A. Vaswani et al., “Attention is All you Need,” in Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa- Paper.pdf

  34. [34]

    Neural Message Passing for Quantum Chemistry,

    J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural Message Passing for Quantum Chemistry,” in Proceedings of the 34th International Conference on Machine Learning, D. Precup and Y. W. Teh, Eds., in Proceedings of Machine Learning Research, vol. 70. PMLR, Apr. 2017, pp. 1263–1272. [Online]. Available: https://proceedings.mlr.pres...

  35. [35]

    Dropout: A Simple Way to Prevent Neural Networks from Overfitting,

    N. Srivastava, G. Hinton, A. Krizhevsky, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” 2014

  36. [36]

    Adam: A Method for Stochastic Optimization,

    D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” Jan. 2017

  37. [37]

    neurips.cc/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf

    J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian Optimization of Machine Learning Algorithms,” in Advances in Neural Information Processing Systems, F. Pereira, C. J. Burges, L. Bottou, and K. Q. Weinberger, Eds., Curran Associates, Inc., 2012. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2012/file/05311655a15b75fab86...

  38. [38]

    Orr and K.-Robert

    Genevieve. Orr and K.-Robert. Müller, Neural networks : tricks of the trade. Springer, 1998

  39. [39]

    Gibbs-Equilibrium of Heterogeneous Substances.,

    by J. Willard Gibbs, “Gibbs-Equilibrium of Heterogeneous Substances.,” 1876

  40. [40]

    Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,

    M. Belkin, P. Niyogi, and V. Sindhwani, “Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples,” 2006. [Online]. Available: http://www.cse.msu.edu/ 29

  41. [41]

    OptNet: Differentiable Optimization as a Layer in Neural Networks,

    B. Amos and J. Z. Kolter, “OptNet: Differentiable Optimization as a Layer in Neural Networks,” in Proceedings of the 34th International Conference on Machine Learning, D. Precup and Y. W. Teh, Eds., in Proceedings of Machine Learning Research, vol. 70. PMLR, Apr. 2017, pp. 136–145. [Online]. Available: https://proceedings.mlr.press/v70/amos17a.html

  42. [42]

    Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles,

    B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles,” in Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neur...