Recognition: no theorem link
The design of selection experiments using a model-based approach
Pith reviewed 2026-05-13 01:28 UTC · model grok-4.3
The pith
A model-based approach builds optimal designs for selection experiments by aligning layouts with the linear mixed model used for analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present an approach for constructing designs for selection experiments which are optimal or near optimal against a robust and sensible linear mixed model. This model reflects the models used for analysis. The approach is flexible and introduces an additional step to accommodate efficient resource allocation of replication status to genotypes, which is undertaken prior to the allocation of plots to genotypes.
What carries the argument
A linear mixed model for genotype-by-environment effects and genetic relatedness that is used both to optimize the experimental layout and to perform the subsequent analysis.
If this is right
- Designs constructed to match the analysis model improve selection accuracy compared with designs that ignore that model.
- Allocating replication numbers to genotypes before assigning plots to locations allows more efficient use of limited plot resources.
- The same framework applies to both single-environment and multi-environment selection experiments.
- In-silico simulations show measurable gains in accuracy under realistic variance structures for genotype-by-environment interaction.
Where Pith is reading between the lines
- If the assumed model is only approximately correct, the resulting designs may still outperform random or traditional layouts because they incorporate relatedness and interaction structure.
- The replication-allocation step could be extended to incorporate costs or constraints that vary by genotype or environment.
- Integration with genomic relationship matrices instead of pedigree-based relatedness would be a direct next use of the same machinery.
Load-bearing premise
The linear mixed model used for design optimization correctly represents the main sources of variation and relatedness that will appear in the actual trial data.
What would settle it
A field trial or simulation in which the model-based design produces lower selection accuracy than a conventional design when the true variance structure differs from the one assumed during design construction.
read the original abstract
Plant breeding programs use data obtained from multi-environment selection experiments to produce improved varieties with the ultimate aim of maintaining high levels of genetic gain. Selection accuracy can be improved with the use of advanced statistical analytical methods that use informative and parsimonious variance models for the set of genotype by environment interaction effects, include information on genetic relatedness and appropriately accommodate non-genetic sources of variation within the framework of a single step estimation and prediction algorithm. Maximal gains from using these advanced techniques are more likely to be achieved if the designs used match the aims of the selection experiment and make full use of the available resources. In this paper we present an approach for constructing designs for selection experiments which are optimal or near optimal against a robust and sensible linear mixed model. This model reflects the models used for analysis. The approach is flexible and introduces an additional step to accommodate efficient resource allocation of replication status to genotypes, which is undertaken prior to the allocation of plots to genotypes. A motivating example is used to illustrate the approach, two illustrative examples are presented one each for single and multiple environment selection experiments and several in-silico simulation studies are used to demonstrate the advantages of these approaches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a model-based approach to construct optimal or near-optimal designs for plant breeding selection experiments. Designs are optimized against a linear mixed model (LMM) that incorporates genotype-by-environment (GxE) interactions, genetic relatedness, and non-genetic effects, matching the models used in subsequent analysis. The method adds a preliminary step for efficient replication allocation to genotypes before plot assignment. It is illustrated via a motivating example, single- and multi-environment cases, and in-silico simulations that compare the proposed designs to standard alternatives.
Significance. If the central claim holds, the work offers a practical extension of optimal design theory to modern LMMs employed in breeding, with the replication-allocation step providing a useful operational feature. The simulations under the design model provide initial evidence of improved selection accuracy, but the absence of misspecification tests limits the assessed impact on real programs where variance structures are uncertain.
major comments (2)
- [in-silico simulation studies] The in-silico simulation studies (described after the illustrative examples) generate data exclusively from the same LMM used for design optimization, including the assumed GxE covariance and additive relationship matrix. No results are shown for cases where the true data-generating process differs (e.g., altered GxE correlations or misspecified relatedness), which directly undermines the translation from model-based optimality to improved selection accuracy under realistic conditions.
- [illustrative examples] The optimality criterion employed for the multi-environment design (Section on illustrative examples) is not stated explicitly (e.g., A-optimality for prediction error variance of breeding values versus a custom selection-accuracy metric). Without this, it is unclear how the reported designs achieve near-optimality or how sensitive the results are to the chosen criterion.
minor comments (2)
- The abstract would be strengthened by naming the specific optimality criterion and briefly quantifying the simulation gains (e.g., relative improvement in selection accuracy).
- Notation for the LMM variance components (G, R, etc.) should be introduced once in the methods and used consistently; currently some parameters appear only in the examples.
Simulated Author's Rebuttal
We thank the referee for their thorough review and constructive comments. We address each major comment below and indicate the revisions made to strengthen the manuscript.
read point-by-point responses
-
Referee: The in-silico simulation studies (described after the illustrative examples) generate data exclusively from the same LMM used for design optimization, including the assumed GxE covariance and additive relationship matrix. No results are shown for cases where the true data-generating process differs (e.g., altered GxE correlations or misspecified relatedness), which directly undermines the translation from model-based optimality to improved selection accuracy under realistic conditions.
Authors: The simulations were conducted under the design model to demonstrate the gains achievable when the model assumptions hold, which aligns with standard practice for assessing model-based optimal designs. We agree that evaluations under misspecification would better inform real-world performance. Due to the high computational demands of the optimization and simulation workflow, such analyses could not be completed for this revision. We have added a discussion paragraph noting this limitation and identifying robustness checks as future work. revision: partial
-
Referee: The optimality criterion employed for the multi-environment design (Section on illustrative examples) is not stated explicitly (e.g., A-optimality for prediction error variance of breeding values versus a custom selection-accuracy metric). Without this, it is unclear how the reported designs achieve near-optimality or how sensitive the results are to the chosen criterion.
Authors: We appreciate the referee highlighting this lack of clarity. The multi-environment designs were constructed using the A-optimality criterion on the average prediction error variance of the breeding values, as defined in the general methodology. We have revised the illustrative examples section to state the criterion explicitly and to clarify its connection to selection accuracy. revision: yes
Circularity Check
No circularity: model-based design optimization is independent of analysis data
full rationale
The paper's central contribution is a method to construct experimental designs that are optimal or near-optimal with respect to a pre-specified linear mixed model (including GxE, relatedness, and non-genetic terms) that will later be used for analysis. This is a standard, non-circular workflow in optimal design theory: the design criterion is defined from the model structure before any data are collected, and the motivating examples, illustrative cases, and in-silico simulations simply evaluate performance under that same model. No equation or step reduces a claimed prediction or optimality result to a fitted quantity defined from the same data, nor does any load-bearing premise rest on a self-citation chain or imported uniqueness theorem. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- variance components and covariance parameters of the linear mixed model
axioms (2)
- domain assumption The linear mixed model used for design optimization accurately represents the true genotype-by-environment interactions, genetic relatedness, and non-genetic sources of variation in the experiments.
- standard math Standard assumptions of linear mixed models (normality, independence of residuals given the random effects structure) hold sufficiently for the optimality criterion to be meaningful.
Reference graph
Works this paper leans on
-
[1]
Arief, V.N., Desmae, H., Hardner, C., DeLacy, I.H., Gilmour, A., Bull, J.K., Basford, K.E. (2019, March). Utilization of Multiyear Plant Breeding Data to Better Predict Genotype Performance.Crop Science,59(2), 480–490, https://doi.org/ 10.2135/cropsci2018.03.0182
-
[2]
Asif, M.A., Bithell, S.L., Pirathiban, R., Cullis, B.R., Hughes, D.G.D., McGarty, A., . . . Hobson, K. (2023, December). Rapid and High Throughput Hydroponics Phenotyping Method for Evaluating Chickpea Resistance to Phytophthora Root Rot.Plants,12(23), 4069, https://doi.org/10.3390/plants12234069
-
[3]
(2008).Design of comparative experiments
Bailey, R. (2008).Design of comparative experiments. Cambridge: Cambridge University Press. 22
work page 2008
-
[4]
Bhatta, M., Gutierrez, L., Cammarota, L., Cardozo, F., Germ´ an, S., G´ omez-Guerrero, B., . . . Castro, A.J. (2020, March). Multi-trait Genomic Prediction Model Increased the Predictive Ability for Agronomic and Malting Quality Traits in Barley (Hordeum vulgareL.).G3 Genes|Genomes|Genetics,10(3), 1113–1124, https://doi.org/10.1534/g3.119.400968
-
[5]
Brien, C.J., & Dem´ etrio, C.G. (2009). Formulating mixed models for experiments, including longitudinal experiments.Journal of Agricultural, Biological, and Environmental Statistics,14(3), 253–280, https://doi.org/10.1198/jabes.2009 .08001 Bueno Filho, J.S.D.S., & Gilmour, S.G. (2007). Block designs for random treatment effects.Journal of Statistical Pla...
-
[6]
Butler, D. (2013).On The Optimal Design of Experiments under the Linear Mixed Model(Unpublished doctoral dissertation). The University of Queensland
work page 2013
-
[7]
(2018).Optimal Design under the Linear Mixed Model(Tech
Butler, D., & Cullis, B. (2018).Optimal Design under the Linear Mixed Model(Tech. Rep.). Wollongong: Unversity of Wollongong
work page 2018
-
[8]
Butler, D.G., Smith, A.B., Cullis, B.R. (2014). On the Design of Field Experi- ments with Correlated Treatment Effects.Journal of Agricultural, Biological, and Environmental Statistics,19(4), 539–555, https://doi.org/10.1007/s13253 -014-0191-0
-
[9]
Chan, B.S.P. (1999).The Design of Field Experiments When the Data are Spa- tially Correlated.(Unpublished doctoral dissertation). Unversity of Queensland, Briasbane
work page 1999
-
[10]
Coombes, N.E. (2002).The reactive TABU search for efficient correlated experimental designs.(Unpublished doctoral dissertation). Liverpool John Moores
work page 2002
-
[11]
(2009).DiGGeR, a Spatial Design Program(Biometric Bulletin)
Coombes, N.E. (2009).DiGGeR, a Spatial Design Program(Biometric Bulletin). NSW DPI
work page 2009
-
[12]
Cullis, B.R., Smith, A.B., Cocks, N.A., Butler, D.G. (2020, December). The Design of Early-Stage Plant Breeding Trials Using Genetic Relatedness.Journal of
work page 2020
-
[13]
Agricultural, Biological and Environmental Statistics,25(4), 553–578, https:// doi.org/10.1007/s13253-020-00403-5 23
-
[14]
Cullis, B.R., Smith, A.B., Coombes, N.E. (2006). On the design of early genera- tion variety trials with correlated data.Journal of agricultural, biological, and environmental statistics,11(4), 381–393,
work page 2006
-
[15]
Fairlie, W., Hughes, D., Cullis, B., Edwards, J., Kuchel, H. (2024, September). Genotype-by-environment interaction for wheat falling number performance due to late maturityA-amylase.Crop Science, csc2.21348, https://doi.org/10.1002/ csc2.21348
work page 2024
-
[16]
Gilmour, A.R., Cullis, B.R., Welham, S.J., Gogel, B.J., Thompson, R. (2004). An effi- cient computing strategy for prediction in mixed linear models.Computational Statistics and Data Analysis,44, 571–586,
work page 2004
-
[17]
Glover, F. (1989). Tabu Search Part I.ORSA Journal of Computing,1(3), 190–207,
work page 1989
-
[18]
Goddard, M.E., Hayes, B.J., Meuwissen, T.H.E. (2011). Using the genomic relation- ship matrix to predict the accuracy of genomic selection.Journal of Animal Breeding and Genetics,128(6), 409–421, https://doi.org/10.1111/j.1439-0388 .2011.00964.x
-
[19]
Heslot, N., Akdemir, D., Sorrells, M.E., Jannink, J.L. (2014). Integrating environ- mental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions.Theoretical and Applied Genetics, 127(2), 463–480, https://doi.org/10.1007/s00122-013-2231-5 Jarqu´ ın, D., Crossa, J., Lacaze, X., Du Cheyron, P., Dauc...
-
[20]
Campos, G. (2014). A reaction norm model for genomic selection using high- dimensional genomic and environmental data.Theoretical and Applied Genetics, 127, 595–607,
work page 2014
-
[21]
(1995).Cyclic and Computer Generated Designs.(2nd ed.)
John, J.A., & Williams, E.R. (1995).Cyclic and Computer Generated Designs.(2nd ed.). Chapman and Hall, London
work page 1995
-
[22]
Kempton, R.A. (1982). The design and analysis of unreplicated field trials.Vortrage fur Pflanzenzuchtung,7, 219–242,
work page 1982
-
[23]
Lisle, C., Smith, A.B., Birrell, C.L., Cullis, B. (2021). Information Based Diagnos- tic for Genetic Variance Parameter Estimation in Multi-Environment Trials. Frontiers in Plant Science,12, 16, https://doi.org/10.3389/fpls.2021.785430 24
-
[24]
Martin, R. (1986). On the design of experiments under spatial correlation.Biometrika, 73, 247–277,
work page 1986
-
[25]
Martin, R., & Eccleston, J. (1997).Construction of optimal and near-optimal designs for dependent observations using simulated annealing.(Tech. Rep.). Dept. Prob
work page 1997
-
[26]
Martin, R.J., Chauhan, N., Eccleston, J.A., Chan, B.S.P. (2006). Efficient experi- mental designs when most treatments are unreplicated.Linear Algebra and Its Applications,417, 163–182, https://doi.org/10.1016/j.laa.2006.02.009
-
[27]
Martin, R.J., & Eccleston, J.A. (1991). Efficient block designs for correlated observations.Australian Journal of Statistics,33(3), 299–311,
work page 1991
-
[28]
Meuwissen, T.H. (2012). The accuracy of genomic selection.XV Meeting of the EUCARPIA Section - Biometrics in Plant Breeding(p. 24). Stuttgart – Hohenheim: Eucarpia
work page 2012
-
[29]
Meyer, K., Tier, B., Swan, A. (2018, December). Estimates of genetic trend for single- step genomic evaluations.Genetics Selection Evolution,50(1), 39, https:// doi.org/10.1186/s12711-018-0410-1
-
[30]
Nguyen, N.-K., & Williams, E.R. (1993). An algorithm for constructing optimal resolvable row-columns designs.Australian & New Zealand Journal of Statistics, 35, 363–370,
work page 1993
-
[31]
Norman, A., Taylor, J., Tanaka, E., Telfer, P., Edwards, J., Martinant, J.P., Kuchel, H. (2017). Increased genomic prediction accuracy in wheat breeding using a large Australian panel.Theoretical and Applied Genetics,130(12), 2543–2555, https://doi.org/10.1007/s00122-017-2975-4
-
[32]
Oakey, H., Cullis, B., Thompson, R., Comadran, J., Halpin, C., Waugh, R. (2016). Genomic selection in multi-environment crop trials.G3: Genes, Genomes, Genetics,6(5), 1313–1326, https://doi.org/10.1534/g3.116.027524
-
[33]
Oakey, H., Verbyla, A.P., Cullis, B.R., Pitchford, W.S., Kuchel, H. (2006). Joint modeling of additive and non-additive genetic line effects in single field trials. Theoretical and Applied Genetics,113(5), 809–819, 25
work page 2006
-
[34]
Oakey, H., Verbyla, A.P., Cullis, B.R., Wei, X., Pitchford, W.S. (2007). Joint mod- elling of additive and non-additive (genetic line) effects in multi-environment trials.Theoretical and Applied Genetics,114, 1319–1332,
work page 2007
-
[35]
Piepho, H.-P., & Williams, E.R. (2006). A comparison of experimental designs for selection in breeding trials with nested treatment structure.Theoretical Applied Genetics.,113, 1505–1513,
work page 2006
-
[36]
Robinson, G. (1991). That BLUP is a good thing: Estimation of random effects. Statistical Science,6(1), 15–51,
work page 1991
-
[37]
Schmidt, M., Kollers, S., Maasberg-Prelle, A., Großer, J., Schinkel, B., Tomerius, A., . . . Korzun, V. (2016, February). Prediction of malting quality traits in barley based on genome-wide marker data to assess the potential of genomic selection. Theoretical and Applied Genetics,129(2), 203–213, https://doi.org/10.1007/ s00122-015-2639-1
work page 2016
-
[38]
Smith, A., Ganesalingam, A., Lisle, C., Kadkol, G., Hobson, K., Cullis, B. (2021). Use of Contemporary Groups in the Construction of Multi-Environment Trial Datasets for Selection in Plant Breeding Programs.Frontiers in Plant Science, 11(February), 1–13, https://doi.org/10.3389/fpls.2020.623586
-
[39]
Smith, A., Norman, A., Kuchel, H., Cullis, B. (2021). Plant variety selection using interaction classes derived from Factor Analytic Linear Mixed Models : Models with independent variety effects .Frontiers in Plant Science,12, , https:// doi.org/10.3389/fpls.2021.737462
-
[40]
Smith, A.B., & Cullis, B.R. (2018).Design Tableau: An Aid to Specifying the Linear Mixed Model for a Comparative Experiment.(Tech. Rep. No. 5-18). University of Wollongong
work page 2018
-
[41]
Tolhurst, D.J., Mathews, K.L., Smith, A.B., Cullis, B.R. (2019). Genomic selection in multi-environment plant breeding trials using a factor analytic linear mixed model.Journal of Animal Breeding and Genetics,136(4), 279–300, 26
work page 2019
-
[42]
Verbyla, A. (2023, November). On two-stage analysis of multi-environment trials. Euphytica,219(11), 121, https://doi.org/10.1007/s10681-023-03248-4
-
[43]
Vo-Thanh, N., & Piepho, H.-P. (2023, December). Generating Designs for Compar- ative Experiments with Two Blocking Factors.Biometrics,79(4), 3574–3585, https://doi.org/10.1111/biom.13913
-
[44]
Wilkinson, G.N., & Rogers, C.E. (1973). Symbolic description of factorial models for analysis of variance.Applied Statistics,22, 392–399, 27
work page 1973
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.