Cross-Architectural Mixture-of-Experts with Adaptive Soft Routing for Plant Leaf Disease Classification
Pith reviewed 2026-06-26 08:09 UTC · model grok-4.3
The pith
An adaptive soft Mixture-of-Experts framework routes among EfficientNet-B0, DenseNet-121 and Swin-Tiny to improve plant leaf disease classification on imbalanced data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that cross-architectural Mixture-of-Experts with adaptive soft routing and two-stage refinement training integrates complementary features from EfficientNet-B0, DenseNet-121, and Swin-Tiny to outperform any single expert on imbalanced leaf-disease classification tasks.
What carries the argument
The adaptive soft gating mechanism that computes input-dependent weights for the three expert models.
If this is right
- The soft-routing approach captures both local and global representations more effectively than any one architecture alone.
- Two-stage refinement training improves stability when class distributions are heavily skewed.
- The same framework produces strong results across potato, durian, and sesame leaf datasets without dataset-specific redesign.
- Dynamic expert weighting removes the need to manually select or ensemble architectures for each new crop.
Where Pith is reading between the lines
- The routing mechanism may generalize to other image domains that combine local texture and global structure cues.
- Inference cost could remain close to a single model if only a subset of experts is activated per image.
- Extending the set of experts or conditioning the gate on metadata such as crop type could further improve robustness.
Load-bearing premise
That the three chosen architectures supply sufficiently complementary features that soft routing can combine without introducing instability or overfitting on imbalanced data.
What would settle it
Re-training the three experts independently on the potato dataset and checking whether the MoE still exceeds the best single expert by at least 5 percent in F1-score.
Figures
read the original abstract
Plant leaf disease classification is crucial for crop protection and precision agriculture but remains challenging under complex backgrounds, illumination variations, and severe class imbalance. Moreover, single-architecture models often fail to effectively capture both local and global representations. To address these challenges, this study proposes an adaptive soft Mixture-of-Experts (MoE) framework with cross-architectural routing that integrates EfficientNet-B0, DenseNet-121, and Swin-Tiny to exploit complementary multi-scale, local, and global features. A soft gating mechanism dynamically assigns input-dependent expert weights, while a two-stage refinement training strategy improves optimization stability and generalization. Experiments on a highly imbalanced potato leaf disease dataset achieve 91.68% recall and 92.62% F1-score, surpassing the strongest individual expert by 5.91% and 5.03%, respectively. Additional evaluations on durian and sesame leaf disease datasets yield F1-scores of 94.03% and 97.04%, demonstrating robust cross-dataset generalization and the potential of the proposed framework for reliable real-world crop health monitoring
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an adaptive soft Mixture-of-Experts framework with cross-architectural routing that integrates EfficientNet-B0, DenseNet-121, and Swin-Tiny to capture complementary multi-scale, local, and global features for plant leaf disease classification. A two-stage refinement training strategy is used to improve stability. On a highly imbalanced potato leaf disease dataset the model reports 91.68% recall and 92.62% F1-score, exceeding the strongest single expert by 5.91% and 5.03% respectively; additional F1-scores of 94.03% and 97.04% are reported on durian and sesame datasets.
Significance. If the performance gains are shown to arise from input-dependent combination of complementary features rather than routing collapse or uncontrolled training effects, the work would supply a concrete example of cross-architecture MoE for imbalanced agricultural imagery. The evaluation across three datasets supplies modest evidence of generalization beyond a single collection.
major comments (2)
- [Abstract] Abstract: the reported metrics (91.68% recall, 92.62% F1) are presented without any description of data partitioning, imbalance handling, statistical testing, or ablation controls, preventing evaluation of whether the 5.91% and 5.03% lifts over the strongest expert are attributable to the proposed routing.
- [Experiments] Experiments (or equivalent results section): no statistics on gating entropy, expert utilization frequencies, or per-class routing distributions are supplied, leaving open the possibility that soft routing collapses to near-one-hot selection on the majority class and that the observed gains are not produced by genuine cross-architectural mixing.
minor comments (1)
- [Method] The description of the two-stage refinement training would be clearer if the loss terms, learning-rate schedules, and any explicit entropy or load-balancing regularizers were stated explicitly.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the presentation of our results. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] Abstract: the reported metrics (91.68% recall, 92.62% F1) are presented without any description of data partitioning, imbalance handling, statistical testing, or ablation controls, preventing evaluation of whether the 5.91% and 5.03% lifts over the strongest expert are attributable to the proposed routing.
Authors: We agree the abstract is too concise on these points. The full manuscript details stratified 5-fold cross-validation for partitioning, class-weighted cross-entropy loss for imbalance, and ablation studies comparing the MoE against single experts. We will revise the abstract to include a brief clause noting these elements and the use of statistical testing, while staying within length limits. revision: yes
-
Referee: [Experiments] Experiments (or equivalent results section): no statistics on gating entropy, expert utilization frequencies, or per-class routing distributions are supplied, leaving open the possibility that soft routing collapses to near-one-hot selection on the majority class and that the observed gains are not produced by genuine cross-architectural mixing.
Authors: This is a valid concern. The current manuscript does not report these diagnostics. In the revision we will add a dedicated subsection with gating entropy values, expert utilization histograms, and per-class routing distributions across the potato dataset (and the other two datasets) to demonstrate that routing remains input-dependent and does not collapse to majority-class one-hot selection. revision: yes
Circularity Check
No circularity: empirical framework with direct experimental claims
full rationale
The paper proposes an adaptive soft MoE architecture combining three backbones and reports performance metrics (recall, F1) on three leaf-disease datasets. No equations, derivations, or parameter-fitting steps are described that would reduce a claimed prediction to a quantity defined by the same data or by self-citation. All reported gains are presented as outcomes of training and evaluation rather than analytic results derived from fitted inputs. The derivation chain is therefore self-contained and contains no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
J. Yao, S. N. Tran, S. Sawyer, et al., Machine learning for leaf dis- ease classification: data, techniques and applications, Artificial In- telligence Review 56 (Suppl 3) (2023) 3571–3616.doi:10.1007/ s10462-023-10610-4
2023
-
[2]
I. Pacal, I. Kunduracioglu, M. H. Alma, et al., A systematic review of deep learning techniques for plant diseases, Artificial Intelligence Review 57 (2024) 304.doi:10.1007/s10462-024-10944-7. 37
-
[3]
Sridhar, P
P. Sridhar, P. Angamuthu, Enhancing image based classification for crop disease detection using a multiclass svm approach with ker- nel comparison, Scientific Reports 15 (2025) 40055.doi:10.1038/ s41598-025-23568-w
2025
-
[4]
P. H. Hoang, T. T. H. Phan, Potato leaf disease classification in un- controlled environments: Leveraging the synergy of handcrafted fea- tures, in: T. T. Quan, C. Sombattheera, H. A. Pham, N. T. Tran (Eds.), Multi-disciplinary Trends in Artificial Intelligence, Vol. 16354 of Lecture Notes in Computer Science, Springer, Singapore, 2026. doi:10.1007/978-98...
-
[5]
P. H. Hoang, T. T. H. Phan, Toward robust potato leaf disease identi- fication: Optimizing performance via comparative feature selection, in: N. N. Dao, H. A. Le, R. Vadivel, N. T. Nguyen (Eds.), Intelligence of Things: Technologies and Applications, Vol. 281 of Lecture Notes on Data Engineering and Communications Technologies, Springer, Cham, 2026.doi:10...
-
[6]
N. H. Shabrina, S. Indarti, R. Maharani, D. A. Kristiyanti, Irmawati, N. Prastomo, T. Adilah M, A novel dataset of potato leaf disease in uncontrolled environment, Data in Brief 52 (2024) 109955.doi:https: //doi.org/10.1016/j.dib.2023.109955
-
[7]
Rivaldo, D
M. Rivaldo, D. Udjulawa, Performance comparison of efficientnetb0 in potato leaf disease classification with adam and sgd, Brilliance: Re- search of Artificial Intelligence 5 (2) (2025) 1224–1231.doi:10.47709/ brilliance.v5i2.7482
2025
-
[8]
P. Mhala, A. Bilandani, S. Sharma, Enhancing crop productivity with fined-tuned deep convolution neural network for potato leaf disease de- tection, Expert Systems with Applications 267 (2025) 126066.doi: https://doi.org/10.1016/j.eswa.2024.126066
-
[9]
V. Meghana, S. Akanksha, P. K. Reddy, M. A. Jabbar, Potato leaf dis- ease classification using vision transformers, in: Proceedings of the 2025 International Conference on Computing and Communications (COM- PUTINGCON), 2025, pp. 1–7.doi:10.1109/COMPUTINGCON64838. 2025.11377413. 38
-
[10]
S. Murugavalli, R. Gopi, Plant leaf disease detection using vision trans- formers for precision agriculture, Scientific Reports 15 (2025) 22361. doi:10.1038/s41598-025-05102-0
-
[11]
I. Tabassum, V. Nunavath, Transformer-based multi-class classification of bangladeshi rice varieties using image data, Applied Sciences 16 (3) (2026) 1279.doi:10.3390/app16031279
-
[12]
Apleni, F
T. Apleni, F. O. Isinkaye, M. O. Olusanya, Ensemble-based fea- ture fusion for accurate plant disease classification using pre- trained models, Scientific Reports 15 (2025) 41925.doi:10.1038/ s41598-025-25927-z
2025
-
[13]
J. H. Sinamenye, A. Chatterjee, R. Shrestha, Potato plant disease de- tection: Leveraging hybrid deep learning models, BMC Plant Biology 25 (2025) 647.doi:10.1186/s12870-025-06679-4
-
[14]
B. Ahmad, Alamsyah, Ensemble learning-based potato leaf disease clas- sification using densenet201 and mobilenetv2, Journal of Information System Exploration and Research 4 (1) (2026) 1–8
2026
-
[15]
A. R. Al-Shamasneh, Potato leaves disease classification based on gen- eralized jones polynomials image features, MethodsX 14 (2025) 103421. doi:https://doi.org/10.1016/j.mex.2025.103421
-
[16]
Z. Li, S. M. Javidan, Ml-based approach to potato diseases diagnosis using image processing and whale optimization algorithm for feature selection, Smart Agricultural Technology 12 (2025) 101282.doi:https: //doi.org/10.1016/j.atech.2025.101282
-
[17]
P.-H. Hoang, N.-T. Trinh, V.-M. Tran, T.-T.-H. Phan, Multi-objective hybrid knowledge distillation for efficient deep learning in smart agri- culture (2025).arXiv:2512.22239
-
[18]
J. Zhang, X. Yang, X. Fu, B. Wang, H. Li, Ldl-mobilenetv3s: an en- hanced lightweight mobilenetv3-small model for potato leaf disease diag- nosis through multi-module fusion, Frontiers in Plant Science 16 (2025). doi:10.3389/fpls.2025.1656731. 39
-
[19]
N. Aishwarya, S. Cheran, S. S. Gnaneswar, V. Rathinasamy, Transformer-based deep learning approach for potato leaf disease clas- sification, in: Proceedings of the 2025 International Conference on Sus- tainability, Innovation & Technology (ICSIT), 2025, pp. 1–6.doi: 10.1109/ICSIT65336.2025.11295448
-
[20]
H. K. Rofiqi, E. Noersasongko, S. Winarno, M. A. Soeleman, Augmenta- tion strategy and hyperparameter optimization using optuna for potato leaf disease classification in uncontrolled environment, Jurnal Teknik Informatika (JUTIF) 7 (2) (2026) 743–759
2026
-
[21]
Mandhani, S
K. Mandhani, S. Singh, A. Chandrawanshi, Multi-crop disease detection using deep learning with class imbalance handling, International Journal of Research Publication and Reviews 7 (4) (2026) 2906–2913
2026
-
[22]
L. E. Raya-González, V. A. Alcántar-Camarena, J. Cepeda-Negrete, A. Bustos-Gaytán, M. del Rosario Abraham-Juárez, N. Salda˜ na-Robles, Application of mixture of experts models for the recognition of pests and diseases in maize, Array 27 (2025) 100502.doi:https://doi.org/ 10.1016/j.array.2025.100502
-
[23]
Z. Salman, A. Muhammad, D. Han, Plant disease classification in the wild using vision transformers and mixture of experts, Frontiers in Plant Science 16 (2025).doi:10.3389/fpls.2025.1522985
-
[24]
Q. Lu, W. Zhao, J. Chen, X. Chen, L. Zhang, Uncertainty mixture of experts model for long tail crop type mapping, Remote Sensing 17 (22) (2025) 3752.doi:10.3390/rs17223752
-
[25]
S. Xu, Z. He, X. Liang, H. Lu, Robust prediction of soluble solids content in pomelo across storage time using a gated mixture-of-experts model with near-infrared transmittance spectra, Talanta 306 (2026) 129758. doi:https://doi.org/10.1016/j.talanta.2026.129758
-
[26]
MobileNetV2: Inverted Residuals and Linear Bottlenecks
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, Mo- bilenetv2: Inverted residuals and linear bottlenecks (2019).arXiv: 1801.04381
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[27]
M. Tan, Q. V. Le, Efficientnet: Rethinking model scaling for convolu- tional neural networks (2020).arXiv:1905.11946. 40
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[28]
Densely connected convolutional networks
G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger, Densely connected convolutional networks, in: 2017 IEEE Conference on Com- puter Vision and Pattern Recognition (CVPR), 2017, pp. 2261–2269. doi:10.1109/CVPR.2017.243
-
[29]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition (2015).arXiv:1512.03385
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[30]
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9992–10002
2021
-
[31]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, International Journal of Computer Vision 128 (2) (2019) 336–359.doi:10.1007/s11263-019-01228-7
-
[32]
T. Nguyen, Image dataset of ten durian diseases captured in real-field conditions from a family orchard in vinh long, vietnam, Mendeley Data, V1 (2025).doi:10.17632/mhjwyb5p48.1
-
[33]
S. A. Rahman, M. H. Hena, Applying convolutional neural networks for early detection of diseases in sesame leaf, Mendeley Data (2025). doi:10.17632/c64jt5gkzm.1
-
[34]
C.-Y. Chang, C.-C. Lai, Potato leaf disease detection based on a lightweight deep learning model, Machine Learning and Knowledge Ex- traction 6 (4) (2024) 2321–2335.doi:10.3390/make6040114
-
[35]
M. H. Tariq, H. Sultan, R. Akram, S. G. Kim, J. S. Kim, M. Us- man, H. A. H. Gondal, J. Seo, Y. H. Lee, K. R. Park, Estimation of fractal dimensions and classification of plant disease with complex backgrounds, Fractal and Fractional 9 (5) (2025) 315.doi:10.3390/ fractalfract9050315
2025
-
[36]
A. Mondal, A. Chatterjee, N. A vazov, A hybrid cnn-transformer model with adaptive activation function for potato leaf disease classification, Scientific Reports 16 (2026) 4282.doi:10.1038/s41598-025-34406-4. 41
-
[37]
G. Sangar, V. Rajasekar, Optimized classification of potato leaf disease using efficientnet-lite and ke-svm in diverse environments, Frontiers in Plant Science 16 (2025) 1499909.doi:10.3389/fpls.2025.1499909. 42
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.