Lemur neural net- work dataset: Towards seamless automl

Arash Torabi Goodarzi, Roman Kochnev, Waleed Khalid, Furui Qin, Tolgay Atinc Uzun, Yashkumar Sanjaybhai Dhameliya, Yash Kanubhai Kathiriya, Zofia Antonina Bentyn, Dmitry Ignatov, Radu Timofte · 2025 · cs.LG · arXiv 2504.10552

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

open full Pith review browse 11 citing papers arXiv PDF

abstract

Neural networks are the backbone of modern artificial intelligence, but designing, evaluating, and comparing them remains labor-intensive. While numerous datasets exist for training, there are few standardized collections of the models themselves. We introduce LEMUR, an open-source dataset and framework that provides a large collection of PyTorch-based neural networks across tasks such as classification, segmentation, detection, and natural language processing. Each model follows a unified template, with configurations and results stored in a structured database to ensure consistency and reproducibility. LEMUR integrates automated hyperparameter optimization via Optuna, includes statistical analysis and visualization tools, and offers an API for seamless access to performance data. The framework is extensible, allowing researchers to add new models, datasets, or metrics without breaking compatibility. By standardizing implementations and unifying evaluation, LEMUR aims to accelerate AutoML research, enable fair benchmarking, and reduce barriers to large-scale neural network experimentation. To support adoption and collaboration, LEMUR and its plugins are released under the MIT license at: https://github.com/ABrain-One/nn-dataset https://github.com/ABrain-One/nn-plots https://github.com/ABrain-One/nn-vr

representative citing papers

Convergence Theory for Iterative LLM-Based Neural Architecture Search: A Parametric Cross-Entropy Framework with Closed-Form Proxy Reliability

cs.LG · 2026-05-28 · unverdicted · novelty 7.0

Iterative LLM-NAS is equivalent to a parametric cross-entropy method with proven monotonic quality improvement, geometric convergence of elite probability, and a closed-form proxy reliability rho_S = (6/pi) arcsin(rho_P(SNR)/2), partially confirmed on 3300 architectures.

Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Fine-tuned 7B LLMs generating unified diffs for neural architecture refinement achieve 66-75% valid rates and 64-66% mean first-epoch accuracy, outperforming full-generation baselines by large margins while cutting output length by 75-85%.

Closed-Loop LLM Discovery of Non-Standard Channel Priors in Vision Models

cs.CV · 2026-01-13 · unverdicted · novelty 6.0

Closed-loop LLM search with AST-generated examples discovers non-standard channel widths that improve vision model performance over initial architectures on CIFAR-100.

Enhancing LLM-Based Neural Network Generation: Few-Shot Prompting and Efficient Validation for Automated Architecture Design

cs.CV · 2025-12-30 · conditional · novelty 6.0

Three-example few-shot prompting optimizes LLM-generated vision architectures while a whitespace-normalized hash provides 100x faster duplicate detection than AST parsing across seven benchmarks.

A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks

cs.CV · 2025-12-03 · unverdicted · novelty 6.0

NN-RAG extracts 1,289 candidate neural modules from 19 PyTorch repositories, validates 941 of them, and supplies roughly 72% of the novel structures in the LEMUR dataset while enabling cross-repository migration.

Systematic Exploration of 4-Expert Heterogeneous Mixture-of-Experts via Automated Pipeline Search

cs.LG · 2026-06-21 · unverdicted · novelty 5.0

Automated search of 4463 heterogeneous 4-expert MoE models found enumeration bias anchoring the space to AirNet and ranked ShuffleNet/MobileNetV3 as top performers.

From Code to Prediction: Fine-Tuning LLMs for Neural Network Performance Classification in NNGPT

cs.LG · 2026-05-05 · unverdicted · novelty 5.0 · 2 refs

Fine-tuned LLMs reach 80% accuracy predicting which dataset a neural network code performs better on, outperforming metadata prompts at 70%.

Real Image Denoising with Knowledge Distillation for High-Performance Mobile NPUs

cs.CV · 2026-05-05 · unverdicted · novelty 5.0

A 1.96M-parameter LiteDenoiseNet student model achieves 37.58 dB PSNR on full-resolution real image denoising benchmarks while running in 34-46 ms on mobile NPUs by leveraging NPU-compatible primitives and high-alpha knowledge distillation.

Towards Robust Training in NNGPT AutoML Pipeline: A Loss-Optimizer Pairing Selection Study

cs.LG · 2026-06-18 · conditional · novelty 4.0

Empirical grid search over 18 loss-optimizer pairs on 33 LEMUR architectures shows cross-entropy with Adam/AdamW is most robust while NGL and SGD-based pairings vary sharply by model family.

Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis

cs.LG · 2025-11-10 · unverdicted · novelty 4.0 · 2 refs

FractalNet automatically generates and tests over 1,200 CNN architectures based on recursive fractal templates, achieving up to 80.18% accuracy on CIFAR-10 after five training epochs.

MobileAgeNet: Lightweight Facial Age Estimation for Mobile Deployment

cs.CV · 2026-04-18 · unverdicted · novelty 3.0

MobileAgeNet uses a MobileNetV3-Large backbone with a regression head to achieve 4.65 years mean absolute error in age estimation and 14.4 ms on-device latency with 3.23 million parameters.

citing papers explorer

Showing 11 of 11 citing papers.

Convergence Theory for Iterative LLM-Based Neural Architecture Search: A Parametric Cross-Entropy Framework with Closed-Form Proxy Reliability cs.LG · 2026-05-28 · unverdicted · none · ref 3 · internal anchor
Iterative LLM-NAS is equivalent to a parametric cross-entropy method with proven monotonic quality improvement, geometric convergence of elite probability, and a closed-form proxy reliability rho_S = (6/pi) arcsin(rho_P(SNR)/2), partially confirmed on 3300 architectures.
Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs cs.LG · 2026-05-06 · unverdicted · none · ref 12 · internal anchor
Fine-tuned 7B LLMs generating unified diffs for neural architecture refinement achieve 66-75% valid rates and 64-66% mean first-epoch accuracy, outperforming full-generation baselines by large margins while cutting output length by 75-85%.
Closed-Loop LLM Discovery of Non-Standard Channel Priors in Vision Models cs.CV · 2026-01-13 · unverdicted · none · ref 33 · internal anchor
Closed-loop LLM search with AST-generated examples discovers non-standard channel widths that improve vision model performance over initial architectures on CIFAR-100.
Enhancing LLM-Based Neural Network Generation: Few-Shot Prompting and Efficient Validation for Automated Architecture Design cs.CV · 2025-12-30 · conditional · none · ref 8 · internal anchor
Three-example few-shot prompting optimizes LLM-generated vision architectures while a whitespace-normalized hash provides 100x faster duplicate detection than AST parsing across seven benchmarks.
A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks cs.CV · 2025-12-03 · unverdicted · none · ref 18 · internal anchor
NN-RAG extracts 1,289 candidate neural modules from 19 PyTorch repositories, validates 941 of them, and supplies roughly 72% of the novel structures in the LEMUR dataset while enabling cross-repository migration.
Systematic Exploration of 4-Expert Heterogeneous Mixture-of-Experts via Automated Pipeline Search cs.LG · 2026-06-21 · unverdicted · none · ref 8 · internal anchor
Automated search of 4463 heterogeneous 4-expert MoE models found enumeration bias anchoring the space to AirNet and ranked ShuffleNet/MobileNetV3 as top performers.
From Code to Prediction: Fine-Tuning LLMs for Neural Network Performance Classification in NNGPT cs.LG · 2026-05-05 · unverdicted · none · ref 4 · 2 links · internal anchor
Fine-tuned LLMs reach 80% accuracy predicting which dataset a neural network code performs better on, outperforming metadata prompts at 70%.
Real Image Denoising with Knowledge Distillation for High-Performance Mobile NPUs cs.CV · 2026-05-05 · unverdicted · none · ref 10 · internal anchor
A 1.96M-parameter LiteDenoiseNet student model achieves 37.58 dB PSNR on full-resolution real image denoising benchmarks while running in 34-46 ms on mobile NPUs by leveraging NPU-compatible primitives and high-alpha knowledge distillation.
Towards Robust Training in NNGPT AutoML Pipeline: A Loss-Optimizer Pairing Selection Study cs.LG · 2026-06-18 · conditional · none · ref 9 · internal anchor
Empirical grid search over 18 loss-optimizer pairs on 33 LEMUR architectures shows cross-entropy with Adam/AdamW is most robust while NGL and SGD-based pairings vary sharply by model family.
Preparation of Fractal-Inspired Computational Architectures for Advanced Large Language Model Analysis cs.LG · 2025-11-10 · unverdicted · none · ref 4 · 2 links · internal anchor
FractalNet automatically generates and tests over 1,200 CNN architectures based on recursive fractal templates, achieving up to 80.18% accuracy on CIFAR-10 after five training epochs.
MobileAgeNet: Lightweight Facial Age Estimation for Mobile Deployment cs.CV · 2026-04-18 · unverdicted · none · ref 8 · internal anchor
MobileAgeNet uses a MobileNetV3-Large backbone with a regression head to achieve 4.65 years mean absolute error in age estimation and 14.4 ms on-device latency with 3.23 million parameters.

Lemur neural net- work dataset: Towards seamless automl

fields

years

verdicts

representative citing papers

citing papers explorer