Can GPT-4 Perform Neural Architecture Search?
read the original abstract
We investigate the potential of GPT-4~\cite{gpt4} to perform Neural Architecture Search (NAS) -- the task of designing effective neural architectures. Our proposed approach, \textbf{G}PT-4 \textbf{E}nhanced \textbf{N}eural arch\textbf{I}tect\textbf{U}re \textbf{S}earch (GENIUS), leverages the generative capabilities of GPT-4 as a black-box optimiser to quickly navigate the architecture search space, pinpoint promising candidates, and iteratively refine these candidates to improve performance. We assess GENIUS across several benchmarks, comparing it with existing state-of-the-art NAS techniques to illustrate its effectiveness. Rather than targeting state-of-the-art performance, our objective is to highlight GPT-4's potential to assist research on a challenging technical problem through a simple prompting scheme that requires relatively limited domain expertise\footnote{Code available at \href{https://github.com/mingkai-zheng/GENIUS}{https://github.com/mingkai-zheng/GENIUS}.}. More broadly, we believe our preliminary results point to future research that harnesses general purpose language models for diverse optimisation tasks. We also highlight important limitations to our study, and note implications for AI safety.
This paper has not been read by Pith yet.
Forward citations
Cited by 10 Pith papers
-
Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs
Fine-tuned 7B LLMs generating unified diffs for neural architecture refinement achieve 66-75% valid rates and 64-66% mean first-epoch accuracy, outperforming full-generation baselines by large margins while cutting ou...
-
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
MLE-bench evaluates frontier language models as ML engineering agents on 75 Kaggle competitions, with the top setup (o1-preview + AIDE) reaching bronze medal level in 16.9% of tasks.
-
EvoPrompt: Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers
EvoPrompt uses LLMs to run evolutionary operators on populations of prompts, outperforming human-engineered prompts by up to 25% on BIG-Bench Hard tasks across 31 datasets.
-
LLM-Guided Neural Architecture Search for Robust Co-Design of Physical Neural Networks
UH-NAS uses LLMs as evolutionary operators in a swappable-backend NAS to co-optimize neural architectures for accuracy and inference energy on physical hardware such as optical MZIs, producing more diverse and robust ...
-
AutoMCU: Feasibility-First MCU Neural Network Customization via LLM-based Multi-Agent Systems
AutoMCU uses feasibility-first LLM multi-agent coordination to automate MCU-constrained neural network design, delivering competitive accuracy on CIFAR-10/100 in 1-2 hours versus hundreds of GPU hours for prior HW-NAS...
-
Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search
Authors structure architectural design knowledge with LLMs to create an open-ended NAS space and introduce FairNAD, which finds architectures improving 0.84, 2.17, and 2.35 points over SOTA on CIFAR-10, CIFAR-100, and...
-
LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search
LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, ...
-
FELA: A Multi-Agent Evolutionary System for Feature Engineering of Industrial Event Log Data
FELA deploys specialized LLM agents in an evolutionary framework to generate, validate, and refine explainable features from heterogeneous industrial event logs, improving downstream model performance.
-
LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers
LLM-FE is a framework that treats feature engineering as LLM-driven program search with data feedback, reporting consistent gains over baselines on classification and regression tabular tasks.
-
CoLLM-NAS: Collaborative Large Language Models for Efficient Knowledge-Guided Neural Architecture Search
CoLLM-NAS introduces a collaborative two-LLM framework with Navigator, Generator, and Coordinator modules to perform knowledge-guided neural architecture search, reporting state-of-the-art results on ImageNet and NAS-...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.