{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2023:AP2ERP4WROMF6ADSYSWAL5U4KF","short_pith_number":"pith:AP2ERP4W","schema_version":"1.0","canonical_sha256":"03f448bf968b985f0072c4ac05f69c515c7535edb675e2102f91ffe4f89aa05c","source":{"kind":"arxiv","id":"2309.03409","version":3},"attestation_state":"computed","paper":{"title":"Large Language Models as Optimizers","license":"http://creativecommons.org/publicdomain/zero/1.0/","headline":"Large language models can optimize solutions by iteratively generating new candidates from a prompt that lists all prior attempts together with their scores.","cross_cats":["cs.AI","cs.CL"],"primary_cat":"cs.LG","authors_text":"Chengrun Yang, Denny Zhou, Hanxiao Liu, Quoc V. Le, Xinyun Chen, Xuezhi Wang, Yifeng Lu","submitted_at":"2023-09-07T00:07:15Z","abstract_excerpt":"Optimization is ubiquitous. While derivative-based algorithms have been powerful tools for various problems, the absence of gradient imposes challenges on many real-world applications. In this work, we propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as optimizers, where the optimization task is described in natural language. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values, then the new solutions are evaluated and added to the prompt for th"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2309.03409","kind":"arxiv","version":3},"metadata":{"license":"http://creativecommons.org/publicdomain/zero/1.0/","primary_cat":"cs.LG","submitted_at":"2023-09-07T00:07:15Z","cross_cats_sorted":["cs.AI","cs.CL"],"title_canon_sha256":"c7267a9ba911c0155389c05e1a869f5061975eba6a5996cee9757c2e248dba5a","abstract_canon_sha256":"696a3ad040aa7c54bce5841a41ecbdc0ec1f364eb2796cfc81219edc4c0c3f43"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:39:19.769749Z","signature_b64":"namU8Wke+5UMOCjdCmsifQJq9+/V64eGwK64H3XRZByHkEzNP45HeE6RhsCD5ghwOaMSQkkr/wfryIQu/nLmBw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"03f448bf968b985f0072c4ac05f69c515c7535edb675e2102f91ffe4f89aa05c","last_reissued_at":"2026-05-17T23:39:19.769046Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:39:19.769046Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Large Language Models as Optimizers","license":"http://creativecommons.org/publicdomain/zero/1.0/","headline":"Large language models can optimize solutions by iteratively generating new candidates from a prompt that lists all prior attempts together with their scores.","cross_cats":["cs.AI","cs.CL"],"primary_cat":"cs.LG","authors_text":"Chengrun Yang, Denny Zhou, Hanxiao Liu, Quoc V. Le, Xinyun Chen, Xuezhi Wang, Yifeng Lu","submitted_at":"2023-09-07T00:07:15Z","abstract_excerpt":"Optimization is ubiquitous. While derivative-based algorithms have been powerful tools for various problems, the absence of gradient imposes challenges on many real-world applications. In this work, we propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as optimizers, where the optimization task is described in natural language. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values, then the new solutions are evaluated and added to the prompt for th"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"With a variety of LLMs, we demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That an LLM, when shown a growing list of prior solutions and their numeric scores inside a prompt, will reliably generate new solutions that improve on the best previous score rather than plateau or regress.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Large language models can optimize by being prompted with histories of past solutions and scores to propose better ones, producing prompts that raise accuracy up to 8% on GSM8K and 50% on Big-Bench Hard over human-designed baselines.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Large language models can optimize solutions by iteratively generating new candidates from a prompt that lists all prior attempts together with their scores.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"03a126071ea71762f75dabe3304d110b6f4b17580b8f1f54f2d53aaa5eb20bc9"},"source":{"id":"2309.03409","kind":"arxiv","version":3},"verdict":{"id":"26c1b0a2-2213-4933-8b78-581198dc790b","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T23:59:25.433751Z","strongest_claim":"With a variety of LLMs, we demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.","one_line_summary":"Large language models can optimize by being prompted with histories of past solutions and scores to propose better ones, producing prompts that raise accuracy up to 8% on GSM8K and 50% on Big-Bench Hard over human-designed baselines.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That an LLM, when shown a growing list of prior solutions and their numeric scores inside a prompt, will reliably generate new solutions that improve on the best previous score rather than plateau or regress.","pith_extraction_headline":"Large language models can optimize solutions by iteratively generating new candidates from a prompt that lists all prior attempts together with their scores."},"references":{"count":51,"sample":[{"doi":"","year":null,"title":"PaLM 2 Technical Report","work_id":"905ee9a7-ea61-4a94-bd62-2600cbe3e315","ref_index":1,"cited_arxiv_id":"2305.10403","is_internal_anchor":true},{"doi":"","year":null,"title":"Constitutional AI: Harmlessness from AI Feedback","work_id":"faaaa4e0-2676-4fac-a0b4-99aef10d2095","ref_index":2,"cited_arxiv_id":"2212.08073","is_internal_anchor":true},{"doi":"","year":null,"title":"arXiv preprint arXiv:2305.17126 , year=","work_id":"1447b78e-0a79-4af6-8cd4-93220e680d2b","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Dohan and David R","work_id":"80c6bf1e-aa52-4830-9f36-0616ee2d8ef8","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Teaching Large Language Models to Self-Debug","work_id":"cdfb2680-220c-44eb-9edd-867b75fb821d","ref_index":5,"cited_arxiv_id":"2304.05128","is_internal_anchor":true}],"resolved_work":51,"snapshot_sha256":"1167c17795a898642d17396677c24d26a47a2e13074cb32b68985a20f22a7215","internal_anchors":22},"formal_canon":{"evidence_count":2,"snapshot_sha256":"a9e2c1afcd7f0509ee97c2ab69a703856fc472be4b417d398ee73c015140427c"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2309.03409","created_at":"2026-05-17T23:39:19.769156+00:00"},{"alias_kind":"arxiv_version","alias_value":"2309.03409v3","created_at":"2026-05-17T23:39:19.769156+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2309.03409","created_at":"2026-05-17T23:39:19.769156+00:00"},{"alias_kind":"pith_short_12","alias_value":"AP2ERP4WROMF","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_16","alias_value":"AP2ERP4WROMF6ADS","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_8","alias_value":"AP2ERP4W","created_at":"2026-05-18T12:33:33.725879+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":33,"internal_anchor_count":33,"sample":[{"citing_arxiv_id":"2605.15665","citing_title":"PRISM: Prompt Reliability via Iterative Simulation and Monitoring for Enterprise Conversational AI","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2605.15721","citing_title":"Contexting as Recommendation: Evolutionary Collaborative Filtering for Context Engineering","ref_index":35,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16233","citing_title":"FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast","ref_index":22,"is_internal_anchor":true},{"citing_arxiv_id":"2605.17247","citing_title":"Towards Robust Argumentative Essay Understanding via TIDE: An Interactive Framework with Trial and Debate","ref_index":99,"is_internal_anchor":true},{"citing_arxiv_id":"2605.19102","citing_title":"Prompt Optimization for LLM Code Generation via Reinforcement Learning","ref_index":35,"is_internal_anchor":true},{"citing_arxiv_id":"2605.19633","citing_title":"optimize_anything: A Universal API for Optimizing any Text Parameter","ref_index":31,"is_internal_anchor":true},{"citing_arxiv_id":"2506.08332","citing_title":"ORFS-agent: Tool-Using Agents for Chip Design Optimization","ref_index":61,"is_internal_anchor":true},{"citing_arxiv_id":"2509.12643","citing_title":"Learn to Relax with Large Language Models: Solving Constraint Optimization Problems via Bidirectional Coevolution","ref_index":1,"is_internal_anchor":true},{"citing_arxiv_id":"2511.15408","citing_title":"Chinese Short-Form Creative Content Generation via Explanation-Oriented Multi-Objective Optimization","ref_index":77,"is_internal_anchor":true},{"citing_arxiv_id":"2402.17762","citing_title":"Massive Activations in Large Language Models","ref_index":157,"is_internal_anchor":true},{"citing_arxiv_id":"2605.09018","citing_title":"Evolutionary Ensemble of Agents","ref_index":25,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12484","citing_title":"Learning, Fast and Slow: Towards LLMs That Adapt Continually","ref_index":67,"is_internal_anchor":true},{"citing_arxiv_id":"2605.14443","citing_title":"Prompting Policies for Multi-step Reasoning and Tool-Use in Black-box LLMs with Iterative Distillation of Experience","ref_index":1,"is_internal_anchor":true},{"citing_arxiv_id":"2507.21046","citing_title":"A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence","ref_index":48,"is_internal_anchor":true},{"citing_arxiv_id":"2604.03189","citing_title":"Reflective Context Learning: Studying the Optimization Primitives of Context Space","ref_index":12,"is_internal_anchor":true},{"citing_arxiv_id":"2604.02988","citing_title":"Self-Optimizing Multi-Agent Systems for Deep Research","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12484","citing_title":"Learning, Fast and Slow: Towards LLMs That Adapt Continually","ref_index":66,"is_internal_anchor":true},{"citing_arxiv_id":"2402.07927","citing_title":"A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2605.08094","citing_title":"MedThink: Enhancing Diagnostic Accuracy in Small Models via Teacher-Guided Reasoning Correction","ref_index":35,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10877","citing_title":"Neural at ArchEHR-QA 2026: One Method Fits All: Unified Prompt Optimization for Clinical QA over EHRs","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2605.09018","citing_title":"Evolutionary Ensemble of Agents","ref_index":17,"is_internal_anchor":true},{"citing_arxiv_id":"2310.03714","citing_title":"DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines","ref_index":61,"is_internal_anchor":true},{"citing_arxiv_id":"2605.04107","citing_title":"TSCG: Deterministic Tool-Schema Compilation for Agentic LLM Deployments","ref_index":29,"is_internal_anchor":true},{"citing_arxiv_id":"2604.18206","citing_title":"A Control Architecture for Training-Free Memory Use","ref_index":28,"is_internal_anchor":true},{"citing_arxiv_id":"2604.17937","citing_title":"ContraPrompt: Contrastive Prompt Optimization via Dyadic Reasoning Trace Analysis","ref_index":7,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF","json":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF.json","graph_json":"https://pith.science/api/pith-number/AP2ERP4WROMF6ADSYSWAL5U4KF/graph.json","events_json":"https://pith.science/api/pith-number/AP2ERP4WROMF6ADSYSWAL5U4KF/events.json","paper":"https://pith.science/paper/AP2ERP4W"},"agent_actions":{"view_html":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF","download_json":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF.json","view_paper":"https://pith.science/paper/AP2ERP4W","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2309.03409&json=true","fetch_graph":"https://pith.science/api/pith-number/AP2ERP4WROMF6ADSYSWAL5U4KF/graph.json","fetch_events":"https://pith.science/api/pith-number/AP2ERP4WROMF6ADSYSWAL5U4KF/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF/action/timestamp_anchor","attest_storage":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF/action/storage_attestation","attest_author":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF/action/author_attestation","sign_citation":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF/action/citation_signature","submit_replication":"https://pith.science/pith/AP2ERP4WROMF6ADSYSWAL5U4KF/action/replication_record"}},"created_at":"2026-05-17T23:39:19.769156+00:00","updated_at":"2026-05-17T23:39:19.769156+00:00"}