{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:ZSJZXFHZKAXSCGF7S6E4XD3HC7","short_pith_number":"pith:ZSJZXFHZ","schema_version":"1.0","canonical_sha256":"cc939b94f9502f2118bf9789cb8f6717df69436d947181f1da1dd97ab78eacb0","source":{"kind":"arxiv","id":"2603.00729","version":1},"attestation_state":"computed","paper":{"title":"Qwen3-Coder-Next Technical Report","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"An 80-billion-parameter model activates only three billion at inference to reach competitive results on coding agent benchmarks.","cross_cats":[],"primary_cat":"cs.CL","authors_text":"Binyuan Hui, Fan Zhou, Jiajun Zhang, Jiawei Chen, Jiaxi Yang, Jinxi Wei, Junyang Lin, Kaixin Li, Kashun Shum, Lei Zhang, Mingze Li, Mouxiang Chen, Ruisheng Cao, Wenting Zhao, Xuwu Wang, Yuheng Jing, Yunlong Feng, Zeyao Ma, Zeyu Cui, Zongmeng Zhang","submitted_at":"2026-02-28T16:25:04Z","abstract_excerpt":"We present Qwen3-Coder-Next, an open-weight language model specialized for coding agents. Qwen3-Coder-Next is an 80-billion-parameter model that activates only 3 billion parameters during inference, enabling strong coding capability with efficient inference. In this work, we explore how far strong training recipes can push the capability limits of models with small parameter footprints. To achieve this, we perform agentic training through large-scale synthesis of verifiable coding tasks paired with executable environments, allowing learning directly from environment feedback via mid-training a"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":true},"canonical_record":{"source":{"id":"2603.00729","kind":"arxiv","version":1},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.CL","submitted_at":"2026-02-28T16:25:04Z","cross_cats_sorted":[],"title_canon_sha256":"688f60c3d31d34179dbd3273b33f3f4353959d060b9876d47f0c75acf0af7a92","abstract_canon_sha256":"bb52bbc009df70c868ff6136ce1cbd69618c2d633c6881c1d6e9ca62c5771629"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:46.558305Z","signature_b64":"JUXwXCGHiZUTlqXyHwPcWf7Fa+OczbeOryJa3s92YS6a17+BRD/hipCCDQ9XdMJWOsaO5mU5L/jQst2Qx08jCQ==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"cc939b94f9502f2118bf9789cb8f6717df69436d947181f1da1dd97ab78eacb0","last_reissued_at":"2026-05-17T23:38:46.557853Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:46.557853Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Qwen3-Coder-Next Technical Report","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"An 80-billion-parameter model activates only three billion at inference to reach competitive results on coding agent benchmarks.","cross_cats":[],"primary_cat":"cs.CL","authors_text":"Binyuan Hui, Fan Zhou, Jiajun Zhang, Jiawei Chen, Jiaxi Yang, Jinxi Wei, Junyang Lin, Kaixin Li, Kashun Shum, Lei Zhang, Mingze Li, Mouxiang Chen, Ruisheng Cao, Wenting Zhao, Xuwu Wang, Yuheng Jing, Yunlong Feng, Zeyao Ma, Zeyu Cui, Zongmeng Zhang","submitted_at":"2026-02-28T16:25:04Z","abstract_excerpt":"We present Qwen3-Coder-Next, an open-weight language model specialized for coding agents. Qwen3-Coder-Next is an 80-billion-parameter model that activates only 3 billion parameters during inference, enabling strong coding capability with efficient inference. In this work, we explore how far strong training recipes can push the capability limits of models with small parameter footprints. To achieve this, we perform agentic training through large-scale synthesis of verifiable coding tasks paired with executable environments, allowing learning directly from environment feedback via mid-training a"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Qwen3-Coder-Next achieves competitive performance relative to its active parameter count across agent-centric benchmarks including SWE-Bench and Terminal-Bench.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That large-scale synthesis of verifiable coding tasks paired with executable environments produces training signals that generalize to real-world coding agent use cases without significant distribution shift.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"An 80B model with 3B active parameters achieves competitive coding-agent performance through agentic training on verifiable tasks and releases open weights.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"An 80-billion-parameter model activates only three billion at inference to reach competitive results on coding agent benchmarks.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"793be288c9ce18c00a6e73d3cb52a8ef94139963dc47337eddc9197d5feae86a"},"source":{"id":"2603.00729","kind":"arxiv","version":1},"verdict":{"id":"af112bd7-77f1-4b04-8d13-8fe7ecb7feb4","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T21:09:32.285141Z","strongest_claim":"Qwen3-Coder-Next achieves competitive performance relative to its active parameter count across agent-centric benchmarks including SWE-Bench and Terminal-Bench.","one_line_summary":"An 80B model with 3B active parameters achieves competitive coding-agent performance through agentic training on verifiable tasks and releases open weights.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That large-scale synthesis of verifiable coding tasks paired with executable environments produces training signals that generalize to real-world coding agent use cases without significant distribution shift.","pith_extraction_headline":"An 80-billion-parameter model activates only three billion at inference to reach competitive results on coding agent benchmarks."},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"5967965fa8969dd934ef1e66aa19bba7a09dd18d55fd01f11ded0a7db08ea862"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2603.00729","created_at":"2026-05-17T23:38:46.557923+00:00"},{"alias_kind":"arxiv_version","alias_value":"2603.00729v1","created_at":"2026-05-17T23:38:46.557923+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2603.00729","created_at":"2026-05-17T23:38:46.557923+00:00"},{"alias_kind":"pith_short_12","alias_value":"ZSJZXFHZKAXS","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"ZSJZXFHZKAXSCGF7","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"ZSJZXFHZ","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":22,"internal_anchor_count":22,"sample":[{"citing_arxiv_id":"2605.21384","citing_title":"SpecBench: Measuring Reward Hacking in Long-Horizon Coding Agents","ref_index":7,"is_internal_anchor":true},{"citing_arxiv_id":"2605.17862","citing_title":"$\\boldsymbol{f}$-OPD: Stabilizing Long-Horizon On-Policy Distillation with Freshness-Aware Control","ref_index":49,"is_internal_anchor":true},{"citing_arxiv_id":"2605.19660","citing_title":"OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond","ref_index":9,"is_internal_anchor":true},{"citing_arxiv_id":"2605.19932","citing_title":"PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents","ref_index":9,"is_internal_anchor":true},{"citing_arxiv_id":"2605.20075","citing_title":"CopT: Contrastive On-Policy Thinking with Continuous Spaces for General and Agentic Reasoning","ref_index":6,"is_internal_anchor":true},{"citing_arxiv_id":"2605.14084","citing_title":"CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing","ref_index":1,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12913","citing_title":"Revisiting DAgger in the Era of LLM-Agents","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2604.01496","citing_title":"From SWE-ZERO to SWE-HERO: Execution-free to Execution-based Fine-tuning for Software Engineering Agents","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2605.03195","citing_title":"Terminus-4B: Can a Smaller Model Replace Frontier LLMs at Agentic Execution Tasks?","ref_index":22,"is_internal_anchor":true},{"citing_arxiv_id":"2605.08301","citing_title":"Priming: Hybrid State Space Models From Pre-trained Transformers","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10876","citing_title":"AssayBench: An Assay-Level Virtual Cell Benchmark for LLMs and Agents","ref_index":87,"is_internal_anchor":true},{"citing_arxiv_id":"2605.05902","citing_title":"Evaluating Non-English Developer Support in Machine Learning for Software Engineering","ref_index":76,"is_internal_anchor":true},{"citing_arxiv_id":"2604.22601","citing_title":"From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2605.04894","citing_title":"SynConfRoute: Syntax-Aware Routing for Efficient Code Completion with Small CodeLLMs","ref_index":39,"is_internal_anchor":true},{"citing_arxiv_id":"2605.00342","citing_title":"Making Every Verified Token Count: Adaptive Verification for MoE Speculative Decoding","ref_index":15,"is_internal_anchor":true},{"citing_arxiv_id":"2604.12144","citing_title":"VERITAS: Verifiable Epistemic Reasoning for Image-Derived Hypothesis Testing via Agentic Systems","ref_index":64,"is_internal_anchor":true},{"citing_arxiv_id":"2604.06861","citing_title":"REAgent: Requirement-Driven LLM Agents for Software Issue Resolution","ref_index":6,"is_internal_anchor":true},{"citing_arxiv_id":"2604.08000","citing_title":"PASK: Toward Intent-Aware Proactive Agents with Long-Term Memory","ref_index":3,"is_internal_anchor":true},{"citing_arxiv_id":"2604.04443","citing_title":"DeonticBench: A Benchmark for Reasoning over Rules","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2604.16198","citing_title":"Bridging the Gap between User Intent and LLM: A Requirement Alignment Approach for Code Generation","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2604.22050","citing_title":"LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs","ref_index":38,"is_internal_anchor":true},{"citing_arxiv_id":"2605.03179","citing_title":"A Validated Prompt Bank for Malicious Code Generation: Separating Executable Weapons from Security Knowledge in 1,554 Consensus-Labeled Prompts","ref_index":53,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7","json":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7.json","graph_json":"https://pith.science/api/pith-number/ZSJZXFHZKAXSCGF7S6E4XD3HC7/graph.json","events_json":"https://pith.science/api/pith-number/ZSJZXFHZKAXSCGF7S6E4XD3HC7/events.json","paper":"https://pith.science/paper/ZSJZXFHZ"},"agent_actions":{"view_html":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7","download_json":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7.json","view_paper":"https://pith.science/paper/ZSJZXFHZ","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2603.00729&json=true","fetch_graph":"https://pith.science/api/pith-number/ZSJZXFHZKAXSCGF7S6E4XD3HC7/graph.json","fetch_events":"https://pith.science/api/pith-number/ZSJZXFHZKAXSCGF7S6E4XD3HC7/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7/action/timestamp_anchor","attest_storage":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7/action/storage_attestation","attest_author":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7/action/author_attestation","sign_citation":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7/action/citation_signature","submit_replication":"https://pith.science/pith/ZSJZXFHZKAXSCGF7S6E4XD3HC7/action/replication_record"}},"created_at":"2026-05-17T23:38:46.557923+00:00","updated_at":"2026-05-17T23:38:46.557923+00:00"}