{"paper":{"title":"RS-Claw: Progressive Active Tool Exploration via Hierarchical Skill Trees for Remote Sensing Agents","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"RS-Claw lets remote sensing agents actively explore tools via hierarchical skill trees, achieving up to 86% token compression while outperforming flat and RAG baselines.","cross_cats":[],"primary_cat":"cs.AI","authors_text":"Chengfu Liu, Cheng Yang, Dongyang Hou, Haifeng Li, Hanwen Yu, Kai Ouyang, Liangtian Liu, Wentao Yang, Zeyuan Wang, Zichao Tang, Ziyu Li","submitted_at":"2026-05-13T11:49:18Z","abstract_excerpt":"The rise of multi-modal large language models (MLLMs) is shifting remote sensing (RS) intelligence from \"see\" to \"action\", as OpenClaw-style frameworks enable agents to autonomously operate massive RS image-processing tools for complex tasks. Existing RS agents adopt a passive selection paradigm for tool invocation, relying on either full tool registration (Flat) or retrieval-augmented generation (RAG). However, in the massive and multi-source heterogeneous RS tool ecosystem, such passive mechanisms struggle to dynamically balance \"context load\" and \"toolset completeness\" throughout task reaso"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"RS-Claw's active exploration mechanism effectively filters semantic noise and substantially frees up reasoning space, achieving an input token compression ratio of up to 86%, and comprehensively outperforming existing Flat and RAG baselines across complex reasoning evaluations on the Earth-Bench benchmark.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That hierarchically structuring tool descriptions via skill encapsulation allows the agent to perform on-demand sequential decision-making that accurately hits critical tools without omissions during long-horizon reasoning in the massive heterogeneous RS tool ecosystem.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"RS-Claw enables remote sensing agents to actively explore tools via hierarchical skill trees, achieving up to 86% token compression and outperforming flat registration and RAG baselines on Earth-Bench.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"RS-Claw lets remote sensing agents actively explore tools via hierarchical skill trees, achieving up to 86% token compression while outperforming flat and RAG baselines.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"2989d5ee3df2351c8211ebcc349e83c20e91daf1f9ef566661fb6c04a725a8a1"},"source":{"id":"2605.13391","kind":"arxiv","version":1},"verdict":{"id":"07611a89-d405-41c8-a250-fd1ebabb06d6","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T19:29:20.903863Z","strongest_claim":"RS-Claw's active exploration mechanism effectively filters semantic noise and substantially frees up reasoning space, achieving an input token compression ratio of up to 86%, and comprehensively outperforming existing Flat and RAG baselines across complex reasoning evaluations on the Earth-Bench benchmark.","one_line_summary":"RS-Claw enables remote sensing agents to actively explore tools via hierarchical skill trees, achieving up to 86% token compression and outperforming flat registration and RAG baselines on Earth-Bench.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That hierarchically structuring tool descriptions via skill encapsulation allows the agent to perform on-demand sequential decision-making that accurately hits critical tools without omissions during long-horizon reasoning in the massive heterogeneous RS tool ecosystem.","pith_extraction_headline":"RS-Claw lets remote sensing agents actively explore tools via hierarchical skill trees, achieving up to 86% token compression while outperforming flat and RAG baselines."},"references":{"count":45,"sample":[{"doi":"","year":2023,"title":"Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models","work_id":"077c8bd4-de74-49bd-be87-4cb72bac1b73","ref_index":1,"cited_arxiv_id":"2305.04091","is_internal_anchor":true},{"doi":"","year":2023,"title":"Toolformer: Language models can teach themselves to use tools,","work_id":"ded388d7-62d9-4a3e-8260-791dcd633fda","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"HuggingGPT: Solving AI tasks with ChatGPT and its friends in Hugging Face,","work_id":"279c79ad-d324-493d-b3cf-5b712e012f7e","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"A survey on large language model based autonomous agents","work_id":"4aa1f101-d987-4096-969f-b7c6e616bc59","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"OpenClaw: Open-source personal AI assistant,","work_id":"ebda7aaa-80aa-4ee2-9315-e625f5352316","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":45,"snapshot_sha256":"dca06e5e1c698796e8eeb2ec46265e989c25c579501d82c37d095cb1d90ca3a5","internal_anchors":5},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}