{"state_type":"pith_open_graph_state","state_version":"1.0","pith_number":"pith:2026:FGMIIDEUJ2UNHPZFZEFFYXBVTJ","merge_version":"pith-open-graph-merge-v1","event_count":2,"valid_event_count":2,"invalid_event_count":0,"equivocation_count":0,"current":{"canonical_record":{"metadata":{"abstract_canon_sha256":"f424f81b6896725b2cd07e3f3c675dc0ac60ae47379e05b40bb3e201ed41c12b","cross_cats_sorted":[],"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.CV","submitted_at":"2026-05-14T07:15:46Z","title_canon_sha256":"5f29947e84643c048af1daf5e431c2efd97953599b092800f79034dc76140eb5"},"schema_version":"1.0","source":{"id":"2605.14475","kind":"arxiv","version":1}},"source_aliases":[{"alias_kind":"arxiv","alias_value":"2605.14475","created_at":"2026-05-17T23:39:06Z"},{"alias_kind":"arxiv_version","alias_value":"2605.14475v1","created_at":"2026-05-17T23:39:06Z"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2605.14475","created_at":"2026-05-17T23:39:06Z"},{"alias_kind":"pith_short_12","alias_value":"FGMIIDEUJ2UN","created_at":"2026-05-18T12:33:37Z"},{"alias_kind":"pith_short_16","alias_value":"FGMIIDEUJ2UNHPZF","created_at":"2026-05-18T12:33:37Z"},{"alias_kind":"pith_short_8","alias_value":"FGMIIDEU","created_at":"2026-05-18T12:33:37Z"}],"graph_snapshots":[{"event_id":"sha256:d2d0bc69e2099b1fcafe5f12d07a467abe13cd0b1346f4b2f15770d3ff8fbb18","target":"graph","created_at":"2026-05-17T23:39:06Z","signer":{"key_id":"pith-v1-2026-05","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","signer_id":"pith.science","signer_type":"pith_registry"},"payload":{"graph_snapshot":{"author_claims":{"count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","strong_count":0},"builder_version":"pith-number-builder-2026-05-17-v1","claims":{"count":4,"items":[{"attestation":"unclaimed","claim_id":"C1","kind":"strongest_claim","source":"verdict.strongest_claim","status":"machine_extracted","text":"Experiments on RSHR-Bench, XLRS-Bench, and LRS-VQA show that GeoVista achieves state-of-the-art performance."},{"attestation":"unclaimed","claim_id":"C2","kind":"weakest_assumption","source":"verdict.weakest_assumption","status":"machine_extracted","text":"The assumption that building a global exploration plan followed by branch-wise local inspection with explicit evidence state maintenance will reliably handle sparse tiny evidence across large scenes without losing context or causing duplication, which depends on the effectiveness of the APEX-GRO trajectory corpus and GRPO alignment."},{"attestation":"unclaimed","claim_id":"C3","kind":"one_line_summary","source":"verdict.one_line_summary","status":"machine_extracted","text":"GeoVista introduces a planning-driven active perception framework with global exploration plans, branch-wise local inspection, and explicit evidence tracking to achieve state-of-the-art results on ultra-high-resolution remote sensing benchmarks."},{"attestation":"unclaimed","claim_id":"C4","kind":"headline","source":"verdict.pith_extraction.headline","status":"machine_extracted","text":"GeoVista builds a global exploration plan then performs branch-wise inspections while tracking evidence to interpret ultra-high-resolution remote sensing images."}],"snapshot_sha256":"3b34c75ca82ac71e0824dfee0d681e664e129c86ff5f7adcec8e2b298cf9745b"},"formal_canon":{"evidence_count":2,"snapshot_sha256":"ce2454ff5b095b8320453b2c5f64a18930c18de3f06d660902bc1bea73f48e4f"},"paper":{"abstract_excerpt":"Interpreting ultra-high-resolution (UHR) remote sensing images requires models to search for sparse and tiny visual evidence across large-scale scenes. Existing remote sensing vision-language models can inspect local regions with zooming and cropping tools, but most exploration strategies follow either a one-shot focus or a single sequential trajectory. Such single-path exploration can lose global context, leave scattered regions unvisited, and revisit or count the same evidence multiple times. To this end, we propose GeoVista, a planning-driven active perception framework for UHR remote sensi","authors_text":"Bo Yang, Haoran Liu, Jiasen Hu, Jiashun Zhu, Lang Sun, Nachuan Xing, Ronghao Fu, Weijie Zhang, Weipeng Zhang, Xiao Yang, Xu Na, Zhiheng Xue, Zhiwen Lin","cross_cats":[],"headline":"GeoVista builds a global exploration plan then performs branch-wise inspections while tracking evidence to interpret ultra-high-resolution remote sensing images.","license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.CV","submitted_at":"2026-05-14T07:15:46Z","title":"GeoVista: Visually Grounded Active Perception for Ultra-High-Resolution Remote Sensing Understanding"},"references":{"count":94,"internal_anchors":15,"resolved_work":94,"sample":[{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":1,"title":"Towards large-scale small object detection: Survey and benchmarks.IEEE transactions on pattern analysis and machine intelligence, 45(11):13467–13488, 2023","work_id":"56a8f76b-c3bb-4f96-b5de-91e684ceb560","year":2023},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":2,"title":"Star: A first-ever dataset and a large-scale benchmark for scene graph generation in large-size satellite imagery.IEEE Trans","work_id":"a8e1273e-f8dd-4bf9-8265-cd434c2ad827","year":2025},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":3,"title":"When large vision-language model meets large remote sensing imagery: Coarse- to-fine text-guided token pruning.ArXiv, abs/2503.07588, 2025","work_id":"11ebcf82-8db3-4c59-82f4-8e71cdf69f00","year":2025},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":4,"title":"Geoeyes: On-demand visual focusing for evidence-grounded understanding of ultra-high-resolution remote sensing imagery","work_id":"0d316c52-7eb2-4906-8b3c-d6ff2c4d5135","year":2026},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":5,"title":"GeoLLaVA-8K: Scaling remote-sensing multimodal large language models to 8K resolution","work_id":"e8b770f4-ab0f-4fa1-9b6e-f9bd0e4e8960","year":2025}],"snapshot_sha256":"b1a713a072d529affbb993cbb88539381ca1615e03cbe34a46fcd7619837daa0"},"source":{"id":"2605.14475","kind":"arxiv","version":1},"verdict":{"created_at":"2026-05-15T02:36:57.241165Z","id":"4f804091-a583-4336-b194-38c9f79821a8","model_set":{"reader":"grok-4.3"},"one_line_summary":"GeoVista introduces a planning-driven active perception framework with global exploration plans, branch-wise local inspection, and explicit evidence tracking to achieve state-of-the-art results on ultra-high-resolution remote sensing benchmarks.","pipeline_version":"pith-pipeline@v0.9.0","pith_extraction_headline":"GeoVista builds a global exploration plan then performs branch-wise inspections while tracking evidence to interpret ultra-high-resolution remote sensing images.","strongest_claim":"Experiments on RSHR-Bench, XLRS-Bench, and LRS-VQA show that GeoVista achieves state-of-the-art performance.","weakest_assumption":"The assumption that building a global exploration plan followed by branch-wise local inspection with explicit evidence state maintenance will reliably handle sparse tiny evidence across large scenes without losing context or causing duplication, which depends on the effectiveness of the APEX-GRO trajectory corpus and GRPO alignment."}},"verdict_id":"4f804091-a583-4336-b194-38c9f79821a8"}}],"author_attestations":[],"timestamp_anchors":[],"storage_attestations":[],"citation_signatures":[],"replication_records":[],"corrections":[],"mirror_hints":[],"record_created":{"event_id":"sha256:b270d6983f2999a4d99acf0403988896226ecdedd62d3352e84fbcdbe7c09002","target":"record","created_at":"2026-05-17T23:39:06Z","signer":{"key_id":"pith-v1-2026-05","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","signer_id":"pith.science","signer_type":"pith_registry"},"payload":{"attestation_state":"computed","canonical_record":{"metadata":{"abstract_canon_sha256":"f424f81b6896725b2cd07e3f3c675dc0ac60ae47379e05b40bb3e201ed41c12b","cross_cats_sorted":[],"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.CV","submitted_at":"2026-05-14T07:15:46Z","title_canon_sha256":"5f29947e84643c048af1daf5e431c2efd97953599b092800f79034dc76140eb5"},"schema_version":"1.0","source":{"id":"2605.14475","kind":"arxiv","version":1}},"canonical_sha256":"2998840c944ea8d3bf25c90a5c5c359a4e35b4ba42001f216cd9a39f9611bedd","receipt":{"algorithm":"ed25519","builder_version":"pith-number-builder-2026-05-17-v1","canonical_sha256":"2998840c944ea8d3bf25c90a5c5c359a4e35b4ba42001f216cd9a39f9611bedd","first_computed_at":"2026-05-17T23:39:06.613503Z","key_id":"pith-v1-2026-05","kind":"pith_receipt","last_reissued_at":"2026-05-17T23:39:06.613503Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","receipt_version":"0.3","signature_b64":"ipgRcStzDrM0cUJqB1I0tldbjqvEdzjSYdG6smsnLf2CMmRqqX6ki5xmhhsnZHa32mRwxdwmuyWkB0A8dd3SCA==","signature_status":"signed_v1","signed_at":"2026-05-17T23:39:06.614219Z","signed_message":"canonical_sha256_bytes"},"source_id":"2605.14475","source_kind":"arxiv","source_version":1}}},"equivocations":[],"invalid_events":[],"applied_event_ids":["sha256:b270d6983f2999a4d99acf0403988896226ecdedd62d3352e84fbcdbe7c09002","sha256:d2d0bc69e2099b1fcafe5f12d07a467abe13cd0b1346f4b2f15770d3ff8fbb18"],"state_sha256":"0fbb9995981f71210abceea15a5132a98a29c70373c34ea021981dbee51f421e"}