{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:5OGHKE3FWDZ36CCERS7LUMJ3QE","short_pith_number":"pith:5OGHKE3F","schema_version":"1.0","canonical_sha256":"eb8c751365b0f3bf08448cbeba313b8104c9ef7a5ea9c4478e481874941c871d","source":{"kind":"arxiv","id":"2604.02029","version":2},"attestation_state":"computed","paper":{"title":"The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":[],"primary_cat":"cs.AI","authors_text":"Chen Gao, Chenglin Wu, Chengming Xu, Cheng Tan, Cheng Yang, Guanting Dong, Guibin Zhang, Haojie Huang, Huacan Wang, Jiale Tao, Jiangning Zhang, Jiayi Zhang, Jie Xu, Kaituo Feng, Kelu Yao, Kun Wang, Ronghao Chen, Ruqi Huang, Shuicheng Yan, Siyuan Ma, Tao Jin, Tianyu Fu, Wenqi Ren, Xiangyu Yue, Xiaobin Hu, Xiaogang Xu, Xinlei Yu, Yanwei Fu, Yongbo He, Yong Liu, Youxing Li, Yue Liao, Yue Ma, Yu-Gang Jiang, Yu Wang, Zhangquan Chen, Zhe Cao, Zhucun Xue, Zikun Su","submitted_at":"2026-04-02T13:36:37Z","abstract_excerpt":"Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift is driven by the structural limitations of explicit-space computation, including linguistic redundancy, discretization bottlenecks, sequential inefficiency, and semantic loss. This survey aims to provide a unified and up-to-date landscape of latent"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"2604.02029","kind":"arxiv","version":2},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.AI","submitted_at":"2026-04-02T13:36:37Z","cross_cats_sorted":[],"title_canon_sha256":"5bba0f069211d35825ca55bfb1a33ee6790a6b3ce34128b690a5527251e2f9ed","abstract_canon_sha256":"5e1d89daf68d52abfda5908a70123d08540025d2e45eb5659dda0c56a7b318a2"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-06-08T01:04:03.457198Z","signature_b64":"aukAmh7/q2aBqTE3celD5mcnOANRV65ZzfQNotXriOQT9cztvlv1kNnROsN1YIQAsDIYMUDqB7MJGAtT6OmDDw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"eb8c751365b0f3bf08448cbeba313b8104c9ef7a5ea9c4478e481874941c871d","last_reissued_at":"2026-06-08T01:04:03.456347Z","signature_status":"signed_v1","first_computed_at":"2026-06-08T01:04:03.456347Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":[],"primary_cat":"cs.AI","authors_text":"Chen Gao, Chenglin Wu, Chengming Xu, Cheng Tan, Cheng Yang, Guanting Dong, Guibin Zhang, Haojie Huang, Huacan Wang, Jiale Tao, Jiangning Zhang, Jiayi Zhang, Jie Xu, Kaituo Feng, Kelu Yao, Kun Wang, Ronghao Chen, Ruqi Huang, Shuicheng Yan, Siyuan Ma, Tao Jin, Tianyu Fu, Wenqi Ren, Xiangyu Yue, Xiaobin Hu, Xiaogang Xu, Xinlei Yu, Yanwei Fu, Yongbo He, Yong Liu, Youxing Li, Yue Liao, Yue Ma, Yu-Gang Jiang, Yu Wang, Zhangquan Chen, Zhe Cao, Zhucun Xue, Zikun Su","submitted_at":"2026-04-02T13:36:37Z","abstract_excerpt":"Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift is driven by the structural limitations of explicit-space computation, including linguistic redundancy, discretization bottlenecks, sequential inefficiency, and semantic loss. This survey aims to provide a unified and up-to-date landscape of latent"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"2604.02029","kind":"arxiv","version":2},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.02029/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2604.02029","created_at":"2026-06-08T01:04:03.456465+00:00"},{"alias_kind":"arxiv_version","alias_value":"2604.02029v2","created_at":"2026-06-08T01:04:03.456465+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2604.02029","created_at":"2026-06-08T01:04:03.456465+00:00"},{"alias_kind":"pith_short_12","alias_value":"5OGHKE3FWDZ3","created_at":"2026-06-08T01:04:03.456465+00:00"},{"alias_kind":"pith_short_16","alias_value":"5OGHKE3FWDZ36CCE","created_at":"2026-06-08T01:04:03.456465+00:00"},{"alias_kind":"pith_short_8","alias_value":"5OGHKE3F","created_at":"2026-06-08T01:04:03.456465+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":18,"internal_anchor_count":18,"sample":[{"citing_arxiv_id":"2605.05997","citing_title":"4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding","ref_index":22,"is_internal_anchor":true},{"citing_arxiv_id":"2605.22051","citing_title":"EasyVFX: Frequency-Driven Decoupling for Resource-Efficient VFX Generation","ref_index":69,"is_internal_anchor":true},{"citing_arxiv_id":"2605.17766","citing_title":"LatentUMM: Dual Latent Alignment for Unified Multimodal Models","ref_index":55,"is_internal_anchor":true},{"citing_arxiv_id":"2605.20075","citing_title":"CopT: Contrastive On-Policy Thinking with Continuous Spaces for General and Agentic Reasoning","ref_index":40,"is_internal_anchor":true},{"citing_arxiv_id":"2605.11856","citing_title":"UniVLR: Unifying Text and Vision in Visual Latent Reasoning for Multimodal LLMs","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2604.10500","citing_title":"Visual Enhanced Depth Scaling for Multimodal Latent Reasoning","ref_index":75,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12497","citing_title":"From Web to Pixels: Bringing Agentic Search into Visual Perception","ref_index":3,"is_internal_anchor":true},{"citing_arxiv_id":"2604.10500","citing_title":"Visual Enhanced Depth Scaling for Multimodal Latent Reasoning","ref_index":75,"is_internal_anchor":true},{"citing_arxiv_id":"2604.24661","citing_title":"Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations","ref_index":68,"is_internal_anchor":true},{"citing_arxiv_id":"2604.23318","citing_title":"Hidden States Know Where Reasoning Diverges: Credit Assignment via Span-Level Wasserstein Distance","ref_index":30,"is_internal_anchor":true},{"citing_arxiv_id":"2605.06285","citing_title":"LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG","ref_index":53,"is_internal_anchor":true},{"citing_arxiv_id":"2605.05997","citing_title":"4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding","ref_index":21,"is_internal_anchor":true},{"citing_arxiv_id":"2604.17866","citing_title":"Latent Abstraction for Retrieval-Augmented Generation","ref_index":49,"is_internal_anchor":true},{"citing_arxiv_id":"2604.10500","citing_title":"Visual Enhanced Depth Scaling for Multimodal Latent Reasoning","ref_index":75,"is_internal_anchor":true},{"citing_arxiv_id":"2605.07106","citing_title":"Retrieve, Integrate, and Synthesize: Spatial-Semantic Grounded Latent Visual Reasoning","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2605.07315","citing_title":"LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification","ref_index":9,"is_internal_anchor":true},{"citing_arxiv_id":"2604.24661","citing_title":"Agent-Centric Observation Adaptation for Robust Visual Control under Dynamic Perturbations","ref_index":68,"is_internal_anchor":true},{"citing_arxiv_id":"2604.17503","citing_title":"SkillGraph: Self-Evolving Multi-Agent Collaboration with Multimodal Graph Topology","ref_index":44,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE","json":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE.json","graph_json":"https://pith.science/api/pith-number/5OGHKE3FWDZ36CCERS7LUMJ3QE/graph.json","events_json":"https://pith.science/api/pith-number/5OGHKE3FWDZ36CCERS7LUMJ3QE/events.json","paper":"https://pith.science/paper/5OGHKE3F"},"agent_actions":{"view_html":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE","download_json":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE.json","view_paper":"https://pith.science/paper/5OGHKE3F","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2604.02029&json=true","fetch_graph":"https://pith.science/api/pith-number/5OGHKE3FWDZ36CCERS7LUMJ3QE/graph.json","fetch_events":"https://pith.science/api/pith-number/5OGHKE3FWDZ36CCERS7LUMJ3QE/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE/action/timestamp_anchor","attest_storage":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE/action/storage_attestation","attest_author":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE/action/author_attestation","sign_citation":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE/action/citation_signature","submit_replication":"https://pith.science/pith/5OGHKE3FWDZ36CCERS7LUMJ3QE/action/replication_record"}},"created_at":"2026-06-08T01:04:03.456465+00:00","updated_at":"2026-06-08T01:04:03.456465+00:00"}