Recognition: no theorem link
ActivityEditor: Learning to Synthesize Physically Valid Human Mobility
Pith reviewed 2026-05-10 18:38 UTC · model grok-4.3
The pith
A dual-LLM-agent system learns to revise activity chains so that generated human trajectories obey physical constraints in any new city.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ActivityEditor decomposes trajectory synthesis into an intention-based agent that produces demographic-driven coarse activity chains and an editor agent that iteratively revises them through reinforcement learning with multiple rewards grounded in real-world physical constraints, thereby achieving zero-shot cross-regional generation while preserving high statistical fidelity and physical validity.
What carries the argument
The editor agent, which acquires the capacity to enforce human mobility regularities by training on reinforcement learning rewards derived from physical constraints and then applies iterative revisions to coarse activity chains.
Load-bearing premise
Rewards based on physical constraints are enough to make the editor agent internalize mobility regularities without introducing artifacts or breaking socio-semantic coherence.
What would settle it
In a city never seen during training, generate trajectories with ActivityEditor and measure whether they exhibit statistically lower physical-validity scores or poorer distributional match to real mobility patterns than a simple baseline that ignores physical rewards.
Figures
read the original abstract
Human mobility modeling is indispensable for diverse urban applications. However, existing data-driven methods often suffer from data scarcity, limiting their applicability in regions where historical trajectories are unavailable or restricted. To bridge this gap, we propose \textbf{ActivityEditor}, a novel dual-LLM-agent framework designed for zero-shot cross-regional trajectory generation. Our framework decomposes the complex synthesis task into two collaborative stages. Specifically, an intention-based agent, which leverages demographic-driven priors to generate structured human intentions and coarse activity chains to ensure high-level socio-semantic coherence. These outputs are then refined by editor agent to obtain mobility trajectories through iteratively revisions that enforces human mobility law. This capability is acquired through reinforcement learning with multiple rewards grounded in real-world physical constraints, allowing the agent to internalize mobility regularities and ensure high-fidelity trajectory generation. Extensive experiments demonstrate that \textbf{ActivityEditor} achieves superior zero-shot performance when transferred across diverse urban contexts. It maintains high statistical fidelity and physical validity, providing a robust and highly generalizable solution for mobility simulation in data-scarce scenarios. Our code is available at: https://anonymous.4open.science/r/ActivityEditor-066B.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ActivityEditor, a dual-LLM-agent framework for zero-shot cross-regional human mobility trajectory synthesis. An intention agent uses demographic priors to produce socio-semantically coherent activity chains; an editor agent then iteratively revises these trajectories via reinforcement learning whose rewards are grounded in real-world physical constraints (speed, distance, etc.) so that the policy internalizes mobility regularities. The authors claim that the resulting trajectories exhibit superior zero-shot transfer performance across diverse urban contexts while preserving high statistical fidelity and physical validity, with code released at an anonymous repository.
Significance. If the empirical claims are substantiated, the work would offer a practical route to mobility simulation in data-scarce regions by combining LLM priors with physically grounded RL, reducing reliance on region-specific trajectory datasets. The public code release is a clear strength that supports reproducibility and follow-up work.
major comments (3)
- Abstract and §4 (Experiments): the abstract asserts 'superior zero-shot performance' and 'high statistical fidelity and physical validity' yet reports no quantitative metrics, no baseline comparisons, and no tables or figures summarizing results. Without these, the central claim that ActivityEditor outperforms existing methods cannot be evaluated.
- §3.2 (Editor Agent) and reward design: the framework states that RL rewards grounded only in physical constraints allow the editor to 'internalize human mobility law,' but no derivation is given showing how terms such as speed and distance suffice to reproduce region-varying regularities (activity-duration distributions, trip-chaining dependencies, demographic sequencing). The skeptic concern that physical constraints alone may introduce artifacts or fail to preserve socio-semantic coherence from the intention agent therefore remains unaddressed.
- §4 (Evaluation protocol): the zero-shot transfer claim requires explicit description of how statistical fidelity is measured (e.g., which distributions are compared), how physical validity is quantified, and which baselines (data-driven or LLM-only) are used. Absence of these details makes it impossible to assess whether the reported superiority is robust or merely an artifact of the chosen metrics.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to improve clarity, completeness, and substantiation of our claims.
read point-by-point responses
-
Referee: [—] Abstract and §4 (Experiments): the abstract asserts 'superior zero-shot performance' and 'high statistical fidelity and physical validity' yet reports no quantitative metrics, no baseline comparisons, and no tables or figures summarizing results. Without these, the central claim that ActivityEditor outperforms existing methods cannot be evaluated.
Authors: We agree that the abstract would benefit from including key quantitative results to immediately support the claims. The experiments in §4 include comparisons against baselines with tables and figures, but these are not summarized in the abstract. We will revise the abstract to report the main performance metrics (e.g., improvements in statistical fidelity and physical validity scores) and add a concise results overview at the start of §4. revision: partial
-
Referee: [—] §3.2 (Editor Agent) and reward design: the framework states that RL rewards grounded only in physical constraints allow the editor to 'internalize human mobility law,' but no derivation is given showing how terms such as speed and distance suffice to reproduce region-varying regularities (activity-duration distributions, trip-chaining dependencies, demographic sequencing). The skeptic concern that physical constraints alone may introduce artifacts or fail to preserve socio-semantic coherence from the intention agent therefore remains unaddressed.
Authors: We appreciate this concern regarding the sufficiency of physical rewards. The design relies on RL allowing the editor to discover mobility regularities through constraint satisfaction, while the intention agent provides the socio-semantic starting point. To address potential artifacts or coherence loss, we will add an explanation of the reward terms in §3.2, including why physical constraints help capture higher-order patterns, and include empirical checks showing preservation of activity types and demographics from the intention agent. revision: partial
-
Referee: [—] §4 (Evaluation protocol): the zero-shot transfer claim requires explicit description of how statistical fidelity is measured (e.g., which distributions are compared), how physical validity is quantified, and which baselines (data-driven or LLM-only) are used. Absence of these details makes it impossible to assess whether the reported superiority is robust or merely an artifact of the chosen metrics.
Authors: We agree that the evaluation protocol needs to be stated more explicitly. We will expand §4 with a dedicated subsection detailing the statistical fidelity metrics (e.g., distribution comparisons for activity durations, trip lengths, and sequencing), physical validity quantification (constraint violation rates and feasibility checks), the full list of baselines (including data-driven and LLM-only variants), and the precise zero-shot cross-regional protocol. This will make the superiority claims easier to evaluate. revision: yes
Circularity Check
No significant circularity: RL training on external physical constraints is independent of target outputs
full rationale
The paper's core derivation decomposes synthesis into an intention agent (demographic priors for chains) followed by an editor agent trained via RL whose rewards are defined from real-world physical constraints (speed, distance, etc.). This training process is presented as enabling the agent to internalize regularities, with zero-shot transfer evaluated on held-out urban contexts. No equations, fitted parameters, or self-citations are shown that would make the learned policy equivalent to the evaluation statistics by construction. The mechanism relies on external constraint definitions rather than re-expressing target data patterns, rendering the chain self-contained.
Axiom & Free-Parameter Ledger
free parameters (1)
- RL reward coefficients
axioms (2)
- domain assumption Large language models can reliably translate demographic priors into socio-semantically coherent human intentions and activity chains.
- domain assumption Iterative LLM revisions guided by RL can enforce real-world mobility laws without external simulation engines.
invented entities (1)
-
ActivityEditor dual-LLM-agent framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Generating individual travel diaries using large language models informed by census and land-use data.arXiv preprint arXiv:2509.09710. Guanting Dong, Yifei Chen, Xiaoxi Li, Jiajie Jin, Hongjin Qian, Yutao Zhu, Hangyu Mao, Guorui Zhou, Zhicheng Dou, and Ji-Rong Wen. 2025. Tool-star: Empowering llm-brained multi-tool rea- soner via reinforcement learning.ar...
-
[2]
Chenyang Shao, Fengli Xu, Bingbing Fan, Jingtao Ding, Yuan Yuan, Meng Wang, and Yong Li
Direct preference optimization: Your language model is secretly a reward model.Advances in neural information processing systems, 36:53728–53741. Chenyang Shao, Fengli Xu, Bingbing Fan, Jingtao Ding, Yuan Yuan, Meng Wang, and Yong Li. 2024. Chain-of-planned-behaviour workflow elicits few- shot mobility generation in llms.arXiv preprint arXiv:2402.09836. X...
-
[3]
InProceedings of the twenty-fifth inter- national joint conference on artificial intelligence, pages 2618–2624
Deeptransport: Prediction and simulation of human mobility and transportation mode at a city- wide level. InProceedings of the twenty-fifth inter- national joint conference on artificial intelligence, pages 2618–2624. Ke Sun, Tieyun Qian, Tong Chen, Yile Liang, Quoc Viet Hung Nguyen, and Hongzhi Yin. 2020. Where to go next: Modeling long-and short-term us...
2020
-
[4]
Huandong Wang, Qizhong Zhang, Yuchen Wu, Depeng Jin, Xing Wang, Lin Zhu, and Li Yu
Tool4poi: A tool-augmented llm frame- work for next poi recommendation.arXiv preprint arXiv:2511.06405. Huandong Wang, Qizhong Zhang, Yuchen Wu, Depeng Jin, Xing Wang, Lin Zhu, and Li Yu. 2023a. Synthe- sizing human trajectories based on variational point processes.IEEE Transactions on Knowledge and Data Engineering, 36(4):1785–1799. Jiawei Wang, Renhe Ji...
-
[5]
Comapoi: A collaborative multi-agent frame- work for next poi prediction bridging the gap between trajectory and language. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1768–1778. Yifei Zhou, Song Jiang, Yuandong Tian, Jason We- ston, Sergey Levine, Sainbayar Sukhbaatar, and Xian L...
-
[7]
Identify any violations in the draft
-
[8]
Apply edit operations from the action space: ADD / DELETE / SHIFT / REPLACE / SPLIT
-
[9]
Output the final constraint-satisfying schedule OUTPUT FORMAT: [THOUGHT] Constraint Checking:
-
[11]
Logical Constraints (Hard): starts/ends at home, starts at 00:00, consecutive identical activities must be merged
-
[12]
Common Sense Constraints (Soft): activities match socio-demographic profile
-
[13]
Temporal Constraints (Soft): realistic durations for each activity type
-
[14]
You understand human behavior patterns and generate realistic daily schedules
Coherence Constraints (Soft): logical transitions, not over-fragmented Edit Operations Applied: - ADD: activity ’Y’ at time HH:MM (reason) - DELETE: activity ’X’ at index N (reason) - SHIFT: activity ’Z’ start/end time adjustment (reason) - REPLACE: activity ’A’ -> ’B’ (reason) - SPLIT: activity ’X’ divided into two segments (reason) Final Result: All con...
-
[15]
Physical (Hard): overlaps? 24h coverage? ends at 24:00? -> Yes / No
-
[16]
Logical (Hard): starts/ends home? starts at 00:00? -> Yes / No
-
[17]
Common Sense (Soft): activities match profile? -> Yes / No
-
[18]
Temporal (Soft): realistic durations? -> Yes / No
-
[19]
Show edit operations explicitly if violations are found
Coherence (Soft): logical flow? not fragmented? -> Yes / No Edits to Match Ground Truth: If already matches: No edits needed Otherwise list each operation: - DELETE: ’X’ at idx N (reason) - ADD: ’Y’ at time HH:MM (reason) - SHIFT: ’Z’ time adjustment (reason) - REPLACE: ’A’ -> ’B’ (reason) Final Result: All constraints satisfied after edits? Yes / No [/TH...
-
[20]
Check the INITIAL SCHEDULE against 5 constraint types: Physical, Logical, Common Sense, Temporal, and Coherence
-
[21]
Identify any violations
-
[22]
Apply edit operations: ADD / DELETE / SHIFT / REPLACE
-
[23]
Output the refined schedule OUTPUT FORMAT: [THOUGHT] Constraint Checking:
-
[24]
Physical Constraints (Hard): overlaps, 24 h coverage, ends at 24:00
-
[25]
Logical Constraints (Hard): starts/ends at home, starts at 00:00
-
[26]
Common Sense Constraints (Soft): age/ employment appropriate activities Figure 5: Activity duration by state
-
[27]
Temporal Constraints (Soft): realistic durations
-
[28]
activity
Coherence Constraints (Soft): logical transitions, not over-fragmented Applying Edit Operations: - DELETE: activity ’X’ at index N (reason) - ADD: activity ’Y’ at time HH:MM (reason) - SHIFT: activity time adjustment (reason) - REPLACE: activity type change (reason) Final Result: Yes / No [/THOUGHT] [JSON] [refined schedule as JSON array] [/JSON] USER MES...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.