pith. machine review for the scientific record. sign in

arxiv: 2605.00938 · v1 · submitted 2026-05-01 · 💻 cs.LG · cs.AI

Recognition: unknown

Fusing Urban Structure and Semantics: A Conditional Diffusion Model for Cross-City OD Matrix Generation

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:17 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords OD matrix generationdiffusion modelsgraph transformersurban mobilitycross-city generalizationcommuting flowsspatial constraints
0
0 comments X

The pith

A conditional diffusion model fuses urban graph structure with node semantics to generate generalizable OD matrices across cities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes SEDAN to model cities as attributed graphs where regions carry demographic and point-of-interest features as nodes and flows appear as edges. It injects adjacency and distance information as explicit conditions while using graph transformers to capture semantic interactions among nodes inside a diffusion process. This fusion is intended to produce commuting matrices that remain accurate without city-specific retraining. Accurate matrices support better traffic planning and resource decisions in urban settings. Experiments on U.S. city data show a 7.38 percent RMSE gain over the prior best method and stable results under varying urban patterns.

Core claim

SEDAN represents each city as an attributed graph and fuses semantic node attributes through graph-transformer interactions with spatial structure supplied by adjacency and distance matrices as diffusion conditions, yielding OD matrices that are both behaviorally plausible and geographically coherent across heterogeneous cities.

What carries the argument

The fusion mechanism inside the conditional diffusion model that routes semantic attributes through graph transformers while conditioning the reverse diffusion steps on adjacency weights and distance values.

If this is right

  • OD matrices can be produced for new cities without collecting city-specific training data or retraining the model.
  • Generated flows respect both behavioral plausibility from node attributes and geographic coherence from distance constraints.
  • The same conditioning approach may support planning tasks that rely on consistent flow estimates across multiple urban regions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same node-attribute-plus-distance conditioning could be tested on other flow types such as freight or pedestrian movement.
  • Adding temporal snapshots of the same graphs might allow the model to forecast how flows change after infrastructure updates.
  • The approach may reduce the data burden for smaller cities that lack large historical OD surveys.

Load-bearing premise

That semantic node features plus explicit adjacency and distance conditioning inside one diffusion process capture enough of the commuting process to generalize without city-specific retraining or extra latent variables.

What would settle it

Apply the trained model to OD data from an unseen city whose commuting patterns differ markedly in topography or demographics and check whether the RMSE remains lower than the previous baseline.

Figures

Figures reproduced from arXiv: 2605.00938 by Bin Chen, Chuan Ai, Fang Yang, Jingtao Ding, Runkang Guo, Yin Zhang, Zhengqiu Zhu, Zhuoya Meng.

Figure 1
Figure 1. Figure 1: Construction of the directed weighted graph for commuting OD As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Generation process of the commuting OD matrix values for every OD pair (i, j). The process is described as: q(F t ij |F t−1 ij ) = N  F t ij ; p 1 − βtF t−1 ij , βtI  , (6) q(F 1 ij , ..., FT ij |F 0 ij ) = Y T t=1 q(F t ij |F t−1 ij ), (7) where F t ij denotes the flow value of OD pair (i, j) at time step t, and βt is the noise scale parameter that controls the intensity of noise injection. As t increas… view at source ↗
Figure 3
Figure 3. Figure 3: Denoising network of SEDAN. (a) Urban characteristics embedding: regional attributes and the noisy OD matrix are mapped to initial node and edge features, and encoding spatial prior. (b) Overall framework: regional attributes, a noisy OD matrix, and two spatial priors (adjacency and distance) are provided as inputs to the model. After the urban characteristics embedding, features pass through a stacked gra… view at source ↗
Figure 4
Figure 4. Figure 4: Performance across different urban structure. (a) OD flow–based classification, (b) distance–based classification, (c) population–based classification, and (d) integrated classification. Each subfigure contains two radar charts: the left summarizes accuracy for cities grouped by structure—monocentric (M), polycentric (P), and uniform (U); the right reports distributional consistency measured by JSD for inf… view at source ↗
Figure 5
Figure 5. Figure 5: Ablation study results. Comparison of the full SEDAN model against six variants: w/o A (without adjacency prior), w/o D (without distance prior), w/o AD (without either prior), w/o POI (without POI features), w/o Diffusion (without the diffusion process), and w/o Constraint (without spatial information fusion mechanism). Perfor￾mance is reported on six metrics: (a) CPC↑, (b) RMSE↓, (c) NRMSE↓, (d) Inflow J… view at source ↗
Figure 6
Figure 6. Figure 6: SHAP values of the top 20 demographic features. The selected features are grouped into four dimensions: (a) age structure, (b) education, (c) transportation re￾sources, and (d) economic and household characteristics [PITH_FULL_IMAGE:figures/full_fig_p029_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: SHAP values of the top 10 POI features Overall, SHAP analysis reveals the critical influence of demographic and POI features in the OD flow generation, indicating that model predictions are jointly driven by socioeconomic characteristics and spatial environmental factors. Population composition, transportation accessibility, and the spatial layout of functional facilities collectively play a central role i… view at source ↗
Figure 8
Figure 8. Figure 8: Visualization of attention weights across transformer layers. (a) Binary adjacency matrix representing the graph topology. (b) Distance-based similarity matrix derived from physical distance, where shorter distances correspond to higher similarity. (c) Attention weight heatmaps across four transformer layers. Darker colors indicate larger attention weights between origin–destination region pairs. Each atte… view at source ↗
read the original abstract

Accurate modeling of commuting flows is important for urban governance, traffic planning, and resource allocation. However, the combined influence of individual intentions, geographic constraints, and social dynamics leads to considerable heterogeneity in commuting patterns, making it difficult to develop generation models that generalize across cities. To address this issue, we propose SEDAN, a Structure-Enhanced Diffusion model conditioned on Attributed Nodes for generalizable OD matrix generation. SEDAN models a city as an attributed graph. Each region is treated as a node with demographic and point-of-interest features, and commuting flows are modeled as weighted edges. Adjacency and distance matrices are incorporated to characterize spatial structure. Based on this representation, we design a fusion mechanism within SEDAN to jointly model semantic information and spatial information. Regional semantic attributes are used to model latent travel demand through graph-transformer-based node interactions, while spatial structure is injected into the generation process as explicit constraints. The adjacency matrix guides attention weights to strengthen interactions between neighboring regions. Meanwhile, the distance matrix serves as a diffusion condition to capture spatial proximity and travel impedance. The fusion of urban semantics and spatial constraints enables SEDAN to generate OD matrices that are both behaviorally plausible and geographically coherent. Experiments on real-world OD datasets from U.S. cities show that SEDAN achieves a 7.38\% improvement in RMSE over the state-of-the-art baseline, WEDAN. It also remains robust across heterogeneous urban scenarios and varying structural patterns. Our work provides an effective and generalizable solution for commuting OD matrix generation. The code is available at https://anonymous.4open.science/r/SEDAN.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes SEDAN, a conditional diffusion model for cross-city OD matrix generation that represents cities as attributed graphs with demographic/POI node features, models latent demand via graph transformers, and injects spatial structure through explicit adjacency-matrix-guided attention and distance-matrix conditioning in the diffusion process. It reports a 7.38% RMSE improvement over the WEDAN baseline together with robustness across heterogeneous U.S. urban scenarios.

Significance. If the generalization claims are supported by appropriate cross-city protocols, the work would offer a practically useful advance for transportation planning by reducing reliance on city-specific retraining while jointly capturing semantic demand drivers and geographic constraints; the public code release is a clear reproducibility strength.

major comments (1)
  1. Abstract: the central claim of cross-city generalization and the 7.38% RMSE gain rest on an unspecified train/test partitioning across cities; without an explicit leave-one-city-out, zero-shot, or held-out-city protocol, the reported improvement cannot be confirmed to demonstrate out-of-distribution transfer rather than within-city fitting.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for identifying the need for greater clarity regarding our cross-city evaluation protocol. We agree that the abstract should explicitly describe the partitioning to substantiate the generalization claims.

read point-by-point responses
  1. Referee: Abstract: the central claim of cross-city generalization and the 7.38% RMSE gain rest on an unspecified train/test partitioning across cities; without an explicit leave-one-city-out, zero-shot, or held-out-city protocol, the reported improvement cannot be confirmed to demonstrate out-of-distribution transfer rather than within-city fitting.

    Authors: We acknowledge that the current abstract does not specify the train/test partitioning. In the full manuscript (Section 4.1 and 4.2), experiments are conducted on real-world OD datasets from multiple U.S. cities, with the model trained on data from a collection of cities and evaluated on held-out cities to assess transfer. However, to directly address the concern and eliminate ambiguity, we will revise the abstract to explicitly state the use of a leave-one-city-out protocol: the model is trained on all cities except one and tested on the held-out city, with results averaged across folds. We will also expand Section 4.1 to detail this protocol, confirming that the reported 7.38% RMSE improvement over WEDAN is obtained under this out-of-distribution setup rather than within-city fitting. These changes will make the generalization claim fully verifiable. revision: yes

Circularity Check

0 steps flagged

No significant circularity; architecture and empirical results are independent of inputs

full rationale

The paper describes a new conditional diffusion model (SEDAN) that fuses graph-transformer node semantics with explicit adjacency/distance conditioning, then reports RMSE improvement on external real-world U.S. city OD datasets. No equations, derivations, or claims reduce the reported performance gain or generalization claim to a fitted parameter, self-definition, or self-citation chain. The evaluation uses real datasets and a named baseline (WEDAN), keeping the central result falsifiable and non-circular by construction. Minor self-citation risk is possible in the full text but is not load-bearing here.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The approach rests on standard assumptions of diffusion models and graph transformers plus the domain assumption that cities are adequately represented by attributed graphs with adjacency and distance matrices; no new physical entities are postulated.

free parameters (1)
  • Diffusion and graph-transformer hyperparameters
    Typical in such models but not enumerated in the abstract
axioms (2)
  • standard math Diffusion models generate structured data by learning to reverse a forward noise process
    Core generative mechanism invoked for OD matrix synthesis
  • domain assumption Cities can be represented as attributed graphs where node features capture travel demand and adjacency/distance matrices capture spatial constraints
    Foundational modeling choice stated in the abstract

pith-pipeline@v0.9.0 · 5614 in / 1373 out tokens · 67729 ms · 2026-05-09T19:17:53.633187+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 5 canonical work pages · 2 internal anchors

  1. [1]

    Saberi, H

    M. Saberi, H. S. Mahmassani, D. Brockmann, A. Hosseini, A complex network perspective for characterizing urban travel demand patterns: graph theoretical analysis of large-scale origin–destination demand net- works, Transportation 44 (6) (2017) 1383–1402

  2. [2]

    Saberi, T

    M. Saberi, T. H. Rashidi, M. Ghasri, K. Ewe, A complex network methodology for travel demand model evaluation and validation, Net- works and Spatial Economics 18 (4) (2018) 1051–1073

  3. [3]

    C.Rong, J.Ding, Y.Li, Aninterdisciplinarysurveyonorigin-destination flowsmodeling: Theoryandtechniques, ACMComputingSurveys57(1) (2024) 1–49

  4. [4]

    J. Zeng, Y. Liu, J. Ding, J. Yuan, Y. Li, Estimating on-road trans- portation carbon emissions from open data of road network and origin- destination flow data, in: Proceedings of the AAAI Conference on Arti- ficial Intelligence, Vol. 38, 2024, pp. 22493–22501

  5. [5]

    Credit, Z

    K. Credit, Z. Arnao, A method to derive small area estimates of linked commuting trips by mode from open source lodes and acs data, Environ- ment and Planning B: Urban Analytics and City Science 50 (3) (2023) 709–722

  6. [6]

    T. Yang, Understanding commuting patterns and changes: Counterfac- tual analysis in a planning support framework, Environment and plan- ning b: urban analytics and city science 47 (8) (2020) 1440–1455

  7. [7]

    G. K. Zipf, The p 1 p 2/d hypothesis: on the intercity movement of persons, American sociological review 11 (6) (1946) 677–686

  8. [8]

    Simini, M

    F. Simini, M. C. González, A. Maritan, A.-L. Barabási, A universal model for mobility and migration patterns, Nature 484 (7392) (2012) 96–100

  9. [9]

    D. A. Griffith, M. M. Fischer, Constrained variants of the gravity model and spatial dependence: model specification and estimation issues, in: Spatial Econometric Interaction Modelling, Springer, 2016, pp. 37–66

  10. [10]

    Cai, Doubly constrained gravity models for interregional trade esti- mation, Papers in Regional Science 100 (2) (2021) 455–475

    M. Cai, Doubly constrained gravity models for interregional trade esti- mation, Papers in Regional Science 100 (2) (2021) 455–475. 34

  11. [11]

    Liu, X.-Y

    E.-J. Liu, X.-Y. Yan, A universal opportunity model for human mobility, Scientific reports 10 (1) (2020) 4657

  12. [12]

    C. Rong, J. Feng, Y. Li, Deep learning models for population flow gen- eration from aggregated mobility data, in: Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers, 2019, pp. 1008–1013

  13. [13]

    C. Rong, Z. Liu, J. Ding, Y. Li, Learning to generate temporal origin- destination flow based-on urban regional features and traffic informa- tion, ACM Transactions on Knowledge Discovery from Data 18 (6) (2024) 1–17

  14. [14]

    C. Rong, T. Li, J. Feng, Y. Li, Inferring origin-destination flows from population distribution, IEEE Transactions on Knowledge and Data Engineering 35 (1) (2021) 603–613

  15. [15]

    Huang, X

    D. Huang, X. Song, Z. Fan, R. Jiang, R. Shibasaki, Y. Zhang, H. Wang, Y. Kato, A variational autoencoder based generative model of urban human mobility, in: 2019 IEEE conference on multimedia information processing and retrieval (MIPR), IEEE, 2019, pp. 425–430

  16. [16]

    Ouyang, Y

    X. Ouyang, Y. Yang, W. Zhou, Y. Zhang, H. Wang, W. Huang, City- trans: Domain-adversarial training with knowledge transfer for spatio- temporal prediction across cities, IEEE Transactions on Knowledge and Data Engineering 36 (1) (2023) 62–76

  17. [17]

    C. Rong, J. Ding, Z. Liu, Y. Li, Complexity-aware large scale origin- destination network generation via diffusion model, arXiv preprint arXiv:2306.04873 (2023)

  18. [18]

    T. He, J. Bao, R. Li, S. Ruan, Y. Li, L. Song, H. He, Y. Zheng, What is the human mobility in a new city: Transfer mobility knowledge across cities, in: Proceedings of The Web Conference 2020, 2020, pp. 1355– 1365

  19. [19]

    C. Rong, J. Feng, J. Ding, Goddag: Generating origin-destination flow for new cities via domain adversarial training, IEEE Transactions on Knowledge and Data Engineering 35 (10) (2023) 10048–10057. 35

  20. [20]

    Bhandari, A

    P. Bhandari, A. Anastasopoulos, D. Pfoser, Urban mobility assessment using llms, in: Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems, 2024, pp. 67–79

  21. [21]

    C. Rong, J. Ding, Y. Liu, Y. Li, A large-scale dataset and benchmark for commuting origin-destination flow generation, in: Proceedings of the Thirteenth International Conference on Learning Representations (ICLR), 2025

  22. [22]

    Robinson, B

    C. Robinson, B. Dilkina, A machine learning approach to modeling hu- man migration, in: Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies, 2018, pp. 1–8

  23. [23]

    Pourebrahim, S

    N. Pourebrahim, S. Sultana, A. Niakanlahiji, J.-C. Thill, Trip distribu- tion modeling with twitter data, Computers, Environment and Urban Systems 77 (2019) 101354

  24. [24]

    Z. Liu, F. Miranda, W. Xiong, J. Yang, Q. Wang, C. Silva, Learning geo- contextual embeddings for commuting flow prediction, in: Proceedings of the AAAI conference on artificial intelligence, Vol. 34, 2020, pp. 808– 816

  25. [25]

    Simini, G

    F. Simini, G. Barlacchi, M. Luca, L. Pappalardo, A deep gravity model for mobility flows generation, Nature communications 12 (1) (2021) 6576

  26. [26]

    J. Zeng, G. Zhang, C. Rong, J. Ding, J. Yuan, Y. Li, Causal learning empowered od prediction for urban planning, in: Proceedings of the 31st ACM international conference on information & knowledge man- agement, 2022, pp. 2455–2464

  27. [27]

    D. P. Kingma, M. Welling, et al., Auto-encoding variational bayes (2013)

  28. [28]

    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, Advances in neural information processing systems 27 (2014)

  29. [29]

    J. Ho, A. Jain, P. Abbeel, Denoising diffusion probabilistic models, Ad- vances in neural information processing systems 33 (2020) 6840–6851. 36

  30. [30]

    Bojchevski, O

    A. Bojchevski, O. Shchur, D. Zügner, S. Günnemann, Netgan: Gener- ating graphs via random walks, in: International conference on machine learning, PMLR, 2018, pp. 610–619

  31. [31]

    Rombach, A

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, B. Ommer, High- resolution image synthesis with latent diffusion models, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion, 2022, pp. 10684–10695

  32. [32]

    Croitoru, V

    F.-A. Croitoru, V. Hondru, R. T. Ionescu, M. Shah, Diffusion models in vision: A survey, IEEE transactions on pattern analysis and machine intelligence 45 (9) (2023) 10850–10869

  33. [33]

    J. Liu, C. Li, Y. Ren, F. Chen, Z. Zhao, Diffsinger: Singing voice syn- thesis via shallow diffusion mechanism, in: Proceedings of the AAAI conference on artificial intelligence, Vol. 36, 2022, pp. 11020–11028

  34. [34]

    D. Yang, J. Yu, H. Wang, W. Wang, C. Weng, Y. Zou, D. Yu, Diffsound: Discrete diffusion model for text-to-sound generation, IEEE/ACM Transactions on Audio, Speech, and Language Processing 31 (2023) 1720–1733

  35. [35]

    X. Li, J. Thickstun, I. Gulrajani, P. S. Liang, T. B. Hashimoto, Diffusion-lm improves controllable text generation, Advances in neural information processing systems 35 (2022) 4328–4343

  36. [36]

    S. Gong, M. Li, J. Feng, Z. Wu, L. Kong, Diffuseq: Sequence to sequence text generation with diffusion models, arXiv preprint arXiv:2210.08933 (2022)

  37. [37]

    T. Wu, Z. Fan, X. Liu, H.-T. Zheng, Y. Gong, J. Jiao, J. Li, J. Guo, N. Duan, W. Chen, et al., Ar-diffusion: Auto-regressive diffusion model for text generation, Advances in Neural Information Processing Systems 36 (2023) 39957–39974

  38. [38]

    C. Rong, X. Zhang, Y. Xi, H. Sui, J. Ding, Y. Li, Satellites reveal mo- bility: A commuting origin-destination flow generator for global cities, arXiv preprint arXiv:2505.15870 (2025)

  39. [39]

    J. Song, C. Meng, S. Ermon, Denoising diffusion implicit models, arXiv preprint arXiv:2010.02502 (2020). 37

  40. [40]

    C. Liu, S. Yang, Q. Xu, Z. Li, C. Long, Z. Li, R. Zhao, Spatial-temporal large language model for traffic prediction, in: 2024 25th IEEE Inter- national Conference on Mobile Data Management (MDM), IEEE, 2024, pp. 31–40

  41. [41]

    C.Liu, K.H.Hettige, Q.Xu, C.Long, S.Xiang, G.Cong, Z.Li, R.Zhao, St-llm+: Graph enhanced spatio-temporal large language models for traffic prediction, IEEE Transactions on Knowledge and Data Engineer- ing (2025)

  42. [42]

    Wilson, A statistical theory of spatial distribution models, Trans- portation Research 1 (3) (1967) 253–269

    A. Wilson, A statistical theory of spatial distribution models, Trans- portation Research 1 (3) (1967) 253–269

  43. [43]

    Rodríguez-Rueda, J

    P. Rodríguez-Rueda, J. Ruiz-Aguilar, J. González-Enrique, I. Turias, Origin–destinationmatrixestimationandpredictionfromsocioeconomic variables using automatic feature selection procedure-based machine learning model, Journal of urban planning and development 147 (4) (2021) 04021056

  44. [44]

    Pourebrahim, S

    N. Pourebrahim, S. Sultana, J.-C. Thill, S. Mohanty, Enhancing trip distribution prediction with twitter data: comparison of neural network and gravity models, in: Proceedings of the 2nd acm sigspatial inter- national workshop on ai for geographic knowledge discovery, 2018, pp. 5–8

  45. [45]

    A. Q. Nichol, P. Dhariwal, Improved denoising diffusion probabilistic models, in: International conference on machine learning, PMLR, 2021, pp. 8162–8171

  46. [46]

    Decoupled Weight Decay Regularization

    I. Loshchilov, F. Hutter, Decoupled weight decay regularization, arXiv preprint arXiv:1711.05101 (2017)

  47. [47]

    Barbosa, M

    H. Barbosa, M. Barthelemy, G. Ghoshal, C. R. James, M. Lenormand, T. Louail, R. Menezes, J. J. Ramasco, F. Simini, M. Tomasini, Human mobility: Models and applications, Physics Reports 734 (2018) 1–74

  48. [48]

    X. Lu, J. Feng, S. Lai, P. Holme, S. Liu, Z. Du, X. Yuan, S. Wang, Y. Li, X. Zhang, et al., Human mobility in epidemic modeling, Physics Reports 1157 (2026) 1–45. 38

  49. [49]

    S. M. Lundberg, S.-I. Lee, A unified approach to interpreting model pre- dictions, Advances in neural information processing systems 30 (2017)

  50. [50]

    R. Imai, D. Ikeda, H. Shingai, T. Nagata, K. Shigetaka, Origin- destination trips generated from operational data of a mobile network for urban transportation planning, Journal of Urban Planning and De- velopment 147 (1) (2021) 04020049

  51. [51]

    D.Mistry, M.Litvinova, A.PastoreyPiontti, M.Chinazzi, L.Fumanelli, M. F. Gomes, S. A. Haque, Q.-H. Liu, K. Mu, X. Xiong, et al., Infer- ringhigh-resolutionhumanmixingpatternsfordiseasemodeling, Nature communications 12 (1) (2021) 323

  52. [52]

    J. Zhu, X. Niu, Y. Wang, Polycentric urban spatial structure identifica- tion based on morphological and functional dimensions: Evidence from three chinese cities, Sustainability 16 (6) (2024) 2584

  53. [53]

    C. Tang, J. Dou, Exploring the polycentric structure and driving mecha- nism of urban regions from the perspective of innovation network, Fron- tiers in Physics 10 (2022) 855380. Appendix A. Analysis of Spatial Regularities in Commuting Flows To uncover the underlying patterns of commuting behaviors, we ana- lyzed the OD data across all cities in the data...