pith. sign in

arxiv: 2606.00733 · v1 · pith:6WTWPHYBnew · submitted 2026-05-30 · 💻 cs.CE

Higher-order Network Analysis of Human Mobility Data

Pith reviewed 2026-06-28 17:59 UTC · model grok-4.3

classification 💻 cs.CE
keywords human mobilityhigher-order networkssynthetic datamobility simulationpath analysisinfrastructure networksdata validationIle-de-France
0
0 comments X

The pith

Higher-order networks show simulated human mobility has key path-based limitations despite overall promise.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a framework that represents individual movement sequences as higher-order networks on the underlying road or transit infrastructure. It tests this approach on real mobility traces from thousands of people in the Ile-de-France region against a detailed open-source synthetic population model of the same area. The comparison finds that the simulated traces reproduce many broad patterns yet miss important sequential dependencies that appear in the observed paths. This matters because privacy and cost barriers make large real datasets scarce, so reliable synthetic substitutes are needed if their shortcomings can be identified and fixed. The path-focused lens surfaces differences that simpler origin-destination checks would overlook.

Core claim

A higher-order network representation of paths through the infrastructure network, when applied to the NetMob 2025 Ile-de-France dataset and a matching synthetic mobility model, establishes that simulated data serves as a promising surrogate for observed mobility while still exhibiting key limitations from a path-based perspective.

What carries the argument

Higher-order network representation of paths individuals take through the infrastructure network, which encodes sequential movement patterns for direct comparison between observed and simulated traces.

If this is right

  • Simulated mobility traces can substitute for observed data in many aggregate analyses.
  • Simulation models require targeted improvements to better reproduce higher-order path sequences.
  • Remediations such as adjusted route-choice rules or added path constraints can reduce the identified gaps.
  • Higher-order network methods open new validation challenges that standard trip-based checks do not address.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Extending the same path comparison to other cities would test whether the simulation shortcomings are model-specific or general.
  • Urban planning tools that rely on individual movement sequences could gain accuracy by incorporating higher-order validation steps.
  • Privacy-preserving data releases might benefit from releasing only aggregate higher-order network statistics rather than full traces.

Load-bearing premise

The higher-order network view of paths captures the mobility patterns that matter for deciding whether synthetic data is realistic enough to replace real traces.

What would settle it

Finding that an alternative representation such as origin-destination matrices or first-order networks shows no meaningful differences between the observed NetMob data and the synthetic model would undermine the reported path-based limitations.

read the original abstract

The detailed study of individual human mobility requires large-scale high-resolution datasets, but collecting such datasets in a way that is both statistically powerful and privacy preserving is a challenging and expensive task. In response, researchers have built tools to generate complex synthetic populations of agents that can be used to simulate synthetic individual mobility data, potentially obviating the difficulties of data collection. While these simulation-based approaches offer a promising avenue for expanding individual mobility research, it is difficult to asses whether such tools are effective at generating realistic mobility traces. In this work, we develop a framework for comparing observed and simulated mobility data using a higher-order network framework that focuses on analyzing patterns of movement in the paths individuals take through the underlying infrastructure network. We apply our framework to a case study comparing the NetMob 2025 Data Challenge Dataset, which includes individual mobility data for thousands of residents of the \^Ile-de-France region, with a sophisticated open-source synthetic population and mobility simulation model of the same region. We show that while simulated mobility data is indeed promising as a surrogate for observed mobility, there are some key limitations to the simulation paradigm from a path-based perspective, which we discuss along with potential future remediations and open challenges for higher-order mobility network analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript develops a higher-order network framework for comparing observed mobility paths (from the NetMob 2025 dataset for Île-de-France) against those generated by a synthetic population and mobility simulation model. It concludes that synthetic data is a promising surrogate but exhibits key limitations when mobility is analyzed from a path-based perspective through the infrastructure network.

Significance. If the central findings hold after addressing the issues below, the work would provide a useful validation lens for synthetic mobility models, helping to identify path-level discrepancies that standard aggregate statistics might miss and thereby supporting more reliable use of simulations in place of privacy-sensitive real data.

major comments (2)
  1. [Methods] Methods section: The framework is applied directly to the NetMob 2025 and synthetic datasets, but the manuscript provides no description of how the network order is chosen and no ablation or comparison against first-order networks, origin-destination matrices, or sequence entropy measures. This is load-bearing for the claim that the higher-order representation reveals simulation limitations that matter, because the skeptic concern (that alternative path representations might yield no limitations or different ones) is not addressed.
  2. [Results] Results section: The abstract and results claim 'key limitations' without reporting quantitative error bars, confidence intervals, or statistical tests on the magnitude of differences in path statistics between observed and simulated data. This prevents assessment of whether the deviations are practically significant or merely detectable.
minor comments (1)
  1. [Abstract] Abstract: The region name appears as '^Ile-de-France' rather than 'Île-de-France'; this is a minor encoding or typesetting issue.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive review. The comments identify important gaps in justification and quantification that we will address in revision to strengthen the manuscript's claims about higher-order network analysis of mobility data.

read point-by-point responses
  1. Referee: [Methods] Methods section: The framework is applied directly to the NetMob 2025 and synthetic datasets, but the manuscript provides no description of how the network order is chosen and no ablation or comparison against first-order networks, origin-destination matrices, or sequence entropy measures. This is load-bearing for the claim that the higher-order representation reveals simulation limitations that matter, because the skeptic concern (that alternative path representations might yield no limitations or different ones) is not addressed.

    Authors: We agree this justification is necessary. The network order (k=2) was selected to capture sequential path dependencies in the infrastructure network that first-order models miss, following established higher-order network methods for mobility. In the revised manuscript we will add an explicit Methods subsection describing the order selection criterion, including a brief sensitivity check across k=1 to k=3 on a subsample. We will also include a direct comparison of the same path statistics under first-order networks to demonstrate that the observed discrepancies between real and synthetic data are attenuated or absent at order 1. Full ablations against OD matrices and sequence entropy are orthogonal to the path-network focus and would expand scope substantially; we will instead add a short discussion noting these as complementary lenses without performing the full analysis. revision: partial

  2. Referee: [Results] Results section: The abstract and results claim 'key limitations' without reporting quantitative error bars, confidence intervals, or statistical tests on the magnitude of differences in path statistics between observed and simulated data. This prevents assessment of whether the deviations are practically significant or merely detectable.

    Authors: We concur that quantitative assessment of effect size is required. The current results rely on visual and descriptive comparison of path distributions. In revision we will add bootstrapped confidence intervals (or standard errors) around the reported path statistics and apply distribution-comparison tests (e.g., two-sample Kolmogorov-Smirnov) between the observed and synthetic path ensembles. These additions will be placed in the Results section and referenced in the abstract to allow readers to judge practical significance. revision: yes

Circularity Check

0 steps flagged

No circularity: framework applied to independent external datasets

full rationale

The paper develops a higher-order network framework for path-based comparison of observed (NetMob 2025) and simulated mobility data, then reports limitations from that direct application. No equations, fitted parameters, self-definitions, or self-citations are described that would force the reported limitations to reduce to the input data or framework by construction. The analysis rests on external benchmarks rather than renaming or re-deriving its own premises.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The framework rests on the assumption that higher-order networks of paths are the appropriate representation for mobility validation; no free parameters, axioms, or invented entities are mentioned in the abstract.

pith-pipeline@v0.9.1-grok · 5746 in / 1163 out tokens · 18642 ms · 2026-06-28T17:59:04.273282+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 35 canonical work pages

  1. [1]

    Physics Reports734, 1–74 (2018) https://doi.org/10.1016/ j.physrep.2018.01.001

    Barbosa, H., Barthelemy, M., Ghoshal, G., James, C.R., Lenormand, M., Louail, T., Menezes, R., Ramasco, J.J., Simini, F., Tomasini, M.: Human mobility: Mod- els and applications. Physics Reports734, 1–74 (2018) https://doi.org/10.1016/ j.physrep.2018.01.001

  2. [2]

    Nature Health1(3), 280–289 (2026) https://doi.org/ 10.1038/s44360-026-00056-7

    Wang, Q.R., Bell, M.L., Zhang, Y.: Integrating mobility data into air pollution research for public health. Nature Health1(3), 280–289 (2026) https://doi.org/ 10.1038/s44360-026-00056-7

  3. [3]

    Nature Computational Science4(7), 469–472 (2024) https://doi.org/10.1038/ s43588-024-00650-3

    Yabe, T., Luca, M., Tsubouchi, K., Lepri, B., Gonzalez, M.C., Moro, E.: Enhancing human mobility research with open and standardized datasets. Nature Computational Science4(7), 469–472 (2024) https://doi.org/10.1038/ s43588-024-00650-3

  4. [4]

    arXiv (2025)

    Armenante, P., Huang, K., Jha, N., Vassio, L.: Protecting Participants or Popu- lation? Comparison of k-Anonymous Origin-Destination Matrices. arXiv (2025). https://doi.org/10.48550/arXiv.2509.12950 22

  5. [5]

    Transportation Research Part C: Emerging Technologies182, 105416 (2026) https://doi.org/10.1016/j.trc.2025.105416

    Chen, C., Wang, R., Bansal, P., Chen, L., Ugurel, E., Zhang, Y., Wu, X.: From biases to opportunities: Leveraging Location-Based-Service (LBS) data for next- generation transportation planning. Transportation Research Part C: Emerging Technologies182, 105416 (2026) https://doi.org/10.1016/j.trc.2025.105416

  6. [6]

    Nagel, K., Fl¨ otter¨ od, G.: Agent-based traffic assignment: Going from trips to behavioral travelers. (2009)

  7. [7]

    ETH Zurich (2014)

    Chakirov, A., Fourie, P.J.: Enriched Sioux Falls Scenario with Dynamic and Disaggregate Demand. ETH Zurich (2014). https://doi.org/10.3929/ ETHZ-B-000080996

  8. [8]

    doi:10.5334/baw

    Horni, A., Nagel, K., Axhausen, K.W.: The Multi-Agent Transport Simulation MATSim. Ubiquity Press, ??? (2016). https://doi.org/10.5334/baw

  9. [9]

    Future Internet11(4), 92 (2019) https://doi.org/10

    Hackl, J., Dubernet, T.: Epidemic Spreading in Urban Areas Using Agent-Based Transportation Models. Future Internet11(4), 92 (2019) https://doi.org/10. 3390/fi11040092

  10. [10]

    Transportation Research Part C: Emerging Technologies130, 103291 (2021) https://doi.org/10.1016/j.trc

    H¨ orl, S., Balac, M.: Synthetic population and travel demand for Paris and ˆIle- de-France based on open and publicly available data. Transportation Research Part C: Emerging Technologies130, 103291 (2021) https://doi.org/10.1016/j.trc. 2021.103291

  11. [11]

    In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    Scholtes, I.: When is a Network a Network? Multi-Order Graphical Model Selec- tion in Pathways and Temporal Networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’17, pp. 1037–1046. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3097983.3098145

  12. [12]

    arXiv (2025)

    Zhang, C., Hackl, J.: Beyond Connectivity: Higher-Order Network Framework for Capturing Memory-Driven Mobility Dynamics. arXiv (2025). https://doi.org/10. 48550/ARXIV.2507.07727

  13. [13]

    PLOS ONE10(8), 0134322 (2015) https://doi.org/10

    Zhu, S., Levinson, D.: Do People Use the Shortest Path? An Empirical Test of Wardrop’s First Principle. PLOS ONE10(8), 0134322 (2015) https://doi.org/10. 1371/journal.pone.0134322

  14. [14]

    Journal of The Royal Society Interface13(116), 20160021 (2016) https://doi.org/10.1098/rsif.2016.0021

    Lima, A., Stanojevic, R., Papagiannaki, D., Rodriguez, P., Gonz´ alez, M.C.: Understanding individual routing behaviour. Journal of The Royal Society Interface13(116), 20160021 (2016) https://doi.org/10.1098/rsif.2016.0021

  15. [15]

    Journal of Transport Geography43, 123–139 (2015) https://doi.org/10.1016/j.jtrangeo

    Manley, E.J., Addison, J.D., Cheng, T.: Shortest path or anchor-based route choice: A large-scale empirical analysis of minicab routing in London. Journal of Transport Geography43, 123–139 (2015) https://doi.org/10.1016/j.jtrangeo. 2015.01.006 23

  16. [16]

    arXiv (2025)

    Chasse, A., Kouam, A.J., Viana, A.C., Stanica, R., Lobato, W.V., Ramos, G., Deperle, G., Bouroudi, A., Bussod, S., Molano, F.: The NetMob25 Dataset: A High-resolution Multi-layered View of Individual Mobility in Greater Paris Region. arXiv (2025). https://doi.org/10.48550/arXiv.2506.05903

  17. [17]

    Topics in Transportation

    Erlander, S., Stewart, N.F.: The Gravity Model in Transportation Analysis: Theory and Extensions. Topics in Transportation. VSP, Utrecht (1990)

  18. [18]

    Nature484(7392), 96–100 (2012) https://doi

    Simini, F., Gonz´ alez, M.C., Maritan, A., Barab´ asi, A.-L.: A universal model for mobility and migration patterns. Nature484(7392), 96–100 (2012) https://doi. org/10.1038/nature10856

  19. [19]

    Nature Communications6(1), 6007 (2015) https://doi.org/ 10.1038/ncomms7007

    Louail, T., Lenormand, M., Picornell, M., Garc´ ıa Cant´ u, O., Herranz, R., Frias- Martinez, E., Ramasco, J.J., Barthelemy, M.: Uncovering the spatial structure of mobility networks. Nature Communications6(1), 6007 (2015) https://doi.org/ 10.1038/ncomms7007

  20. [20]

    Nature Communi- cations6(1), 8166 (2015) https://doi.org/10.1038/ncomms9166

    Pappalardo, L., Simini, F., Rinzivillo, S., Pedreschi, D., Giannotti, F., Barab´ asi, A.-L.: Returners and explorers dichotomy in human mobility. Nature Communi- cations6(1), 8166 (2015) https://doi.org/10.1038/ncomms9166

  21. [21]

    Nature Communications8(1), 2229 (2017) https://doi.org/10.1038/s41467-017-02374-7

    Lee, M., Barbosa, H., Youn, H., Holme, P., Ghoshal, G.: Morphology of travel routes and the organization of cities. Nature Communications8(1), 2229 (2017) https://doi.org/10.1038/s41467-017-02374-7

  22. [22]

    In: Proceedings of the 2018 World Wide Web Conference

    Feng, J., Li, Y., Zhang, C., Sun, F., Meng, F., Guo, A., Jin, D.: DeepMove: Pre- dicting Human Mobility with Attentional Recurrent Networks. In: Proceedings of the 2018 World Wide Web Conference. WWW ’18, pp. 1459–1468. Interna- tional World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2018). https://doi.org/10.1145/3178876.3186058

  23. [23]

    Nature Communications12(1), 6576 (2021) https:// doi.org/10.1038/s41467-021-26752-4

    Simini, F., Barlacchi, G., Luca, M., Pappalardo, L.: A Deep Gravity model for mobility flows generation. Nature Communications12(1), 6576 (2021) https:// doi.org/10.1038/s41467-021-26752-4

  24. [24]

    Nature593(7860), 522–527 (2021) https://doi.org/10.1038/s41586-021-03480-9

    Schl¨ apfer, M., Dong, L., O’Keeffe, K., Santi, P., Szell, M., Salat, H., Anklesaria, S., Vazifeh, M., Ratti, C., West, G.B.: The universal visitation law of human mobility. Nature593(7860), 522–527 (2021) https://doi.org/10.1038/s41586-021-03480-9

  25. [25]

    Proceedings of the National Academy of Sciences122(17), 2414848122 (2025) https://doi.org/10.1073/pnas.2414848122

    Bontorin, S., Centellegher, S., Gallotti, R., Pappalardo, L., Lepri, B., Luca, M.: Mixing individual and collective behaviors to predict out-of-routine mobility. Proceedings of the National Academy of Sciences122(17), 2414848122 (2025) https://doi.org/10.1073/pnas.2414848122

  26. [26]

    Scientific Reports3(1), 1645 (2013) https://doi

    Krumme, C., Llorente, A., Cebrian, M., Pentland, A., Moro, E.: The predictability of consumer visitation patterns. Scientific Reports3(1), 1645 (2013) https://doi. org/10.1038/srep01645 24

  27. [27]

    Data Mining and Knowledge Discovery25(3), 478–510 (2012) https://doi.org/10.1007/s10618-012-0264-z

    Jiang, S., Ferreira, J., Gonz´ alez, M.C.: Clustering daily patterns of human activ- ities in the city. Data Mining and Knowledge Discovery25(3), 478–510 (2012) https://doi.org/10.1007/s10618-012-0264-z

  28. [28]

    Journal of Statistical Physics151(1-2), 304–318 (2013) https://doi.org/10.1007/s10955-012-0645-0

    Hasan, S., Schneider, C.M., Ukkusuri, S.V., Gonz´ alez, M.C.: Spatiotemporal Pat- terns of Urban Human Mobility. Journal of Statistical Physics151(1-2), 304–318 (2013) https://doi.org/10.1007/s10955-012-0645-0

  29. [29]

    Transportation Research Part C: Emerging Technologies58, 162–177 (2015) https://doi.org/10.1016/j.trc.2015.04.022

    Toole, J.L., Colak, S., Sturt, B., Alexander, L.P., Evsukoff, A., Gonz´ alez, M.C.: The path most traveled: Travel demand estimation using big data resources. Transportation Research Part C: Emerging Technologies58, 162–177 (2015) https://doi.org/10.1016/j.trc.2015.04.022

  30. [30]

    IEEE Transactions on Intelligent Transportation Systems18(8), 2271–2284 (2017) https://doi.org/10.1109/TITS.2016.2639320

    Zeng, W., Fu, C.-W., Muller Arisona, S., Schubiger, S., Burkhard, R., Ma, K.-L.: Visualizing the Relationship Between Human Mobility and Points of Inter- est. IEEE Transactions on Intelligent Transportation Systems18(8), 2271–2284 (2017) https://doi.org/10.1109/TITS.2016.2639320

  31. [31]

    Journal of Urban Technology27(4), 43–58 (2020) https://doi.org/10.1080/10630732.2021.1882175

    Chen, H., Song, X., Xu, C., Zhang, X.: Using Mobile Phone Data to Exam- ine Point-of-Interest Urban Mobility. Journal of Urban Technology27(4), 43–58 (2020) https://doi.org/10.1080/10630732.2021.1882175

  32. [32]

    Machine Learning115(1), 19 (2026) https://doi.org/10.1007/s10994-025-06904-z

    Mauro, G., Minici, M., Pappalardo, L.: The urban impact of AI: Modelling feed- back loops in location-based recommender systems. Machine Learning115(1), 19 (2026) https://doi.org/10.1007/s10994-025-06904-z

  33. [33]

    Nature Communications9(1), 2501 (2018) https://doi.org/10.1038/s41467-018-04978-z

    Kirkley, A., Barbosa, H., Barthelemy, M., Ghoshal, G.: From the betweenness cen- trality in street networks to structural invariants in random planar graphs. Nature Communications9(1), 2501 (2018) https://doi.org/10.1038/s41467-018-04978-z

  34. [34]

    Geographical Analysis (2025) https://doi.org/10.1111/gean.70009

    Boeing, G.: Modeling and Analyzing Urban Networks and Amenities With OSMnx. Geographical Analysis (2025) https://doi.org/10.1111/gean.70009

  35. [35]

    Transactions in GIS29(3) (2025) https://doi.org/10.1111/ tgis.70037

    Boeing, G.: Topological Graph Simplification Solutions to the Street Intersection Miscount Problem. Transactions in GIS29(3) (2025) https://doi.org/10.1111/ tgis.70037

  36. [36]

    In: 2025 IEEE International Conference on Big Data (BigData), pp

    Fu, J., Zhong, Z., Wang, X., Ma, J., Sekimoto, Y.: Linking Urban Morphology and Human Activity Patterns: An Empirical Study Based on Mobility Survey Data from the Paris Region. In: 2025 IEEE International Conference on Big Data (BigData), pp. 3685–3694. IEEE, Macau, China (2025). https://doi.org/10.1109/ BigData66926.2025.11402166

  37. [37]

    In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp

    Newson, P., Krumm, J.: Hidden Markov map matching through noise and sparse- ness. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 336–343. ACM, Seattle Washington (2009). https://doi.org/10.1145/1653771.1653818 25

  38. [38]

    Transactions in GIS 27(7), 1959–1991 (2023) https://doi.org/10.1111/tgis.13107

    W¨ oltche, A.: Open source map matching with Markov decision processes: A new method and a detailed benchmark with existing approaches. Transactions in GIS 27(7), 1959–1991 (2023) https://doi.org/10.1111/tgis.13107

  39. [39]

    OpenStreetMap contributors: Planet Dump Retrieved from https://planet.osm.org (2017)

  40. [40]

    Nature72(1867), 342–342 (1905) https://doi.org/10.1038/072342a0

    Pearson, K.: The Problem of the Random Walk. Nature72(1867), 342–342 (1905) https://doi.org/10.1038/072342a0

  41. [41]

    Technical Report YALEU/DCS/TR-1029, Yeale University, New Haven, CT, US (May 1994)

    Lov´ asz, L.: Random Walks on Graphs: A Survay. Technical Report YALEU/DCS/TR-1029, Yeale University, New Haven, CT, US (May 1994)

  42. [42]

    Liu, Y.-W

    Masuda, N., Porter, M.A., Lambiotte, R.: Random walks and diffusion on net- works. Physics Reports716–717, 1–58 (2017) https://doi.org/10.1016/j.physrep. 2017.07.007

  43. [43]

    Nature Physics15(4), 313–320 (2019) https://doi

    Lambiotte, R., Rosvall, M., Scholtes, I.: From networks to optimal higher-order models of complex systems. Nature Physics15(4), 313–320 (2019) https://doi. org/10.1038/s41567-019-0459-y

  44. [44]

    de: A combinatorial problem

    Bruijn, N.G. de: A combinatorial problem. Proceedings of the Koninklijke Nederlandse Akademie van Wetenschappen49(7), 758–764 (1946)

  45. [45]

    Journal of Complex Networks10(5), 036 (2022) https://doi.org/10.1093/comnet/ cnac036

    LaRock, T., Scholtes, I., Eliassi-Rad, T.: Sequential motifs in observed walks. Journal of Complex Networks10(5), 036 (2022) https://doi.org/10.1093/comnet/ cnac036

  46. [46]

    Applied Network Science8(1), 1–20 (2023) https://doi.org/10.1007/s41109-023-00596-x

    Gote, C., Casiraghi, G., Schweitzer, F., Scholtes, I.: Predicting variable-length paths in networked systems using multi-order generative models. Applied Network Science8(1), 1–20 (2023) https://doi.org/10.1007/s41109-023-00596-x

  47. [47]

    Adaptive Computation and Machine Learning

    Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning, Second edition edn. Adaptive Computation and Machine Learning. The MIT Press, Cambridge, Massachusetts (2018) Acknowledgements.This work has been supported by a grant from the Fund for Energy Research with Corporate Partners administered by the Andlinger Center for Energy and the...