Recognition: no theorem link
TraXion: Rethinking Pre-training Frameworks for Mobility and Beyond
Pith reviewed 2026-05-11 00:55 UTC · model grok-4.3
The pith
TraXion is a pre-training framework built to satisfy three axioms for multi-entity spatiotemporal event streams, allowing one checkpoint per dataset to outperform task-specific models on mobility tasks and transfer directly to security and
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TraXion is a pre-training framework whose objectives and architecture are jointly designed to satisfy three axioms derived from the properties of MESES data: tuple-valued events whose meaning depends on the joint distribution over location time and activity, persistent user signatures across trajectories, and non-independence across users due to co-location at shared places. A single TraXion checkpoint per dataset beats task-specific baselines on every task across six public mobility datasets covering anomaly detection, next-POI recommendation, next-visit prediction, and social-link prediction. The same recipe, applied unchanged to enterprise authentication logs and ICU mortality prediction,
What carries the argument
TraXion's objectives and architecture jointly designed to satisfy the three axioms for multi-entity spatiotemporal event streams (MESES).
If this is right
- Mobility tasks such as anomaly detection and next-visit prediction can share one pre-trained representation instead of requiring separate models for each task.
- Event streams from security and healthcare can be modeled with the same pre-training recipe as mobility data without any changes.
- Performance improvements arise because the pre-training explicitly accounts for joint distributions over event attributes, persistent entity signatures, and inter-entity interactions.
- Cross-domain transfer becomes possible once the pre-training respects the shared structural properties of MESES data rather than importing objectives from unrelated domains.
Where Pith is reading between the lines
- The same axiomatic approach could extend to other MESES-like data such as financial transactions or sensor networks where entities interact through shared infrastructure.
- If the axioms prove general, future models for sequential event data may need less domain-specific engineering because the core structure is captured at the pre-training stage.
- Testing which of the three axioms contributes most to gains on particular tasks would clarify whether all three are required or if subsets suffice for narrower applications.
Load-bearing premise
The three structural properties of MESES data can be turned into axioms whose satisfaction by TraXion's objectives and architecture is both necessary and sufficient for the observed performance gains, with no domain-specific post-processing or hyper-parameter search required.
What would settle it
Training TraXion on a new MESES dataset from a different domain and finding that it underperforms a task-specific baseline on at least one task or requires domain-specific hyperparameter tuning to match prior results.
Figures
read the original abstract
Human mobility differs from text and from generic time series in three structural ways: visits are tuple-valued events whose meaning depends on the joint distribution over location, time, and activity; users carry persistent signatures across trajectories; and visits are not independent across users, since co-location at shared places is a primary signal. Existing pre-training recipes for mobility import objectives from language modeling, treating trajectories as sentences and visits as tokens, an analogy that fails against each of the three properties above. These properties define a broader class, multi-entity spatiotemporal event streams (MESES), spanning enterprise authentication logs, electronic health records, and other event-stream domains where entities share infrastructure, schedules, or contexts. We make the properties precise as three axioms that any pre-training framework for MESES should satisfy, and introduce TraXion, whose objectives and architecture are jointly designed to meet them. A single TraXion checkpoint per dataset beats task-specific baselines on every task across six public mobility datasets covering anomaly detection, next-POI recommendation, next-visit prediction, and social-link prediction. The same recipe, applied unchanged to enterprise authentication logs and ICU mortality prediction, matches or exceeds prior work on both, showing that event streams from domains as different as mobility, security, and healthcare can be modeled under a single framework.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that mobility trajectories and similar multi-entity spatiotemporal event streams (MESES) exhibit three structural properties—tuple-valued events whose semantics depend on joint location-time-activity distributions, persistent user signatures across trajectories, and non-independence of visits across users due to co-location—that distinguish them from text or generic time series. Existing language-modeling-style pre-training fails to respect these properties. The authors formalize the properties as three axioms that any MESES pre-training framework should satisfy, introduce TraXion whose objectives and architecture are jointly designed to meet the axioms, and report that a single TraXion checkpoint per dataset outperforms task-specific baselines on anomaly detection, next-POI recommendation, next-visit prediction, and social-link prediction across six public mobility datasets. The identical recipe, applied unchanged, matches or exceeds prior work on enterprise authentication logs and ICU mortality prediction.
Significance. If the empirical results and the necessity of the axioms are substantiated, the work would be significant: it offers a unified pre-training recipe for event-stream domains that share infrastructure or context (mobility, security, healthcare) and avoids the common practice of importing language-modeling objectives that ignore the data's joint structure and cross-entity dependencies.
major comments (3)
- [Abstract] Abstract and experimental section: the central claim that a single TraXion checkpoint beats every task-specific baseline on all tasks requires a reproducible experimental protocol, baseline implementations, statistical significance tests, and ablation results; none of these are described or referenced in the provided text, rendering the performance claims unverifiable.
- [Abstract] The argument that the three MESES axioms drive the observed gains (rather than other unablated factors such as encoder choice or training schedule) is load-bearing for the paper's contribution, yet no ablation is reported that removes or relaxes one axiom while keeping the rest of the architecture and recipe fixed and then measures the resulting performance drop on the reported tasks.
- [Abstract] Abstract: the cross-domain transfer result (authentication logs and ICU mortality) is presented as using the 'same recipe, applied unchanged,' but no details are given on whether hyper-parameter search, domain-specific post-processing, or re-tuning occurred; this information is necessary to evaluate the claim of zero-shot transfer.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments highlight important gaps in experimental documentation and validation that we will address in the revision. Below we respond point-by-point to the three major comments.
read point-by-point responses
-
Referee: [Abstract] Abstract and experimental section: the central claim that a single TraXion checkpoint beats every task-specific baseline on all tasks requires a reproducible experimental protocol, baseline implementations, statistical significance tests, and ablation results; none of these are described or referenced in the provided text, rendering the performance claims unverifiable.
Authors: We agree that the current manuscript version lacks sufficient detail for independent reproduction of the results. In the revised manuscript we will add a dedicated 'Experimental Setup' subsection that fully specifies the training protocol (including data splits, batching, optimization hyperparameters, and early-stopping criteria), provides references or links to the exact baseline implementations used, reports statistical significance (paired t-tests or Wilcoxon signed-rank tests with p-values across five random seeds), and includes the requested ablation results. These additions will make the performance claims verifiable. revision: yes
-
Referee: [Abstract] The argument that the three MESES axioms drive the observed gains (rather than other unablated factors such as encoder choice or training schedule) is load-bearing for the paper's contribution, yet no ablation is reported that removes or relaxes one axiom while keeping the rest of the architecture and recipe fixed and then measures the resulting performance drop on the reported tasks.
Authors: We acknowledge that the manuscript does not yet contain ablations that isolate the contribution of each axiom. In the revision we will introduce three controlled variants of TraXion, each violating exactly one axiom while preserving the encoder, training schedule, and all other objectives: (1) an independent-event variant that factorizes the joint location-time-activity distribution, (2) a user-agnostic variant that drops persistent signature modeling, and (3) a user-independent variant that ignores co-location signals. We will report the resulting performance drops on all four mobility tasks, thereby directly linking the axioms to the observed gains. revision: yes
-
Referee: [Abstract] Abstract: the cross-domain transfer result (authentication logs and ICU mortality) is presented as using the 'same recipe, applied unchanged,' but no details are given on whether hyper-parameter search, domain-specific post-processing, or re-tuning occurred; this information is necessary to evaluate the claim of zero-shot transfer.
Authors: We will clarify this point explicitly. The transfer experiments used the identical pre-training objectives, architecture, optimizer settings, and hyperparameter values as the mobility runs; the only adaptation was the minimal input tokenization required to map the new event schemas into the same vocabulary format. No hyper-parameter search, domain-specific fine-tuning, or post-processing was performed on the target domains. A new paragraph in the experimental section will document this procedure and confirm the zero-shot nature of the transfer. revision: yes
Circularity Check
No circularity: axioms are motivated externally and performance is empirical.
full rationale
The paper extracts three structural properties from MESES data descriptions, formalizes them as axioms, designs objectives and architecture to satisfy those axioms, and reports experimental results on held-out tasks and cross-domain datasets. No equation equates a derived quantity to a fitted parameter defined by the same metric, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled via prior work. The performance edge is presented as an observed outcome of the design rather than a mathematical identity with the inputs.
Axiom & Free-Parameter Ledger
axioms (3)
- domain assumption Visits are tuple-valued events whose meaning depends on the joint distribution over location, time, and activity.
- domain assumption Users carry persistent signatures across trajectories.
- domain assumption Visits are not independent across users because co-location at shared places is a primary signal.
invented entities (2)
-
MESES (multi-entity spatiotemporal event streams)
no independent evidence
-
TraXion
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Urban Anomalies: A simulated human mobility dataset with injected anomalies
Hossein Amiri, Ruochen Kong, and Andreas Züfle. Urban Anomalies: A simulated human mobility dataset with injected anomalies. InProceedings of the 1st ACM SIGSPATIAL International Workshop on Geospatial Anomaly Detection, pages 1–11, 2024
2024
-
[2]
Claude Code.https://www.anthropic.com/claude-code, 2025
Anthropic. Claude Code.https://www.anthropic.com/claude-code, 2025
2025
-
[3]
ICAD: A self-supervised autoregressive approach for multi-context anomaly detection in human mobility data
Bita Azarijoo, Maria Despoina Siampou, John Krumm, and Cyrus Shahabi. ICAD: A self-supervised autoregressive approach for multi-context anomaly detection in human mobility data. InProceedings of 9 the 33rd ACM International Conference on Advances in Geographic Information Systems, pages 595–606, 2025
2025
-
[4]
Contrastive trajectory similarity learning with dual-feature attention
Yanchuan Chang, Jianzhong Qi, Yuxuan Liang, and Egemen Tanin. Contrastive trajectory similarity learning with dual-feature attention. In2023 IEEE 39th International Conference on Data Engineering (ICDE), pages 2933–2945. IEEE, 2023
2023
-
[5]
Recurrent neural networks for multivariate time series with missing values.Scientific Reports, 8(1):6085, 2018
Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. Recurrent neural networks for multivariate time series with missing values.Scientific Reports, 8(1):6085, 2018
2018
-
[6]
Mutual distillation learn- ing network for trajectory-user linking
Wei Chen, Shuzhe Li, Chao Huang, Yanwei Yu, Yongguo Jiang, and Junyu Dong. Mutual distillation learn- ing network for trajectory-user linking. InProceedings of the Thirty-First International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization., 2022
2022
-
[7]
Trajectory-user linking via hierarchical spatio-temporal attention networks.ACM Transactions on Knowledge Discovery from Data, 18(4):1–22, 2024
Wei Chen, Chao Huang, Yanwei Yu, Yongguo Jiang, and Junyu Dong. Trajectory-user linking via hierarchical spatio-temporal attention networks.ACM Transactions on Knowledge Discovery from Data, 18(4):1–22, 2024
2024
-
[8]
Friendship and mobility: user movement in location- based social networks
Eunjoon Cho, Seth A Myers, and Jure Leskovec. Friendship and mobility: user movement in location- based social networks. InProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1082–1090, 2011
2011
-
[9]
One model, many cities: A transferable social relationship inference framework for human mobility data
Chen Chu, Cyrus Shahabi, Emmanuel Tung, and Khurram Shafique. One model, many cities: A transferable social relationship inference framework for human mobility data. InProceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems, pages 66–76, 2025
2025
-
[10]
Le, and Christopher D
Kevin Clark, Minh-Thang Luong, Quoc V . Le, and Christopher D. Manning. ELECTRA: Pre-training text encoders as discriminators rather than generators. InInternational Conference on Learning Representations, 2020
2020
-
[11]
BERT: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, 2019
2019
-
[12]
SimMTM: A simple pre-training framework for masked time-series modeling.Advances in Neural Information Processing Systems, 36:29996–30025, 2023
Jiaxiang Dong, Haixu Wu, Haoran Zhang, Li Zhang, Jianmin Wang, and Mingsheng Long. SimMTM: A simple pre-training framework for masked time-series modeling.Advances in Neural Information Processing Systems, 36:29996–30025, 2023
2023
-
[13]
DeepLog: Anomaly detection and diagnosis from system logs through deep learning
Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. DeepLog: Anomaly detection and diagnosis from system logs through deep learning. InProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 1285–1298, 2017
2017
-
[14]
Dirichlet-Hawkes processes with applications to clustering continuous-time document streams
Nan Du, Mehrdad Farajtabar, Amr Ahmed, Alexander J Smola, and Le Song. Dirichlet-Hawkes processes with applications to clustering continuous-time document streams. InProceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 219–228, 2015
2015
-
[15]
Recurrent marked temporal point processes: Embedding event history to vector
Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. Recurrent marked temporal point processes: Embedding event history to vector. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1555–1564, 2016
2016
-
[16]
Back to Bayesics: Uncovering human mobility distributions and anomalies with an integrated statistical and neural framework
Minxuan Duan, Yinlong Qian, Lingyi Zhao, Zihao Zhou, Zeeshan Rasheed, Rose Yu, and Khurram Shafique. Back to Bayesics: Uncovering human mobility distributions and anomalies with an integrated statistical and neural framework. InProceedings of the 1st ACM SIGSPATIAL International Workshop on Geospatial Anomaly Detection, pages 56–67, 2024
2024
-
[17]
AgentMove: A large language model based agentic framework for zero-shot next location prediction
Jie Feng, Yuwei Du, Jie Zhao, and Yong Li. AgentMove: A large language model based agentic framework for zero-shot next location prediction. InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1322–1338, 2025
2025
-
[18]
Improving event representa- tion via simultaneous weakly supervised contrastive learning and clustering
Jun Gao, Wei Wang, Changlong Yu, Huan Zhao, Wilfred Ng, and Ruifeng Xu. Improving event representa- tion via simultaneous weakly supervised contrastive learning and clustering. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022. 10
2022
-
[19]
Letian Gong, Yan Lin, Xinyue Zhang, Yiwen Lu, Xuedi Han, Yichen Liu, Shengnan Guo, Youfang Lin, and Huaiyu Wan. Mobility-LLM: Learning visiting intentions and travel preference from human mobility data with large language models.Advances in Neural Information Processing Systems, 37:36185–36217, 2024
2024
-
[20]
TabR: Tabular deep learning meets nearest neighbors
Yury Gorishniy, Ivan Rubachev, Nikolay Kartashev, Daniil Shlenskii, Akim Kotelnikov, and Artem Babenko. TabR: Tabular deep learning meets nearest neighbors. InThe Twelfth International Conference on Learning Representations, 2024
2024
-
[21]
LogBERT: Log anomaly detection via BERT
Haixuan Guo, Shuhan Yuan, and Xintao Wu. LogBERT: Log anomaly detection via BERT. In2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021
2021
-
[22]
Arash Hajisafi, Maria Despoina Siampou, Bita Azarijoo, Zhen Xiong, and Cyrus Shahabi. WaveGNN: Integrating graph neural networks and transformers for decay-aware classification of irregular clinical time-series. In2025 IEEE International Conference on Big Data (BigData), pages 1934–1943, 2025. doi: 10.1109/BigData66926.2025.11401906
-
[23]
LogGPT: Log anomaly detection via GPT
Xiao Han, Shuhan Yuan, and Mohamed Trabelsi. LogGPT: Log anomaly detection via GPT. In2023 IEEE International Conference on Big Data (BigData), pages 1117–1122, 2023
2023
-
[24]
MobilityGPT: Enhanced human mobility modeling with a GPT model.IEEE Transactions on Intelligent Transportation Systems, 27(1):1681–1694, 2025
Ammar Haydari, Dongjie Chen, Zhengfeng Lai, Michael Zhang, and Chen-Nee Chuah. MobilityGPT: Enhanced human mobility modeling with a GPT model.IEEE Transactions on Intelligent Transportation Systems, 27(1):1681–1694, 2025
2025
-
[25]
Masked autoencoders are scalable vision learners
Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022
2022
-
[26]
TrajGPT: Controlled synthetic trajectory generation using a multitask transformer-based spatiotemporal model
Shang-Ling Hsu, Emmanuel Tung, John Krumm, Cyrus Shahabi, and Khurram Shafique. TrajGPT: Controlled synthetic trajectory generation using a multitask transformer-based spatiotemporal model. In Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems, pages 362–371, 2024
2024
-
[27]
eICU Collaborative Research Database Demo.PhysioNet, May 2021
Alistair Johnson, Tom Pollard, Omar Badawi, and Jesse Raffa. eICU Collaborative Research Database Demo.PhysioNet, May 2021. doi: 10.13026/4mxk-na84. URL https://doi.org/10.13026/ 4mxk-na84. Version 2.0.1
-
[28]
Self-attentive sequential recommendation
Wang-Cheng Kang and Julian McAuley. Self-attentive sequential recommendation. In2018 IEEE International Conference on Data Mining (ICDM), pages 197–206, 2018. doi: 10.1109/ICDM.2018.00035
-
[29]
autoresearch.https://github.com/karpathy/autoresearch, 2025
Andrej Karpathy. autoresearch.https://github.com/karpathy/autoresearch, 2025
2025
-
[30]
Time2Vec: Learning a Vector Representation of Time
Seyed Mehran Kazemi, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, and Marcus Brubaker. Time2Vec: Learning a vector representa- tion of time.arXiv preprint arXiv:1907.05321, 2019
work page Pith review arXiv 1907
-
[31]
Comprehensive, multi-source cyber-security events data set
Alexander D Kent. Comprehensive, multi-source cyber-security events data set. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States), 2015
2015
-
[32]
Similarity of neural network representations revisited
Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network representations revisited. InInternational Conference on Machine Learning, 2019
2019
-
[33]
Junnan Li, Pan Zhou, Caiming Xiong, and Steven C.H. Hoi. Prototypical contrastive learning of unsuper- vised representations. InInternational Conference on Learning Representations, 2021
2021
-
[34]
TrajFlow: Nation-wide pseudo GPS trajectory generation with flow matching models
Peiran Li, Jiawei Wang, Haoran Zhang, Xiaodan Shi, Noboru Koshizuka, Chihiro Shimizu, and Renhe Jiang. TrajFlow: Nation-wide pseudo GPS trajectory generation with flow matching models. InThe Fourteenth International Conference on Learning Representations, 2026
2026
-
[35]
BEHRT: transformer for electronic health records.Scientific Reports, 10(1):7155, 2020
Yikuan Li, Shishir Rao, José Roberto Ayala Solares, Abdelaali Hassaine, Rema Ramakrishnan, Dexter Canoy, Yajie Zhu, Kazem Rahimi, and Gholamreza Salimi-Khorshidi. BEHRT: transformer for electronic health records.Scientific Reports, 10(1):7155, 2020
2020
-
[36]
Heterogeneous hyperbolic hypergraph neural network for friend recommendation in location-based social networks.ACM Transactions on Knowledge Discovery from Data, 19(3):1–29, 2025
Yongkang Li, Zipei Fan, and Xuan Song. Heterogeneous hyperbolic hypergraph neural network for friend recommendation in location-based social networks.ACM Transactions on Knowledge Discovery from Data, 19(3):1–29, 2025
2025
-
[37]
Temporal fusion transformers for interpretable multi-horizon time series forecasting.International Journal of Forecasting, 37(4):1748–1764, 2021
Bryan Lim, Sercan Ö Arık, Nicolas Loeff, and Tomas Pfister. Temporal fusion transformers for interpretable multi-horizon time series forecasting.International Journal of Forecasting, 37(4):1748–1764, 2021. 11
2021
-
[38]
Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction
Yan Lin, Huaiyu Wan, Shengnan Guo, and Youfang Lin. Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction. InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 4241–4248, 2021
2021
-
[39]
Discovering latent network structure in point process data
Scott Linderman and Ryan Adams. Discovering latent network structure in point process data. In International Conference on Machine Learning, pages 1413–1421. PMLR, 2014
2014
-
[40]
iTrans- former: Inverted transformers are effective for time series forecasting
Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Lintao Ma, and Mingsheng Long. iTrans- former: Inverted transformers are effective for time series forecasting. InThe Twelfth International Conference on Learning Representations, 2024
2024
-
[41]
Decoupled weight decay regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InInternational Conference on Learning Representations (ICLR), 2019
2019
-
[42]
Multi-scale representation learning for spatial feature distributions using grid cells
Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, and Ni Lao. Multi-scale representation learning for spatial feature distributions using grid cells. InInternational Conference on Learning Representations (ICLR), 2020
2020
-
[43]
UMAP: Uniform manifold approxi- mation and projection.Journal of Open Source Software, 3(29), 2018
Leland McInnes, John Healy, Nathaniel Saul, and Lukas Großberger. UMAP: Uniform manifold approxi- mation and projection.Journal of Open Source Software, 3(29), 2018
2018
-
[44]
The neural Hawkes process: A neurally self-modulating multivariate point process.Advances in Neural Information Processing Systems, 30, 2017
Hongyuan Mei and Jason M Eisner. The neural Hawkes process: A neurally self-modulating multivariate point process.Advances in Neural Information Processing Systems, 30, 2017
2017
-
[45]
Self-supervised log parsing
Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, and Odej Kao. Self-supervised log parsing. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2020
2020
-
[46]
A time series is worth 64 words: Long-term forecasting with transformers
Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. InThe Eleventh International Conference on Learning Representations, 2023
2023
-
[47]
CEHR-BERT: Incorporating temporal information from structured ehr data to improve prediction tasks
Chao Pang, Xinzhuo Jiang, Krishna S Kalluri, Matthew Spotnitz, RuiJun Chen, Adler Perotte, and Karthik Natarajan. CEHR-BERT: Incorporating temporal information from structured ehr data to improve prediction tasks. InMachine Learning for Health, pages 239–260. PMLR, 2021
2021
-
[48]
EBM: an entropy-based model to infer social strength from spatiotemporal data
Huy Pham, Cyrus Shahabi, and Yan Liu. EBM: an entropy-based model to infer social strength from spatiotemporal data. InProceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 265–276, 2013
2013
-
[49]
Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.NPJ Digital Medicine, 4(1):86, 2021
Laila Rasmy, Yang Xiang, Ziqian Xie, Cui Tao, and Degui Zhi. Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.NPJ Digital Medicine, 4(1):86, 2021
2021
-
[50]
Samir, Jaroslaw Was, Quanzheng Li, David W
Pawel Renc, Yugang Jia, Anthony E. Samir, Jaroslaw Was, Quanzheng Li, David W. Bates, and Arkadiusz Sitek. Zero shot health trajectory prediction using transformer.NPJ Digital Medicine, 7(1):256, 2024
2024
-
[51]
Neural temporal point processes: A review
Oleksandr Shchur, Ali Caner Türkmen, Tim Januschowski, and Stephan Günnemann. Neural temporal point processes: A review. InProceedings of the Thirtieth International Joint Conference on Artificial Intelligence, pages 4585–4593. International Joint Conferences on Artificial Intelligence Organization, 2021
2021
-
[52]
Multi-time attention networks for irregularly sampled time series
Satya Narayan Shukla and Benjamin Marlin. Multi-time attention networks for irregularly sampled time series. InInternational Conference on Learning Representations, 2021
2021
-
[53]
Mobility-Embedded POIs: Learning what a place is and how it is used from human movement
Maria Despoina Siampou, Shushman Choudhury, Shang-Ling Hsu, Neha Arora, and Cyrus Shahabi. Mobility-Embedded POIs: Learning what a place is and how it is used from human movement. In Forty-third International Conference on Machine Learning, 2026. To appear
2026
-
[54]
NUMOSIM: A synthetic mobility dataset with anomaly detection benchmarks
Chris Stanford, Suman Adari, Xishun Liao, Yueshuai He, Qinhua Jiang, Chenchen Kuai, Jiaqi Ma, Emmanuel Tung, Yinlong Qian, Lingyi Zhao, et al. NUMOSIM: A synthetic mobility dataset with anomaly detection benchmarks. InProceedings of the 1st ACM SIGSPATIAL International Workshop on Geospatial Anomaly Detection, pages 68–78, 2024
2024
-
[55]
BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer
Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. InProceedings of the 28th ACM International Conference on Information and Knowledge Management, pages 1441–1450, 2019. 12
2019
-
[56]
Self-supervised transformer for sparse and irregularly sampled multivariate clinical time-series.ACM Transactions on Knowledge Discovery from Data (TKDD), 16(6): 1–17, 2022
Sindhu Tipirneni and Chandan K Reddy. Self-supervised transformer for sparse and irregularly sampled multivariate clinical time-series.ACM Transactions on Knowledge Discovery from Data (TKDD), 16(6): 1–17, 2022
2022
-
[57]
Deep learning for unsupervised insider threat detection in structured cybersecurity data streams
Aaron Tuor, Samuel Kaplan, Brian Hutchinson, Nicole Nichols, and Sean Robinson. Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. InAAAI Workshops, pages 224–231, 2017
2017
-
[58]
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[59]
Pre-training time-aware location embeddings from spatial-temporal trajectories.IEEE Transactions on Knowledge and Data Engineering, 34(11):5510–5523, 2021
Huaiyu Wan, Yan Lin, Shengnan Guo, and Youfang Lin. Pre-training time-aware location embeddings from spatial-temporal trajectories.IEEE Transactions on Knowledge and Data Engineering, 34(11):5510–5523, 2021
2021
-
[60]
CoBAD: Modeling collective behaviors for human mobility anomaly detection
Haomin Wen, Shurui Cao, and Leman Akoglu. CoBAD: Modeling collective behaviors for human mobility anomaly detection. InProceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems, pages 197–209, 2025
2025
-
[61]
Uncertainty- aware spatio-temporal human mobility modeling and anomaly detection
Haomin Wen, Shurui Cao, Zeeshan Rasheed, Khurram Hassan Shafique, and Leman Akoglu. Uncertainty- aware spatio-temporal human mobility modeling and anomaly detection. InProceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems, pages 328–331, 2025
2025
-
[62]
Contrastive learning for sequential recommendation
Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Jiandong Zhang, Bolin Ding, and Bin Cui. Contrastive learning for sequential recommendation. In2022 IEEE 38th International Conference on Data Engineering (ICDE), pages 1259–1273. IEEE, 2022
2022
-
[63]
COAST: Contrastive learning with augmented spatio-temporal encoding for next POI recommendation
Bada Xin, Xin Wan, Zhuojun Jiang, Faqiang Liu, Su Chen, Rong Yang, and Qingyun Liu. COAST: Contrastive learning with augmented spatio-temporal encoding for next POI recommendation. InICASSP 2025–2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025
2025
-
[64]
Taming the long tail in human mobility prediction.Advances in Neural Information Processing Systems, 37:54748–54771, 2024
Xiaohang Xu, Renhe Jiang, Chuang Yang, Zipei Fan, and Kaoru Sezaki. Taming the long tail in human mobility prediction.Advances in Neural Information Processing Systems, 37:54748–54771, 2024
2024
-
[65]
Where and when: predict next POI and its explicit timestamp in sequential recommendation
Yuanbo Xu, Hongxu Shen, Yiheng Jiang, and En Wang. Where and when: predict next POI and its explicit timestamp in sequential recommendation. InProceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, pages 3507–3515, 2025
2025
-
[66]
MobTCast: Leveraging auxiliary trajectory forecast- ing for human mobility prediction.Advances in Neural Information Processing Systems, 34:30380–30391, 2021
Hao Xue, Flora Salim, Yongli Ren, and Nuria Oliver. MobTCast: Leveraging auxiliary trajectory forecast- ing for human mobility prediction.Advances in Neural Information Processing Systems, 34:30380–30391, 2021
2021
-
[67]
Transformer embeddings of irregularly spaced events and their participants
Chenghao Yang, Hongyuan Mei, and Jason Eisner. Transformer embeddings of irregularly spaced events and their participants. InInternational Conference on Learning Representations, 2022
2022
-
[68]
Revisiting user mobility and social relationships in LBSNs: a hypergraph embedding approach
Dingqi Yang, Bingqing Qu, Jie Yang, and Philippe Cudre-Mauroux. Revisiting user mobility and social relationships in LBSNs: a hypergraph embedding approach. InThe World Wide Web Conference, pages 2147–2157, 2019
2019
-
[69]
GETNext: trajectory flow map enhanced transformer for next POI recommendation
Song Yang, Jiamou Liu, and Kaiqi Zhao. GETNext: trajectory flow map enhanced transformer for next POI recommendation. InProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1144–1153, 2022
2022
-
[70]
UniST: A prompt-empowered universal model for urban spatio-temporal prediction
Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, and Yong Li. UniST: A prompt-empowered universal model for urban spatio-temporal prediction. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4095–4106, 2024
2024
-
[71]
TS2Vec: Towards universal representation of time series
Zhihan Yue, Yujing Wang, Juanyong Duan, Tianmeng Yang, Congrui Huang, Yunhai Tong, and Bixiong Xu. TS2Vec: Towards universal representation of time series. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8980–8987, 2022
2022
-
[72]
A transformer-based framework for multivariate time series representation learning
George Zerveas, Srideepika Jayaraman, Dhaval Patel, Anuradha Bhamidipaty, and Carsten Eickhoff. A transformer-based framework for multivariate time series representation learning. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 2114–2124, 2021
2021
-
[73]
Scalable trajectory-user linking with dual-stream representation networks
Hao Zhang, Wei Chen, Xingyu Zhao, Jianpeng Qi, Guiyuan Jiang, and Yanwei Yu. Scalable trajectory-user linking with dual-stream representation networks. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 13224–13232, 2025. 13
2025
-
[74]
Graph-guided network for irregularly sampled multivariate time series
Xiang Zhang, Marko Zeman, Theodoros Tsiligkaridis, and Marinka Zitnik. Graph-guided network for irregularly sampled multivariate time series. InInternational Conference on Learning Representations, 2022
2022
-
[75]
Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting
Yunhao Zhang and Junchi Yan. Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. InThe Eleventh International Conference on Learning Representations, 2023
2023
-
[76]
Trajectory-user linking via variational autoencoder
Fan Zhou, Qiang Gao, Goce Trajcevski, Kunpeng Zhang, Ting Zhong, and Fengli Zhang. Trajectory-user linking via variational autoencoder. InProceedings of the 27th International Joint Conference on Artificial Intelligence, pages 3212–3218, 2018
2018
-
[77]
Learning triggering kernels for multi-dimensional Hawkes processes
Ke Zhou, Hongyuan Zha, and Le Song. Learning triggering kernels for multi-dimensional Hawkes processes. InInternational Conference on Machine Learning, pages 1301–1309. PMLR, 2013
2013
-
[78]
S3-Rec: Self-supervised learning for sequential recommendation with mutual information maximization
Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, and Ji-Rong Wen. S3-Rec: Self-supervised learning for sequential recommendation with mutual information maximization. InProceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 1893–1902, 2020
1902
-
[79]
One fits all: Power general time series analysis by pretrained LM.Advances in Neural Information Processing Systems, 36:43322–43355, 2023
Tian Zhou, Peisong Niu, Liang Sun, Rong Jin, et al. One fits all: Power general time series analysis by pretrained LM.Advances in Neural Information Processing Systems, 36:43322–43355, 2023
2023
-
[80]
UniTraj: Learning a universal trajectory foundation model from billion-scale worldwide traces
Yuanshao Zhu, James Jianqiao Yu, Xiangyu Zhao, Xun Zhou, Liang Han, Xuetao Wei, and Yuxuan Liang. UniTraj: Learning a universal trajectory foundation model from billion-scale worldwide traces. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.