Recognition: no theorem link
UxSID: Semantic-Aware User Interests Modeling for Ultra-Long Sequence
Pith reviewed 2026-05-14 21:01 UTC · model grok-4.3
The pith
UxSID uses semantic IDs and dual-level attention to model ultra-long user sequences with target-aware preferences.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By assigning Semantic IDs to items and employing a dual-level attention strategy over the resulting semantic groups, UxSID builds a shared interest memory that captures preferences relevant to a specific target item without incurring the cost of item-by-item search or the information loss of fully item-agnostic compression.
What carries the argument
Semantic IDs (SIDs) that group items by meaning, combined with dual-level attention that first aggregates within each semantic group and then attends across groups to the target item.
Load-bearing premise
Grouping items into Semantic IDs preserves the distinctions that actually matter for target-aware user preferences rather than collapsing important differences or injecting new biases.
What would settle it
Replace the learned Semantic IDs with randomly assigned group labels on the same ultra-long sequences and measure whether recommendation accuracy and revenue lift disappear.
Figures
read the original abstract
Modeling ultra-long user sequences involves a difficult trade-off between efficiency and effectiveness. While current paradigms rely on either item-specific search or item-agnostic compression, we propose UxSID, a framework exploring a third path: semantic-group shared interest memory. By utilizing Semantic IDs (SIDs) and a dual-level attention strategy, UxSID captures target-aware preferences without the heavy cost of item-specific models. This end-to-end architecture balances computational parsimony with semantic awareness, achieving state-of-the-art performance and a 0.337% revenue lift in large-scale advertising A/B test.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes UxSID, a framework for modeling ultra-long user sequences in recommendation systems via semantic-group shared interest memory. It introduces Semantic IDs (SIDs) and a dual-level attention strategy to capture target-aware preferences efficiently, avoiding the costs of item-specific search while retaining semantic awareness, and reports state-of-the-art performance plus a 0.337% revenue lift in a large-scale advertising A/B test.
Significance. If the results hold, the work offers a practical third path for ultra-long sequence modeling that could improve scalability in industrial recommender systems without sacrificing semantic fidelity. The online A/B test result provides direct evidence of business impact, strengthening the case for adoption in advertising and related domains.
minor comments (2)
- [Abstract] Abstract: the claim of SOTA performance would be strengthened by briefly naming the offline datasets, sequence lengths, and main baselines used.
- [§4] §4 (Experiments): include error bars or statistical significance tests for the reported metrics to support the SOTA and revenue-lift claims.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of UxSID and the recommendation for minor revision. The recognition of our semantic-group shared interest memory approach as a practical third path for ultra-long sequence modeling, along with the value of the online A/B test results, is appreciated.
Circularity Check
No significant circularity
full rationale
The paper proposes UxSID as an independent architectural framework using Semantic IDs (SIDs) and dual-level attention to model ultra-long sequences via semantic-group shared memory. No equations, fitted parameters, or derivations are presented in the abstract or described structure that reduce outputs to inputs by construction. The central claim of balancing efficiency and target-aware preferences is framed as a novel third path without self-definitional loops, self-citation load-bearing premises, or renaming of known results. The architecture is presented as self-contained with external validation via A/B test revenue lift, satisfying the criteria for a non-circular proposal.
Axiom & Free-Parameter Ledger
invented entities (2)
-
Semantic IDs (SIDs)
no independent evidence
-
dual-level attention strategy
no independent evidence
Reference graph
Works this paper leans on
-
[1]
A survey of user lifelong behavior modeling: Perspectives on efficiency and effectiveness
Rui Zhou, Qinglin Jia, Bo Chen, Peng Xu, Yijia Sun, Siyuan Lou, Chaoxin Fu, Mengyuan Fu, Guoming Shen, Zheli Zhou, et al. A survey of user lifelong behavior modeling: Perspectives on efficiency and effectiveness. 2026
2026
-
[2]
Practice on long sequential user behavior modeling for click-through rate prediction
Qi Pi, Weijie Bian, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. Practice on long sequential user behavior modeling for click-through rate prediction. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, page 2671–2679. ACM, 2019. doi: 10.1145/3292500. 3330666
-
[3]
Attention is all you need.Advances in neural information processing systems, 30, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017
2017
-
[4]
Deep interest network for click-through rate prediction
Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. Deep interest network for click-through rate prediction. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1059–1068, 2018
2018
-
[5]
Deep interest evolution network for click-through rate prediction
Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. Deep interest evolution network for click-through rate prediction. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 5941–5948, 2019
2019
-
[6]
Transact: Transformer-based realtime user action model for recommendation at pinterest
Xue Xia, Pong Eksombatchai, Nikil Pancha, Dhruvil Deven Badani, Po-Wei Wang, Neng Gu, Saurabh Vish- was Joshi, Nazanin Farahpour, Zhiyuan Zhang, and Andrew Zhai. Transact: Transformer-based realtime user action model for recommendation at pinterest. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 5249–5259, 2023
2023
-
[7]
Search-based user interest modeling with lifelong sequential behavior data for click-through rate prediction
Qi Pi, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, Xiaoqiang Zhu, and Kun Gai. Search-based user interest modeling with lifelong sequential behavior data for click-through rate prediction. InProceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 2685–2692, 2020
2020
-
[8]
Twin: Two-stage interest network for lifelong user behavior modeling in ctr prediction at kuaishou
Jianxin Chang, Chenbin Zhang, Zhiyi Fu, Xiaoxue Zang, Lin Guan, Jing Lu, Yiqun Hui, Dewei Leng, Yanan Niu, Yang Song, et al. Twin: Two-stage interest network for lifelong user behavior modeling in ctr prediction at kuaishou. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3785–3794, 2023
2023
-
[9]
Learning universal user representations via self-supervised lifelong behaviors modeling
Bei Yang, Ke Liu, Xiaoxiao Xu, Renjun Xu, Hong Liu, et al. Learning universal user representations via self-supervised lifelong behaviors modeling. 2021
2021
-
[10]
Trans- formers are good clusterers for lifelong user behavior sequence modeling
Xingmei Wang, Shiyao Wang, Wuchao Li, Jiaxin Deng, Song Lu, Defu Lian, and Guorui Zhou. Trans- formers are good clusterers for lifelong user behavior sequence modeling. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 3123–3132, 2025
2025
-
[11]
Pinnerformer: Sequence modeling for user representation at pinterest
Nikil Pancha, Andrew Zhai, Jure Leskovec, and Charles Rosenberg. Pinnerformer: Sequence modeling for user representation at pinterest. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 3702–3712, 2022
2022
-
[12]
Sampling and noise filtering methods for recommender systems: A literature review.Engineering Applications of Artificial Intelligence, 122:106129, 2023
Kirti Jain and Rajni Jindal. Sampling and noise filtering methods for recommender systems: A literature review.Engineering Applications of Artificial Intelligence, 122:106129, 2023
2023
-
[13]
Recommender systems with generative retrieval
Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al. Recommender systems with generative retrieval. Advances in Neural Information Processing Systems, 36:10299–10315, 2023
2023
-
[14]
Kun Zhang, Jingming Zhang, Wei Cheng, Yansong Cheng, Jiaqi Zhang, Hao Lu, Xu Zhang, Haixiang Gan, Jiangxia Cao, Tenglong Wang, et al. Onemall: One model, more scenarios–end-to-end generative recommender family at kuaishou e-commerce.arXiv preprint arXiv:2601.21770, 2026
-
[15]
Ruining He, Lukasz Heldt, Lichan Hong, Raghunandan Keshavan, Shifan Mao, Nikhil Mehta, Zhengyang Su, Alicia Tsai, Yueqi Wang, Shao-Chuan Wang, et al. Plum: Adapting pre-trained language models for industrial-scale generative recommendations.arXiv preprint arXiv:2510.07784, 2025
-
[16]
Das: Dual-aligned semantic ids empowered industrial recommender system
Wencai Ye, Mingjie Sun, Shaoyun Shi, Peng Wang, Wenjin Wu, and Peng Jiang. Das: Dual-aligned semantic ids empowered industrial recommender system. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 6217–6224, 2025. 12
2025
-
[17]
Huanjie Wang, Xinchen Luo, Honghui Bao, Zhang Zixing, Lejian Ren, Yunfan Wu, Hongwei Zhang, Liwei Guan, and Guang Chen. Pit: A dynamic personalized item tokenizer for end-to-end generative recommendation.arXiv preprint arXiv:2602.08530, 2026
-
[18]
Junwei Yin, Senjie Kou, Changhao Li, Shuli Wang, Xue Wei, Yinqiu Huang, Yinhua Zhu, Haitao Wang, and Xingxing Wang. Dos: Dual-flow orthogonal semantic ids for recommendation in meituan.arXiv preprint arXiv:2602.04460, 2026
-
[19]
Tian Xia, Jiaqi Zhang, Yueyang Liu, Hongjian Dou, Tingya Yin, Jiangxia Cao, Xulei Liang, Tianlu Xie, Lihao Liu, Xiang Chen, et al. Qarm v2: Quantitative alignment multi-modal recommendation for reasoning user sequence modeling.arXiv preprint arXiv:2602.08559, 2026
-
[20]
Deep Session Interest Network for Click-Through Rate Prediction
Yufei Feng, Fuyu Lv, Weichen Shen, Menghan Wang, Fei Sun, Yu Zhu, and Keping Yang. Deep session interest network for click-through rate prediction.arXiv preprint arXiv:1905.06482, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1905
-
[21]
User-aware multi-interest learning for candidate matching in recommenders
Zheng Chai, Zhihong Chen, Chenliang Li, Rong Xiao, Houyi Li, Jiawei Wu, Jingxu Chen, and Haihong Tang. User-aware multi-interest learning for candidate matching in recommenders. InProceedings of the 45th international ACM SIGIR conference on research and development in information retrieval, pages 1326–1335, 2022
2022
-
[22]
Multi-interest network with dynamic routing for recommendation at tmall
Chao Li, Zhiyuan Liu, Mengmeng Wu, Yuchi Xu, Huan Zhao, Pipei Huang, Guoliang Kang, Qiwei Chen, Wei Li, and Dik Lun Lee. Multi-interest network with dynamic routing for recommendation at tmall. In Proceedings of the 28th ACM international conference on information and knowledge management, pages 2615–2623, 2019
2019
-
[23]
Multi-grained preference enhanced transformer for multi-behavior sequential recommendation
Chuan He, Yongchao Liu, Qiang Li, Weiqiang Wang, Xing Fu, Xinyi Fu, Chuntao Hong, and Xinwei Yao. Multi-grained preference enhanced transformer for multi-behavior sequential recommendation. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, pages 872–883, 2025
2025
-
[24]
Behavior sequence transformer for e- commerce recommendation in alibaba
Qiwei Chen, Huan Zhao, Wei Li, Pipei Huang, and Wenwu Ou. Behavior sequence transformer for e- commerce recommendation in alibaba. InProceedings of the 1st international workshop on deep learning practice for high-dimensional sparse data, pages 1–4, 2019
2019
-
[25]
Self-attentive sequential recommendation
Wang-Cheng Kang and Julian McAuley. Self-attentive sequential recommendation. In2018 IEEE international conference on data mining (ICDM), pages 197–206. IEEE, 2018
2018
-
[26]
Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhaojie Gong, Fangda Gu, Michael He, et al. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations.arXiv preprint arXiv:2402.17152, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[27]
Mtgr: Industrial-scale generative recommendation framework in meituan
Ruidong Han, Bin Yin, Shangyu Chen, He Jiang, Fei Jiang, Xiang Li, Chi Ma, Mincong Huang, Xiaoguang Li, Chunzhen Jing, et al. Mtgr: Industrial-scale generative recommendation framework in meituan. In Proceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 5731–5738, 2025
2025
-
[28]
A survey on sequential recommendation.Frontiers of Computer Science, 20(3):2003606, 2026
Li-Wei Pan, Wei-Ke Pan, Mei-Yan Wei, Hong-Zhi Yin, and Zhong Ming. A survey on sequential recommendation.Frontiers of Computer Science, 20(3):2003606, 2026
2026
-
[29]
Qiwei Chen, Changhua Pei, Shanshan Lv, Chao Li, Junfeng Ge, and Wenwu Ou. End-to-end user behavior retrieval in click-through rateprediction model.arXiv preprint arXiv:2108.04468, 2021
-
[30]
Sampling is all you need on modeling long-term user behaviors for ctr prediction
Yue Cao, Xiaojiang Zhou, Jiaqi Feng, Peihao Huang, Yao Xiao, Dayao Chen, and Sheng Chen. Sampling is all you need on modeling long-term user behaviors for ctr prediction. InProceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 2974–2983, 2022
2022
-
[31]
Twin v2: Scaling ultra-long user behavior sequence modeling for enhanced ctr prediction at kuaishou
Zihua Si, Lin Guan, ZhongXiang Sun, Xiaoxue Zang, Jing Lu, Yiqun Hui, Xingchao Cao, Zeyu Yang, Yichen Zheng, Dewei Leng, et al. Twin v2: Scaling ultra-long user behavior sequence modeling for enhanced ctr prediction at kuaishou. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 4890–4897, 2024
2024
-
[32]
Multi-granularity interest retrieval and refinement network for long-term user behavior modeling in ctr prediction
Xiang Xu, Hao Wang, Wei Guo, Luankang Zhang, Wanshan Yang, Runlong Yu, Yong Liu, Defu Lian, and Enhong Chen. Multi-granularity interest retrieval and refinement network for long-term user behavior modeling in ctr prediction. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 1, pages 2745–2755, 2025. 13
2025
-
[34]
Lrea: Low-rank efficient attention on modeling long-term user behaviors for ctr prediction
Xin Song, Xiaochen Li, Jinxin Hu, Hong Wen, Zulong Chen, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. Lrea: Low-rank efficient attention on modeling long-term user behaviors for ctr prediction. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2843–2847, 2025
2025
-
[35]
Dv365: Extremely long user history modeling at instagram
Wenhan Lyu, Devashish Tyagi, Yihang Yang, Ziwei Li, Ajay Somani, Karthikeyan Shanmugasundaram, Nikola Andrejevic, Ferdi Adeputra, Curtis Zeng, Arun K Singh, et al. Dv365: Extremely long user history modeling at instagram. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, pages 4717–4727, 2025
2025
-
[36]
Longer: Scaling up long sequence modeling in industrial recommenders
Zheng Chai, Qin Ren, Xijun Xiao, Huizhi Yang, Bo Han, Sijun Zhang, Di Chen, Hui Lu, Wenlin Zhao, Lele Yu, et al. Longer: Scaling up long sequence modeling in industrial recommenders. InProceedings of the Nineteenth ACM Conference on Recommender Systems, pages 247–256, 2025
2025
-
[37]
Marm: Unlocking the recommendation cache scaling-law through memory augmentation and scalable complexity
Xiao Lv, Jiangxia Cao, Shijie Guan, Xiaoyou Zhou, Zhiguang Qi, Yaqiang Zang, Ben Wang, and Guorui Zhou. Marm: Unlocking the recommendation cache scaling-law through memory augmentation and scalable complexity. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 2022–2031, 2025
2022
-
[38]
Qarm: Quantitative alignment multi-modal recommendation at kuaishou
Xinchen Luo, Jiangxia Cao, Tianyu Sun, Jinkai Yu, Rui Huang, Wei Yuan, Hezheng Lin, Yichen Zheng, Shiyao Wang, Qigen Hu, et al. Qarm: Quantitative alignment multi-modal recommendation at kuaishou. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 5915–5922, 2025
2025
-
[39]
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2502.18965, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[40]
Farewell to item ids: Unlocking the scaling potential of large ranking models via semantic tokens
Zhen Zhao, Tong Zhang, Jie Xu, Qingliang Cai, Qile Zhang, Leyuan Yang, Daorui Xiao, and Xiaojia Chang. Farewell to item ids: Unlocking the scaling potential of large ranking models via semantic tokens. arXiv preprint arXiv:2601.22694, 2026
-
[41]
Finite scalar quantization: Vq-vae made simple.arXiv preprint arXiv:2309.15505, 2023
Fabian Mentzer, David Minnen, Eirikur Agustsson, and Michael Tschannen. Finite scalar quantization: Vq-vae made simple.arXiv preprint arXiv:2309.15505, 2023
-
[42]
Lifelong sequential modeling with personalized memorization for user response prediction
Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, et al. Lifelong sequential modeling with personalized memorization for user response prediction. InProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 565–574, 2019
2019
-
[43]
Kuairec: A fully-observed dataset and insights for evaluating recommender systems
Chongming Gao, Shijun Li, Wenqiang Lei, Jiawei Chen, Biao Li, Peng Jiang, Xiangnan He, Jiaxin Mao, and Tat-Seng Chua. Kuairec: A fully-observed dataset and insights for evaluating recommender systems. InProceedings of the 31st ACM International Conference on Information & Knowledge Management, CIKM ’22, page 540–550, 2022. doi: 10.1145/3511808.3557220. UR...
-
[44]
Wide & deep learning for recommender systems
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. Wide & deep learning for recommender systems. InProceedings of the 1st workshop on deep learning for recommender systems, pages 7–10, 2016
2016
-
[45]
Learnable item tokenization for generative recommendation
Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, See-Kiong Ng, and Tat-Seng Chua. Learnable item tokenization for generative recommendation. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 2400–2409, 2024
2024
-
[46]
Onerec-v2 technical report.arXiv preprint arXiv:2508.20900, 2025
Guorui Zhou, Hengrui Hu, Hongtao Cheng, Huanjie Wang, Jiaxin Deng, Jinghao Zhang, Kuo Cai, Lejian Ren, Lu Ren, Liao Yu, et al. Onerec-v2 technical report.arXiv preprint arXiv:2508.20900, 2025
-
[47]
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971, 2023. 14 A Online Implementation Long-term User History Storage Training Data Storage UxSID Ranking Mode...
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[48]
For each video sequence, the last 7 videos are used for testing, the 8th to 14th last videos are used for validation, and the rest are used for training
We use users’ interaction histories to create video sequences sorted by timestamp and filter out users with less than 50 comments. For each video sequence, the last 7 videos are used for testing, the 8th to 14th last videos are used for validation, and the rest are used for training. We fix the user’s interaction history up to 2k. Industrial Dataset:Colle...
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.