arxiv: 2605.09040 · v2 · submitted 2026-05-09 · 💻 cs.AI · cs.IR· cs.LG

Recognition: no theorem link

UxSID: Semantic-Aware User Interests Modeling for Ultra-Long Sequence

Hongwei Zhang , Qiqiang Zhong , Jiangxia Cao , Yiyang Lv , Huanjie Wang , Liwei Guan , Jing Yao , Yiyu Wang

show 3 more authors

Junfeng Shu Zhaojie Liu Han Li

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:01 UTC · model grok-4.3

classification 💻 cs.AI cs.IRcs.LG

keywords semantic IDsultra-long sequencesdual-level attentionuser interest modelingrecommendation systemsadvertisingsequence compression

0 comments

The pith

UxSID uses semantic IDs and dual-level attention to model ultra-long user sequences with target-aware preferences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces UxSID to solve the efficiency-effectiveness trade-off when handling very long user behavior histories in recommendation and advertising systems. Existing approaches either scan every past item individually at high cost or compress the entire history without regard to the current target item. UxSID instead assigns Semantic IDs to group related items and applies a two-stage attention process that first pools within each semantic group and then attends across groups to the target. This produces a shared interest memory that stays semantically aware yet remains computationally light. The method reaches state-of-the-art accuracy and delivers a measured 0.337 percent revenue gain in a large-scale live advertising experiment.

Core claim

By assigning Semantic IDs to items and employing a dual-level attention strategy over the resulting semantic groups, UxSID builds a shared interest memory that captures preferences relevant to a specific target item without incurring the cost of item-by-item search or the information loss of fully item-agnostic compression.

What carries the argument

Semantic IDs (SIDs) that group items by meaning, combined with dual-level attention that first aggregates within each semantic group and then attends across groups to the target item.

Load-bearing premise

Grouping items into Semantic IDs preserves the distinctions that actually matter for target-aware user preferences rather than collapsing important differences or injecting new biases.

What would settle it

Replace the learned Semantic IDs with randomly assigned group labels on the same ultra-long sequences and measure whether recommendation accuracy and revenue lift disappear.

Figures

Figures reproduced from arXiv: 2605.09040 by Han Li, Hongwei Zhang, Huanjie Wang, Jiangxia Cao, Jing Yao, Junfeng Shu, Liwei Guan, Qiqiang Zhong, Yiyang Lv, Yiyu Wang, Zhaojie Liu.

**Figure 1.** Figure 1: Comparison of different paradigms for ULSM. (a) Item-specific Search: Online filtering for each candidate, incurring high computational cost. (b) Item-agnostic Compression: Offline distillation into static memories, lacking target-specificity. (c) UxSID: A semantic-specific path that shares compressed interest memories among items with the same SIDs. platform’s items, a powerful recommendation system (RecS… view at source ↗

**Figure 2.** Figure 2: The architecture of UxSID primarily comprises three components: a target SIDs Generator [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: AUC improvements (percentage points) across various sequence lengths on all datasets. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Hyper-parameters analysis of UxSID on all three datasets. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Efficacy of UxSID in interest modeling. (a) highlights the target SID-based attention [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: The overall system deployment pipeline of UxSID, comprising offline UxSID embedding [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

read the original abstract

Modeling ultra-long user sequences involves a difficult trade-off between efficiency and effectiveness. While current paradigms rely on either item-specific search or item-agnostic compression, we propose UxSID, a framework exploring a third path: semantic-group shared interest memory. By utilizing Semantic IDs (SIDs) and a dual-level attention strategy, UxSID captures target-aware preferences without the heavy cost of item-specific models. This end-to-end architecture balances computational parsimony with semantic awareness, achieving state-of-the-art performance and a 0.337% revenue lift in large-scale advertising A/B test.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

UxSID carves a practical middle path for ultra-long sequences via semantic IDs and dual attention, but the claims rest on limited visible validation.

read the letter

The core idea is a third option for modeling thousands of user actions: group them into semantic IDs so you can share interest memory across similar items, then apply dual-level attention to stay target-aware without paying for full item-specific search or losing everything in crude compression. That framing is new enough to stand out from the usual two-paradigm split, and the end-to-end setup plus the reported 0.337% revenue lift in a large ad A/B test give it some real-world grounding for industrial use. The architecture sounds like it could cut compute while keeping enough signal, which is the kind of incremental win that matters in production recsys. What is missing is any detail on how the Semantic IDs are constructed or assigned, and the abstract gives no ablations, dataset sizes, or error bars to show whether the dual attention actually avoids new biases or signal loss. If the full paper has those controls and they hold up, the efficiency claim strengthens; right now it is hard to judge the size of the advance. This is for people building large-scale recommendation or advertising systems who already fight sequence-length trade-offs. A reader in that area could pull useful architecture ideas even if the numbers need more scrutiny. I would send it to peer review—the practical test and coherent framing are enough to justify referee time, though the experimental section will probably need expansion.

Referee Report

0 major / 2 minor

Summary. The paper proposes UxSID, a framework for modeling ultra-long user sequences in recommendation systems via semantic-group shared interest memory. It introduces Semantic IDs (SIDs) and a dual-level attention strategy to capture target-aware preferences efficiently, avoiding the costs of item-specific search while retaining semantic awareness, and reports state-of-the-art performance plus a 0.337% revenue lift in a large-scale advertising A/B test.

Significance. If the results hold, the work offers a practical third path for ultra-long sequence modeling that could improve scalability in industrial recommender systems without sacrificing semantic fidelity. The online A/B test result provides direct evidence of business impact, strengthening the case for adoption in advertising and related domains.

minor comments (2)

[Abstract] Abstract: the claim of SOTA performance would be strengthened by briefly naming the offline datasets, sequence lengths, and main baselines used.
[§4] §4 (Experiments): include error bars or statistical significance tests for the reported metrics to support the SOTA and revenue-lift claims.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of UxSID and the recommendation for minor revision. The recognition of our semantic-group shared interest memory approach as a practical third path for ultra-long sequence modeling, along with the value of the online A/B test results, is appreciated.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper proposes UxSID as an independent architectural framework using Semantic IDs (SIDs) and dual-level attention to model ultra-long sequences via semantic-group shared memory. No equations, fitted parameters, or derivations are presented in the abstract or described structure that reduce outputs to inputs by construction. The central claim of balancing efficiency and target-aware preferences is framed as a novel third path without self-definitional loops, self-citation load-bearing premises, or renaming of known results. The architecture is presented as self-contained with external validation via A/B test revenue lift, satisfying the criteria for a non-circular proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

The abstract introduces Semantic IDs and dual-level attention as core components without detailing their construction or grounding; these function as new modeling primitives whose validity is assumed rather than derived from external benchmarks.

invented entities (2)

Semantic IDs (SIDs) no independent evidence
purpose: To group items into semantic clusters for shared interest memory
Introduced as the basis for semantic-group modeling; no independent evidence or prior definition supplied in abstract
dual-level attention strategy no independent evidence
purpose: To capture target-aware preferences at group and cross-group levels
Proposed as the mechanism balancing efficiency and semantic awareness; no formal definition or validation in available text

pith-pipeline@v0.9.0 · 5428 in / 1310 out tokens · 41978 ms · 2026-05-14T21:01:42.096991+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 15 canonical work pages · 4 internal anchors

[1]

A survey of user lifelong behavior modeling: Perspectives on efficiency and effectiveness

Rui Zhou, Qinglin Jia, Bo Chen, Peng Xu, Yijia Sun, Siyuan Lou, Chaoxin Fu, Mengyuan Fu, Guoming Shen, Zheli Zhou, et al. A survey of user lifelong behavior modeling: Perspectives on efficiency and effectiveness. 2026

2026
[2]

Practice on long sequential user behavior modeling for click-through rate prediction

Qi Pi, Weijie Bian, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. Practice on long sequential user behavior modeling for click-through rate prediction. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, page 2671–2679. ACM, 2019. doi: 10.1145/3292500. 3330666

work page doi:10.1145/3292500 2019
[3]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

2017
[4]

Deep interest network for click-through rate prediction

Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. Deep interest network for click-through rate prediction. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1059–1068, 2018

2018
[5]

Deep interest evolution network for click-through rate prediction

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. Deep interest evolution network for click-through rate prediction. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 5941–5948, 2019

2019
[6]

Transact: Transformer-based realtime user action model for recommendation at pinterest

Xue Xia, Pong Eksombatchai, Nikil Pancha, Dhruvil Deven Badani, Po-Wei Wang, Neng Gu, Saurabh Vish- was Joshi, Nazanin Farahpour, Zhiyuan Zhang, and Andrew Zhai. Transact: Transformer-based realtime user action model for recommendation at pinterest. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 5249–5259, 2023

2023
[7]

Search-based user interest modeling with lifelong sequential behavior data for click-through rate prediction

Qi Pi, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, Xiaoqiang Zhu, and Kun Gai. Search-based user interest modeling with lifelong sequential behavior data for click-through rate prediction. InProceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 2685–2692, 2020

2020
[8]

Twin: Two-stage interest network for lifelong user behavior modeling in ctr prediction at kuaishou

Jianxin Chang, Chenbin Zhang, Zhiyi Fu, Xiaoxue Zang, Lin Guan, Jing Lu, Yiqun Hui, Dewei Leng, Yanan Niu, Yang Song, et al. Twin: Two-stage interest network for lifelong user behavior modeling in ctr prediction at kuaishou. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3785–3794, 2023

2023
[9]

Learning universal user representations via self-supervised lifelong behaviors modeling

Bei Yang, Ke Liu, Xiaoxiao Xu, Renjun Xu, Hong Liu, et al. Learning universal user representations via self-supervised lifelong behaviors modeling. 2021

2021
[10]

Trans- formers are good clusterers for lifelong user behavior sequence modeling

Xingmei Wang, Shiyao Wang, Wuchao Li, Jiaxin Deng, Song Lu, Defu Lian, and Guorui Zhou. Trans- formers are good clusterers for lifelong user behavior sequence modeling. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 3123–3132, 2025

2025
[11]

Pinnerformer: Sequence modeling for user representation at pinterest

Nikil Pancha, Andrew Zhai, Jure Leskovec, and Charles Rosenberg. Pinnerformer: Sequence modeling for user representation at pinterest. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 3702–3712, 2022

2022
[12]

Sampling and noise filtering methods for recommender systems: A literature review.Engineering Applications of Artificial Intelligence, 122:106129, 2023

Kirti Jain and Rajni Jindal. Sampling and noise filtering methods for recommender systems: A literature review.Engineering Applications of Artificial Intelligence, 122:106129, 2023

2023
[13]

Recommender systems with generative retrieval

Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al. Recommender systems with generative retrieval. Advances in Neural Information Processing Systems, 36:10299–10315, 2023

2023
[14]

Onemall: One model, more scenarios–end-to-end generative recommender family at kuaishou e-commerce.arXiv preprint arXiv:2601.21770, 2026

Kun Zhang, Jingming Zhang, Wei Cheng, Yansong Cheng, Jiaqi Zhang, Hao Lu, Xu Zhang, Haixiang Gan, Jiangxia Cao, Tenglong Wang, et al. Onemall: One model, more scenarios–end-to-end generative recommender family at kuaishou e-commerce.arXiv preprint arXiv:2601.21770, 2026

work page arXiv 2026
[15]

Plum: Adapting pre-trained language models for industrial-scale generative recommendations.arXiv preprint arXiv:2510.07784, 2025

Ruining He, Lukasz Heldt, Lichan Hong, Raghunandan Keshavan, Shifan Mao, Nikhil Mehta, Zhengyang Su, Alicia Tsai, Yueqi Wang, Shao-Chuan Wang, et al. Plum: Adapting pre-trained language models for industrial-scale generative recommendations.arXiv preprint arXiv:2510.07784, 2025

work page arXiv 2025
[16]

Das: Dual-aligned semantic ids empowered industrial recommender system

Wencai Ye, Mingjie Sun, Shaoyun Shi, Peng Wang, Wenjin Wu, and Peng Jiang. Das: Dual-aligned semantic ids empowered industrial recommender system. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 6217–6224, 2025. 12

2025
[17]

Pit: A dynamic personalized item tokenizer for end-to-end generative recommendation.arXiv preprint arXiv:2602.08530, 2026

Huanjie Wang, Xinchen Luo, Honghui Bao, Zhang Zixing, Lejian Ren, Yunfan Wu, Hongwei Zhang, Liwei Guan, and Guang Chen. Pit: A dynamic personalized item tokenizer for end-to-end generative recommendation.arXiv preprint arXiv:2602.08530, 2026

work page arXiv 2026
[18]

Dos: Dual-flow orthogonal semantic ids for recommendation in meituan.arXiv preprint arXiv:2602.04460, 2026

Junwei Yin, Senjie Kou, Changhao Li, Shuli Wang, Xue Wei, Yinqiu Huang, Yinhua Zhu, Haitao Wang, and Xingxing Wang. Dos: Dual-flow orthogonal semantic ids for recommendation in meituan.arXiv preprint arXiv:2602.04460, 2026

work page arXiv 2026
[19]

Qarm v2: Quantitative alignment multi-modal recommendation for reasoning user sequence modeling.arXiv preprint arXiv:2602.08559, 2026

Tian Xia, Jiaqi Zhang, Yueyang Liu, Hongjian Dou, Tingya Yin, Jiangxia Cao, Xulei Liang, Tianlu Xie, Lihao Liu, Xiang Chen, et al. Qarm v2: Quantitative alignment multi-modal recommendation for reasoning user sequence modeling.arXiv preprint arXiv:2602.08559, 2026

work page arXiv 2026
[20]

Deep Session Interest Network for Click-Through Rate Prediction

Yufei Feng, Fuyu Lv, Weichen Shen, Menghan Wang, Fei Sun, Yu Zhu, and Keping Yang. Deep session interest network for click-through rate prediction.arXiv preprint arXiv:1905.06482, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1905
[21]

User-aware multi-interest learning for candidate matching in recommenders

Zheng Chai, Zhihong Chen, Chenliang Li, Rong Xiao, Houyi Li, Jiawei Wu, Jingxu Chen, and Haihong Tang. User-aware multi-interest learning for candidate matching in recommenders. InProceedings of the 45th international ACM SIGIR conference on research and development in information retrieval, pages 1326–1335, 2022

2022
[22]

Multi-interest network with dynamic routing for recommendation at tmall

Chao Li, Zhiyuan Liu, Mengmeng Wu, Yuchi Xu, Huan Zhao, Pipei Huang, Guoliang Kang, Qiwei Chen, Wei Li, and Dik Lun Lee. Multi-interest network with dynamic routing for recommendation at tmall. In Proceedings of the 28th ACM international conference on information and knowledge management, pages 2615–2623, 2019

2019
[23]

Multi-grained preference enhanced transformer for multi-behavior sequential recommendation

Chuan He, Yongchao Liu, Qiang Li, Weiqiang Wang, Xing Fu, Xinyi Fu, Chuntao Hong, and Xinwei Yao. Multi-grained preference enhanced transformer for multi-behavior sequential recommendation. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, pages 872–883, 2025

2025
[24]

Behavior sequence transformer for e- commerce recommendation in alibaba

Qiwei Chen, Huan Zhao, Wei Li, Pipei Huang, and Wenwu Ou. Behavior sequence transformer for e- commerce recommendation in alibaba. InProceedings of the 1st international workshop on deep learning practice for high-dimensional sparse data, pages 1–4, 2019

2019
[25]

Self-attentive sequential recommendation

Wang-Cheng Kang and Julian McAuley. Self-attentive sequential recommendation. In2018 IEEE international conference on data mining (ICDM), pages 197–206. IEEE, 2018

2018
[26]

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhaojie Gong, Fangda Gu, Michael He, et al. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations.arXiv preprint arXiv:2402.17152, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[27]

Mtgr: Industrial-scale generative recommendation framework in meituan

Ruidong Han, Bin Yin, Shangyu Chen, He Jiang, Fei Jiang, Xiang Li, Chi Ma, Mincong Huang, Xiaoguang Li, Chunzhen Jing, et al. Mtgr: Industrial-scale generative recommendation framework in meituan. In Proceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 5731–5738, 2025

2025
[28]

A survey on sequential recommendation.Frontiers of Computer Science, 20(3):2003606, 2026

Li-Wei Pan, Wei-Ke Pan, Mei-Yan Wei, Hong-Zhi Yin, and Zhong Ming. A survey on sequential recommendation.Frontiers of Computer Science, 20(3):2003606, 2026

2026
[29]

End-to-end user behavior retrieval in click-through rateprediction model.arXiv preprint arXiv:2108.04468, 2021

Qiwei Chen, Changhua Pei, Shanshan Lv, Chao Li, Junfeng Ge, and Wenwu Ou. End-to-end user behavior retrieval in click-through rateprediction model.arXiv preprint arXiv:2108.04468, 2021

work page arXiv 2021
[30]

Sampling is all you need on modeling long-term user behaviors for ctr prediction

Yue Cao, Xiaojiang Zhou, Jiaqi Feng, Peihao Huang, Yao Xiao, Dayao Chen, and Sheng Chen. Sampling is all you need on modeling long-term user behaviors for ctr prediction. InProceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 2974–2983, 2022

2022
[31]

Twin v2: Scaling ultra-long user behavior sequence modeling for enhanced ctr prediction at kuaishou

Zihua Si, Lin Guan, ZhongXiang Sun, Xiaoxue Zang, Jing Lu, Yiqun Hui, Xingchao Cao, Zeyu Yang, Yichen Zheng, Dewei Leng, et al. Twin v2: Scaling ultra-long user behavior sequence modeling for enhanced ctr prediction at kuaishou. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 4890–4897, 2024

2024
[32]

Multi-granularity interest retrieval and refinement network for long-term user behavior modeling in ctr prediction

Xiang Xu, Hao Wang, Wei Guo, Luankang Zhang, Wanshan Yang, Runlong Yu, Yong Liu, Defu Lian, and Enhong Chen. Multi-granularity interest retrieval and refinement network for long-term user behavior modeling in ctr prediction. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 1, pages 2745–2755, 2025. 13

2025
[34]

Lrea: Low-rank efficient attention on modeling long-term user behaviors for ctr prediction

Xin Song, Xiaochen Li, Jinxin Hu, Hong Wen, Zulong Chen, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. Lrea: Low-rank efficient attention on modeling long-term user behaviors for ctr prediction. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2843–2847, 2025

2025
[35]

Dv365: Extremely long user history modeling at instagram

Wenhan Lyu, Devashish Tyagi, Yihang Yang, Ziwei Li, Ajay Somani, Karthikeyan Shanmugasundaram, Nikola Andrejevic, Ferdi Adeputra, Curtis Zeng, Arun K Singh, et al. Dv365: Extremely long user history modeling at instagram. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, pages 4717–4727, 2025

2025
[36]

Longer: Scaling up long sequence modeling in industrial recommenders

Zheng Chai, Qin Ren, Xijun Xiao, Huizhi Yang, Bo Han, Sijun Zhang, Di Chen, Hui Lu, Wenlin Zhao, Lele Yu, et al. Longer: Scaling up long sequence modeling in industrial recommenders. InProceedings of the Nineteenth ACM Conference on Recommender Systems, pages 247–256, 2025

2025
[37]

Marm: Unlocking the recommendation cache scaling-law through memory augmentation and scalable complexity

Xiao Lv, Jiangxia Cao, Shijie Guan, Xiaoyou Zhou, Zhiguang Qi, Yaqiang Zang, Ben Wang, and Guorui Zhou. Marm: Unlocking the recommendation cache scaling-law through memory augmentation and scalable complexity. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 2022–2031, 2025

2022
[38]

Qarm: Quantitative alignment multi-modal recommendation at kuaishou

Xinchen Luo, Jiangxia Cao, Tianyu Sun, Jinkai Yu, Rui Huang, Wei Yuan, Hezheng Lin, Yichen Zheng, Shiyao Wang, Qigen Hu, et al. Qarm: Quantitative alignment multi-modal recommendation at kuaishou. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 5915–5922, 2025

2025
[39]

OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment

Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2502.18965, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[40]

Farewell to item ids: Unlocking the scaling potential of large ranking models via semantic tokens

Zhen Zhao, Tong Zhang, Jie Xu, Qingliang Cai, Qile Zhang, Leyuan Yang, Daorui Xiao, and Xiaojia Chang. Farewell to item ids: Unlocking the scaling potential of large ranking models via semantic tokens. arXiv preprint arXiv:2601.22694, 2026

work page arXiv 2026
[41]

Finite scalar quantization: Vq-vae made simple.arXiv preprint arXiv:2309.15505, 2023

Fabian Mentzer, David Minnen, Eirikur Agustsson, and Michael Tschannen. Finite scalar quantization: Vq-vae made simple.arXiv preprint arXiv:2309.15505, 2023

work page arXiv 2023
[42]

Lifelong sequential modeling with personalized memorization for user response prediction

Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, et al. Lifelong sequential modeling with personalized memorization for user response prediction. InProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 565–574, 2019

2019
[43]

Kuairec: A fully-observed dataset and insights for evaluating recommender systems

Chongming Gao, Shijun Li, Wenqiang Lei, Jiawei Chen, Biao Li, Peng Jiang, Xiangnan He, Jiaxin Mao, and Tat-Seng Chua. Kuairec: A fully-observed dataset and insights for evaluating recommender systems. InProceedings of the 31st ACM International Conference on Information & Knowledge Management, CIKM ’22, page 540–550, 2022. doi: 10.1145/3511808.3557220. UR...

work page doi:10.1145/3511808.3557220 2022
[44]

Wide & deep learning for recommender systems

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. Wide & deep learning for recommender systems. InProceedings of the 1st workshop on deep learning for recommender systems, pages 7–10, 2016

2016
[45]

Learnable item tokenization for generative recommendation

Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, See-Kiong Ng, and Tat-Seng Chua. Learnable item tokenization for generative recommendation. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 2400–2409, 2024

2024
[46]

Onerec-v2 technical report.arXiv preprint arXiv:2508.20900, 2025

Guorui Zhou, Hengrui Hu, Hongtao Cheng, Huanjie Wang, Jiaxin Deng, Jinghao Zhang, Kuo Cai, Lejian Ren, Lu Ren, Liao Yu, et al. Onerec-v2 technical report.arXiv preprint arXiv:2508.20900, 2025

work page arXiv 2025
[47]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971, 2023. 14 A Online Implementation Long-term User History Storage Training Data Storage UxSID Ranking Mode...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[48]

For each video sequence, the last 7 videos are used for testing, the 8th to 14th last videos are used for validation, and the rest are used for training

We use users’ interaction histories to create video sequences sorted by timestamp and filter out users with less than 50 comments. For each video sequence, the last 7 videos are used for testing, the 8th to 14th last videos are used for validation, and the rest are used for training. We fix the user’s interaction history up to 2k. Industrial Dataset:Colle...

2026