pith. machine review for the scientific record. sign in

arxiv: 2605.06331 · v1 · submitted 2026-05-07 · 💻 cs.IR

Recognition: unknown

Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation

Authors on Pith no claims yet

Pith reviewed 2026-05-08 05:45 UTC · model grok-4.3

classification 💻 cs.IR
keywords generative recommendationsemantic IDsautoregressive generationexpressiveness limitsdecoding treecollaborative filteringlatent tokens
0
0 comments X

The pith

Autoregressive semantic ID generation in generative recommenders creates tree-structured probability correlations that block representation of simple user preference patterns conventional collaborative filtering can capture.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Generative recommendation models build item indices from sequences of discrete semantic tokens and decode them autoregressively. This process traverses a fixed tree whose internal structure forces items near each other in the tree to receive similar probabilities for any given user. The resulting correlations make it impossible for the model to express even elementary preference distinctions that matrix factorization or other conventional collaborative filtering approaches represent without difficulty. The authors prove the limitation formally and demonstrate that it is a direct consequence of the single-tree decoding geometry rather than other modeling choices. They then show that inserting latent tokens before the semantic IDs splits the space into multiple conditioned trees, loosening the unwanted couplings and producing measurable gains on ranking metrics.

Core claim

The autoregressive traversal of a semantic-ID decoding tree induces strong correlations among item probabilities according to tree distance; these correlations are provably sufficient to prevent the model from representing simple preference patterns that standard collaborative filtering models capture routinely.

What carries the argument

The decoding tree induced by the sequence of semantic ID tokens, whose leaf-to-root paths force probability similarity between nearby items.

If this is right

  • Generative models cannot reliably rank items that share long common prefixes in their semantic IDs even when user history clearly distinguishes them.
  • Any pattern expressible only by independent per-item scores will be at least partially flattened by the tree geometry.
  • Conventional collaborative filtering remains strictly more expressive than unmodified autoregressive semantic-ID generators on certain elementary ranking tasks.
  • Modifications that introduce additional decoding paths can restore some of the lost distinctions without abandoning the generative formulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same tree-coupling phenomenon may appear in any autoregressive generative system whose tokens define a fixed hierarchy over the output space.
  • Alternative indexing schemes that avoid deep shared prefixes, or that randomize the token order per user, could be compared directly against the latent-token fix.
  • The theoretical gap identified here supplies a concrete target for measuring how much of the performance difference between generative and discriminative recommenders is due to expressiveness rather than optimization.

Load-bearing premise

Item probability correlations arise primarily from the autoregressive traversal of the semantic ID tree rather than from other aspects of the model or data.

What would settle it

A controlled experiment in which a generative recommender is forced to assign sharply different probabilities to two sibling items under a user who strongly prefers one over the other, with no other model changes.

Figures

Figures reproduced from arXiv: 2605.06331 by Clark Mingxuan Ju, Eduardo Escoto, Haven Kim, Julian McAuley, Neil Shah, Yupeng Hou.

Figure 1
Figure 1. Figure 1: Illustration of the expressive limitations induced by autoregressive SID generation. (a) view at source ↗
Figure 2
Figure 2. Figure 2: Correlation between tree distance and item generation probability similarity. view at source ↗
Figure 3
Figure 3. Figure 3: Overall framework of Latte. Having analyzed the expressive limitations of standard autoregressive semantic ID generation in GR, we now present Latte ( view at source ↗
Figure 4
Figure 4. Figure 4: Analysis of model performance w.r.t. the number of latent tokens M′ . item-item similarity introduced in Section 3.2, defined as the Pearson correlation of item generation probabilities across the sampled user population. Since tree distances typically take only a few values (e.g., {2, 4, 6, 8}), we employ Kendall’s rank correlation coefficient [22] for this measurement. This metric is more robust for eval… view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of modality orders correspond view at source ↗
read the original abstract

Generative recommendation (GR) models generate items by autoregressively producing a sequence of discrete tokens that jointly index the target item. However, this autoregressive generation process also induces a structured decoding space whose impact on model expressiveness remains underexplored. Specifically, token-by-token generation can be viewed as traversing a decoding tree induced by semantic ID tokens, where leaf nodes correspond to candidate items. We observe that the item probabilities produced by GR models are strongly correlated with this tree structure: items that are close in the tree tend to receive similar probabilities for any given user, making it difficult to distinguish among them based on user-specific preferences. We further show theoretically that such structural correlations prevent GR models from representing even simple patterns that can be well captured by conventional collaborative filtering models. To mitigate this issue, we propose Latte, a simple modification that injects a latent token before each semantic ID, reshaping the decoding space from a single tree into multiple latent-token-conditioned trees. This design creates multiple paths with varying tree distances between items, relaxing tree-induced probability coupling and yielding an average of 3.45% relative improvement on NDCG@10. Our code is available at https://github.com/hyp1231/Latte.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that autoregressive semantic ID generation in generative recommendation (GR) models induces a decoding tree whose structure creates strong correlations in item probabilities (nearby leaves receive similar scores for a given user), limiting expressiveness relative to conventional collaborative filtering. It provides a theoretical argument that this prevents representation of simple preference patterns, and proposes Latte, which prepends a latent token to each semantic ID to reshape the space into multiple conditioned trees. This yields an average 3.45% relative NDCG@10 gain; code is released.

Significance. The empirical improvement and public code release are concrete strengths that could aid reproducibility and follow-up work. If the theoretical limit is correctly identified, the result would usefully clarify a structural constraint on current GR approaches; however, the central expressiveness claim appears to rest on an assumption about fixed conditionals that does not hold in general.

major comments (2)
  1. [§3] §3 (theoretical argument): the claim that tree traversal imposes an inherent barrier to representing patterns that CF can capture is incorrect. For any fixed semantic-ID tree and any target distribution p(item|u), one can exactly recover p by setting each prefix probability to the sum of p over the subtree and each conditional split to the normalized subtree masses. A sufficiently expressive network (user embedding fed to every token predictor) realizes these conditionals without restriction, so the AR structure itself does not create an expressiveness limit; observed correlations are therefore attributable to training or semantic sibling similarity rather than representational incapacity. This directly undercuts the motivation for Latte.
  2. [§5] §5 (experiments): the reported 3.45% relative NDCG@10 gain is presented without per-dataset breakdowns, variance estimates, or statistical significance tests against strong CF baselines that already capture the simple patterns the theory claims GR cannot represent. Without these controls it is unclear whether the improvement stems from relaxed tree coupling or from the added capacity of the latent tokens.
minor comments (2)
  1. [§4] Notation: the distinction between 'semantic ID tokens' and the newly introduced 'latent token' should be made explicit in the first paragraph of §4 and in all equations defining the modified factorization.
  2. [Figure 2] Figure 2: the diagram of the Latte decoding space would be clearer if it explicitly labeled the multiple parallel trees and indicated how user embeddings are shared across latent-token branches.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below, clarifying our theoretical claims and committing to improved experimental reporting.

read point-by-point responses
  1. Referee: [§3] §3 (theoretical argument): the claim that tree traversal imposes an inherent barrier to representing patterns that CF can capture is incorrect. For any fixed semantic-ID tree and any target distribution p(item|u), one can exactly recover p by setting each prefix probability to the sum of p over the subtree and each conditional split to the normalized subtree masses. A sufficiently expressive network (user embedding fed to every token predictor) realizes these conditionals without restriction, so the AR structure itself does not create an expressiveness limit; observed correlations are therefore attributable to training or semantic sibling similarity rather than representational incapacity. This directly undercuts the motivation for Latte.

    Authors: We thank the referee for this precise observation. Our theoretical analysis in §3 highlights that the autoregressive decoding tree, when using fixed semantic ID hierarchies, induces strong probability correlations among nearby leaves for a given user. While we acknowledge that an arbitrarily expressive network could in principle set conditionals to realize any target distribution, our argument and empirical findings focus on the practical expressiveness limits under standard GR training regimes, where parameter sharing across token predictors and the semantic structure of IDs make it difficult to break these correlations for simple preference patterns that CF models capture easily. We will revise §3 to explicitly distinguish absolute theoretical capacity from the inductive bias and practical limitations observed in trained models, thereby refining (but not removing) the motivation for Latte. revision: partial

  2. Referee: [§5] §5 (experiments): the reported 3.45% relative NDCG@10 gain is presented without per-dataset breakdowns, variance estimates, or statistical significance tests against strong CF baselines that already capture the simple patterns the theory claims GR cannot represent. Without these controls it is unclear whether the improvement stems from relaxed tree coupling or from the added capacity of the latent tokens.

    Authors: We agree that additional experimental details are warranted. In the revised manuscript we will report per-dataset NDCG@10 scores, standard deviations across multiple random seeds, and statistical significance tests against the reported CF baselines as well as additional strong collaborative filtering methods. These additions will help isolate the benefit of the latent-token mechanism from capacity increases. revision: yes

Circularity Check

0 steps flagged

No circularity; theoretical expressiveness argument is independent of fitted quantities

full rationale

The paper's core claim—that autoregressive traversal of the semantic-ID tree induces structural correlations preventing representation of patterns capturable by conventional CF—is advanced via an empirical observation of probability correlations plus a separate theoretical argument. No step reduces a derived quantity to a fitted parameter by construction, nor does any load-bearing premise collapse to a self-citation, self-definition, or renamed known result. The proposed Latte modification and its reported gains are likewise presented as an independent architectural change rather than a re-expression of the same inputs. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that autoregressive token generation induces a fixed decoding tree whose structure governs probability similarity; the latent token is an invented entity introduced to alter that structure.

axioms (1)
  • domain assumption Token-by-token generation traverses a decoding tree induced by semantic ID tokens, with leaf nodes corresponding to candidate items.
    Explicitly stated in the abstract as the view of the autoregressive process.
invented entities (1)
  • Latent token no independent evidence
    purpose: Injected before each semantic ID to reshape the single decoding tree into multiple latent-conditioned trees.
    Proposed in the abstract as the mechanism to relax tree-induced probability coupling.

pith-pipeline@v0.9.0 · 9472 in / 1270 out tokens · 79723 ms · 2026-05-08T05:45:59.022229+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

67 extracted references · 23 canonical work pages · 5 internal anchors

  1. [1]

    How expressive are graph neural networks in recommendation? InCIKM, pages 173–182, 2023

    Xuheng Cai, Lianghao Xia, Xubin Ren, and Chao Huang. How expressive are graph neural networks in recommendation? InCIKM, pages 173–182, 2023

  2. [2]

    Sui confini della probabilita

    Francesco Paolo Cantelli. Sui confini della probabilita. InAtti del Congresso Internazionale dei Matematici: Bologna del 3 al 10 de settembre di 1928, pages 47–60, 1929

  3. [3]

    arXiv preprint arXiv:2509.03236 , year=

    Ben Chen, Xian Guo, Siyuan Wang, Zihan Liang, Yue Lv, Yufei Ma, Xinlong Xiao, Bowen Xue, Xuxin Zhang, Ying Yang, Huangyu Dai, Xing Xu, Tong Zhao, Mingcan Peng, Xiaoyang Zheng, Chao Wang, Qihang Zhao, Zhixin Zhai, Yang Zhao, Bochao Liu, Jingshan Lv, Xiao Liang, Yuqing Ding, Jing Chen, Chenyi Lei, Wenwu Ou, Han Li, and Kun Gai. Onesearch: A preliminary expl...

  4. [4]

    Recsys challenge 2018: automatic music playlist continuation

    Ching-Wei Chen, Paul Lamere, Markus Schedl, and Hamed Zamani. Recsys challenge 2018: automatic music playlist continuation. InRecSys, pages 527–528, 2018

  5. [5]

    OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment

    Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2502.18965, 2025

  6. [6]

    How well does generative recommendation generalize?arXiv preprint arXiv:2603.19809, 2026

    Yijie Ding, Zitian Guo, Jiacheng Li, Letian Peng, Shuai Shao, Wei Shao, Xiaoqiang Luo, Luke Simon, Jingbo Shang, Julian McAuley, and Yupeng Hou. How well does generative recommendation generalize?arXiv preprint arXiv:2603.19809, 2026

  7. [7]

    Inductive generative recommenda- tion via retrieval-based speculation

    Yijie Ding, Jiacheng Li, Julian McAuley, and Yupeng Hou. Inductive generative recommenda- tion via retrieval-based speculation. InAAAI, 2026

  8. [8]

    Talkplay: Multimodal music recommendation with large language models.arXiv preprint arXiv:2502.13713, 2025

    Seungheon Doh, Keunwoo Choi, and Juhan Nam. Talkplay: Multimodal music recommendation with large language models.arXiv preprint arXiv:2502.13713, 2025

  9. [9]

    TalkPlay-Tools: Conversational music recommendation with LLM tool calling.arXiv preprint arXiv:2510.01698, 2025

    Seungheon Doh, Keunwoo Choi, and Juhan Nam. Talkplay-tools: Conversational music recommendation with llm tool calling.arXiv preprint arXiv:2510.01698, 2025

  10. [10]

    Plum: Adapting pre-trained language models for industrial-scale generative recommendations.arXiv preprint arXiv:2510.07784, 2025

    Ruining He, Lukasz Heldt, Lichan Hong, Raghunandan H. Keshavan, Shifan Mao, Nikhil Mehta, Zhengyang Su, Alicia Tsai, Yueqi Wang, Shao-Chuan Wang, Xinyang Yi, Lexi Baugher, Baykal Cakici, Ed H. Chi, Cristos Goodrow, Ningren Han, He Ma, Rómer Rosales, Abby Van Soest, Devansh Tandon, Su-Lin Wu, Weilong Yang, and Yilin Zheng. PLUM: adapting pre- trained langu...

  11. [11]

    Session-based recommendations with recurrent neural networks

    Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. Session-based recommendations with recurrent neural networks. InICLR, 2016

  12. [12]

    Learning vector-quantized item representation for transferable sequential recommenders

    Yupeng Hou, Zhankui He, Julian McAuley, and Wayne Xin Zhao. Learning vector-quantized item representation for transferable sequential recommenders. InWWW, pages 1162–1171, 2023

  13. [13]

    Bridging Language and Items for Retrieval and Recommendation: Benchmarking LLMs as Semantic Encoders

    Yupeng Hou, Jiacheng Li, Zhankui He, An Yan, Xiusi Chen, and Julian McAuley. Bridging language and items for retrieval and recommendation.arXiv preprint arXiv:2403.03952, 2024

  14. [14]

    Generating long semantic ids in parallel for recom- mendation

    Yupeng Hou, Jiacheng Li, Ashley Shin, Jinsung Jeon, Abhishek Santhanam, Wei Shao, Kaveh Hassani, Ning Yao, and Julian McAuley. Generating long semantic ids in parallel for recom- mendation. InKDD, 2025

  15. [15]

    Towards universal sequence representation learning for recommender systems

    Yupeng Hou, Shanlei Mu, Wayne Xin Zhao, Yaliang Li, Bolin Ding, and Ji-Rong Wen. Towards universal sequence representation learning for recommender systems. InKDD, pages 585–593, 2022

  16. [16]

    Chi, Julian McAuley, and Derek Zhiyuan Cheng

    Yupeng Hou, Jianmo Ni, Zhankui He, Noveen Sachdeva, Wang-Cheng Kang, Ed H. Chi, Julian McAuley, and Derek Zhiyuan Cheng. ActionPiece: Contextually tokenizing action sequences for generative recommendation. InICML, 2025. 10

  17. [17]

    How to index item ids for recommendation foundation models

    Wenyue Hua, Shuyuan Xu, Yingqiang Ge, and Yongfeng Zhang. How to index item ids for recommendation foundation models. InSIGIR-AP, 2023

  18. [18]

    Language models as semantic indexers

    Bowen Jin, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, and Xianfeng Tang. Language models as semantic indexers. InICML, 2024

  19. [19]

    Generative recommendation with semantic ids: A practitioner’s handbook

    Clark Mingxuan Ju, Liam Collins, Leonardo Neves, Bhuvesh Kumar, Louis Yufeng Wang, Tong Zhao, and Neil Shah. Generative recommendation with semantic ids: A practitioner’s handbook. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 6420–6425, 2025

  20. [20]

    Semantic IDs for Recommender Systems at Snapchat: Use Cases, Technical Challenges, and Design Choices

    Clark Mingxuan Ju, Tong Zhao, Leonardo Neves, Liam Collins, Bhuvesh Kumar, Jiwen Ren, Lili Zhang, Wenfeng Zhuo, Vincent Zhang, Xiao Bai, et al. Semantic ids for recommender systems at snapchat: Use cases, technical challenges, and design choices.arXiv preprint arXiv:2604.03949, 2026

  21. [21]

    Wang-Cheng Kang and Julian J. McAuley. Self-attentive sequential recommendation. InICDM, 2018

  22. [22]

    A new measure of rank correlation.Biometrika, 30(1-2):81–93, 1938

    Maurice G Kendall. A new measure of rank correlation.Biometrika, 30(1-2):81–93, 1938

  23. [23]

    Fusid: Modality-fused semantic ids for generative music recommendation.arXiv preprint arXiv:2601.08764, 2026

    Haven Kim, Yupeng Hou, and Julian McAuley. Fusid: Modality-fused semantic ids for generative music recommendation.arXiv preprint arXiv:2601.08764, 2026

  24. [24]

    Text is all you need: Learning language representations for sequential recommendation

    Jiacheng Li, Ming Wang, Jin Li, Jinmiao Fu, Xin Shen, Jingbo Shang, and Julian McAuley. Text is all you need: Learning language representations for sequential recommendation. In KDD, 2023

  25. [25]

    Generative recommender with end-to-end learnable item tokenization

    Enze Liu, Bowen Zheng, Cheng Ling, Lantao Hu, Han Li, and Wayne Xin Zhao. Generative recommender with end-to-end learnable item tokenization. InSIGIR, pages 729–739, 2025

  26. [26]

    Understanding generative recommendation with semantic ids from a model-scaling view.arXiv preprint arXiv:2509.25522, 2025

    Jingzhe Liu, Liam Collins, Jiliang Tang, Tong Zhao, Neil Shah, and Clark Mingxuan Ju. Understanding generative recommendation with semantic ids from a model-scaling view.arXiv preprint arXiv:2509.25522, 2025

  27. [27]

    Multi-behavior generative recommendation

    Zihan Liu, Yupeng Hou, and Julian McAuley. Multi-behavior generative recommendation. In CIKM, 2024

  28. [28]

    Qarm: Quantitative alignment multi-modal recommendation at kuaishou

    Xinchen Luo, Jiangxia Cao, Tianyu Sun, Jinkai Yu, Rui Huang, Wei Yuan, Hezheng Lin, Yichen Zheng, Shiyao Wang, Qigen Hu, et al. Qarm: Quantitative alignment multi-modal recommendation at kuaishou. InProceedings of the 34th ACM International Conference on Information and Knowledge Management, pages 5915–5922, 2025

  29. [29]

    In defense of dual-encoders for neural ranking

    Aditya Menon, Sadeep Jayasumana, Ankit Singh Rawat, Seungyeon Kim, Sashank Reddi, and Sanjiv Kumar. In defense of dual-encoders for neural ranking. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine...

  30. [30]

    Efficient Estimation of Word Representations in Vector Space

    Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space.arXiv preprint arXiv:1301.3781, 2013

  31. [31]

    Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe

    Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks, 2021

  32. [32]

    Curse of “low” dimensionality in recommender systems

    Naoto Ohsaka and Riku Togashi. Curse of “low” dimensionality in recommender systems. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’23, page 537–547. ACM, July 2023

  33. [33]

    Contrastive learning for representation degeneration problem in sequential recommendation

    Ruihong Qiu, Zi Huang, Hongzhi Yin, and Zijian Wang. Contrastive learning for representation degeneration problem in sequential recommendation. InWSDM, 2022. 11

  34. [34]

    Tran, Jonah Samost, Maciej Kula, Ed H

    Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Q. Tran, Jonah Samost, Maciej Kula, Ed H. Chi, and Maheswaran Sathiamoorthy. Recommender systems with generative retrieval. InNeurIPS, 2023

  35. [35]

    The curse of dense low-dimensional information retrieval for large index sizes

    Nils Reimers and Iryna Gurevych. The curse of dense low-dimensional information retrieval for large index sizes. InACL, pages 605–611, 2021

  36. [36]

    BPR: bayesian personalized ranking from implicit feedback

    Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. BPR: bayesian personalized ranking from implicit feedback. InUAI, pages 452–461, 2009

  37. [37]

    Konstan, and John Riedl

    Badrul Munir Sarwar, George Karypis, Joseph A. Konstan, and John Riedl. Item-based collaborative filtering recommendation algorithms. InWWW, pages 285–295, 2010

  38. [38]

    How powerful is graph convolution for recommendation? InCIKM, pages 1619–1629, 2021

    Yifei Shen, Yongji Wu, Yao Zhang, Caihua Shan, Jun Zhang, B Khaled Letaief, and Dongsheng Li. How powerful is graph convolution for recommendation? InCIKM, pages 1619–1629, 2021

  39. [39]

    Bert4rec: Sequential recommendation with bidirectional encoder representations from transformer

    Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. Bert4rec: Sequential recommendation with bidirectional encoder representations from transformer. In CIKM, 2019

  40. [40]

    Cohen, and Donald Metzler

    Yi Tay, Vinh Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Prakash Gupta, Tal Schuster, William W. Cohen, and Donald Metzler. Transformer memory as a differentiable search index. InNeurIPS, 2022

  41. [41]

    Pit: A dynamic personalized item tokenizer for end-to-end generative recommendation.arXiv preprint arXiv:2602.08530, 2026

    Huanjie Wang, Xinchen Luo, Honghui Bao, Zhang Zixing, Lejian Ren, Yunfan Wu, Hongwei Zhang, Liwei Guan, and Guang Chen. Pit: A dynamic personalized item tokenizer for end-to- end generative recommendation.arXiv preprint arXiv:2602.08530, 2026

  42. [42]

    Learnable tokenizer for llm-based generative recommendation

    Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, See-Kiong Ng, and Tat-Seng Chua. Learnable tokenizer for llm-based generative recommendation. InCIKM, 2024

  43. [43]

    Eager: Two-stream generative recommender with behavior-semantic collaboration

    Ye Wang, Jiahao Xun, Minjie Hong, Jieming Zhu, Tao Jin, Wang Lin, Haoyuan Li, Linjun Li, Yan Xia, Zhou Zhao, and Zhenhua Dong. Eager: Two-stream generative recommender with behavior-semantic collaboration. InKDD, page 3245–3254, 2024

  44. [44]

    Enhanced generative recommendation via content and collaboration integration.arXiv preprint arXiv:2403.18480, 2024

    Yidan Wang, Zhaochun Ren, Weiwei Sun, Jiyuan Yang, Zhixiang Liang, Xin Chen, Ruobing Xie, Su Yan, Xu Zhang, Pengjie Ren, Zhumin Chen, and Xin Xin. Enhanced generative recommendation via content and collaboration integration.arXiv preprint arXiv:2403.18480, 2024

  45. [45]

    Empowering large language model for sequential recommendation via multimodal embeddings and semantic ids

    Yuhao Wang, Junwei Pan, Xinhang Li, Maolin Wang, Yuan Wang, Yue Liu, Dapeng Liu, Jie Jiang, and Xiangyu Zhao. Empowering large language model for sequential recommendation via multimodal embeddings and semantic ids. InCIKM, pages 3209–3219, 2025

  46. [46]

    A neural corpus indexer for document retrieval

    Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, and Mao Yang. A neural corpus indexer for document retrieval. InNeurIPS, 2022

  47. [47]

    arXiv preprint arXiv:2508.14646 , year=

    Zhipeng Wei, Kuo Cai, Junda She, Jie Chen, Minghao Chen, Yang Zeng, Qiang Luo, Wencong Zeng, Ruiming Tang, Kun Gai, et al. Oneloc: Geo-aware generative recommender systems for local life service.arXiv preprint arXiv:2508.14646, 2025

  48. [48]

    On the theoretical limitations of embedding-based retrieval, 2025

    Orion Weller, Michael Boratko, Iftekhar Naim, and Jinhyuk Lee. On the theoretical limitations of embedding-based retrieval, 2025

  49. [49]

    Contrastive learning for sequential recommendation

    Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Jiandong Zhang, Bolin Ding, and Bin Cui. Contrastive learning for sequential recommendation. InICDE, 2022

  50. [50]

    Agentictagger: Structured item representation for recommendation with llm agents.arXiv preprint arXiv:2602.05945, 2026

    Zhouhang Xie, Bo Peng, Zhankui He, Ziqi Chen, Alice Han, Isabella Ye, Benjamin Coleman, Noveen Sachdeva, Fernando Pereira, Julian McAuley, et al. Agentictagger: Structured item representation for recommendation with llm agents.arXiv preprint arXiv:2602.05945, 2026. 12

  51. [51]

    How powerful are graph neural networks?, 2019

    Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?, 2019

  52. [52]

    Mmq: Multimodal mixture-of-quantization tokenization for semantic id generation and user behavioral adaptation.arXiv preprint arXiv:2508.15281, 2025

    Yi Xu, Moyu Zhang, Chenxuan Li, Zhihao Liao, Haibo Xing, Hao Deng, Jinxin Hu, Yu Zhang, Xiaoyi Zeng, and Jing Zhang. Mmq: Multimodal mixture-of-quantization tokenization for semantic id generation and user behavioral adaptation.arXiv preprint arXiv:2508.15281, 2025

  53. [53]

    Qwen3 Technical Report

    An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025

  54. [54]

    Unifying generative and dense retrieval for sequential recommendation.arXiv preprint arXiv:2411.18814, 2024

    Liu Yang, Fabian Paischer, Kaveh Hassani, Jiacheng Li, Shuai Shao, Zhang Gabriel Li, Yun He, Xue Feng, Nima Noorshams, Sem Park, Bo Long, Robert D Nowak, Xiaoli Gao, and Hamid Eghbalzadeh. Unifying generative and dense retrieval for sequential recommendation.arXiv preprint arXiv:2411.18814, 2024

  55. [55]

    Das: Dual- aligned semantic ids empowered industrial recommender system

    Wencai Ye, Mingjie Sun, Shaoyun Shi, Peng Wang, Wenjin Wu, and Peng Jiang. Das: Dual- aligned semantic ids empowered industrial recommender system. InCIKM, pages 6217–6224, 2025

  56. [56]

    Multimodal quantitative language for generative recommendation.arXiv preprint arXiv:2504.05314, 2025

    Jianyang Zhai, Zi-Feng Mai, Chang-Dong Wang, Feidiao Yang, Xiawu Zheng, Hui Li, and Yonghong Tian. Multimodal quantitative language for generative recommendation.arXiv preprint arXiv:2504.05314, 2025

  57. [57]

    Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations

    Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhaojie Gong, Fangda Gu, Michael He, Yinghai Lu, and Yu Shi. Actions speak louder than words: Trillion-parameter sequential transducers for generative recommendations. InICML, 2024

  58. [58]

    Multi-aspect cross-modal quantization for generative recommendation, 2025

    Fuwei Zhang, Xiaoyu Liu, Dongbo Xi, Jishen Yin, Huan Chen, Peng Yan, Fuzhen Zhuang, and Zhao Zhang. Multi-aspect cross-modal quantization for generative recommendation, 2025

  59. [59]

    Purely semantic indexing for llm-based generative recommendation and retrieval.arXiv preprint arXiv:2509.16446, 2025

    Ruohan Zhang, Jiacheng Li, Julian McAuley, and Yupeng Hou. Purely semantic indexing for LLM-based generative recommendation and retrieval.arXiv preprint arXiv:2509.16446, 2025

  60. [60]

    Sheng, Jiajie Xu, Deqing Wang, Guan- feng Liu, and Xiaofang Zhou

    Tingting Zhang, Pengpeng Zhao, Yanchi Liu, Victor S. Sheng, Jiajie Xu, Deqing Wang, Guan- feng Liu, and Xiaofang Zhou. Feature-level deeper self-attention network for sequential recommendation. InIJCAI, 2019

  61. [61]

    Adapting large language models by integrating collaborative semantics for recommendation

    Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, and Ji-Rong Wen. Adapting large language models by integrating collaborative semantics for recommendation. In ICDE, 2024

  62. [62]

    Pre-training generative recommender with multi-identifier item tokenization

    Bowen Zheng, Enze Liu, Zhongfu Chen, Zhongrui Ma, Yue Wang, Wayne Xin Zhao, and Ji-Rong Wen. Pre-training generative recommender with multi-identifier item tokenization. arXiv preprint arXiv:2504.04400, 2025

  63. [63]

    Pctx: Tokenizing personalized context for generative recommendation.arXiv preprint arXiv:2510.21276, 2025

    Qiyong Zhong, Jiajie Su, Yunshan Ma, Julian McAuley, and Yupeng Hou. Pctx: Tokenizing personalized context for generative recommendation.arXiv preprint arXiv:2510.21276, 2025

  64. [64]

    S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization

    Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, and Ji-Rong Wen. S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. InCIKM, 2020

  65. [65]

    Filter-enhanced MLP is all you need for sequential recommendation

    Kun Zhou, Hui Yu, Wayne Xin Zhao, and Ji-Rong Wen. Filter-enhanced MLP is all you need for sequential recommendation. InTheWebConf, 2022

  66. [66]

    Cost: Contrastive quantization based semantic tokenization for generative recommendation

    Jieming Zhu, Mengqun Jin, Qijiong Liu, Zexuan Qiu, Zhenhua Dong, and Xiu Li. Cost: Contrastive quantization based semantic tokenization for generative recommendation. In RecSys, 2024

  67. [67]

    Beyond unimodal boundaries: Generative recommendation with multimodal semantics.arXiv preprint arXiv:2503.23333, 2025

    Jing Zhu, Mingxuan Ju, Yozen Liu, Danai Koutra, Neil Shah, and Tong Zhao. Beyond unimodal boundaries: Generative recommendation with multimodal semantics.arXiv preprint arXiv:2503.23333, 2025. 13 Table 6: Notations and explanations. Notation Explanation i,i t item, item at timet (i1, i2, . . . , it−1)historical interaction sequence uuser’s historical inte...