Echoes in Filter Bubble: Diagnosing and Curing Popularity Bias in Generative Recommenders

Bangguo Zhu; Chengqi Zhang; Hao Chen; Jun Yin; Peng Huo; Ruochen Liu; Senzhang Wang; Shirui Pan

arxiv: 2605.16825 · v3 · pith:M5WLHRDPnew · submitted 2026-05-16 · 💻 cs.IR · cs.AI

Echoes in Filter Bubble: Diagnosing and Curing Popularity Bias in Generative Recommenders

Jun Yin , Bangguo Zhu , Peng Huo , Ruochen Liu , Hao Chen , Senzhang Wang , Shirui Pan , Chengqi Zhang This is my paper

Pith reviewed 2026-06-30 19:40 UTC · model grok-4.3

classification 💻 cs.IR cs.AI

keywords popularity biasgenerative recommendersdebiasingrecommendation systemstokenizationoptimizationfairness in recommendations

0 comments

The pith

Generative recommenders develop popularity bias from token-level optimization flaws and uniform item tokenization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that generative recommenders, which predict items through a single end-to-end generative model, pick up severe popularity bias because their training objective operates at the token level without distinguishing item frequency and because items are encoded as tokens without regard to popularity differences. The authors trace this bias through theoretical analysis and introduce Ghost, which adds asymmetric unlikelihood optimization to penalize popular items more during training and skeleton-founded tokenization to create more distinct item representations. Experiments on three datasets show Ghost lowers popularity bias and improves fairness metrics compared with prior debiasing attempts while accepting only small drops in overall recommendation accuracy. Readers would care if the diagnosis holds because popularity bias in generative systems can systematically hide less popular items from users.

Core claim

The severe popularity bias in generative recommenders emerges from the confluence of a token-level optimization flaw and the undifferentiated property of item tokenization. Ghost addresses these by using asymmetric unlikelihood optimization together with skeleton-founded tokenization, which substantially alleviates the bias and promotes fairer recommendations across multiple datasets while incurring only slight degradation to overall recommendation utility.

What carries the argument

Asymmetric unlikelihood optimization paired with skeleton-founded tokenization inside the Ghost generative recommender, which corrects the identified token-level flaw and undifferentiated encoding.

If this is right

Ghost produces measurably fairer item exposure distributions than prior debiasing methods on the same generative architecture.
The fixes incur only minor losses in standard ranking metrics such as recall or NDCG.
The approach works across three different recommendation datasets against multiple strong baselines.
Fairness gains come directly from changing the optimization and tokenization steps rather than post-hoc re-ranking.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar token-level optimization issues may appear in other generative models that output ranked lists, such as language-model-based search systems.
The skeleton-founded tokenization idea could be tested on non-recommendation generative tasks where frequency bias distorts outputs.
If the two flaws prove general, existing generative recommenders could be patched without full retraining by swapping only the optimization and tokenization modules.

Load-bearing premise

That the token-level optimization flaw and undifferentiated item tokenization are the main causes of popularity bias rather than other aspects of the generative framework.

What would settle it

A controlled run in which the two proposed fixes are applied yet popularity bias metrics remain unchanged or worsen.

Figures

Figures reproduced from arXiv: 2605.16825 by Bangguo Zhu, Chengqi Zhang, Hao Chen, Jun Yin, Peng Huo, Ruochen Liu, Senzhang Wang, Shirui Pan.

**Figure 1.** Figure 1: a). Comparison of Hit-Rate@10 (i.e., HR@10) between head and tail items. b). Comparison between the number of head and tail items in the recommendation list provided by three GRs. c). Tendency of HR@10 as the backbone parameters of LC-Rec scaling up. a). Trade-off between overall performance and tail performance b). Dilemma between recommendation performance and fairness [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗

**Figure 2.** Figure 2: Limitations of current popularity debiasing methods on GRs. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the Ghost model. First, textual representations are encoded based on item features. After categorizing items into head and tail sets, SKT [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Analysis of SID lengths, including head length [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Numbers of head and tail items in the recommendation results provided by [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Tendency of Ghost performance on Ins dataset, under different AUO weights [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Tendency of Ghost performance on Ins dataset, under different undesired collection sizes. The [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Long tail distribution of the item popularity. [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗

**Figure 9.** Figure 9: Number of tail items that inherit SID prefix from the same head items. To provide a more fine-grained understanding of where the performance improvements originate, we analyze the recommendation results across different popularity segments [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Performance comparison of each equal-sized grouping on Ins dataset. [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: Tendency of Ghost performance on Ins dataset, under different learning rates. The [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Tendency of Ghost performance on Games dataset, under different optimization epochs. The [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

read the original abstract

Recently, Generative Recommenders (GRs), characterized by a unified end-to-end framework, have exhibited astonishing potential in transforming the recommendation paradigm. Despite their effectiveness, we recognize that GRs are still susceptible to the long-standing issue of popularity bias that has pervaded the recommendation community. Although a few studies have attempted to extend traditional debiasing methods to GRs, their effectiveness is marginal, and the fundamental reason why GRs suffer from popularity bias remains under-explored. To bridge this gap, this study focuses on two core aspects in GRs: the optimization of generative framework and the item tokenization based on semantic index. Based on theoretical analyses, we identify that the severe popularity bias emerges from the confluence of a token-level optimization flaw and the undifferentiated property of item tokenization. Accordingly, this study develops a novel generative recommender system, called Ghost, by designing the asymmetric unlikelihood optimization and the skeleton-founded tokenization. Extensive empirical evaluations across three datasets, alongside multiple SOTA baselines, reveal that Ghost substantially alleviates popularity bias and promotes fairer recommendations, while incurring slight degradation to the overall recommendation utility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper pins popularity bias in generative recommenders on token-level optimization and undifferentiated tokenization, then shows Ghost's two fixes cut the bias with only slight utility loss on three datasets.

read the letter

The main point from this paper is that generative recommenders suffer from popularity bias because of a flaw in how they optimize at the token level and because item tokenization treats all items the same. Ghost fixes this with asymmetric unlikelihood optimization and skeleton-founded tokenization, leading to fairer recommendations with only minor loss in overall utility.

What stands out as new is the application of these ideas specifically to generative frameworks, which previous debiasing work hadn't tackled directly. The paper does well in providing both a diagnosis through theoretical analyses and empirical validation on three datasets against multiple baselines. The fact that they report the slight degradation in utility shows some honesty in the evaluation.

On the soft spots, the theoretical analyses are referenced but their depth isn't clear from the abstract alone, so the link between the identified flaws and the bias needs to hold up in the full derivations. The experiments might benefit from more ablation studies to isolate the effect of each component. Nothing looks like a load-bearing flaw, though.

This work is aimed at researchers in information retrieval and recommender systems who are exploring generative models. Readers working on bias mitigation would find the specific techniques relevant.

I think it deserves a serious referee. The problem is real, the approach is targeted, and the results are presented in a way that invites scrutiny. Recommend sending it for peer review.

Referee Report

0 major / 2 minor

Summary. The manuscript claims that popularity bias in Generative Recommenders arises from the combination of a token-level optimization flaw and undifferentiated item tokenization based on semantic indices. It proposes the Ghost system, which introduces asymmetric unlikelihood optimization and skeleton-founded tokenization to mitigate the bias. Experiments across three datasets against multiple SOTA baselines show Ghost substantially reduces popularity bias and improves fairness while incurring only slight degradation in overall recommendation utility.

Significance. If the theoretical diagnosis and empirical results hold, the work offers a principled, targeted intervention for bias in an emerging class of end-to-end generative recommenders rather than extending ad-hoc traditional debiasing techniques. The explicit acknowledgment of the utility-fairness trade-off and the multi-dataset evaluation are strengths that increase the result's credibility and potential impact on the field.

minor comments (2)

[Abstract] Abstract: the phrase 'theoretical analyses' is used to ground the root-cause diagnosis but receives no high-level outline; adding one sentence summarizing the key steps of the analysis would improve accessibility without lengthening the abstract.
[Methodology] The manuscript introduces 'Ghost' as a novel system but does not explicitly contrast its two proposed components against the exact formulations of prior GR optimization and tokenization methods; a short comparative table in the methodology section would clarify the novelty.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the constructive review and positive recommendation for minor revision. The assessment correctly captures the core contributions regarding the diagnosis of popularity bias in generative recommenders and the proposed Ghost framework.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's derivation begins with theoretical analyses of token-level optimization and item tokenization properties, identifies them as root causes of popularity bias, and proposes Ghost with asymmetric unlikelihood optimization plus skeleton-founded tokenization. No equations or steps reduce by construction to fitted inputs renamed as predictions, no self-definitional loops appear, and no load-bearing self-citations or uniqueness theorems imported from the same authors are invoked to force the result. The central claim remains independent of its own outputs, with explicit acknowledgment of the utility-bias trade-off. This is the common case of a self-contained empirical and analytical paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Limited information available from abstract only; no specific free parameters or invented physical entities mentioned.

axioms (1)

domain assumption Theoretical analyses showing the causes of popularity bias in GRs
The diagnosis relies on these analyses.

invented entities (1)

Ghost system no independent evidence
purpose: To cure popularity bias in GRs
New method proposed in the paper.

pith-pipeline@v0.9.1-grok · 5750 in / 1190 out tokens · 39463 ms · 2026-06-30T19:40:48.432790+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Time-Aware Diffusion based on Preference Disentanglement for Generative Recommendation
cs.IR 2026-06 unverdicted novelty 6.0

TDPM is a diffusion-based generative recommender that disentangles user preferences into period and point components to enable time-aware diffusion on semantic indices, reporting up to 29% gains on HR@20 and NDCG@20 o...

Reference graph

Works this paper leans on

48 extracted references · 3 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Deep interest network for click-through rate prediction,

G. Zhou, C. Song, X. Zhu, Y . Fan, H. Zhu, X. Ma, Y . Yan, J. Jin, H. Li, and K. Gai, “Deep interest network for click-through rate prediction,” inProceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

2018
[2]

Deep neural networks for youtube recommendations,

P. Covington, J. Adams, and E. Sargin, “Deep neural networks for youtube recommendations,” inProceedings of the ACM Conference on Recommender Systems, 2016

2016
[3]

Deepinf: Social influence prediction with deep learning,

J. Qiu, J. Tang, H. Ma, Y . Dong, K. Wang, and J. Tang, “Deepinf: Social influence prediction with deep learning,” inProceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

2018
[4]

Recommender systems with generative retrieval,

S. Rajput, N. Mehta, A. Singh, R. H. Keshavan, T. Vu, L. Heldt, L. Hong, Y . Tay, V . Q. Tran, J. Samost, M. Kula, E. H. Chi, and M. Sathiamoorthy, “Recommender systems with generative retrieval,” inProceedings of the International Conference on Neural Information Processing Systems, 2023

2023
[5]

Adapt- ing large language models by integrating collaborative semantics for recommendation,

B. Zheng, Y . Hou, H. Lu, Y . Chen, W. X. Zhao, and M. Chen, “Adapt- ing large language models by integrating collaborative semantics for recommendation,” inProceedings of the IEEE International Conference on Data Engineering, 2024

2024
[6]

Learnable item tokenization for generative recommendation,

W. Wang, H. Bao, X. Lin, J. Zhang, Y . Li, F. Feng, S.-K. Ng, and T.-S. Chua, “Learnable item tokenization for generative recommendation,” in Proceedings of the ACM International Conference on Information and Knowledge Management, 2024

2024
[7]

Unleash llms potential for sequential recommendation by coordinating dual dynamic index mechanism,

J. Yin, Z. Zeng, M. Li, H. Yan, C. Li, W. Han, J. Zhang, R. Liu, H. Sun, W. Deng, F. Sun, Q. Zhang, S. Pan, and S. Wang, “Unleash llms potential for sequential recommendation by coordinating dual dynamic index mechanism,” inProceedings of the ACM on Web Conference, 2025

2025
[8]

Multimodal quantitative language for generative recommendation,

J. Zhai, Z.-F. Mai, C.-D. Wang, F. Yang, X. Zheng, H. Li, and Y . Tian, “Multimodal quantitative language for generative recommendation,” in Proceedings of the International Conference on Learning Representa- tions, 2025

2025
[9]

Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,

X. He, K. Deng, X. Wang, Y . Li, Y . Zhang, and M. Wang, “Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,” inProceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020

2020
[10]

Self-attentive sequential recommenda- tion,

W.-C. Kang and J. McAuley, “Self-attentive sequential recommenda- tion,” inProceedings of the IEEE International Conference on Data Mining, 2018

2018
[11]

Neural discrete representation learning,

A. van den Oord, O. Vinyals, and K. Kavukcuoglu, “Neural discrete representation learning,” inProceedings of the International Conference on Neural Information Processing Systems, 2017

2017
[12]

Autoregressive image generation using residual quantization,

D. Lee, C. Kim, S. Kim, M. Cho, and W.-S. Han, “Autoregressive image generation using residual quantization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2022
[13]

Deepseek-r1 incentivizes reasoning in llms through reinforcement learning,

D. Guo, D. Yang, H. Zhang, J. Song, P. Wanget al., “Deepseek-r1 incentivizes reasoning in llms through reinforcement learning,”Nature, vol. 645, pp. 633–638, 2025

2025
[14]

Llama: Open and efficient foundation language models,

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi `ere, N. Goyal, E. Hambro, and F. Azhar, “Llama: Open and efficient foundation language models,” 2023

2023
[15]

Improving language understanding by generative pre-training,

A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018

2018
[16]

How do recommendation models amplify popularity bias? an analysis from the spectral perspective,

S. Lin, C. Gao, J. Chen, S. Zhou, B. Hu, Y . Feng, C. Chen, and C. Wang, “How do recommendation models amplify popularity bias? an analysis from the spectral perspective,” inProceedings of the ACM International Conference on Web Search and Data Mining, 2025

2025
[17]

Model-agnostic counterfactual reasoning for eliminating popularity bias in recommender system,

T. Wei, F. Feng, J. Chen, Z. Wu, J. Yi, and X. He, “Model-agnostic counterfactual reasoning for eliminating popularity bias in recommender system,” inProceedings of the ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021

2021
[18]

Popularity- opportunity bias in collaborative filtering,

Z. Zhu, Y . He, X. Zhao, Y . Zhang, J. Wang, and J. Caverlee, “Popularity- opportunity bias in collaborative filtering,” inProceedings of the ACM International Conference on Web Search and Data Mining, 2021

2021
[19]

Item-side fairness of large language model-based recommendation system,

M. Jiang, K. Bao, J. Zhang, W. Wang, Z. Yang, F. Feng, and X. He, “Item-side fairness of large language model-based recommendation system,” inProceedings of the ACM Web Conference, 2024

2024
[20]

Causally debiased time-aware recommendation,

L. Wang, C. Ma, X. Wu, Z. Qiu, Y . Zheng, and X. Chen, “Causally debiased time-aware recommendation,” inProceedings of the ACM Web Conference, 2024

2024
[21]

A model-agnostic popularity debias training framework for click-through rate prediction in recommender system,

F. Zhang and Q. Shen, “A model-agnostic popularity debias training framework for click-through rate prediction in recommender system,” in Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

2023
[22]

Personalised reranking of paper recommendations using paper content and user behavior,

X. Li, Y . Chen, B. Pettit, and M. D. Rijke, “Personalised reranking of paper recommendations using paper content and user behavior,”ACM Transactions on Information Systems, vol. 37, no. 3, pp. 1–23, 2019

2019
[23]

Enhancing recommendation diversity by re-ranking with large language models,

D. Carraro and D. Bridge, “Enhancing recommendation diversity by re-ranking with large language models,”ACM Transactions on Recom- mender Systems, vol. 4, no. 2, pp. 1–40, 2025

2025
[24]

Miettinen,Nonlinear multiobjective optimization

K. Miettinen,Nonlinear multiobjective optimization. Springer Science & Business Media, 1999, vol. 12

1999
[25]

Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization,

D. Mahapatra and V . Rajan, “Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization,” in Proceedings of the International Conference on Machine Learning, 2020

2020
[26]

Neural collaborative filtering,

X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T.-S. Chua, “Neural collaborative filtering,” inProceedings of the International Conference on World Wide Web, 2017

2017
[27]

Towards fair large language model-based recommender systems with- out costly retraining,

J. Li, H. Gu, S. Wang, Q. Zhang, S. Yu, C. Wang, X. Xu, and F. Chen, “Towards fair large language model-based recommender systems with- out costly retraining,” inProceedings of the ACM Web Conference, 2026

2026
[28]

Bringing reasoning to generative recommendation through the lens of cascaded ranking,

X. Lin, P. Liu, W. Wang, Y . Hu, C. Xu, F. Feng, Q. Wang, and T.-S. Chua, “Bringing reasoning to generative recommendation through the lens of cascaded ranking,” inProceedings of the ACM Web Conference, 2026

2026
[29]

Qarm: Quantitative alignment multi-modal recommendation at kuaishou,

X. Luo, J. Cao, T. Sun, J. Yu, R. Huang, W. Yuan, H. Lin, Y . Zheng, S. Wang, Q. Hu, C. Qiu, J. Zhang, X. Zhang, Z. Yan, J. Zhang, S. Zhang, M. Wen, Z. Liu, and G. Zhou, “Qarm: Quantitative alignment multi-modal recommendation at kuaishou,” inProceedings of the ACM International Conference on Information and Knowledge Management, 2025

2025
[30]

Mitigating popularity bias in recommendation with unbalanced interactions: A gra- dient perspective,

W. Ren, L. Wang, K. Liu, R. Guo, L. E. Peng, and Y . Fu, “Mitigating popularity bias in recommendation with unbalanced interactions: A gra- dient perspective,” inProceedings of the IEEE International Conference on Data Mining, 2022

2022
[31]

Gradient starvation: A learning proclivity in neural networks,

M. Pezeshki, O. Kaba, Y . Bengio, A. C. Courville, D. Precup, and G. La- joie, “Gradient starvation: A learning proclivity in neural networks,” inProceedings of the International Conference on Neural Information Processing Systems, 2021

2021
[32]

Neural text generation with unlikelihood training,

S. Welleck, I. Kulikov, S. Roller, E. Dinan, K. Cho, and J. Weston, “Neural text generation with unlikelihood training,” inProceedings of the International Conference on Learning Representations, 2020

2020
[33]

Implicit unlikelihood train- ing: Improving neural text generation with reinforcement learning,

E. Lagutin, D. Gavrilov, and P. Kalaidin, “Implicit unlikelihood train- ing: Improving neural text generation with reinforcement learning,” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

2021
[34]

Generative recommendation with semantic ids: A practi- tioner’s handbook,

C. M. Ju, L. Collins, L. Neves, B. Kumar, L. Y . Wang, T. Zhao, and N. Shah, “Generative recommendation with semantic ids: A practi- tioner’s handbook,” inProceedings of the ACM International Conference on Information and Knowledge Management, 2025

2025
[35]

Bias and debias in recommender system: A survey and future directions,

J. Chen, H. Dong, X. Wang, F. Feng, M. Wang, and X. He, “Bias and debias in recommender system: A survey and future directions,”ACM Transactions on Information Systems, vol. 41, no. 3, pp. 1–39, 2023

2023
[36]

Recommendations as treatments: Debiasing learning and evaluation,

T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, and T. Joachims, “Recommendations as treatments: Debiasing learning and evaluation,” inProceedings of the International Conference on Machine Learning, 2016

2016
[37]

Causal inference in recommender systems: A survey and future directions,

C. Gao, Y . Zheng, W. Wang, F. Feng, X. He, and Y . Li, “Causal inference in recommender systems: A survey and future directions,” ACM Transactions on Information Systems,, vol. 42, no. 4, pp. 1–32, 2024

2024
[38]

Causal intervention for leveraging popularity bias in recommendation,

Y . Zhang, F. Feng, X. He, T. Wei, C. Song, G. Ling, and Y . Zhang, “Causal intervention for leveraging popularity bias in recommendation,” inProceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

2021
[39]

An algorithm for vector quantizer design,

Y . Linde, A. Buzo, and R. Gray, “An algorithm for vector quantizer design,”IEEE Transactions on communications, vol. 28, no. 1, pp. 84– 95, 1980

1980
[40]

Taming transformers for high- resolution image synthesis,

P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high- resolution image synthesis,” inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2021, pp. 12 873– 12 883

2021
[41]

Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering,

R. He and J. McAuley, “Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering,” inProceedings of the International Conference on World Wide Web, 2016

2016
[42]

Sinkhorn distances: lightspeed computation of optimal transport,

M. Cuturi, “Sinkhorn distances: lightspeed computation of optimal transport,” inProceedings of the International Conference on Neural Information Processing Systems, vol. 2, 2013, pp. 2292–2300. PREPRINT MANUSCRIPT 12

2013
[43]

Qwen2 Technical Report

A. Yang, B. Yang, B. Hui, B. Zheng, B. Yu, C. Zhou, C. Li, C. Li, D. Liu, F. Huanget al., “Qwen2 technical report,”arXiv preprint arXiv:2407.10671, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[44]

Qwen2.5: A party of foundation models,

Q. Team, “Qwen2.5: A party of foundation models,” September 2024. [Online]. Available: https://qwenlm.github.io/blog/qwen2.5/

2024
[45]

Qwen3 Technical Report

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[46]

Adam: A method for stochastic optimization,

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” inProceedings of the International Conference on Learning Represen- tations, 2015

2015
[47]

Openonerec technical report.arXiv preprint arXiv:2512.24762, 2025a

G. Zhou, H. Bao, J. Huang, J. Deng, J. Zhang, J. She, K. Cai, L. Ren, L. Ren, Q. Luoet al., “Openonerec technical report,”arXiv preprint arXiv:2512.24762, 2025

work page arXiv 2025
[48]

Sprec: Self- play to debias llm-based recommendation,

C. Gao, R. Chen, S. Yuan, K. Huang, Y . Yu, and X. He, “Sprec: Self- play to debias llm-based recommendation,” inProceedings of the ACM on Web Conference, 2025. PREPRINT MANUSCRIPT 13 APPENDIXA NOTATION The notations and corresponding descriptions are summarized in Table IV. TABLE IV SUMMARY OFMATHEMATICAL ANDMODELNOTATIONS Symbol Description Symbol Descr...

2025

[1] [1]

Deep interest network for click-through rate prediction,

G. Zhou, C. Song, X. Zhu, Y . Fan, H. Zhu, X. Ma, Y . Yan, J. Jin, H. Li, and K. Gai, “Deep interest network for click-through rate prediction,” inProceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

2018

[2] [2]

Deep neural networks for youtube recommendations,

P. Covington, J. Adams, and E. Sargin, “Deep neural networks for youtube recommendations,” inProceedings of the ACM Conference on Recommender Systems, 2016

2016

[3] [3]

Deepinf: Social influence prediction with deep learning,

J. Qiu, J. Tang, H. Ma, Y . Dong, K. Wang, and J. Tang, “Deepinf: Social influence prediction with deep learning,” inProceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

2018

[4] [4]

Recommender systems with generative retrieval,

S. Rajput, N. Mehta, A. Singh, R. H. Keshavan, T. Vu, L. Heldt, L. Hong, Y . Tay, V . Q. Tran, J. Samost, M. Kula, E. H. Chi, and M. Sathiamoorthy, “Recommender systems with generative retrieval,” inProceedings of the International Conference on Neural Information Processing Systems, 2023

2023

[5] [5]

Adapt- ing large language models by integrating collaborative semantics for recommendation,

B. Zheng, Y . Hou, H. Lu, Y . Chen, W. X. Zhao, and M. Chen, “Adapt- ing large language models by integrating collaborative semantics for recommendation,” inProceedings of the IEEE International Conference on Data Engineering, 2024

2024

[6] [6]

Learnable item tokenization for generative recommendation,

W. Wang, H. Bao, X. Lin, J. Zhang, Y . Li, F. Feng, S.-K. Ng, and T.-S. Chua, “Learnable item tokenization for generative recommendation,” in Proceedings of the ACM International Conference on Information and Knowledge Management, 2024

2024

[7] [7]

Unleash llms potential for sequential recommendation by coordinating dual dynamic index mechanism,

J. Yin, Z. Zeng, M. Li, H. Yan, C. Li, W. Han, J. Zhang, R. Liu, H. Sun, W. Deng, F. Sun, Q. Zhang, S. Pan, and S. Wang, “Unleash llms potential for sequential recommendation by coordinating dual dynamic index mechanism,” inProceedings of the ACM on Web Conference, 2025

2025

[8] [8]

Multimodal quantitative language for generative recommendation,

J. Zhai, Z.-F. Mai, C.-D. Wang, F. Yang, X. Zheng, H. Li, and Y . Tian, “Multimodal quantitative language for generative recommendation,” in Proceedings of the International Conference on Learning Representa- tions, 2025

2025

[9] [9]

Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,

X. He, K. Deng, X. Wang, Y . Li, Y . Zhang, and M. Wang, “Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,” inProceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020

2020

[10] [10]

Self-attentive sequential recommenda- tion,

W.-C. Kang and J. McAuley, “Self-attentive sequential recommenda- tion,” inProceedings of the IEEE International Conference on Data Mining, 2018

2018

[11] [11]

Neural discrete representation learning,

A. van den Oord, O. Vinyals, and K. Kavukcuoglu, “Neural discrete representation learning,” inProceedings of the International Conference on Neural Information Processing Systems, 2017

2017

[12] [12]

Autoregressive image generation using residual quantization,

D. Lee, C. Kim, S. Kim, M. Cho, and W.-S. Han, “Autoregressive image generation using residual quantization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2022

[13] [13]

Deepseek-r1 incentivizes reasoning in llms through reinforcement learning,

D. Guo, D. Yang, H. Zhang, J. Song, P. Wanget al., “Deepseek-r1 incentivizes reasoning in llms through reinforcement learning,”Nature, vol. 645, pp. 633–638, 2025

2025

[14] [14]

Llama: Open and efficient foundation language models,

H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi `ere, N. Goyal, E. Hambro, and F. Azhar, “Llama: Open and efficient foundation language models,” 2023

2023

[15] [15]

Improving language understanding by generative pre-training,

A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018

2018

[16] [16]

How do recommendation models amplify popularity bias? an analysis from the spectral perspective,

S. Lin, C. Gao, J. Chen, S. Zhou, B. Hu, Y . Feng, C. Chen, and C. Wang, “How do recommendation models amplify popularity bias? an analysis from the spectral perspective,” inProceedings of the ACM International Conference on Web Search and Data Mining, 2025

2025

[17] [17]

Model-agnostic counterfactual reasoning for eliminating popularity bias in recommender system,

T. Wei, F. Feng, J. Chen, Z. Wu, J. Yi, and X. He, “Model-agnostic counterfactual reasoning for eliminating popularity bias in recommender system,” inProceedings of the ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021

2021

[18] [18]

Popularity- opportunity bias in collaborative filtering,

Z. Zhu, Y . He, X. Zhao, Y . Zhang, J. Wang, and J. Caverlee, “Popularity- opportunity bias in collaborative filtering,” inProceedings of the ACM International Conference on Web Search and Data Mining, 2021

2021

[19] [19]

Item-side fairness of large language model-based recommendation system,

M. Jiang, K. Bao, J. Zhang, W. Wang, Z. Yang, F. Feng, and X. He, “Item-side fairness of large language model-based recommendation system,” inProceedings of the ACM Web Conference, 2024

2024

[20] [20]

Causally debiased time-aware recommendation,

L. Wang, C. Ma, X. Wu, Z. Qiu, Y . Zheng, and X. Chen, “Causally debiased time-aware recommendation,” inProceedings of the ACM Web Conference, 2024

2024

[21] [21]

A model-agnostic popularity debias training framework for click-through rate prediction in recommender system,

F. Zhang and Q. Shen, “A model-agnostic popularity debias training framework for click-through rate prediction in recommender system,” in Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

2023

[22] [22]

Personalised reranking of paper recommendations using paper content and user behavior,

X. Li, Y . Chen, B. Pettit, and M. D. Rijke, “Personalised reranking of paper recommendations using paper content and user behavior,”ACM Transactions on Information Systems, vol. 37, no. 3, pp. 1–23, 2019

2019

[23] [23]

Enhancing recommendation diversity by re-ranking with large language models,

D. Carraro and D. Bridge, “Enhancing recommendation diversity by re-ranking with large language models,”ACM Transactions on Recom- mender Systems, vol. 4, no. 2, pp. 1–40, 2025

2025

[24] [24]

Miettinen,Nonlinear multiobjective optimization

K. Miettinen,Nonlinear multiobjective optimization. Springer Science & Business Media, 1999, vol. 12

1999

[25] [25]

Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization,

D. Mahapatra and V . Rajan, “Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization,” in Proceedings of the International Conference on Machine Learning, 2020

2020

[26] [26]

Neural collaborative filtering,

X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T.-S. Chua, “Neural collaborative filtering,” inProceedings of the International Conference on World Wide Web, 2017

2017

[27] [27]

Towards fair large language model-based recommender systems with- out costly retraining,

J. Li, H. Gu, S. Wang, Q. Zhang, S. Yu, C. Wang, X. Xu, and F. Chen, “Towards fair large language model-based recommender systems with- out costly retraining,” inProceedings of the ACM Web Conference, 2026

2026

[28] [28]

Bringing reasoning to generative recommendation through the lens of cascaded ranking,

X. Lin, P. Liu, W. Wang, Y . Hu, C. Xu, F. Feng, Q. Wang, and T.-S. Chua, “Bringing reasoning to generative recommendation through the lens of cascaded ranking,” inProceedings of the ACM Web Conference, 2026

2026

[29] [29]

Qarm: Quantitative alignment multi-modal recommendation at kuaishou,

X. Luo, J. Cao, T. Sun, J. Yu, R. Huang, W. Yuan, H. Lin, Y . Zheng, S. Wang, Q. Hu, C. Qiu, J. Zhang, X. Zhang, Z. Yan, J. Zhang, S. Zhang, M. Wen, Z. Liu, and G. Zhou, “Qarm: Quantitative alignment multi-modal recommendation at kuaishou,” inProceedings of the ACM International Conference on Information and Knowledge Management, 2025

2025

[30] [30]

Mitigating popularity bias in recommendation with unbalanced interactions: A gra- dient perspective,

W. Ren, L. Wang, K. Liu, R. Guo, L. E. Peng, and Y . Fu, “Mitigating popularity bias in recommendation with unbalanced interactions: A gra- dient perspective,” inProceedings of the IEEE International Conference on Data Mining, 2022

2022

[31] [31]

Gradient starvation: A learning proclivity in neural networks,

M. Pezeshki, O. Kaba, Y . Bengio, A. C. Courville, D. Precup, and G. La- joie, “Gradient starvation: A learning proclivity in neural networks,” inProceedings of the International Conference on Neural Information Processing Systems, 2021

2021

[32] [32]

Neural text generation with unlikelihood training,

S. Welleck, I. Kulikov, S. Roller, E. Dinan, K. Cho, and J. Weston, “Neural text generation with unlikelihood training,” inProceedings of the International Conference on Learning Representations, 2020

2020

[33] [33]

Implicit unlikelihood train- ing: Improving neural text generation with reinforcement learning,

E. Lagutin, D. Gavrilov, and P. Kalaidin, “Implicit unlikelihood train- ing: Improving neural text generation with reinforcement learning,” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

2021

[34] [34]

Generative recommendation with semantic ids: A practi- tioner’s handbook,

C. M. Ju, L. Collins, L. Neves, B. Kumar, L. Y . Wang, T. Zhao, and N. Shah, “Generative recommendation with semantic ids: A practi- tioner’s handbook,” inProceedings of the ACM International Conference on Information and Knowledge Management, 2025

2025

[35] [35]

Bias and debias in recommender system: A survey and future directions,

J. Chen, H. Dong, X. Wang, F. Feng, M. Wang, and X. He, “Bias and debias in recommender system: A survey and future directions,”ACM Transactions on Information Systems, vol. 41, no. 3, pp. 1–39, 2023

2023

[36] [36]

Recommendations as treatments: Debiasing learning and evaluation,

T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, and T. Joachims, “Recommendations as treatments: Debiasing learning and evaluation,” inProceedings of the International Conference on Machine Learning, 2016

2016

[37] [37]

Causal inference in recommender systems: A survey and future directions,

C. Gao, Y . Zheng, W. Wang, F. Feng, X. He, and Y . Li, “Causal inference in recommender systems: A survey and future directions,” ACM Transactions on Information Systems,, vol. 42, no. 4, pp. 1–32, 2024

2024

[38] [38]

Causal intervention for leveraging popularity bias in recommendation,

Y . Zhang, F. Feng, X. He, T. Wei, C. Song, G. Ling, and Y . Zhang, “Causal intervention for leveraging popularity bias in recommendation,” inProceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

2021

[39] [39]

An algorithm for vector quantizer design,

Y . Linde, A. Buzo, and R. Gray, “An algorithm for vector quantizer design,”IEEE Transactions on communications, vol. 28, no. 1, pp. 84– 95, 1980

1980

[40] [40]

Taming transformers for high- resolution image synthesis,

P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high- resolution image synthesis,” inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2021, pp. 12 873– 12 883

2021

[41] [41]

Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering,

R. He and J. McAuley, “Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering,” inProceedings of the International Conference on World Wide Web, 2016

2016

[42] [42]

Sinkhorn distances: lightspeed computation of optimal transport,

M. Cuturi, “Sinkhorn distances: lightspeed computation of optimal transport,” inProceedings of the International Conference on Neural Information Processing Systems, vol. 2, 2013, pp. 2292–2300. PREPRINT MANUSCRIPT 12

2013

[43] [43]

Qwen2 Technical Report

A. Yang, B. Yang, B. Hui, B. Zheng, B. Yu, C. Zhou, C. Li, C. Li, D. Liu, F. Huanget al., “Qwen2 technical report,”arXiv preprint arXiv:2407.10671, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[44] [44]

Qwen2.5: A party of foundation models,

Q. Team, “Qwen2.5: A party of foundation models,” September 2024. [Online]. Available: https://qwenlm.github.io/blog/qwen2.5/

2024

[45] [45]

Qwen3 Technical Report

A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[46] [46]

Adam: A method for stochastic optimization,

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” inProceedings of the International Conference on Learning Represen- tations, 2015

2015

[47] [47]

Openonerec technical report.arXiv preprint arXiv:2512.24762, 2025a

G. Zhou, H. Bao, J. Huang, J. Deng, J. Zhang, J. She, K. Cai, L. Ren, L. Ren, Q. Luoet al., “Openonerec technical report,”arXiv preprint arXiv:2512.24762, 2025

work page arXiv 2025

[48] [48]

Sprec: Self- play to debias llm-based recommendation,

C. Gao, R. Chen, S. Yuan, K. Huang, Y . Yu, and X. He, “Sprec: Self- play to debias llm-based recommendation,” inProceedings of the ACM on Web Conference, 2025. PREPRINT MANUSCRIPT 13 APPENDIXA NOTATION The notations and corresponding descriptions are summarized in Table IV. TABLE IV SUMMARY OFMATHEMATICAL ANDMODELNOTATIONS Symbol Description Symbol Descr...

2025