arxiv: 2604.20490 · v1 · submitted 2026-04-22 · 💻 cs.IR

Recognition: unknown

Break the Optimization Barrier of LLM-Enhanced Recommenders: A Theoretical Analysis and Practical Framework

Wei Zhang, Zhangchi Zhu

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:15 UTC · model grok-4.3

classification 💻 cs.IR

keywords LLM-enhanced recommendationoptimization barrierembedding normalizationRec-PCAcollaborative filteringco-occurrence graphangular clusteringcurvature analysis

0 comments

The pith

Normalizing item embeddings and reducing LLM representations with a collaborative co-occurrence graph removes the training barrier in LLM-enhanced recommenders.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that LLM representations injected into recommenders create an optimization barrier through large differences in embedding norms and angular clusters that do not align with collaborative signals. This results in persistently high training losses that standard optimizers cannot reduce effectively. A theoretical analysis of local curvature identifies these two causes, leading to a practical fix called TF-LLMER. The fix applies embedding normalization for stable conditioning and a recommendation-aware reduction called Rec-PCA that incorporates an item-item graph from user interactions to enforce alignment while preserving semantics. Readers care because the approach restores efficient training and yields higher accuracy without added inference costs from LLMs.

Core claim

The central claim is that the optimization barrier arises specifically from norm disparity and semantic-collaboration misaligned angular clustering in LLM representations. The authors derive this through analysis of local optimization curvature and confirm it with experiments. They address it via item embedding normalization, which provably controls conditioning, and Rec-PCA, which performs dimensionality reduction while penalizing total variation over an item-item co-occurrence graph built from interaction histories to align semantic and collaborative structures. Both the theory and results show that this lightweight framework enables effective training and outperforms prior LLM-enhanced方法.

What carries the argument

TF-LLMER framework, whose core mechanisms are embedding normalization to eliminate norm-driven instability and Rec-PCA to inject collaborative structure from an item-item co-occurrence graph into the representation transformation.

Load-bearing premise

The two identified causes of norm disparity and angular misalignment are the primary and sufficient explanations for the optimization barrier, and the interaction-derived co-occurrence graph faithfully represents the collaborative structure needed for alignment.

What would settle it

Train the same LLM-enhanced backbone with and without the normalization plus Rec-PCA steps on multiple public datasets and check whether the training loss drops to the level of non-LLM baselines only when both components are active.

Figures

Figures reproduced from arXiv: 2604.20490 by Wei Zhang, Zhangchi Zhu.

**Figure 1.** Figure 1: Training loss of several methods, including the standard randomly initialized model (denoted as RandInit) and state-of-theart LLM-enhanced methods. The backbone is GRU4Rec. Results on other backbones are provided in [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: The magnitudes of the initial item embeddings for all items. Items are sorted in descending order by magnitude. The initial item embeddings are provided by LLM2Rec. probability. The curvature is effectively concentrated on only a small subset of effective items (i.e., the positive and a small set of hard negatives). Therefore, analyzing the entire item universe yields λmin(Hh) ≈ 0, rendering the condition… view at source ↗

**Figure 5.** Figure 5: Training loss of our method without and with normalization. The backbone is SASRec. Results on other backbones are given in [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 4.** Figure 4: Framework diagram of our method. It includes two key components: item embedding normalization and Rec-PCA. supervised fine-tuning employed in LLM2Rec already tried to adapt LLM representations to the recommendation task, making it more suitable for recommendation than other LLM-enhanced methods. This result indicates that even when training tasks are carefully designed to adapt LLMs for recommendation, der… view at source ↗

**Figure 6.** Figure 6: Training curves of ρ on the training set for our method with Rec-PCA and vanilla PCA, and random initialization. The backbone is SASRec. Full results are in [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Training loss of random initialization, our method with Rec-PCA and vanilla PCA. The backbone is SASRec. Full results are in [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Ablation study of our method, including three variants. We report the results in [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Training loss of several methods, including the standard randomly initialized model (denoted as RandInit) and state-of-the-art LLM-enhanced methods. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗

**Figure 10.** Figure 10: Training curves of ρ on the training set for Rec-PCA, our method with vanilla PCA, and random initialization (denoted as RandInit) across three datasets (Yelp, Sports, and CDs) and three backbones. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_10.png] view at source ↗

**Figure 11.** Figure 11: Training loss of several variants, including randomly initialized models (denoted as RandInit), our method without normalization and Rec-PCA (denoted as TF-LLMER w/o norm Rec-PCA), our method without Rec-PCA (denoted as TF-LLMER w/o Rec-PCA), and the full version of our method (denoted as TF-LLMER). GRU4Rec Bert4Rec SASRec 0.018 0.020 0.022 0.024 0.026 NDCG@10 Yelp w/o norm w/o Rec-PCA TF-LLMER GRU4Rec Be… view at source ↗

**Figure 12.** Figure 12: Ablation study of our method, including three variants. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_12.png] view at source ↗

**Figure 13.** Figure 13: Hyperparameter α. 1 3 5 7 9 K 0.020 0.025 NDCG@10 Yelp 1 3 5 7 9 K 0.0095 0.0100 0.0105 Sports 1 3 5 7 9 K 0.085 0.090 0.095 CDs GRU4Rec Bert4Rec SASRec [PITH_FULL_IMAGE:figures/full_fig_p023_13.png] view at source ↗

**Figure 14.** Figure 14: Hyperparameter K. C.4. Ablation Study [PITH_FULL_IMAGE:figures/full_fig_p023_14.png] view at source ↗

read the original abstract

Large language model (LLM)-enhanced recommendation models inject LLM representations into backbone recommenders to exploit rich item text without inference-time LLM cost. However, we find that existing LLM-enhanced methods significantly hinder the optimization of backbone models, resulting in high training losses that are difficult to reduce. To address it, we establish a comprehensive theoretical analysis of local optimization curvature and identify two key causes: 1) large norm disparity and 2) semantic-collaboration misaligned angular clustering of LLM representations. Guided by these insights, we propose Training-Friendly LLM-Enhanced Recommender (TF-LLMER), a lightweight framework with two key components. First, we highlight the necessity of item embedding normalization to eliminate norm-driven instability and achieve provable control over optimization conditioning. Second, we introduce Rec-PCA, a recommendation-aware dimensionality reduction method that injects collaborative structure into the representation transformation to resolve semantic-collaboration misaligned angular clustering. It jointly optimizes semantic information retention and alignment with an item-item co-occurrence graph constructed from interaction histories. The graph captures collaborative structure, and alignment is promoted by penalizing total variation over the graph. Both theory and extensive experiments demonstrate that TF-LLMER significantly outperforms state-of-the-art methods. Our code is available at https://github.com/woriazzc/TF-LLMER.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper traces optimization stalls in LLM-injected recommenders to norm disparity and angular misalignment, then offers normalization plus a graph-aware Rec-PCA as the fix, but the curvature analysis stays high-level.

read the letter

The main point is that this work pins the training difficulties on two representation properties—large norm gaps and semantic-collaboration angular clustering—and builds a lightweight framework around normalization for conditioning control plus Rec-PCA for alignment. The Rec-PCA stands out because it reduces dimensions while penalizing total variation on an item co-occurrence graph drawn from interaction data, trying to keep LLM semantics intact while adding collaborative structure. That combination of local curvature framing with a concrete, recommendation-specific reduction method is the clearest new piece, and releasing the code is a practical plus that lets others test the claims directly. The motivation flows logically from the diagnosed causes to the two components, and the abstract positions the approach as delivering provable conditioning benefits plus better final performance. On the softer side, the theoretical analysis is described as comprehensive yet the abstract gives no derivations, Hessian bounds, or explicit checks that these two factors dominate over backbone non-convexity or sparsity variance in the graph. The stress-test note about other loss-landscape contributors is reasonable; without that isolation the “provable control” claim rests on the assumption that fixing norms and angles is sufficient. Experiments are said to be extensive and to beat SOTA, but the summary supplies no numbers, baselines, or ablation breakdowns, so the size of the gains and their robustness remain unclear. This is aimed at researchers and engineers already working on hybrid LLM-recommender pipelines who need training to converge reliably. A reader in that niche could pick up the framework and try it quickly thanks to the code. It should go to peer review: the practical problem is genuine, the method is straightforward to implement, and referees can examine the missing derivations and experimental controls in full.

Referee Report

3 major / 2 minor

Summary. The manuscript claims that LLM-enhanced recommenders suffer from an optimization barrier caused by large norm disparity and semantic-collaboration misaligned angular clustering in LLM representations. It provides a theoretical analysis of local optimization curvature to identify these two causes, then introduces the TF-LLMER framework with item embedding normalization (for provable conditioning control) and Rec-PCA (a graph-based dimensionality reduction using item-item co-occurrence from interactions and total variation penalty for alignment). Both the theory and experiments are asserted to show that TF-LLMER significantly outperforms state-of-the-art methods, with code released.

Significance. If the curvature analysis holds and the framework generalizes, the work is significant for the LLM-recommender integration literature: it isolates concrete representation properties that hinder training and supplies a lightweight, theoretically motivated fix. The explicit code release is a strength that supports reproducibility and follow-on work.

major comments (3)

[Theoretical Analysis] Theoretical Analysis section: the local-curvature derivation identifies norm disparity and angular misalignment as the dominant causes but does not quantify or bound their contribution relative to other loss-landscape factors (e.g., non-convexity of the backbone recommender loss, interaction between frozen LLM vectors and trainable embeddings, or sparsity-induced variance in the co-occurrence graph). Without such isolation, the claim that these two factors are primary and sufficient remains unproven.
[Rec-PCA] Rec-PCA subsection: the total-variation penalty on the interaction-derived graph is said to resolve angular misalignment while jointly retaining semantics, yet no explicit objective function, trade-off parameter analysis, or proof that semantic content is not distorted is supplied. This is load-bearing for the “provable control” claim.
[Experiments] Experimental Evaluation: the manuscript asserts extensive experiments and outperformance, but quantitative results (effect sizes, statistical significance), baseline details, ablation studies isolating normalization versus Rec-PCA, and verification that the co-occurrence graph faithfully captures collaborative structure are not presented at a level that allows independent confirmation of the central claims.

minor comments (2)

[Abstract] Abstract: the phrase “provable control over optimization conditioning” should be qualified by the exact statement that is proven (e.g., a bound on the condition number after normalization).
[Notation] Notation: ensure consistent symbols for LLM embeddings, normalized embeddings, and the transformed Rec-PCA outputs across equations and text.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below, providing clarifications and indicating where revisions will be made to strengthen the presentation.

read point-by-point responses

Referee: [Theoretical Analysis] Theoretical Analysis section: the local-curvature derivation identifies norm disparity and angular misalignment as the dominant causes but does not quantify or bound their contribution relative to other loss-landscape factors (e.g., non-convexity of the backbone recommender loss, interaction between frozen LLM vectors and trainable embeddings, or sparsity-induced variance in the co-occurrence graph). Without such isolation, the claim that these two factors are primary and sufficient remains unproven.

Authors: We appreciate the referee's point on the need for stronger isolation. Our local-curvature analysis approximates the Hessian of the recommendation loss with respect to the injected LLM representations and explicitly decomposes the leading eigenvalues into contributions from norm disparity (scaling the gradient magnitude) and angular misalignment (affecting the off-diagonal coupling terms). While the backbone loss is non-convex, the analysis shows these representation-induced terms dominate the condition number in the early training phase. To address the concern, we will add a new paragraph in the Theoretical Analysis section that bounds the relative magnitude of these terms against the backbone Hessian and interaction sparsity effects, using a simplified linear model for illustration. revision: partial
Referee: [Rec-PCA] Rec-PCA subsection: the total-variation penalty on the interaction-derived graph is said to resolve angular misalignment while jointly retaining semantics, yet no explicit objective function, trade-off parameter analysis, or proof that semantic content is not distorted is supplied. This is load-bearing for the “provable control” claim.

Authors: We agree that the mathematical formulation requires explicit presentation. Rec-PCA solves the optimization problem min_W ||X - X W||_F^2 + λ TV(W; G), where X are the LLM embeddings, W is the projection matrix, G is the item-item co-occurrence graph, and TV is the total variation penalty. The hyperparameter λ trades off semantic fidelity (first term) against collaborative alignment (second term). We will insert the full objective, a sensitivity analysis over λ, and a brief argument (leveraging the fact that the graph Laplacian eigenvectors preserve low-frequency semantic clusters) showing that semantic content is not distorted beyond a controllable error bound. These additions will appear in the revised Rec-PCA subsection. revision: yes
Referee: [Experiments] Experimental Evaluation: the manuscript asserts extensive experiments and outperformance, but quantitative results (effect sizes, statistical significance), baseline details, ablation studies isolating normalization versus Rec-PCA, and verification that the co-occurrence graph faithfully captures collaborative structure are not presented at a level that allows independent confirmation of the central claims.

Authors: We acknowledge that the experimental reporting can be strengthened for reproducibility. In the revised manuscript we will expand the Experimental Evaluation section to include: (i) tables reporting relative improvements with 95% confidence intervals and paired t-test p-values; (ii) complete baseline hyperparameter settings and implementation details; (iii) dedicated ablation tables separating the effects of normalization and Rec-PCA; and (iv) an additional figure and analysis comparing the co-occurrence graph's spectrum to random and content-only graphs to confirm it encodes collaborative signals. The code release already contains the full experimental pipeline, which will be further documented. revision: yes

Circularity Check

0 steps flagged

No significant circularity; theoretical analysis and framework are self-contained against external benchmarks.

full rationale

The paper's derivation begins with an external observation of optimization barriers in LLM-enhanced recommenders, followed by a claimed local-curvature analysis that isolates norm disparity and angular misalignment as causes. These motivate normalization (to control conditioning) and Rec-PCA (to align via co-occurrence graph total variation). Neither step reduces by construction to a fitted parameter renamed as prediction, nor relies on self-citation chains or imported uniqueness theorems. The co-occurrence graph is constructed directly from interaction data (independent of the target loss landscape), and performance is evaluated on held-out recommendation metrics. No equations equate the claimed improvement to its own inputs; the analysis remains falsifiable via external datasets and does not smuggle ansatzes via prior self-work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of the local curvature analysis as the right lens for optimization difficulty and on the assumption that an interaction-derived co-occurrence graph supplies an appropriate collaborative signal for representation alignment. No explicit free parameters or invented entities are named in the abstract.

axioms (2)

domain assumption Local optimization curvature analysis correctly diagnoses the sources of high training loss in LLM-enhanced models.
Invoked to justify the two key causes and the design of the fixes.
domain assumption The item-item co-occurrence graph from interaction histories accurately reflects collaborative filtering structure.
Used to define the alignment objective in Rec-PCA.

pith-pipeline@v0.9.0 · 5531 in / 1415 out tokens · 46911 ms · 2026-05-09T23:15:20.883706+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 11 canonical work pages · 3 internal anchors

[1]

2013 , publisher=

Introductory lectures on convex optimization: A basic course , author=. 2013 , publisher=

2013
[2]

International Conference on Machine Learning , pages=

A statistical perspective on distillation , author=. International Conference on Machine Learning , pages=. 2021 , organization=

2021
[3]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Semi-supervised learning with graph learning-convolutional networks , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[4]

arXiv preprint arXiv:2010.04261 , year=

Dissecting hessian: Understanding common structure of hessian in neural networks , author=. arXiv preprint arXiv:2010.04261 , year=

work page arXiv 2010
[5]

2005 , publisher=

Regression diagnostics: Identifying influential data and sources of collinearity , author=. 2005 , publisher=

2005
[6]

New Jersey , volume=

Iterative analysis , author=. New Jersey , volume=. 1962 , publisher=

1962
[7]

The American Mathematical Monthly , volume=

The monotonicity theorem, Cauchy's interlace theorem, and the Courant-Fischer theorem , author=. The American Mathematical Monthly , volume=. 1987 , publisher=

1987
[8]

Proceedings of the 28th ACM international conference on information and knowledge management , pages=

BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer , author=. Proceedings of the 28th ACM international conference on information and knowledge management , pages=
[9]

arXiv preprint arXiv:2402.06216 , year=

Understanding the role of cross-entropy loss in fairly evaluating large language model-based recommendation , author=. arXiv preprint arXiv:2402.06216 , year=

work page arXiv
[10]

Proceedings of the 17th ACM Conference on Recommender Systems , pages=

Turning dross into gold loss: is bert4rec really better than sasrec? , author=. Proceedings of the 17th ACM Conference on Recommender Systems , pages=
[11]

Decoding matters: Addressing amplification bias and homogeneity issue for llm-based recommenda- tion.arXiv preprint arXiv:2406.14900,

Decoding matters: Addressing amplification bias and homogeneity issue for llm-based recommendation , author=. arXiv preprint arXiv:2406.14900 , year=

work page arXiv
[12]

Proceedings of the 17th ACM Conference on Recommender Systems , pages=

Uncovering chatgpt’s capabilities in recommender systems , author=. Proceedings of the 17th ACM Conference on Recommender Systems , pages=
[13]

Reinforced latent reasoning for llm-based recommendation,

Reinforced Latent Reasoning for LLM-based Recommendation , author=. arXiv preprint arXiv:2505.19092 , year=

work page arXiv
[14]

Proceedings of the 33rd ACM International Conference on Information and Knowledge Management , pages=

Large language models enhanced collaborative filtering , author=. Proceedings of the 33rd ACM International Conference on Information and Knowledge Management , pages=
[15]

ICLR , year=

SLMRec: Distilling Large Language Models into Small for Sequential Recommendation , author=. ICLR , year=
[16]

arXiv preprint arXiv:2406.08477 , year=

Improving llms for recommendation with out-of-vocabulary tokens , author=. arXiv preprint arXiv:2406.08477 , year=

work page arXiv
[17]

arXiv preprint arXiv:2503.24289 , year=

Rec-r1: Bridging generative large language models and user-centric recommendation systems via reinforcement learning , author=. arXiv preprint arXiv:2503.24289 , year=

work page arXiv
[18]

2025 IEEE 41st International Conference on Data Engineering (ICDE) , pages=

Darec: A disentangled alignment framework for large language model and recommender system , author=. 2025 IEEE 41st International Conference on Data Engineering (ICDE) , pages=. 2025 , organization=

2025
[19]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Llmemb: Large language model can be a good embedding generator for sequential recommendation , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[20]

Advances in Neural Information Processing Systems , volume=

Llm-esr: Large language models enhancement for long-tailed sequential recommendation , author=. Advances in Neural Information Processing Systems , volume=
[21]

arXiv preprint arXiv:2503.01814 , year=

Llminit: A free lunch from large language models for selective initialization of recommendation , author=. arXiv preprint arXiv:2503.01814 , year=

work page arXiv
[22]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

Llm2rec: Large language models are powerful embedding models for sequential recommendation , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2 , pages=
[23]

Session-based Recommendations with Recurrent Neural Networks

Session-based recommendations with recurrent neural networks , author=. arXiv preprint arXiv:1511.06939 , year=

work page internal anchor Pith review arXiv
[24]

2018 IEEE international conference on data mining (ICDM) , pages=

Self-attentive sequential recommendation , author=. 2018 IEEE international conference on data mining (ICDM) , pages=. 2018 , organization=

2018
[25]

Bert: Pre-training of deep bidirectional transformers for language understanding , author=. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) , pages=

2019
[26]

Semi-Supervised Classification with Graph Convolutional Networks

Semi-supervised classification with graph convolutional networks , author=. arXiv preprint arXiv:1609.02907 , year=

work page internal anchor Pith review arXiv
[27]

Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval , pages=

Lightgcn: Simplifying and powering graph convolution network for recommendation , author=. Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval , pages=
[28]

Proceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining , pages=

How powerful is graph filtering for recommendation , author=. Proceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining , pages=
[29]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

Large Language Model Enhanced Recommender Systems: Methods, Applications and Trends , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2 , pages=
[30]

2024 IEEE 40th International Conference on Data Engineering (ICDE) , pages=

Are id embeddings necessary? whitening pre-trained text embeddings for effective sequential recommendation , author=. 2024 IEEE 40th International Conference on Data Engineering (ICDE) , pages=. 2024 , organization=

2024
[31]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Dual-view whitening on pre-trained text embeddings for sequential recommendation , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[32]

Companion Proceedings of the ACM Web Conference 2024 , pages=

Enhancing sequential recommendation via llm-based semantic embedding learning , author=. Companion Proceedings of the ACM Web Conference 2024 , pages=

2024
[33]

Language models encode collaborative signals in recommendation , author=
[34]

Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=

Alphafuse: Learn id embeddings for sequential recommendation in null space of language embeddings , author=. Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=
[35]

What matters in llm-based feature extractor for recommender? a systematic analysis of prompts, models, and adaptation.arXiv preprint arXiv:2509.14979, 2025

What Matters in LLM-Based Feature Extractor for Recommender? A Systematic Analysis of Prompts, Models, and Adaptation , author=. arXiv preprint arXiv:2509.14979 , year=

work page arXiv
[36]

IEEE transactions on neural networks and learning systems , volume=

A comprehensive survey on graph neural networks , author=. IEEE transactions on neural networks and learning systems , volume=. 2020 , publisher=

2020
[37]

How Powerful are Graph Neural Networks?

How powerful are graph neural networks? , author=. arXiv preprint arXiv:1810.00826 , year=

work page internal anchor Pith review arXiv
[38]

IEEE transactions on neural networks , volume=

The graph neural network model , author=. IEEE transactions on neural networks , volume=. 2008 , publisher=

2008
[39]

Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining , pages=

Heterogeneous graph neural network , author=. Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining , pages=
[40]

ACM Computing Surveys , volume=

Graph neural networks in recommender systems: a survey , author=. ACM Computing Surveys , volume=. 2022 , publisher=

2022
[41]

Recommender systems handbook , pages=

Recommender systems: Techniques, applications, and challenges , author=. Recommender systems handbook , pages=. 2021 , publisher=

2021
[42]

ACM Computing Surveys (CSUR) , volume=

A survey on session-based recommender systems , author=. ACM Computing Surveys (CSUR) , volume=. 2021 , publisher=

2021
[43]

Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval , pages=

Neural graph collaborative filtering , author=. Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval , pages=
[44]

Proceedings of the 30th ACM international conference on information & knowledge management , pages=

How powerful is graph convolution for recommendation? , author=. Proceedings of the 30th ACM international conference on information & knowledge management , pages=
[45]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Extracting low-/high-frequency knowledge from graph neural networks and injecting it into MLPs: an effective GNN-to-MLP distillation framework , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[46]

Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=

Less is more: Reweighting important spectral graph features for recommendation , author=. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=
[47]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

Exploring feature-based knowledge distillation for recommender system: A frequency perspective , author=. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 1 , pages=
[48]

Proceedings of the AAAI conference on artificial intelligence , volume=

Session-based recommendation with graph neural networks , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
[49]

2022 , publisher=

Introduction to graph signal processing , author=. 2022 , publisher=

2022
[50]

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , pages=

On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering , author=. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , pages=
[51]

Proceedings of the 31st ACM international conference on information & knowledge management , pages=

SVD-GCN: A simplified graph convolution paradigm for recommendation , author=. Proceedings of the 31st ACM international conference on information & knowledge management , pages=
[52]

2012 , publisher=

Matrix analysis , author=. 2012 , publisher=

2012
[53]

, author=

Deep learning via hessian-free optimization. , author=. Icml , volume=
[54]

International Conference on Machine Learning , pages=

Practical Gauss-Newton optimisation for deep learning , author=. International Conference on Machine Learning , pages=. 2017 , organization=

2017
[55]

Canadian Journal of Mathematics , volume=

On the maximum principle of Ky Fan , author=. Canadian Journal of Mathematics , volume=. 1957 , publisher=

1957