F-GRPO factorizes group-relative policy optimization into generation and ranking phases within one autoregressive sequence, using order-invariant coverage and position-aware utility rewards to improve top-ranked performance on recommendation and multi-hop QA tasks.
hub
Session-based Recommendations with Recurrent Neural Networks
37 Pith papers cite this work. Polarity classification is still indexing.
abstract
We apply recurrent neural networks (RNN) on a new domain, namely recommender systems. Real-life recommender systems often face the problem of having to base recommendations only on short session-based data (e.g. a small sportsware website) instead of long user histories (as in the case of Netflix). In this situation the frequently praised matrix factorization approaches are not accurate. This problem is usually overcome in practice by resorting to item-to-item recommendations, i.e. recommending similar items. We argue that by modeling the whole session, more accurate recommendations can be provided. We therefore propose an RNN-based approach for session-based recommendations. Our approach also considers practical aspects of the task and introduces several modifications to classic RNNs such as a ranking loss function that make it more viable for this specific problem. Experimental results on two data-sets show marked improvements over widely used approaches.
hub tools
claims ledger
- abstract We apply recurrent neural networks (RNN) on a new domain, namely recommender systems. Real-life recommender systems often face the problem of having to base recommendations only on short session-based data (e.g. a small sportsware website) instead of long user histories (as in the case of Netflix). In this situation the frequently praised matrix factorization approaches are not accurate. This problem is usually overcome in practice by resorting to item-to-item recommendations, i.e. recommending similar items. We argue that by modeling the whole session, more accurate recommendations can be pro
co-cited works
representative citing papers
AWARE augments generative next-POI recommendation with LLM agents that produce user-anchored narratives capturing events, culture, and trends, delivering up to 12.4% relative gains on three real datasets.
OSA improves LLM-based recommenders by anchoring ordinal preference levels as numeric tokens in the model's latent space to retain fine-grained strength information when fusing collaborative signals.
SUIN improves CTR prediction by augmenting target user sequences with similar users' behaviors via embedding-based retrieval, user-specific position encoding, and user-aware target attention.
Beam-search negatives induce partial AUC optimization in GRPO for LLM recommenders; Windowed Partial AUC and TAWin improve Top-K alignment on four datasets.
TF-LLMER resolves optimization barriers in LLM-enhanced recommenders through embedding normalization and Rec-PCA that aligns semantic representations with collaborative co-occurrence graphs.
AdaTTA is an actor-critic RL framework that selects sequence-specific test-time augmentations and improves recommendation metrics by up to 26% over fixed augmentation strategies on four datasets.
RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.
FAERec fuses collaborative ID embeddings with LLM semantic embeddings using adaptive gating and dual-level alignment to enhance tail-item sequential recommendations.
HSTU-based generative recommenders with 1.5 trillion parameters scale as a power law with compute up to GPT-3 scale, outperform baselines by up to 65.8% NDCG, run 5-15x faster than FlashAttention2 on long sequences, and improve online A/B metrics by 12.4%.
RRCM trains an LLM to dynamically retrieve from collaborative and meta memories using group relative policy optimization driven by final top-k recommendation quality.
A simple graph heuristic without training or sequence encoders matches or outperforms trained generative recommenders on 10 of 14 sequential recommendation benchmarks by exploiting local transition and feature shortcuts.
BLUE aligns LLM-generated textual user profiles with embedding-based recommendation objectives via reinforcement learning and next-item text supervision, yielding better zero-shot performance and cross-domain transfer than baselines.
PA-Bridge bridges passive conversation starter recommendations with active user expressions via adversarial distribution alignment and semantic discretization, yielding 0.54% higher feature penetration in online tests.
DynamicPO prevents preference optimization collapse in multi-negative DPO by adaptively selecting boundary-critical negatives and calibrating per-sample optimization strength, yielding higher recommendation accuracy on three public datasets.
Fair re-ranking is equivalent to gradient descent on a ranking manifold under Walrasian equilibrium in an attention market, yielding the ManifoldRank algorithm that adjusts gradients for supply-side fairness costs and demand-side score predictions.
BITRec improves generative multi-behavior recommendation by modeling behavioral intensity via separated pathways and transitions via learnable relation matrices, reporting 15-23% gains on large retail datasets.
WPGRec is a new sequential recommender that performs multi-scale temporal modeling via stationary wavelet packets and injects high-order collaborative information through scale-aligned graph propagation with energy-aware gated fusion.
GraphRAG-IRL fuses graph-grounded MaxEnt IRL pre-ranking with persona-guided LLM re-ranking to deliver up to 16.8% NDCG@10 gains over IRL-only baselines on MovieLens and consistent 4-6% gains on KuaiRand.
MLTFR combines user-guided token filtering with a multi-LLM mixture-of-experts and Fisher-weighted consensus expert to deliver stable gains in corpus-free sequential recommendation.
SF-UBM enables privacy-preserving cross-domain LLM recommendation by federating semantic item representations, distilling domain knowledge, and aligning preferences into LLM soft prompts.
BDPL improves heterogeneous sequential recommendation by constructing behavior-aware subgraphs, aggregating via cascade GNN, and enhancing representations with preference-level contrastive learning before adaptive fusion for target behavior prediction.
RoTE is a multi-level rotary time embedding module that explicitly models time spans in sequential recommendation and improves NDCG@5 by up to 20.11% when added to standard backbones on public benchmarks.
MOSAIC decomposes user intent into three orthogonal components via a triple-encoder architecture with adversarial training and dynamic gating to outperform baselines in multi-domain session recommendations.
citing papers explorer
-
F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking
F-GRPO factorizes group-relative policy optimization into generation and ranking phases within one autoregressive sequence, using order-invariant coverage and position-aware utility rewards to improve top-ranked performance on recommendation and multi-hop QA tasks.
-
Why Users Go There: World Knowledge-Augmented Generative Next POI Recommendation
AWARE augments generative next-POI recommendation with LLM agents that produce user-anchored narratives capturing events, culture, and trends, delivering up to 12.4% relative gains on three real datasets.
-
Every Preference Has Its Strength: Injecting Ordinal Semantics into LLM-Based Recommenders
OSA improves LLM-based recommenders by anchoring ordinal preference levels as numeric tokens in the model's latent space to retain fine-grained strength information when fusing collaborative signals.
-
Similar Users-Augmented Interest Network
SUIN improves CTR prediction by augmenting target user sequences with similar users' behaviors via embedding-based retrieval, user-specific position encoding, and user-aware target attention.
-
Objective Shaping with Hard Negatives: Windowed Partial AUC Optimization for RL-based LLM Recommenders
Beam-search negatives induce partial AUC optimization in GRPO for LLM recommenders; Windowed Partial AUC and TAWin improve Top-K alignment on four datasets.
-
Break the Optimization Barrier of LLM-Enhanced Recommenders: A Theoretical Analysis and Practical Framework
TF-LLMER resolves optimization barriers in LLM-enhanced recommenders through embedding normalization and Rec-PCA that aligns semantic representations with collaborative co-occurrence graphs.
-
Beyond One-Size-Fits-All: Adaptive Test-Time Augmentation for Sequential Recommendation
AdaTTA is an actor-critic RL framework that selects sequence-specific test-time augmentations and improves recommendation metrics by up to 26% over fixed augmentation strategies on four datasets.
-
Retrieval Augmented Conversational Recommendation with Reinforcement Learning
RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.
-
Fusion and Alignment Enhancement with Large Language Models for Tail-item Sequential Recommendation
FAERec fuses collaborative ID embeddings with LLM semantic embeddings using adaptive gating and dual-level alignment to enhance tail-item sequential recommendations.
-
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations
HSTU-based generative recommenders with 1.5 trillion parameters scale as a power law with compute up to GPT-3 scale, outperform baselines by up to 65.8% NDCG, run 5-15x faster than FlashAttention2 on long sequences, and improve online A/B metrics by 12.4%.
-
RRCM: Ranking-Driven Retrieval over Collaborative and Meta Memories for LLM Recommendation
RRCM trains an LLM to dynamically retrieve from collaborative and meta memories using group relative policy optimization driven by final top-k recommendation quality.
-
An Embarrassingly Simple Graph Heuristic Reveals Shortcut-Solvable Benchmarks for Sequential Recommendation
A simple graph heuristic without training or sequence encoders matches or outperforms trained generative recommenders on 10 of 14 sequential recommendation benchmarks by exploiting local transition and feature shortcuts.
-
Bridging Textual Profiles and Latent User Embeddings for Personalization
BLUE aligns LLM-generated textual user profiles with embedding-based recommendation objectives via reinforcement learning and next-item text supervision, yielding better zero-shot performance and cross-domain transfer than baselines.
-
Bridging Passive and Active: Enhancing Conversation Starter Recommendation via Active Expression Modeling
PA-Bridge bridges passive conversation starter recommendations with active user expressions via adversarial distribution alignment and semantic discretization, yielding 0.54% higher feature penetration in online tests.
-
DynamicPO: Dynamic Preference Optimization for Recommendation
DynamicPO prevents preference optimization collapse in multi-negative DPO by adaptively selecting boundary-critical negatives and calibrating per-sample optimization strength, yielding higher recommendation accuracy on three public datasets.
-
The Attention Market: Interpreting Online Fair Re-ranking as Manifold Optimization under Walrasian Equilibrium
Fair re-ranking is equivalent to gradient descent on a ranking manifold under Walrasian equilibrium in an attention market, yielding the ManifoldRank algorithm that adjusts gradients for supply-side fairness costs and demand-side score predictions.
-
Modeling Behavioral Intensity and Transitions for Generative Recommendation
BITRec improves generative multi-behavior recommendation by modeling behavioral intensity via separated pathways and transitions via learnable relation matrices, reporting 15-23% gains on large retail datasets.
-
WPGRec: Wavelet Packet Guided Graph Enhanced Sequential Recommendation
WPGRec is a new sequential recommender that performs multi-scale temporal modeling via stationary wavelet packets and injects high-order collaborative information through scale-aligned graph propagation with energy-aware gated fusion.
-
GraphRAG-IRL: Personalized Recommendation with Graph-Grounded Inverse Reinforcement Learning and LLM Re-ranking
GraphRAG-IRL fuses graph-grounded MaxEnt IRL pre-ranking with persona-guided LLM re-ranking to deliver up to 16.8% NDCG@10 gains over IRL-only baselines on MovieLens and consistent 4-6% gains on KuaiRand.
-
Multi-LLM Token Filtering and Routing for Sequential Recommendation
MLTFR combines user-guided token filtering with a multi-LLM mixture-of-experts and Fisher-weighted consensus expert to deliver stable gains in corpus-free sequential recommendation.
-
Federated User Behavior Modeling for Privacy-Preserving LLM Recommendation
SF-UBM enables privacy-preserving cross-domain LLM recommendation by federating semantic item representations, distilling domain knowledge, and aligning preferences into LLM soft prompts.
-
Behavior-Aware Dual-Channel Preference Learning for Heterogeneous Sequential Recommendation
BDPL improves heterogeneous sequential recommendation by constructing behavior-aware subgraphs, aggregating via cascade GNN, and enhancing representations with preference-level contrastive learning before adaptive fusion for target behavior prediction.
-
RoTE: Coarse-to-Fine Multi-Level Rotary Time Embedding for Sequential Recommendation
RoTE is a multi-level rotary time embedding module that explicitly models time spans in sequential recommendation and improves NDCG@5 by up to 20.11% when added to standard backbones on public benchmarks.
-
MOSAIC: Multi-Domain Orthogonal Session Adaptive Intent Capture for Prescient Recommendations
MOSAIC decomposes user intent into three orthogonal components via a triple-encoder architecture with adversarial training and dynamic gating to outperform baselines in multi-domain session recommendations.
-
ReRec: Reasoning-Augmented LLM-based Recommendation Assistant via Reinforcement Fine-tuning
ReRec uses reinforcement fine-tuning with dual-graph reward shaping, reasoning-aware advantage estimation, and online curriculum scheduling to improve LLM reasoning and performance in recommendation tasks.
-
Leveraging LLMs and Heterogeneous Knowledge Graphs for Persona-Driven Session-Based Recommendation
A persona-driven SBRS framework learns unsupervised user personas from an LLM-initialized heterogeneous KG and incorporates them into data-driven sequential recommenders, reporting consistent gains over session-history baselines on Amazon Books and Movies & TV.
-
From Clues to Generation: Language-Guided Conditional Diffusion for Cross-Domain Recommendation
LGCD creates pseudo-overlapping user data via LLM reasoning and uses conditional diffusion to generate target-domain user representations for inter-domain sequential recommendation without real overlapping users.
-
Pay Attention to Sequence Split: Uncovering the Impacts of Sub-Sequence Splitting on Sequential Recommendation Models
Sub-sequence splitting interferes with fair evaluation in sequential recommendation models and enhances performance only when paired with particular splitting, targeting, and loss function choices.
-
FLAME: Condensing Ensemble Diversity into a Single Network for Efficient Sequential Recommendation
FLAME condenses ensemble diversity into a single network via modular ensemble simulation and guided mutual learning during training, delivering ensemble-level performance with single-network inference speed on sequential recommendation tasks.
-
HSUGA: LLM-Enhanced Recommendation with Hierarchical Semantic Understanding and Group-Aware Alignment
HSUGA improves LLM-enhanced sequential recommendation via staged hierarchical semantic understanding for better preference extraction and group-aware alignment that varies intensity by user activity level.
-
TwiSTAR:Think Fast, Think Slow, Then Act,Generative Recommendation with Adaptive Reasoning
TwiSTAR learns to switch between fast SID retrieval and slow rationale-generating reasoning in generative recommendation, yielding better accuracy-latency trade-offs on three datasets.
-
Compressed Video Aggregator: Content-driven Module for Efficient Micro-Video Recommendation
CVA aggregates frozen VFM embeddings via latent reasoning to create compact video embeddings for efficient micro-video recommendation, delivering consistent performance gains and orders-of-magnitude efficiency improvements.
-
From Hidden Profiles to Governable Personalization: Recommender Systems in the Age of LLM Agents
LLM agents enable a shift in recommender systems from opaque hidden profiles to governable, inspectable, and portable user representations.
-
SLSREC: Self-Supervised Contrastive Learning for Adaptive Fusion of Long- and Short-Term User Interests
SLSRec disentangles long- and short-term user interests via self-supervised contrastive learning and fuses them adaptively with attention, outperforming prior models on three public recommendation benchmarks.
-
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
OneRec unifies retrieval and ranking in a generative recommender using session-wise decoding and iterative DPO-based preference alignment, achieving real-world gains on Kuaishou.
-
Driving Engagement in Daily Fantasy Sports with a Scalable and Urgency-Aware Ranking Engine
An urgency-aware adaptation of the Deep Interest Network with temporal encodings and listwise neuralNDCG loss delivers a 9% nDCG@1 lift over an optimized LightGBM baseline on a 650k-user industrial DFS dataset.
-
TME-PSR: Time-aware, Multi-interest, and Explanation Personalization for Sequential Recommendation
TME-PSR improves sequential recommendation accuracy and explanation quality by personalizing temporal rhythms, fine-grained interests, and recommendation-explanation alignment using a dual-view time encoder, multihead LRU, and dual-branch mutual information weighting.