Breaking the Information Silo: Semantic Personas for Cross-Domain Recommendation
Pith reviewed 2026-06-28 12:52 UTC · model grok-4.3
The pith
SPHERE enables knowledge transfer for recommendations between domains with no shared users or items using LLM-generated semantic personas.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SPHERE enables recommendation knowledge transfer across strictly disjoint domains with no shared users or items by using LLMs to induce a shared behavioral vocabulary, generate structured semantic personas for users, and retrieve behaviorally similar source-domain communities that form a Community Source Persona. This semantic signal is integrated with collaborative signals through a dual-tower architecture and dynamic fusion gate, allowing SPHERE to augment standard recommender backbones. Empirical evaluation across Amazon Books, Goodreads, and Steam demonstrates consistent improvements over NCF, SVD++, and LightGCN baselines under full-ranking evaluation, showing that cross-domain transfer
What carries the argument
The Community Source Persona, which aggregates behaviorally similar source-domain communities identified via LLM-induced semantic personas and a shared behavioral vocabulary.
If this is right
- Cross-domain recommendation is possible without any shared users or items between domains.
- Transfer effectiveness depends on the target domain's structural density and predictive strength more than on semantic proximity to the source.
- Standard collaborative filtering models can be augmented with semantic signals from personas to improve performance.
- The approach maintains interpretability and modularity in the recommendation system.
Where Pith is reading between the lines
- This approach could allow recommendation systems to operate across unrelated platforms without direct data exchange.
- The method's success varying by target domain density suggests prioritizing dense domains for initial applications.
- Extensions could explore combining multiple source domains into the Community Source Persona.
Load-bearing premise
The LLM-induced semantic personas and Community Source Persona accurately reflect transferable behavioral similarities across domains lacking any structural overlap.
What would settle it
Finding no performance gains when applying SPHERE to a sparse target domain would challenge the importance of structural density for effective transfer.
Figures
read the original abstract
Digital platforms increasingly operate as isolated information silos, limiting their ability to construct comprehensive user representations across domains. Cross-domain recommender systems seek to overcome this limitation by transferring knowledge from a source domain to a target domain, yet most existing approaches depend on shared users, shared items, or structurally similar interaction graphs. These assumptions are often unrealistic across independent platforms. We propose SPHERE (Semantic Personas for Heterogeneous cross-domain Recommendation), a design artifact that enables recommendation knowledge transfer across strictly disjoint domains with no shared users or items. Rather than aligning domains through identity or graph structure, SPHERE uses large language models to induce a shared behavioral vocabulary, generate structured semantic personas for users, and retrieve behaviorally similar source-domain communities that form a Community Source Persona. This semantic signal is integrated with collaborative signals through a dual-tower architecture and dynamic fusion gate, allowing SPHERE to augment standard recommender backbones. Empirical evaluation across Amazon Books, Goodreads, and Steam demonstrates consistent improvements over NCF, SVD++, and LightGCN baselines under full-ranking evaluation. The results show that cross-domain transfer effectiveness is not determined solely by semantic proximity between domains; rather, it depends critically on the structural density and native predictive strength of the target domain. The study contributes to information systems research by reframing cross-domain personalization as behavior-based semantic alignment, offering a practical mechanism for overcoming information silos while preserving interpretability and modularity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes SPHERE, a framework for cross-domain recommendation across strictly disjoint domains (no shared users or items) that uses LLMs to induce a shared behavioral vocabulary, generate structured semantic personas, retrieve behaviorally similar source-domain communities to form a Community Source Persona, and integrate this signal with collaborative filtering via a dual-tower architecture and dynamic fusion gate. It reports consistent empirical improvements over NCF, SVD++, and LightGCN under full-ranking evaluation on Amazon Books, Goodreads, and Steam, plus a secondary finding that transfer effectiveness depends on target-domain density rather than semantic proximity.
Significance. If the central claims hold after verification, SPHERE would provide a practical mechanism for knowledge transfer in recommendation systems without requiring overlapping entities, reframing cross-domain personalization around behavior-based semantic alignment and offering modularity and interpretability advantages over graph-alignment methods.
major comments (2)
- [Experiments] Experiments section: the reported improvements over baselines lack an ablation that replaces LLM-retrieved Community Source Personas with randomly sampled source communities of equal size (or non-semantic matching criteria). Without this test, it is impossible to confirm that the semantic retrieval step is load-bearing for the gains rather than the dual-tower + fusion gate architecture alone.
- [Method] Method section: the description of persona generation and Community Source Persona retrieval provides no implementation details (e.g., exact LLM prompts, retrieval similarity metric, or community size selection criteria), preventing assessment of whether the semantic component reliably captures transferable behavioral similarity as claimed.
minor comments (1)
- [Abstract] Abstract and evaluation description: the claim of 'consistent improvements' and the density observation are stated without reporting effect sizes, statistical significance tests, or error bars, which should be added for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will incorporate revisions to strengthen the paper.
read point-by-point responses
-
Referee: [Experiments] Experiments section: the reported improvements over baselines lack an ablation that replaces LLM-retrieved Community Source Personas with randomly sampled source communities of equal size (or non-semantic matching criteria). Without this test, it is impossible to confirm that the semantic retrieval step is load-bearing for the gains rather than the dual-tower + fusion gate architecture alone.
Authors: We agree that an ablation isolating the semantic retrieval component is necessary to substantiate our claims. In the revised manuscript, we will add an ablation study replacing the LLM-retrieved Community Source Personas with randomly sampled source communities of equal size (and non-semantic matching where feasible) while keeping the dual-tower and fusion gate fixed. This will directly test whether the semantic step drives the observed gains. revision: yes
-
Referee: [Method] Method section: the description of persona generation and Community Source Persona retrieval provides no implementation details (e.g., exact LLM prompts, retrieval similarity metric, or community size selection criteria), preventing assessment of whether the semantic component reliably captures transferable behavioral similarity as claimed.
Authors: We acknowledge that the current method description lacks sufficient implementation details for full reproducibility and verification. In the revised version, we will expand the Method section (or add an appendix) with the exact LLM prompts used for persona generation, the retrieval similarity metric (cosine similarity on sentence embeddings), and the community size selection criteria (e.g., top-k based on behavioral similarity thresholds). revision: yes
Circularity Check
No significant circularity in methodological proposal
full rationale
The paper presents SPHERE as an empirical design artifact relying on LLM-induced semantic personas, community retrieval, dual-tower fusion, and standard backbones evaluated on Amazon/Goodreads/Steam datasets. No equations, fitted parameters, or self-citations are invoked as load-bearing derivations; the method description treats LLM capabilities and neural architectures as external primitives. The central transfer claim is tested via full-ranking improvements rather than reducing to input definitions or prior author results by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Large language models can induce a shared behavioral vocabulary from user interactions across unrelated domains that supports accurate persona generation and community retrieval.
invented entities (2)
-
Semantic Personas
no independent evidence
-
Community Source Persona
no independent evidence
Reference graph
Works this paper leans on
-
[1]
International Journal of Machine Learning and Cybernetics , volume=
Cross-domain sequential recommendation: An attention and temporal-aware approach , author=. International Journal of Machine Learning and Cybernetics , volume=. 2026 , doi=
2026
-
[2]
Recommender Systems Handbook , pages=
Cross-domain recommender systems , author=. Recommender Systems Handbook , pages=. 2015 , publisher=
2015
-
[3]
Information , volume=
End-to-End Personalization via Unifying LLM Agents and Graph Attention Networks for Entertainment Recommendation , author=. Information , volume=. 2026 , publisher=
2026
-
[4]
Proceedings of the Spanish Conference on Information Retrieval , volume=
Cross-domain recommender systems: A survey of the state of the art , author=. Proceedings of the Spanish Conference on Information Retrieval , volume=
-
[5]
Proceedings of the
Prompt-enhanced federated content representation learning for cross-domain recommendation , author=. Proceedings of the
-
[6]
2026 , publisher=
Guo, Lei and Yang, Ting and Yu, Xu and Han, Xiaohui and Jiang, Guiyuan and Liu, Hui , journal=. 2026 , publisher=
2026
-
[7]
IEEE Transactions on Neural Networks and Learning Systems , volume=
Knowledge-reinforced cross-domain recommendation , author=. IEEE Transactions on Neural Networks and Learning Systems , volume=. 2024 , publisher=
2024
-
[8]
IEEE Data Engineering Bulletin , year=
User modeling in the era of large language models: Current research and future directions , author=. IEEE Data Engineering Bulletin , year=
-
[9]
Ning, Lin and Liu, Luyang and Wu, Jiaxing and Wu, Neo and Berlowitz, Devora and Prakash, Sushant and Green, Bradley and O'Banion, Shawn and Xie, Jun , booktitle=
-
[10]
Bridging Language and Items for Retrieval and Recommendation: Benchmarking LLMs as Semantic Encoders
Bridging language and items for retrieval and recommendation , author=. arXiv preprint arXiv:2403.03952 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
Proceedings of the 2018 IEEE International Conference on Data Mining (
Self-attentive sequential recommendation , author=. Proceedings of the 2018 IEEE International Conference on Data Mining (. 2018 , organization=
2018
-
[12]
Proceedings of the 12th
Item recommendation on monotonic behavior chains , author=. Proceedings of the 12th
-
[13]
2018 , howpublished =
Wan, Mengting and McAuley, Julian , title =. 2018 , howpublished =
2018
-
[14]
2018 , howpublished =
Kang, Wang-Cheng and McAuley, Julian , title =. 2018 , howpublished =
2018
-
[15]
2023 , howpublished =
Hou, Yupeng and Li, Jiacheng and He, Zhankui and Yan, An and Chen, Xiusi and McAuley, Julian , title =. 2023 , howpublished =
2023
-
[16]
Text and Code Embeddings by Contrastive Pre-Training
Text and code embeddings by contrastive pre-training , author=. arXiv preprint arXiv:2201.10005 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[17]
Proceedings of the 32nd
Multi-domain recommendation with embedding disentangling and domain alignment , author=. Proceedings of the 32nd
-
[18]
How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings , author=. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) , pages=
2019
-
[19]
Recommender Systems Handbook , pages=
Advances in collaborative filtering , author=. Recommender Systems Handbook , pages=. 2021 , publisher=
2021
-
[20]
ACM Transactions on Information Systems , volume=
Federated semantic learning for privacy-preserving cross-domain recommendation , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=
2025
-
[21]
Proceedings of the 26th International Joint Conference on Artificial Intelligence (
Cross-domain recommendation: An embedding and mapping approach , author=. Proceedings of the 26th International Joint Conference on Artificial Intelligence (
-
[22]
MIS Quarterly , volume=
Using retweets when shaping our online persona: A topic modeling approach , author=. MIS Quarterly , volume=. 2019 , publisher=
2019
-
[23]
Generating personas using
Schuller, Andreas and Janssen, Doris and Blumenr. Generating personas using. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems , pages=
-
[24]
Proceedings of the 26th
On sampled metrics for item recommendation , author=. Proceedings of the 26th
-
[25]
Advances in Neural Information Processing Systems , volume=
Simplify and robustify negative sampling for implicit collaborative filtering , author=. Advances in Neural Information Processing Systems , volume=
-
[26]
Advances in Neural Information Processing Systems , volume=
Matryoshka representation learning , author=. Advances in Neural Information Processing Systems , volume=
-
[27]
ACM Computing Surveys , volume=
Cross domain recommender systems: A systematic literature review , author=. ACM Computing Surveys , volume=. 2017 , publisher=
2017
-
[28]
2025 , publisher=
Shehmir, Sarama and Kashef, Rasha , journal=. 2025 , publisher=
2025
-
[29]
Grattafiori, Aaron and Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Vaughan, Alex and others , journal=. The
-
[30]
User Modeling and User-Adapted Interaction , volume=
Facebook single and cross domain data for recommendation systems , author=. User Modeling and User-Adapted Interaction , volume=. 2013 , publisher=
2013
-
[31]
Proceedings of the International Conference on User Modeling, Adaptation, and Personalization , pages=
Tags as bridges between domains: Improving recommendation with tag-induced cross-domain collaborative filtering , author=. Proceedings of the International Conference on User Modeling, Adaptation, and Personalization , pages=. 2011 , organization=
2011
-
[32]
Proceedings of the 48th International
You are what you bought: Generating customer personas for e-commerce applications , author=. Proceedings of the 48th International
-
[33]
Computers and Electrical Engineering , volume=
Cross-domain recommender systems via multimodal domain adaptation , author=. Computers and Electrical Engineering , volume=. 2025 , publisher=
2025
-
[34]
Proceedings of the 48th International
Enhancing cross-domain recommendation with plug-in contrastive representations from large language models , author=. Proceedings of the 48th International
-
[35]
Proceedings of the 18th
A pre-trained zero-shot sequential recommendation framework via popularity dynamics , author=. Proceedings of the 18th
-
[36]
ACM Transactions on Information Systems , volume=
A survey on cross-domain recommendation: Taxonomies, methods, and future directions , author=. ACM Transactions on Information Systems , volume=. 2022 , publisher=
2022
-
[37]
Proceedings of the 32nd
Sequential recommendation via an adaptive cross-domain knowledge decomposition , author=. Proceedings of the 32nd
-
[38]
From reviews to preference profiles:
Azam, Awais and Sarfraz, Muhammad Shahzad and Zaman, Qamar Uz and Cheema, Adeel Ashraf and Ali, Aitizaz and Talpur, Bandeh Ali , journal=. From reviews to preference profiles:. 2026 , publisher=
2026
-
[39]
Proceedings of the 18th
Instructing and prompting large language models for explainable cross-domain recommendations , author=. Proceedings of the 18th
-
[40]
Companion Proceedings of the
Uncovering cross-domain recommendation ability of large language models , author=. Companion Proceedings of the
-
[41]
Applied Intelligence , volume=
User profile as a bridge in cross-domain recommender systems for sparsity reduction , author=. Applied Intelligence , volume=. 2019 , publisher=
2019
-
[42]
ACM Transactions on Information Systems , volume=
Understanding before recommendation: Semantic aspect-aware review exploitation via large language models , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=
2025
-
[43]
2025 , publisher=
Xin, Haoran and Sun, Ying and Wang, Chao and Xiong, Hui , journal=. 2025 , publisher=
2025
-
[44]
Proceedings of the
Rethinking cross-domain sequential recommendation under open-world assumptions , author=. Proceedings of the
-
[45]
Hou, Min and Liu, Xin and Wu, Le and He, Chenyi and Liu, Hao and Li, Zhi and Li, Xin and Wei, Si , booktitle=
-
[46]
Hadad, Guy and Roitman, Haggai and Eshel, Yotam and Shapira, Bracha and Rokach, Lior , booktitle=
-
[47]
IEEE Transactions on Knowledge and Data Engineering , volume=
Making non-overlapping matters: An unsupervised alignment enhanced cross-domain cold-start recommendation , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2024 , publisher=
2024
-
[48]
ACM Transactions on Information Systems , volume=
One model for all: Large language models are domain-agnostic recommendation systems , author=. ACM Transactions on Information Systems , volume=. 2026 , publisher=
2026
-
[49]
Proceedings of the 48th International
Bridge the domains: Large language models enhanced cross-domain sequential recommendation , author=. Proceedings of the 48th International
-
[50]
Knowledge-Based Systems , volume=
Extracting latently overlapping users by graph neural network for non-overlapping cross-domain recommendation , author=. Knowledge-Based Systems , volume=. 2024 , publisher=
2024
-
[51]
ACM Transactions on Recommender Systems , volume=
A multi-view graph contrastive learning framework for cross-domain sequential recommendation , author=. ACM Transactions on Recommender Systems , volume=. 2025 , publisher=
2025
-
[52]
IEEE Transactions on Knowledge and Data Engineering , volume=
Cross-domain recommendation via progressive structural alignment , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2023 , publisher=
2023
-
[53]
Federated Learning: Privacy and Incentive , pages=
Federated recommendation systems , author=. Federated Learning: Privacy and Incentive , pages=. 2020 , publisher=
2020
-
[54]
ACM Transactions on the Web , volume=
Cross-domain transfer of valence preferences via a meta-optimization approach , author=. ACM Transactions on the Web , volume=. 2025 , publisher=
2025
-
[55]
Proceedings of the 29th International Joint Conference on Artificial Intelligence (
A graphical and attentional framework for dual-target cross-domain recommendation , author=. Proceedings of the 29th International Joint Conference on Artificial Intelligence (
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.