pith. sign in

arxiv: 2606.13533 · v2 · pith:NGOC4W5Bnew · submitted 2026-06-11 · 💻 cs.IR

OneRetrieval: Unifying Multi-Branch E-commerce Retrieval with an Editable Generative Model

Pith reviewed 2026-06-27 05:19 UTC · model grok-4.3

classification 💻 cs.IR
keywords e-commerce retrievalgenerative retrievaleditable generative modelinverted indexkeyword-aligned encodingmulti-branch retrievalcodebook slotsreal-time editability
0
0 comments X

The pith

OneRetrieval unifies multi-branch e-commerce retrieval into one generative model that preserves the inverted index's real-time editability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents OneRetrieval as a generative retrieval framework that collapses the multi-branch retrieval stage of industrial e-commerce search into a single model. It achieves this through Keyword-Aligned Encoding, which assigns each code position to an interpretable attribute word, combined with reserved codebook slots that accept new terms after deployment. The method maintains recall quality comparable to strong baselines while enabling edits without retraining, addressing why the inverted-index branch has persisted despite lower average performance. On real traffic data the system shows high intervention hit rates and online gains in order volume and CTR when replacing or extending existing branches.

Core claim

OneRetrieval is the first editable generative retrieval method that pairs competitive recall quality with the editability of the inverted index, achieved via Keyword-Aligned Encoding that ties each identifier position to an interpretable attribute word. An information-theoretic merging organizes 18 attribute categories into six codebook groups with non-uniform capacity; reserved slots in each codebook can be bound to new words after deployment without retraining; and a four-stage fine-tuning pipeline secures quality and editability jointly. On five million real-traffic requests it matches the deep recall of the strongest generative baseline while delivering an intervention hit rate over an o

What carries the argument

Keyword-Aligned Encoding (KAE), which ties each identifier position to an interpretable attribute word and uses reserved slots in codebooks to support post-deployment binding of new words.

If this is right

  • OneRetrieval matches the recall of the strongest generative baseline on five million real-traffic requests.
  • It achieves an intervention hit rate more than ten times higher than closed-codebook encodings.
  • Replacing the inverted-index branch with OneRetrieval lifts order volume in online A/B tests.
  • Extending OneRetrieval to nearly the entire retrieval stage holds conversion rate while raising CTR.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The reserved-slot mechanism could be tested in non-e-commerce retrieval settings where vocabulary changes frequently, such as news or social media search.
  • Joint optimization across what were previously separate branches may reduce the engineering cost of maintaining hand-tuned fusion rules in other large-scale systems.
  • If the attribute-word alignment generalizes, similar encodings might allow generative models to support rapid updates in domains beyond product catalogs.

Load-bearing premise

Tying each identifier position to an interpretable attribute word preserves retrieval quality while allowing new words to bind to reserved slots after deployment without retraining or quality loss.

What would settle it

Measure recall on queries that use a newly bound term before and after binding the term to a reserved slot; if recall on those queries drops below the original multi-branch system, the editability claim does not hold.

Figures

Figures reproduced from arXiv: 2606.13533 by Ben Chen, Chenyi Lei, Huangyu Dai, Kun Gai, Lingtao Mao, Siyuan Wang, Tong Zhao, Wenwu Ou, Xinyu Sun, Xuxin Zhang, Ying Yang, Yue Lv, Yufei Ma, Yupeng Li, Zhipeng Qian, Zihan Liang.

Figure 1
Figure 1. Figure 1: Overview of OneRetrieval. (a) KAE maps each item and query to a six-token semantic identifier through a shared [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Cumulative information loss versus target group [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Attribute statistics and codebook layout. (a) [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Length axis of the codebook design. Order HR [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Online CTR relative gains across the top [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
read the original abstract

Industrial e-commerce search serves hundreds of millions of items through a multi-branch retrieval stage fused by hand-tuned merging without joint optimization. Generative retrieval (GR) raises the prospect of collapsing this stage into a single model, yet unification is gated by more than retrieval quality: the inverted-index branch converts below the platform average yet persists because it is almost the only branch where operations can inject a new term within hours without any model update; a one-model substitute must preserve this real-time editability. Existing GR methods structurally lack it: closed-codebook methods fix each slot to a quantized embedding at training, while open-vocabulary methods leave new-term routing to model generalization. We present OneRetrieval, a one-model GR framework built on Keyword-Aligned Encoding (KAE), which ties each identifier position to an interpretable attribute word, pairing competitive recall quality with the editability of the inverted index -- to our knowledge the first editable generative retrieval method. An information-theoretic merging organizes 18 attribute categories into six codebook groups with non-uniform capacity; reserved slots in each codebook can be bound to new words after deployment without retraining; and a four-stage fine-tuning pipeline secures quality and editability jointly. On five million real-traffic requests, OneRetrieval matches the deep recall of the strongest generative baseline, with an intervention hit rate over an order of magnitude above closed-codebook encodings. Online, replacing the inverted-index branch significantly lifts order volume; extending to nearly the entire stage holds conversion while improving CTR. The system is deployed at Kuaishou, serving hundreds of millions of PVs daily.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces OneRetrieval, a generative retrieval framework for e-commerce search that uses Keyword-Aligned Encoding (KAE) to tie each identifier position to an interpretable attribute word, organizes attributes into six codebook groups with reserved slots for post-deployment binding of new terms, and applies a four-stage fine-tuning pipeline to jointly achieve competitive recall quality and editability without retraining. On five million real-traffic requests it matches the strongest generative baseline on deep recall while reporting intervention hit rates over an order of magnitude higher than closed-codebook methods; online A/B tests show lifts when replacing the inverted-index branch and maintained conversion with improved CTR when extending to nearly the full stage.

Significance. If the central claim of editability without quality loss holds under direct verification, the work would be significant for industrial retrieval: it offers the first generative method that preserves the real-time term-injection capability of inverted indexes while unifying multi-branch systems under a single optimized model, with demonstrated online business impact at scale.

major comments (2)
  1. [Abstract / Results] Abstract and experimental evaluation: the claim that binding new words to reserved slots incurs no quality loss (and requires no retraining) is load-bearing for the central contribution yet is supported only by intervention hit-rate numbers; no before/after recall, NDCG, or hit-rate metrics on a fixed item set after binding are reported, leaving the 'without quality loss' half of the editability claim unverified.
  2. [Methods] Methods (KAE and codebook construction): the information-theoretic merging of 18 attribute categories into six groups with non-uniform capacity is presented as enabling both quality and editability, but no ablation quantifies the contribution of the merging rule versus uniform allocation or versus the reserved-slot mechanism itself.
minor comments (2)
  1. [Abstract] The abstract states 'an order of magnitude above closed-codebook encodings' for intervention hit rate but supplies neither the exact numerical values nor the identity of the closed-codebook baseline.
  2. [Experimental evaluation] No error bars, confidence intervals, or data-exclusion rules are mentioned for the five-million-request offline results or the online lifts.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below, agreeing that the points identify areas where additional evidence would strengthen the central claims, and we outline revisions to incorporate the suggested verifications.

read point-by-point responses
  1. Referee: [Abstract / Results] Abstract and experimental evaluation: the claim that binding new words to reserved slots incurs no quality loss (and requires no retraining) is load-bearing for the central contribution yet is supported only by intervention hit-rate numbers; no before/after recall, NDCG, or hit-rate metrics on a fixed item set after binding are reported, leaving the 'without quality loss' half of the editability claim unverified.

    Authors: We agree that direct before-and-after metrics on a fixed item set would provide stronger verification of the no-quality-loss claim. The reported intervention hit rates confirm that new terms can be bound and retrieved at high rates, but they do not quantify any potential degradation in overall retrieval quality post-binding. In the revised manuscript we will add an experiment that binds new words to reserved slots, then reports recall, NDCG, and hit-rate on the same fixed item set before and after binding. revision: yes

  2. Referee: [Methods] Methods (KAE and codebook construction): the information-theoretic merging of 18 attribute categories into six groups with non-uniform capacity is presented as enabling both quality and editability, but no ablation quantifies the contribution of the merging rule versus uniform allocation or versus the reserved-slot mechanism itself.

    Authors: We acknowledge that an ablation isolating the merging rule would be valuable for quantifying its contribution relative to uniform allocation or the reserved-slot design. The information-theoretic grouping was motivated by balancing capacity according to attribute entropy and frequency, yet without explicit comparisons the benefit remains unmeasured. We will add an ablation study in the revision that compares the proposed merging against uniform capacity allocation and against variants that omit reserved slots, reporting effects on both retrieval quality and editability metrics. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on architectural description without self-referential reductions

full rationale

The manuscript text (abstract and summary) presents OneRetrieval as an architectural framework relying on Keyword-Aligned Encoding, reserved codebook slots, and a four-stage fine-tuning pipeline. No equations, derivations, or parameter-fitting steps are exhibited that reduce by construction to their own inputs. No self-citations are invoked as load-bearing uniqueness theorems, and no ansatzes or renamings of prior results are smuggled in. The editability claim is asserted as a direct consequence of the described design choices rather than derived from fitted quantities or prior author work. This is the normal case of a self-contained systems paper whose central assertions remain open to external verification.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the framework relies on standard information-theoretic grouping and reserved slots whose details are not specified.

pith-pipeline@v0.9.1-grok · 5873 in / 1009 out tokens · 18913 ms · 2026-06-27T05:19:42.857361+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 7 canonical work pages · 2 internal anchors

  1. [1]

    Alfred V Aho and Margaret J Corasick. 1975. Efficient string matching: an aid to bibliographic search.Commun. ACM18, 6 (1975), 333–340

  2. [2]

    Michele Bevilacqua, Giuseppe Ottaviano, Patrick Lewis, Scott Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive search engines: Generating substrings as document identifiers.Advances in Neural Information Processing Systems35 (2022), 31668–31683

  3. [3]

    Ben Chen, Xian Guo, Siyuan Wang, Zihan Liang, Yue Lv, Yufei Ma, Xinlong Xiao, Bowen Xue, Xuxin Zhang, Ying Yang, et al . 2025. Onesearch: A preliminary exploration of the unified end-to-end generative framework for e-commerce search.arXiv preprint arXiv:2509.03236(2025)

  4. [4]

    Ben Chen, Siyuan Wang, Yufei Ma, Zihan Liang, Xuxin Zhang, Yue Lv, Ying Yang, Huangyu Dai, Lingtao Mao, Tong Zhao, et al. 2026. OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework.arXiv preprint arXiv:2603.24422(2026)

  5. [5]

    1999.Elements of information theory

    Thomas M Cover. 1999.Elements of information theory. John Wiley & Sons

  6. [6]

    Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou. 2025. Onerec: Unifying retrieve and rank with generative OneRetrieval: Unifying Multi-Branch E-commerce Retrieval with an Editable Generative Model Conference acronym ’XX, June 03–05, 2018, Woodstock, NY recommender and iterative preference alignment.arXiv p...

  7. [7]

    Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. Simcse: Simple contrastive learning of sentence embeddings. InProceedings of the 2021 conference on empirical methods in natural language processing. 6894–6910

  8. [8]

    Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun. 2013. Optimized product quantization.IEEE transactions on pattern analysis and machine intelligence36, 4 (2013), 744–755

  9. [9]

    Xian Guo, Ben Chen, Siyuan Wang, Ying Yang, Mingyue Cheng, Chenyi Lei, Yuqing Ding, and Han Li. 2026. Onesug: The unified end-to-end generative framework for e-commerce query suggestion. InProceedings of the AAAI Confer- ence on Artificial Intelligence, Vol. 40. 14774–14782

  10. [10]

    Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2010. Product quantization for nearest neighbor search.IEEE transactions on pattern analysis and machine intelligence33, 1 (2010), 117–128

  11. [11]

    Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs.IEEE transactions on big data7, 3 (2019), 535–547

  12. [12]

    Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recom- mendation. In2018 IEEE international conference on data mining (ICDM). IEEE, 197–206

  13. [13]

    Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open- domain question answering. InProceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). 6769–6781

  14. [14]

    Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. InProceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 39–48

  15. [15]

    Sunkyung Lee, Minjin Choi, and Jongwuk Lee. 2023. GLEN: Generative retrieval via lexical index learning. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 7693–7704

  16. [16]

    Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. InProceedings of the 58th annual meeting of the association for computational linguistics. 7871–7880

  17. [17]

    Mingming Li, Huimu Wang, Zuxu Chen, Guangtao Nie, Yiming Qiu, Guoyu Tang, Lin Liu, and Jingwei Zhuo. 2024. Generative retrieval with preference optimization for e-commerce search.arXiv preprint arXiv:2407.19829(2024)

  18. [18]

    Yongqi Li, Nan Yang, Liang Wang, Furu Wei, and Wenjie Li. 2024. Learning to rank in generative retrieval. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 8716–8723

  19. [19]

    Zan Li, Jiahui Chen, Yuan Chai, Xiaoze Jiang, Xiaohua Qi, Zhiheng Qin, Runbin Zhou, Shun Zuo, Guangchao Hao, Kefeng Wang, et al. 2025. UniDex: Rethink- ing Search Inverted Indexing with Unified Semantic Modeling.arXiv preprint arXiv:2509.24632(2025)

  20. [20]

    Rodrigo Nogueira and Jimmy Lin. 2019. From doc2query to docTTTTTquery. Online preprint(2019)

  21. [21]

    Ming Pang, Chunyuan Yuan, Xiaoyu He, Zheng Fang, Donghao Xie, Fanyi Qu, Xue Jiang, Changping Peng, Zhangang Lin, Zheng Luo, et al. 2025. Generative retrieval and alignment model: A new paradigm for e-commerce retrieval. In Companion Proceedings of the ACM on Web Conference 2025. 413–421

  22. [22]

    Qi Pi, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, Xiaoqiang Zhu, and Kun Gai. 2020. Search-based user interest modeling with lifelong sequential behavior data for click-through rate prediction. InProceedings of the 29th ACM International Conference on Information & Knowledge Management. 2685–2692

  23. [23]

    Junyan Qiu, Ze Wang, Fan Zhang, Zuowu Zheng, Jile Zhu, Jiangke Fan, Teng Zhang, Haitao Wang, and Xingxing Wang. 2025. UniROM: Unifying Online Advertising Ranking as One Model. InProceedings of the 34th ACM International Conference on Information and Knowledge Management. 2440–2449

  24. [24]

    Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al

  25. [25]

    Recommender systems with generative retrieval.Advances in Neural Information Processing Systems36 (2023), 10299–10315

  26. [26]

    2009.The probabilistic relevance frame- work: BM25 and beyond

    Stephen Robertson and Hugo Zaragoza. 2009.The probabilistic relevance frame- work: BM25 and beyond. Vol. 4. Now Publishers Inc

  27. [27]

    Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. InProceedings of the 10th international conference on World Wide Web. 285–295

  28. [28]

    Yi Tay, Vinh Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, et al. 2022. Transformer memory as a differentiable search index.Advances in neural information processing systems35 (2022), 21831–21843

  29. [29]

    Huimin Xu, Wenting Wang, Xinnian Mao, Xinyu Jiang, and Man Lan. 2019. Scaling up open tagging from tens to thousands: Comprehension empowered attribute value extraction from product title. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5214–5223

  30. [30]

    Bowen Zheng, Yupeng Hou, Hongyu Lu, Yu Chen, Wayne Xin Zhao, Ming Chen, and Ji-Rong Wen. 2024. Adapting large language models by integrating collaborative semantics for recommendation. In2024 IEEE 40th International Conference on Data Engineering (ICDE). IEEE, 1435–1448

  31. [31]

    Guineng Zheng, Subhabrata Mukherjee, Xin Luna Dong, and Feifei Li. 2018. Opentag: Open attribute value extraction from product profiles. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1049–1058

  32. [32]

    Zexin Zheng, Huangyu Dai, Lingtao Mao, Xinyu Sun, Zihan Liang, Ben Chen, Yuqing Ding, Chenyi Lei, Wenwu Ou, Han Li, et al. 2025. OneVision: An End-to- End Generative Framework for Multi-view E-commerce Vision Search.arXiv preprint arXiv:2510.05759(2025)

  33. [33]

    Zuowu Zheng, Ze Wang, Fan Yang, Jiangke Fan, Teng Zhang, Yongkang Wang, and Xingxing Wang. 2025. Ega-v2: An end-to-end generative framework for industrial advertising.arXiv preprint arXiv:2505.17549(2025)