Recognition: 2 theorem links
· Lean TheoremGaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search
Pith reviewed 2026-05-15 21:55 UTC · model grok-4.3
The pith
GaiaFlow tunes diffusion models semantically to cut carbon emissions in neural search while holding retrieval quality steady.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GaiaFlow is a framework that operationalizes semantic-guided diffusion tuning to enable carbon-frugal search. It orchestrates retrieval-guided Langevin dynamics and hardware-independent modeling with adaptive early-exit protocols and precision-aware quantized inference, thereby mitigating carbon footprints while preserving robust retrieval quality across varied computing infrastructures.
What carries the argument
Semantic-guided diffusion tuning integrated with retrieval-guided Langevin dynamics and adaptive early-exit protocols, which together drive the precision-energy trade-off.
If this is right
- Operational carbon footprints drop substantially in deployed neural search systems.
- Retrieval quality remains robust on heterogeneous computing hardware.
- The precision-energy trade-off becomes tunable without custom hardware redesign.
- Next-generation neural search gains a scalable, lower-impact deployment path.
Where Pith is reading between the lines
- The same tuning approach might transfer to other high-energy AI tasks such as recommendation or question answering.
- Hardware-independent modeling could let teams forecast carbon costs before full-scale rollout.
- Combining early exits with renewable-powered data centers would compound the reported reductions.
- Edge-device deployments become more feasible once quantized inference and early exits are in place.
Load-bearing premise
Semantic-guided diffusion tuning plus Langevin dynamics and early exits can keep retrieval quality high while cutting carbon use across different hardware.
What would settle it
A benchmark run on standard retrieval datasets where GaiaFlow's NDCG or recall drops below a conventional neural ranker once energy consumption is reduced by half.
Figures
read the original abstract
As the burgeoning power requirements of sophisticated neural architectures escalate, the information retrieval community has recognized ecological sustainability as a pivotal priority that necessitates a fundamental paradigm shift in model design. While contemporary neural rankers have attained unprecedented accuracy, the substantial environmental externalities associated with their computational intensity often remain overlooked in large-scale deployments. We present GaiaFlow, an innovative framework engineered to facilitate carbon-frugal search by operationalizing semantic-guided diffusion tuning. Our methodology orchestrates the convergence of retrieval-guided Langevin dynamics and a hardware-independent performance modeling strategy to optimize the trade-off between search precision and environmental preservation. By incorporating adaptive early exit protocols and precision-aware quantized inference, the proposed architecture significantly mitigates operational carbon footprints while maintaining robust retrieval quality across heterogeneous computing infrastructures. Extensive experimental evaluations demonstrate that GaiaFlow achieves a superior equilibrium between effectiveness and energy efficiency, offering a scalable and sustainable pathway for next-generation neural search systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces GaiaFlow, a framework for carbon-frugal search in neural information retrieval systems. It combines semantic-guided diffusion tuning with retrieval-guided Langevin dynamics, a hardware-independent performance modeling strategy, adaptive early-exit protocols, and precision-aware quantized inference to optimize the trade-off between retrieval effectiveness and energy efficiency, claiming that extensive experiments demonstrate a superior equilibrium between these factors.
Significance. If the experimental claims were substantiated with concrete metrics, baselines, and ablations, the work could offer a meaningful contribution to sustainable IR by addressing the carbon footprint of neural rankers while preserving retrieval quality across heterogeneous hardware. However, the complete absence of any quantitative results, datasets, or evaluation details in the manuscript prevents assessment of whether the proposed methods deliver on the asserted benefits.
major comments (1)
- [Abstract] Abstract: The central claim that 'extensive experimental evaluations demonstrate that GaiaFlow achieves a superior equilibrium between effectiveness and energy efficiency' is presented without any supporting metrics, baselines, datasets, error bars, ablation studies, or result tables. This unsupported assertion is load-bearing for the paper's contribution and cannot be verified from the provided text.
minor comments (1)
- [Abstract] The abstract employs qualitative phrases such as 'significantly mitigates operational carbon footprints' and 'maintaining robust retrieval quality' without any accompanying quantitative definitions or thresholds.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for identifying the critical omission of experimental evidence. We agree that the abstract's claims cannot be assessed without quantitative support and will perform a major revision to include all requested details.
read point-by-point responses
-
Referee: The central claim that 'extensive experimental evaluations demonstrate that GaiaFlow achieves a superior equilibrium between effectiveness and energy efficiency' is presented without any supporting metrics, baselines, datasets, error bars, ablation studies, or result tables. This unsupported assertion is load-bearing for the paper's contribution and cannot be verified from the provided text.
Authors: We acknowledge this is a serious omission in the submitted manuscript. The current version contains only the abstract and high-level method description; the full experimental section (including datasets such as MS MARCO and TREC DL, baselines such as BM25, ColBERT, and other efficient neural rankers, metrics such as nDCG@10 and energy consumption in kWh, ablation studies on each component, error bars from 5 runs, and hardware-independent modeling results) was inadvertently left out. In the revised manuscript we will insert a complete Experiments section with tables, figures, statistical tests, and direct comparisons that substantiate the abstract claim. We apologize for the error and will ensure the revision makes the contribution verifiable. revision: yes
Circularity Check
No significant circularity detected; derivation chain absent from available text
full rationale
The provided abstract and description contain no equations, derivations, self-citations, or explicit performance-modeling procedures that could be inspected for reductions to inputs by construction. Claims rest on experimental evaluations of semantic-guided diffusion tuning combined with Langevin dynamics and early-exit protocols, but without quoted mathematical steps or fitted-parameter details, no load-bearing circularity of any enumerated kind can be exhibited. The hardware-independent modeling strategy is described at a high level only, leaving no traceable self-definition or fitted-input-as-prediction pattern in the given material.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.lean (Jcost, washburn_uniqueness_aczel)J_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
zt+1 = zt − γ1 ∇zU(D(zt)) + γ2 ∇zV(q,zt) + √(2γ3) ξt (eq. 12); U = αCarbon + βLatency + γEffectiveness (eq. 6)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
No mention of φ, 8-tick, reciprocal cost, or distinction-forced constants
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Carbon explorer: A holistic framework for designing carbon aware datacenters
Bilge Acun, Benjamin Lee, Fiodar Kazhamiaka, Kiwan Maeng, Udit Gupta, Manoj Chakkaravarthy, David Brooks, and Carole-Jean Wu. Carbon explorer: A holistic framework for designing carbon aware datacenters. In 13 GaiaFlow Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, p...
work page 2023
-
[2]
Beyond co2 emissions: The overlooked impact of water consumption of information retrieval models
Guido Zuccon, Harrisen Scells, and Shengyao Zhuang. Beyond co2 emissions: The overlooked impact of water consumption of information retrieval models. InProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval, pages 283–289, 2023
work page 2023
-
[3]
Haijin Wang, Mianrong Zhang, Zheng Chen, Nan Shang, Shangheng Yao, Fushuan Wen, and Junhua Zhao. Carbon footprint accounting driven by large language models and retrieval-augmented generation.arXiv preprint arXiv:2408.09713, 2024
-
[4]
Kaiwen Zhao, Bharathan Balaji, and Stephen Lee. Cf-rag: A dataset and method for carbon footprint qa using retrieval-augmented generation.arXiv preprint arXiv:2508.03489, 2025
-
[5]
Lei Liu, Yongzhang Zhou, Jianhua Ma, Yuqing Zhang, and Luhao He. Domain-specific question-answering systems: A case study of a carbon neutrality knowledge base.Sustainability, 17(5):2192, 2025
work page 2025
-
[6]
Zhixuan Cao, Ming Han, Jingtao Wang, and Meng Jia. Carbonchat: Large language model-based corporate carbon emission analysis and climate knowledge q&a system.arXiv preprint arXiv:2501.02031, 2025
-
[7]
Peir: Modeling performance in neural information retrieval
Pooya Khandel, Andrew Yates, Ana-Lucia Varbanescu, Maarten de Rijke, and Andy Pimentel. Peir: Modeling performance in neural information retrieval. InEuropean Conference on Information Retrieval, pages 279–294. Springer, 2025
work page 2025
-
[8]
Information retrieval: recent advances and beyond.IEEE Access, 11: 76581–76604, 2023
Kailash A Hambarde and Hugo Proenca. Information retrieval: recent advances and beyond.IEEE Access, 11: 76581–76604, 2023
work page 2023
-
[9]
An efficient information retrieval system using evolutionary algorithms.Network, 2(4):583–605, 2022
Doaa N Mhawi, Haider W Oleiwi, Nagham H Saeed, and Heba L Al-Taie. An efficient information retrieval system using evolutionary algorithms.Network, 2(4):583–605, 2022
work page 2022
-
[10]
Michael Shen, Muhammad Umar, Kiwan Maeng, G Edward Suh, and Udit Gupta. Towards understanding systems trade-offs in retrieval-augmented generation model inference.arXiv preprint arXiv:2412.11854, 2024
-
[11]
Early exit strategies for approximate k-nn search in dense retrieval
Francesco Busolin, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, and Salvatore Trani. Early exit strategies for approximate k-nn search in dense retrieval. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 3647–3652, 2024
work page 2024
-
[12]
Semantic-enhanced differentiable search index inspired by learning strategies
Yubao Tang, Ruqing Zhang, Jiafeng Guo, Jiangui Chen, Zuowei Zhu, Shuaiqiang Wang, Dawei Yin, and Xueqi Cheng. Semantic-enhanced differentiable search index inspired by learning strategies. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4904–4913, 2023
work page 2023
-
[13]
Min Pan, Quanli Pei, Yu Liu, Teng Li, Ellen Anne Huang, Junmei Wang, and Jimmy Xiangji Huang. Sprf: A semantic pseudo-relevance feedback enhancement for information retrieval via conceptnet.Knowledge-Based Systems, 274:110602, 2023
work page 2023
-
[14]
Alternating phase langevin sampling with implicit denoiser priors for phase retrieval
Rohun Agrawal and Oscar Leong. Alternating phase langevin sampling with implicit denoiser priors for phase retrieval. InICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023
work page 2023
-
[15]
Andreas Blattmann, Robin Rombach, Kaan Oktay, Jonas Müller, and Björn Ommer. Retrieval-augmented diffusion models.Advances in Neural Information Processing Systems, 35:15309–15324, 2022
work page 2022
-
[16]
Remodiffuse: Retrieval-augmented motion diffusion model
Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, and Ziwei Liu. Remodiffuse: Retrieval-augmented motion diffusion model. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 364–373, 2023
work page 2023
-
[17]
Haitz Sáez de Ocáriz Borde, Alvaro Arroyo, Ismael Morales, Ingmar Posner, and Xiaowen Dong. Neural latent geometry search: Product manifold inference via gromov-hausdorff-informed bayesian optimization.Advances in Neural Information Processing Systems, 36:38370–38403, 2023
work page 2023
-
[18]
Yonggang Zhu, Aidong Men, and Li Xiao. Diffusion-based diverse audio captioning with retrieval-guided langevin dynamics.Information Fusion, 114:102643, 2025
work page 2025
-
[19]
Exponentially weighted moving models.arXiv preprint arXiv:2404.08136, 2024
Eric Luxenberg and Stephen Boyd. Exponentially weighted moving models.arXiv preprint arXiv:2404.08136, 2024. 14 GaiaFlow
-
[20]
Yanan Liu, Xiaoxia Wei, Jinyu Xiao, Zhijie Liu, Yang Xu, and Yun Tian. Energy consumption and emission mitigation prediction based on data center traffic and pue for global data centers.Global Energy Interconnection, 3(3):272–282, 2020
work page 2020
-
[21]
Zhiwei Cao, Xin Zhou, Han Hu, Zhi Wang, and Yonggang Wen. Toward a systematic survey for carbon neutral data centers.IEEE Communications Surveys & Tutorials, 24(2):895–936, 2022
work page 2022
-
[22]
Zhiwei Cao, Ruihang Wang, Xin Zhou, Rui Tan, Yonggang Wen, Yuejun Yan, and Zhaoyang Wang. Adaptive capacity provisioning for carbon-aware data centers: a digital twin-based approach.IEEE Transactions on Sustainable Computing, 2025
work page 2025
-
[23]
Dongxiang Yan, Mo-Yuen Chow, and Yue Chen. Low-carbon operation of data centers with joint workload sharing and carbon allowance trading.IEEE Transactions on Cloud Computing, 12(2):750–761, 2024
work page 2024
-
[24]
Carbonreveal: Embodied carbon accounting with retrieval-augmented llm for computer systems
Xiaoyang Zhang, Yucheng Bao, Taiqi Zhou, and Dan Wang. Carbonreveal: Embodied carbon accounting with retrieval-augmented llm for computer systems. InProceedings of the 11th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, pages 250–251, 2024
work page 2024
-
[25]
Andreas Schmidt, Gregory Stock, Robin Ohs, Luis Gerhorst, Benedict Herzog, and Timo Hönig. carbond: An operating-system daemon for carbon awareness.ACM SIGENERGY Energy Informatics Review, 4(3):52–57, 2024
work page 2024
-
[26]
Carbon-aware quality adaptation for energy-intensive services
Philipp Wiesner, Dennis Grinwald, Philipp Weiß, Patrick Wilhelm, Ramin Khalili, and Odej Kao. Carbon-aware quality adaptation for energy-intensive services. InProceedings of the 16th ACM International Conference on Future and Sustainable Energy Systems, pages 415–422, 2025
work page 2025
-
[27]
Dense passage retrieval for open-domain question answering
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick SH Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. Dense passage retrieval for open-domain question answering. InEMNLP (1), pages 6769–6781, 2020
work page 2020
-
[28]
Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. Towards effective and efficient sparse neural information retrieval.ACM Transactions on Information Systems, 42(5):1–46, 2024
work page 2024
-
[29]
A unified framework for learned sparse retrieval
Thong Nguyen, Sean MacAvaney, and Andrew Yates. A unified framework for learned sparse retrieval. In European Conference on Information Retrieval, pages 101–116. Springer, 2023
work page 2023
-
[30]
Ming Hu. Research on semantic information retrieval based on improved fish swarm algorithm.Journal of Web Engineering, 21(3):845–860, 2022
work page 2022
-
[31]
Wei Zhang, KangBin Zhou, LuYao Teng, FeiYi Tang, NaiQi Wu, ShaoHua Teng, and Jian Li. Dynamic confidence sampling and label semantic guidance learning for domain adaptive retrieval.IEEE Transactions on Multimedia, 26:2467–2479, 2023
work page 2023
-
[32]
Semantic-guided hashing learning for domain adaptive retrieval.World Wide Web, 26(3):1093–1112, 2023
Wei Zhang, Xiaoqiong Yang, Shaohua Teng, and NaiQi Wu. Semantic-guided hashing learning for domain adaptive retrieval.World Wide Web, 26(3):1093–1112, 2023
work page 2023
-
[33]
Reis: A high-performance and energy-efficient retrieval system with in-storage processing
Kangqi Chen, Rakesh Nadig, Manos Frouzakis, Nika Mansouri Ghiasi, Yu Liang, Haiyu Mao, Jisung Park, Mohammad Sadrosadati, and Onur Mutlu. Reis: A high-performance and energy-efficient retrieval system with in-storage processing. InProceedings of the 52nd Annual International Symposium on Computer Architecture, pages 1171–1192, 2025
work page 2025
-
[34]
Yi Tay, Vinh Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, et al. Transformer memory as a differentiable search index.Advances in Neural Information Processing Systems, 35:21831–21843, 2022
work page 2022
-
[35]
Knowledge- aware query expansion with large language models for textual and relational retrieval
Yu Xia, Junda Wu, Sungchul Kim, Tong Yu, Ryan A Rossi, Haoliang Wang, and Julian McAuley. Knowledge- aware query expansion with large language models for textual and relational retrieval. InProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long...
work page 2025
-
[36]
Query performance prediction: techniques and applications in modern information retrieval
Negar Arabzadeh, Chuan Meng, Mohammad Aliannejadi, and Ebrahim Bagheri. Query performance prediction: techniques and applications in modern information retrieval. InProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, pages 291–294, 2024. 15 GaiaFlow
work page 2024
-
[37]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022
work page 2022
-
[38]
Diffusion models beat gans on image classification.arXiv preprint arXiv:2307.08702, 2023
Soumik Mukhopadhyay, Matthew Gwilliam, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Srinidhi Hegde, Tianyi Zhou, and Abhinav Shrivastava. Diffusion models beat gans on image classification.arXiv preprint arXiv:2307.08702, 2023
-
[39]
Zhiqing Sun and Yiming Yang. Difusco: Graph-based diffusion solvers for combinatorial optimization.Advances in neural information processing systems, 36:3706–3731, 2023
work page 2023
-
[40]
Jingwei Liu, Ling Yang, Hongyan Li, and Shenda Hong. Retrieval-augmented diffusion models for time series forecasting.Advances in Neural Information Processing Systems, 37:2766–2786, 2024
work page 2024
-
[41]
Zihao Wang. Score-based generative modeling through backward stochastic differential equations: Inversion and generation.arXiv preprint arXiv:2304.13224, 2023
-
[42]
Diffusion augmented retrieval: A training-free approach to interactive text-to-image retrieval
Zijun Long, Kangheng Liang, Gerardo Aragon Camarasa, Richard Mccreadie, and Paul Henderson. Diffusion augmented retrieval: A training-free approach to interactive text-to-image retrieval. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 823–832, 2025
work page 2025
-
[43]
Julien Guinot, Elio Quinton, and György Fazekas. Gd-retriever: Controllable generative text-music retrieval with diffusion models.arXiv preprint arXiv:2506.17886, 2025
-
[44]
Lingkai Kong, Yuanqi Du, Wenhao Mu, Kirill Neklyudov, Valentin De Bortoli, Dongxia Wu, Haorui Wang, Aaron Ferber, Yi-An Ma, Carla P Gomes, et al. Diffusion models as constrained samplers for optimization with unknown constraints.arXiv preprint arXiv:2402.18012, 2024
-
[45]
Luke Snow and Vikram Krishnamurthy. Finite-sample bounds for adaptive inverse reinforcement learning using passive langevin dynamics.IEEE Transactions on Information Theory, 2025
work page 2025
-
[46]
Diffuretrieval: A chain-of-thought enhanced diffusion retrieval in sponsored search
Yadong Zhang, Siyu Lu, Qiang Liu, and Xingxing Wang. Diffuretrieval: A chain-of-thought enhanced diffusion retrieval in sponsored search. InCompanion Proceedings of the ACM Web Conference 2024, pages 533–536, 2024
work page 2024
-
[47]
Raoul Kalisvaart, Masoud Mansoury, Alan Hanjalic, and Elvin Isufi. Towards carbon footprint-aware recommender systems for greener item recommendation.ACM Transactions on Recommender Systems, 2025
work page 2025
-
[48]
Reduce, reuse, recycle: Green information retrieval research
Harrisen Scells, Shengyao Zhuang, and Guido Zuccon. Reduce, reuse, recycle: Green information retrieval research. InProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2825–2837, 2022
work page 2022
-
[49]
Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Hui Chen, and Weichen Liu. You only search once: On lightweight differentiable architecture search for resource-constrained embedded platforms. InProceedings of the 59th ACM/IEEE Design Automation Conference, pages 475–480, 2022
work page 2022
-
[50]
Xiangzhong Luo, Di Liu, Hao Kong, Shuo Huai, Hui Chen, and Weichen Liu. Surgenas: A comprehensive surgery on hardware-aware differentiable neural architecture search.IEEE Transactions on Computers, 72(4):1081–1094, 2022
work page 2022
-
[51]
Lianming Huang, Shangyu Wu, Yufei Cui, Ying Xiong, Xue Liu, Tei-Wei Kuo, Nan Guan, and Chun Jason Xue. Raee: A robust retrieval-augmented early exiting framework for efficient inference.arXiv preprint arXiv:2405.15198, 2024
-
[52]
Nnlqp: A multi-platform neural network latency query and prediction system with an evolving database
Liang Liu, Mingzhu Shen, Ruihao Gong, Fengwei Yu, and Hailong Yang. Nnlqp: A multi-platform neural network latency query and prediction system with an evolving database. InProceedings of the 51st International Conference on Parallel Processing, pages 1–14, 2022
work page 2022
-
[53]
Bayesian learning via stochastic gradient langevin dynamics
Max Welling and Yee W Teh. Bayesian learning via stochastic gradient langevin dynamics. InProceedings of the 28th international conference on machine learning (ICML-11), pages 681–688, 2011
work page 2011
-
[54]
Categorical Reparameterization with Gumbel-Softmax
Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-softmax.arXiv preprint arXiv:1611.01144, 2016. 16 GaiaFlow
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[55]
Hongyu Ke, Wanxin Jin, and Haoxin Wang. Carboncp: Carbon-aware dnn partitioning with conformal prediction for sustainable edge intelligence.arXiv preprint arXiv:2404.16970, 2024
-
[56]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PmLR, 2020
work page 2020
-
[57]
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, et al. Ms marco: A human generated machine reading comprehension dataset.arXiv preprint arXiv:1611.09268, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[58]
Joel Mackenzie, Andrew Trotman, and Jimmy Lin. Efficient document-at-a-time and score-at-a-time query evaluation for learned sparse representations.ACM Transactions on Information Systems, 41(4):1–28, 2023
work page 2023
-
[59]
Antonio Mallia, Michal Siedlaczek, Joel Mackenzie, and Torsten Suel. Pisa: Performant indexes and search for academia.Proceedings of the Open-Source IR Replicability Challenge, 2019
work page 2019
-
[60]
Stephen Robertson, Hugo Zaragoza, et al. The probabilistic relevance framework: Bm25 and beyond.Foundations and trends® in information retrieval, 3(4):333–389, 2009
work page 2009
-
[61]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer.Journal of machine learning research, 21(140):1–67, 2020
work page 2020
-
[62]
Shengyao Zhuang and Guido Zuccon. Fast passage re-ranking with contextualized exact term matching and efficient passage expansion.arXiv preprint arXiv:2108.08513, 2021
-
[63]
Learning passage impacts for inverted indexes
Antonio Mallia, Omar Khattab, Torsten Suel, and Nicola Tonellotto. Learning passage impacts for inverted indexes. InProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1723–1727, 2021
work page 2021
-
[64]
Complement lexical retrieval model with semantic residual embeddings
Luyu Gao, Zhuyun Dai, Tongfei Chen, Zhen Fan, Benjamin Van Durme, and Jamie Callan. Complement lexical retrieval model with semantic residual embeddings. InEuropean Conference on Information Retrieval, pages 146–160. Springer, 2021
work page 2021
-
[65]
arXiv preprint arXiv:2109.10086 , year=
Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. Splade v2: Sparse lexical and expansion model for information retrieval.arXiv preprint arXiv:2109.10086, 2021. A Appendix: Theoretical Proofs A.1 Convergence of Retrieval-Guided Langevin Theorem A.1.Let U:R n →R be L-smooth and m-strongly convex. Assume for every query q the ma...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.