pith. machine review for the scientific record. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

738 papers in cs.IR · page 1

  1. cs.AI 2026-05-14 reviewed
    Citations miss key context in agent graph answers

    Why Neighborhoods Matter: Traversal Context and Provenance in Agentic GraphRAG

    Maximilian von Zastrow +2

  2. stat.ML 2026-05-14 reviewed
    Optimal logging policies minimize OPE error via reward-coverage balance

    Logging Policy Design for Off-Policy Evaluation

    Connor Douglas +2

  3. cs.AI 2026-05-14 reviewed
    The paper presents a fixed six-stage deterministic workflow that confines language model…

    A Deterministic Agentic Workflow for HS Tariff Classification: Multi-Dimensional Rule Reasoning with Interpretable Decisions

    Dongjiang Zhuang +6

  4. cs.CV 2026-05-14 reviewed
    Aggregated vectors make different financial docs look identical

    A Picture is Worth a Thousand Words? An Empirical Study of Aggregation Strategies for Visual Financial Document Retrieval

    Ho Hung Lim +1

  5. cs.IR 2026-05-14 reviewed
    AsymRec raises generative recommender accuracy 15.8%

    Asymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization

    Bin Huang +8

  6. cs.IR 2026-05-14 reviewed
    Distilled rerankers match quality with 34% fewer tokens

    Stop Overthinking: Unlocking Efficient Listwise Reranking with Minimal Reasoning

    Danyang Liu +1

  7. cs.CV 2026-05-14 reviewed
    Adaptive gate skips reasoning for simple multimodal inputs

    Think When Needed: Adaptive Reasoning-Driven Multimodal Embeddings with a Dual-LoRA Architecture

    Guanghao Zhang +4

  8. cs.IR 2026-05-14 reviewed
    Semantic IDs halve beam search size for e-commerce retrieval

    Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL

    Bokang Wang +7

  9. cs.IR 2026-05-14 reviewed
    PaSaMaster beats GPT-5.2 in paper retrieval at 1% cost

    Towards Self-Evolving Agentic Literature Retrieval

    Fenyi Liu +10

  10. cs.IR 2026-05-13 reviewed
    Imagined future steps triple recall of distant memories

    Thinking Ahead: Prospection-Guided Retrieval of Memory with Language Models

    Chirag Shah +4

  11. cs.CR 2026-05-13 reviewed
    Small rotations hide data in embeddings undetected

    VectorSmuggle: Steganographic Exfiltration in Embedding Stores and a Cryptographic Provenance Defense

    Jascha Wanger

  12. cs.IR 2026-05-13 reviewed
    The paper describes benchmarks of XRootD and Pelican services in the Open Science Data…

    Benchmarking the Open Science Data Federation services to develop XRootD best practices

    Fabio Andrijauskas +2

  13. cs.IR 2026-05-13 reviewed
    Granite R2 models lead multilingual retrieval in 200+ languages

    Granite Embedding Multilingual R2 Models

    Aashka Trivedi +17

  14. cs.IR 2026-05-13 reviewed
    LLM profiles boost recommender simulation ranking by 7%

    Task-Aware Automated User Profile Generation for Recommendation Simulation Using Large Language Models

    Chenglong Ma +4

  15. cs.AI 2026-05-13 reviewed
    Graph links convergent claims from multiple innovation methods

    IdeaForge: A Knowledge Graph-Grounded Multi-Agent Framework for Cross-Methodology Innovation Analysis and Patent Claim Generation

    Joy Bose

  16. cs.DL 2026-05-13 reviewed
    Graph links 200k research repos to papers and artifacts

    SemRepo: A Knowledge Graph for Research Software and Its Scholarly Ecosystem

    Abdul Rafay +3

  17. cs.CL 2026-05-13 reviewed
    Parallel dataset gives medical dialogues in nine Indic languages

    IndicMedDialog: A Parallel Multi-Turn Medical Dialogue Dataset for Accessible Healthcare in Indic Languages

    Piyush Patel +2

  18. cs.CL 2026-05-13 reviewed
    Latent info gain ranks visual evidence for better multimodal RAG

    Utility-Oriented Visual Evidence Selection for Multimodal Retrieval-Augmented Generation

    Haofeng Zhang +5

  19. cs.IR 2026-05-13 reviewed
    LeanSearch v2 lifts Lean 4 proof success to 20 percent

    LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving

    Bin Dong +7

  20. cs.IR 2026-05-13 reviewed
    LeanSearch v2 lifts Lean 4 proof success to 20%

    LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving

    Bin Dong +7

  21. cs.MA 2026-05-13 reviewed
    Multi-agent system automates VC due diligence

    A Multi-Agent Orchestration Framework for Venture Capital Due Diligence

    Grigorios Alexandrou +1

  22. cs.IR 2026-05-13 reviewed
    Half of ReDial CRS accuracy traces to repetition shortcuts

    A Standardized Re-evaluation of Conversational Recommender Systems on the ReDial Dataset

    Ivica Kostric +1

  23. cs.IR 2026-05-13 reviewed
    LLMs predict query-specific validity horizons for web content

    RAG-Enhanced Large Language Models for Dynamic Content Expiration Prediction in Web Search

    Daiting Shi +6

  24. cs.CV 2026-05-13 reviewed
    Source figures become verifiable evidence in deep research reports

    ViDR: Grounding Multimodal Deep Research Reports in Source Visual Evidence

    Baoqin Sun +6

  25. cs.AI 2026-05-13 reviewed
    KITE tutor raises simulated student accuracy on algorithm tasks

    Retrieval-Augmented Tutoring for Algorithm Tracing and Problem-Solving in AI Education

    Arto Hellas +8

  26. cs.IR 2026-05-13 reviewed
    Context changes what the same image means for retrieval

    Same Image, Different Meanings: Toward Retrieval of Context-Dependent Meanings

    Ayuto Tsutsumi +1

  27. cs.IR 2026-05-13 reviewed
    Linked page ecosystems steer LLM agents to target recommendations

    EcoGEO: Trajectory-Aware Evidence Ecosystems for Web-Enabled LLM Search Agents

    Hengwei Ye +3

  28. cs.IR 2026-05-12 reviewed
    MLP distillation accelerates generative recommenders 8.74 times

    MLPs are Efficient Distilled Generative Recommenders

    Clark Mingxuan Ju +4

  29. cs.HC 2026-05-12 reviewed
    Admins like AI help writing WhatsApp rules but fear trust breaches

    Creating Group Rules with AI: Human-AI Collaboration in WhatsApp Moderation

    Aditya Vashistha +3

  30. cs.CL 2026-05-12 reviewed
    LLM refines embeddings at test time for up to 25% gains

    Task-Adaptive Embedding Refinement via Test-time LLM Guidance

    Ariel Gera +4

  31. cs.CL 2026-05-12 reviewed
    This paper proposes ORBIT, a method that tracks how far a fine-tuned generative retrieval…

    ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging

    Alicia Tsai +9

  32. cs.CL 2026-05-12 reviewed
    Entropy of plausibility scores estimates LLM question difficulty

    Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring

    Adam Jatowt +2

  33. cs.CL 2026-05-12 reviewed
    High-convergence sentences lift LLM accuracy on inferential questions

    Context Convergence Improves Answering Inferential Questions

    Adam Jatowt +2

  34. cs.CL 2026-05-12 reviewed
    Benchmark forces models to combine facts from two articles

    MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

    Chih-Hsuan Wei +15

  35. cs.IR 2026-05-12 reviewed
    Prototype-guided retrieval improves EHR clinical predictions

    EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records

    Dana El Samad +3

  36. cs.CL 2026-05-12 reviewed
    Retrieval lifts two-hop medical QA to 89% conceptual accuracy

    Overview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering

    Ganesh Chandrasekar +15

  37. cs.IR 2026-05-12 reviewed
    BatchBench framework equalizes autoscaling policy tests

    BatchBench: Toward a Workload-Aware Benchmark for Autoscaling Policies in Big Data Batch Processing -- A Proposed Framework

    Siri Chandana Sirigiri +1

  38. cs.IR 2026-05-12 reviewed
    Crowdsourcing validates LLM ontology mappings at scale

    Unlocking Crowdsourcing for Ontology Matching Validation

    Zhangcheng Qiang

  39. cs.CV 2026-05-12 reviewed
    One autoregressive model makes personalized ad images and text

    Design Your Ad: Personalized Advertising Image and Text Generation with Unified Autoregressive Models

    Ao Ma +17

  40. cs.CL 2026-05-12 reviewed
    Three-stage retrieval pipeline ranks 8th in SemEval multi-turn task

    Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking

    David-Maximilian Caraman +1

  41. cs.IR 2026-05-12 reviewed
    Health record trajectories improve image-based disease forecasts

    From Trajectories to Phenotypes: Disease Progression as Structural Priors for Multi-organ Imaging Representation Learning

    Chengyan Wang +11

  42. cs.DS 2026-05-12 reviewed
    Ulam similarity admits O(n/sqrt(log n)) LSH distortion

    On the LSH Distortion of Ulam and Cayley Similarities

    Erasmo Tani +3

  43. cs.IR 2026-05-12 reviewed
    Benchmark with 1M entries tests multi-dimensional rewards for recommender agents

    RecRM-Bench: Benchmarking Multidimensional Reward Modeling for Agentic Recommender Systems

    Dengcan Liu +12

  44. cs.IR 2026-05-12 reviewed
    ZipRerank matches top multimodal rerankers at 10x lower latency

    Very Efficient Listwise Multimodal Reranking for Long Documents

    Lawrence B. Hsieh +2

  45. cs.IR 2026-05-12 reviewed
    Critic and generator agents iteratively refine research outlines

    AgentDisCo: Towards Disentanglement and Collaboration in Open-ended Deep Research Agents

    Jiarui Jin +4

  46. cs.IR 2026-05-12 reviewed
    Dual-context views with quality weights boost sequential recs

    Quality-Aware Collaborative Multi-Positive Contrastive Learning for Sequential Recommendation

    Wei Wang

  47. cs.IR 2026-05-12 reviewed
    Staged mining and activity grouping boost LLM recommendations

    HSUGA: LLM-Enhanced Recommendation with Hierarchical Semantic Understanding and Group-Aware Alignment

    Dugang Liu +4

  48. cs.IR 2026-05-12 reviewed
    Planner picks slow reasoning only when it improves recommendations

    TwiSTAR:Think Fast, Think Slow, Then Act,Generative Recommendation with Adaptive Reasoning

    Kaian Jiang +3

  49. cs.IR 2026-05-12 reviewed
    Conditional memory fixes SID representation conflicts in generative recommendation

    Conditional Memory Enhanced Item Representation for Generative Recommendation

    Shengyu Zhou +4

  50. cs.IR 2026-05-12 reviewed
    Codebooks quantize signals to boost multi-market CTR privately

    FedMM: Federated Collaborative Signal Quantization for Multi-Market CTR Prediction

    Dugang Liu +4