Matching primary charge explains 99.2% of the NDCG@10 gap between BM25 and best systems on LeCaRDv2 because benchmark relevance is defined by charge-encoding elements.
In: Proceedings of the 47th In- ternational ACM SIGIR Conference on Research and Development in Infor- mation Retrieval
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
An LLM agent self-evolves a set of query-rewriting rules that raise BM25 performance on the LeCaRD-v2 legal retrieval benchmark above human-designed and greedy baselines.
Multi-agent LLM framework simulates Chinese civil trials through five-stage procedures with memory and retrieval, producing judgments strong in liability allocation and multi-item decisions.
LexRubric is a rubric-based benchmark containing 649 instances and 12,337 atomic criteria for diagnostic evaluation of LLMs on open-ended Chinese legal consultation and judicial examination tasks across 14 scenarios.
citing papers explorer
-
Charge as a Construct-Validity Factor in Chinese Legal Case Retrieval: A Cross-Benchmark Audit
Matching primary charge explains 99.2% of the NDCG@10 gap between BM25 and best systems on LeCaRDv2 because benchmark relevance is defined by charge-encoding elements.
-
When Rules Learn: A Self-Evolving Agent for Legal Case Retrieval
An LLM agent self-evolves a set of query-rewriting rules that raise BM25 performance on the LeCaRD-v2 legal retrieval benchmark above human-designed and greedy baselines.
-
Civil Court Simulation with Large Language Models
Multi-agent LLM framework simulates Chinese civil trials through five-stage procedures with memory and retrieval, producing judgments strong in liability allocation and multi-item decisions.
-
LexRubric: A Rubric-Guided Diagnostic Benchmark for Open-Ended Legal Tasks
LexRubric is a rubric-based benchmark containing 649 instances and 12,337 atomic criteria for diagnostic evaluation of LLMs on open-ended Chinese legal consultation and judicial examination tasks across 14 scenarios.