Chatlaw: A Multi-Agent Legal Assistant based on a Role-Aligned Mixture-of-Experts Architecture

Bin Ling; Bohua Chen; Hao Li; Jiaxi Cui; Li Yuan; Munan Ning; Yang Yan; Yonghong Tian; Zongjian Li

arxiv: 2306.16092 · v3 · pith:G4F43S5Pnew · submitted 2023-06-28 · 💻 cs.CL

Chatlaw: A Multi-Agent Legal Assistant based on a Role-Aligned Mixture-of-Experts Architecture

Jiaxi Cui , Munan Ning , Zongjian Li , Bohua Chen , Yang Yan , Hao Li , Bin Ling , Yonghong Tian

show 1 more author

Li Yuan

This is my paper

classification 💻 cs.CL

keywords legalchatlawassistantarchitecturecasecollaborativeexpertframework

0 comments

read the original abstract

Artificial Intelligence (AI) holds great potential in legal services, yet Large Language Models (LLMs) face two major challenges: limited knowledge of the Chinese legal system and vulnerability to hallucinations. To address these issues, we present Chatlaw, a multi-agent legal assistant. Chatlaw's framework is designed to emulate the Standard Operating Procedures (SOP) of real law firms, where different roles (e.g., assistant, researcher, senior lawyer) collaborate on a case. To computationally mirror this collaborative structure, we developed a novel Role-Aligned Mixture-of-Experts (RA-MoE) architecture. In this system, the internal "experts" are specifically trained to align with the distinct tasks of each agent role (e.g., inquiry, analysis, drafting). These specialized agents (Legal Assistant, Researcher, etc.) then form the collaborative framework. When they interact with users, retrieve legal knowledge, analyze case details, or generate reliable consultations, the RA-MoE architecture intelligently routes their computations to the corresponding dedicated expert, ensuring each step is handled by the most qualified parameters. In evaluations, Chatlaw surpasses general-purpose AI models, including GPT-4, achieving a 7.73% improvement in accuracy on the LawBench benchmark and an 11-point higher score on the Unified Qualification Exam for Legal Professionals. Real-case studies and expert assessments further confirm its robustness. Chatlaw enhances the accessibility and reliability of legal services, advancing the provision of legal support to the public.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 17 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Reconstruction of Personally Identifiable Information from Supervised Finetuned Models
cs.CR 2026-05 unverdicted novelty 7.0

PII can be reconstructed from SFT models via prefix attacks, with the new COVA algorithm improving success rates and leakage varying by attacker knowledge and PII type.
MAP-Law: Coverage-Driven Retrieval Control for Multi-Turn Legal Consultation
cs.AI 2026-05 unverdicted novelty 7.0

MAP-Law dynamically controls retrieval depth in legal AI by computing element coverage, evidence coverage, and marginal gain on a joint node graph, reaching 0.86 element coverage with 58% fewer rounds than fixed basel...
InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees
cs.LG 2026-05 unverdicted novelty 7.0

InvEvolve evolves white-box inventory policies from LLMs with statistical safety guarantees and outperforms classical and deep learning methods on synthetic and real retail data.
Social Dynamics as Critical Vulnerabilities that Undermine Objective Decision-Making in LLM Collectives
cs.CL 2026-04 unverdicted novelty 7.0

Social dynamics in LLM collectives cause representative agents to make less accurate decisions as peer pressure increases through larger adversarial groups, more capable peers, longer arguments, and persuasive styles.
VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models
cs.CL 2025-12 conditional novelty 7.0

VLegal-Bench supplies 10,450 expert-validated samples for evaluating LLMs on Vietnamese legal questions, retrieval, multi-step reasoning, and scenario solving.
LexRel: Benchmarking Legal Relation Extraction for Chinese Civil Cases
cs.CL 2025-12 unverdicted novelty 7.0

LexRel introduces a hierarchical legal relation schema and expert benchmark for Chinese civil cases, exposing LLM limitations and downstream gains from explicit relation knowledge.
InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees
cs.LG 2026-05 unverdicted novelty 6.0

InvEvolve uses LLMs and RL to generate certified inventory policies that outperform classical and deep learning methods on synthetic and real data while providing multi-period performance guarantees.
PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations
cs.CL 2026-04 unverdicted novelty 6.0

PRISM benchmark disentangles LLM hallucinations into knowledge missing, knowledge errors, reasoning errors, and instruction-following errors across three generation stages, revealing trade-offs when testing 24 models.
LexGenius: An Expert-Level Benchmark for Large Language Models in Legal General Intelligence
cs.CL 2025-12 unverdicted novelty 6.0

LexGenius benchmark reveals that even the strongest LLMs show major gaps in legal intelligence and lag behind human professionals.
Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
cs.CL 2025-09 unverdicted novelty 6.0

KG-R1 trains a single RL agent to retrieve from and reason over knowledge graphs in one loop, achieving higher accuracy with fewer tokens than multi-module baselines and transferring to unseen graphs.
A Survey on Large Language Model based Autonomous Agents
cs.AI 2023-08 accept novelty 6.0

A survey of LLM-based autonomous agents that proposes a unified framework for their construction and reviews applications in social science, natural science, and engineering along with evaluation methods and future di...
LegalDrill: Diagnosis-Driven Synthesis for Legal Reasoning in Small Language Models
cs.CL 2026-04 unverdicted novelty 5.0

LegalDrill uses diagnosis-driven synthesis and self-reflective verification to create high-quality training data that improves small language models' legal reasoning without expert annotations.
LicenseGPT: A Fine-tuned Foundation Model for Publicly Available Dataset License Compliance
cs.SE 2024-12 unverdicted novelty 5.0

LicenseGPT fine-tuned on 500 expert-annotated licenses raises prediction agreement to 64.30% and cuts per-license analysis time by 94.44% from 108s to 6s in lawyer user studies.
KnowPilot: Your Knowledge-Driven Copilot for Domain Tasks
cs.SE 2026-04 unverdicted novelty 4.0

KnowPilot integrates knowledge retrieval and memory systems into generative agents to achieve better results on domain-specific tasks such as text generation.
A Survey on Knowledge Distillation of Large Language Models
cs.CL 2024-02 accept novelty 3.0

A comprehensive survey of knowledge distillation for LLMs structured around algorithms, skill enhancement, and vertical applications, highlighting data augmentation as a key enabler.
Retrieval-Augmented Generation for Large Language Models: A Survey
cs.CL 2023-12 unverdicted novelty 3.0

A survey of RAG paradigms, components, benchmarks, and challenges for improving LLMs on knowledge-intensive tasks.
A Survey of Hallucination in Large Foundation Models
cs.AI 2023-09 accept novelty 3.0

A survey classifying hallucination phenomena specific to large foundation models, establishing evaluation criteria, examining mitigation strategies, and discussing future directions.