EURO-5K: When Does Domain Pretraining Matter? Benchmarking Transformers for EU Reporting Obligation Extraction

Eugenia Giannini; Marios Koniaris; Panayiotis Tsanakas; Vasileios Kotronis

arxiv: 2606.02971 · v1 · pith:TZY25PYHnew · submitted 2026-06-02 · 💻 cs.CL

EURO-5K: When Does Domain Pretraining Matter? Benchmarking Transformers for EU Reporting Obligation Extraction

Marios Koniaris , Vasileios Kotronis , Eugenia Giannini , Panayiotis Tsanakas This is my paper

Pith reviewed 2026-06-28 11:12 UTC · model grok-4.3

classification 💻 cs.CL

keywords reporting obligationsEU legislationlegal information extractiondomain adaptationBERTlarge language modelsparameter-efficient fine-tuningsentence classification

0 comments

The pith

Fully fine-tuned generic BERT matches legal BERT at 0.89 F1 for EU reporting obligation extraction, with LLMs reaching the same level.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to determine when legal-domain pretraining improves transformer performance on the task of pulling reporting obligations out of EU legislation. It builds a corpus of positive sentences and hard negative examples drawn from 136 acts, then benchmarks both encoder-style and generative models under full fine-tuning and parameter-efficient regimes. The results indicate that full fine-tuning largely erases any advantage from legal pretraining, while constrained adaptation still benefits from it, and that learning saturates near three thousand examples. This matters for deciding whether to pay the cost of domain-specific models when building regulatory compliance tools.

Core claim

On the EURO-5K corpus, fully fine-tuned generic and legal BERT models both reach 0.89 F1; fine-tuned LLMs match encoder accuracy at the sentence level; legal pretraining supplies only small gains for generative models but clear gains under parameter-efficient tuning; and all methods converge around three thousand samples with diminishing returns thereafter.

What carries the argument

The EURO-5K sentence-level dataset of reporting obligations paired with challenging negatives, used to compare full fine-tuning versus QLoRA on generic versus legal-pretrained encoders and on LLMs.

If this is right

Legal pretraining speeds early learning when only small amounts of task data are available.
Models trained on the corpus function as specialised reporting-obligation extractors rather than generic regulatory classifiers on external regulatory texts.
Parameter-efficient methods gain more from legal pretraining than full fine-tuning does.
Performance plateaus near three thousand examples, indicating that additional data yields little further improvement.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

When full fine-tuning is impractical due to compute limits, legal pretraining retains practical value for similar extraction tasks.
The same benchmark setup could test whether domain pretraining patterns hold for obligation extraction in other regulatory domains such as financial or environmental rules.
Extending the models from sentence detection to structured extraction of details like deadlines or responsible parties would be a direct next measurement.

Load-bearing premise

The sentence-level labels and the choice of hard negative examples from the 136 acts correctly reflect the practical boundary between genuine reporting obligations and similar non-obligatory text.

What would settle it

A substantial drop in F1 when the trained models are applied to a fresh collection of EU acts outside the original 136 would show that the learned distinction does not generalize.

read the original abstract

Extracting reporting obligations from EU legislation is critical for assessing and reducing regulatory reporting burden. However, distinguishing reporting requirements from structurally similar provisions requires specialised legal understanding. Current legal NLP methods lack specialised datasets with clear guidelines and comparative evaluation of extraction paradigms and domain adaptation strategies. We curate EURO-5K, a corpus of sentence-level reporting obligations and challenging negative examples from 136 EU legislative acts. On this dataset, we train and compare discriminative token-classification models (BERT-style) and generative span-extraction models (LLMs), evaluating both full fine-tuning and parameter-efficient QLoRA against baselines (pattern and dependency-based extraction, few-shot prompting). Results show that fully fine-tuned generic and legal BERT models achieve similar performance (0.89 F1), while fine-tuned LLMs match encoder accuracy for sentence-level extraction. Legal pretraining offers only small gains for generative models. In contrast, it is clearly beneficial when adaptation capacity is constrained, as parameter-efficient tuning of Legal-BERT outperforms its generic counterpart. Learning curve analysis demonstrates that legal pretraining accelerates early learning with minimal data. All approaches converge around 3K samples with diminishing returns thereafter, validating dataset sufficiency. Cross-dataset evaluation on two external regulatory corpora shows that our models behave as specialised reporting obligation extractors rather than generic regulatory classifiers. We release EURO-5K, trained models, and an interactive demo with explainability visualizations and structured RDF export. These demonstrate that both paradigms and parameter-efficient training provide practical tools for regulatory compliance automation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EURO-5K supplies a usable new dataset for obligation extraction and shows legal pretraining helps most under QLoRA, but the claims rest on unshown annotation details.

read the letter

The paper's core contribution is EURO-5K, a 5k-sentence corpus drawn from 136 EU acts that mixes reporting obligations with deliberately hard negatives. They compare full fine-tuning on BERT-style models against QLoRA on LLMs, plus pattern and few-shot baselines. The reported result is that generic and legal BERT both reach 0.89 F1 when fully tuned, LLMs match that for sentence-level work, and legal pretraining gives a clearer edge only when parameter budget is tight. Learning curves flatten around 3k examples and cross-dataset tests suggest the models act as specialized extractors rather than generic regulatory detectors.

They release the data, the models, and a demo with explainability and RDF export. That practical package is the strongest part of the work. The learning-curve and cross-dataset sections are straightforward and give readers concrete numbers to judge sufficiency and specialization.

The soft spot is exactly where the stress-test note flags it: dataset construction. The abstract supplies no inter-annotator agreement, no annotation guidelines, and no account of how the challenging negatives were sampled from the 136 acts. Without those details it is difficult to judge whether the 0.89 F1 reflects genuine legal distinction or curator-specific choices. If new acts contain contextual cues the current negatives missed, the performance gap and the pretraining story could shrink. The paper would be stronger with even basic label-quality stats and a short error analysis.

This is aimed at legal NLP groups and regulatory-tech teams that need a benchmark for obligation extraction. Readers already running domain-adaptation experiments will find the QLoRA contrast useful. It is not a theoretical paper, but the empirical setup is clear enough to build on.

I would bring it to a reading group focused on applied legal text work. I would not cite it in my own papers in the next year. It deserves peer review because the dataset fills a documented gap and the comparisons are concrete, even if the annotation section needs expansion in revision.

Referee Report

1 major / 1 minor

Summary. The paper introduces EURO-5K, a sentence-level corpus of reporting obligations and challenging negative examples drawn from 136 EU legislative acts. It benchmarks discriminative token-classification models (BERT variants) against generative span-extraction models (LLMs), comparing full fine-tuning and QLoRA, and reports that fully fine-tuned generic and legal BERTs both reach 0.89 F1, that fine-tuned LLMs match encoder accuracy, that legal pretraining yields only small gains for generative models but clear gains under parameter-efficient tuning, that learning curves converge around 3K samples, and that cross-dataset tests indicate the models act as specialized extractors rather than generic regulatory classifiers. The dataset, models, and interactive demo are released.

Significance. If the annotations are reliable, the results supply concrete evidence on the conditions under which domain-specific pretraining matters for legal information extraction and demonstrate the practical viability of both encoder and LLM paradigms for regulatory compliance tasks. The public release of the corpus and models is a clear strength that enables direct replication and extension.

major comments (1)

[§3 (EURO-5K curation)] §3 (EURO-5K curation): the manuscript supplies no annotation guidelines, inter-annotator agreement figures, legal-expert involvement details, or explicit sampling rules for the 'challenging negative examples.' Because every headline result (0.89 F1, LLM parity, conditional pretraining benefit, learning-curve and cross-dataset claims) rests on the correctness of these sentence-level labels, the omission is load-bearing for the central empirical claims.

minor comments (1)

[Abstract] Abstract and §4: the phrase 'challenging negative examples' is used without a concise summary of the selection heuristic; a one-sentence clarification would improve readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the annotation process. We agree that additional details are necessary to substantiate the reliability of the labels and will incorporate them in the revised manuscript.

read point-by-point responses

Referee: [§3 (EURO-5K curation)] §3 (EURO-5K curation): the manuscript supplies no annotation guidelines, inter-annotator agreement figures, legal-expert involvement details, or explicit sampling rules for the 'challenging negative examples.' Because every headline result (0.89 F1, LLM parity, conditional pretraining benefit, learning-curve and cross-dataset claims) rests on the correctness of these sentence-level labels, the omission is load-bearing for the central empirical claims.

Authors: We acknowledge the omission of detailed annotation information in the current version of the manuscript. In the revised version, we will expand §3 to include: (1) the full annotation guidelines used by the annotators, (2) inter-annotator agreement statistics (e.g., Cohen's kappa or F1 agreement on a double-annotated subset), (3) details on legal expert involvement, including the number of experts, their background, and how disagreements were resolved, and (4) explicit sampling rules and criteria for selecting the challenging negative examples. This will allow readers to better assess the quality of EURO-5K and support the validity of our experimental results. We believe these additions will address the referee's concern without altering the core findings. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical benchmarking on held-out data

full rationale

This is a purely empirical study that curates EURO-5K, trains and evaluates models (BERT-style token classifiers and LLM span extractors under full fine-tuning and QLoRA), and reports measured F1 scores, learning curves, and cross-dataset results on held-out sentences. No derivations, fitted parameters renamed as predictions, uniqueness theorems, or ansatzes appear; all claims are direct experimental outcomes on external test data rather than quantities defined by the authors' own modeling choices. Self-citations, if any, are not load-bearing for any central result. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters, invented entities, or non-standard axioms; the work rests on standard supervised learning assumptions and the new dataset itself.

axioms (1)

standard math Standard supervised learning assumptions hold (i.i.d. samples, appropriate loss functions for token classification and span extraction).
Implicit background for all reported fine-tuning and evaluation results.

pith-pipeline@v0.9.1-grok · 5820 in / 1293 out tokens · 28446 ms · 2026-06-28T11:12:56.113167+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

101 extracted references · 56 canonical work pages · 1 internal anchor

[1]

2020 , address =

Chalkidis, Ilias and Fergadiotis, Manos and Malakasiotis, Prodromos and Aletras, Nikolaos and Androutsopoulos, Ion , booktitle =. 2020 , address =

2020
[2]

Lin , journal =

Ruixue Zhang and Wei Yang and Luyun Lin and Zhengkai Tu and Yuqing Xie and Zihang Fu and Yuhao Xie and Luchen Tan and Kun Xiong and Jimmy J. Lin , journal =. Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents , year =. doi:2002.01861 , eprint =

arXiv 2002
[3]

and Grabmair, Matthias , booktitle =

T.y.s.s., Santosh and Quero Hernandez, Elvin A. and Grabmair, Matthias , booktitle =. Query-driven Relevant Paragraph Extraction from Legal Judgments , year =
[4]

Large Language Models are legal but they are not: Making the case for a powerful

Jayakumar, Thanmay and Farooqui, Fauzan and Farooqui, Luqman , booktitle =. Large Language Models are legal but they are not: Making the case for a powerful. 2023 , address =. doi:10.18653/v1/2023.nllp-1.22 , url =

work page doi:10.18653/v1/2023.nllp-1.22 2023
[5]

and Lee, Wonhee and Ng, Amy and Rapstine, Natalya I

Chivers, Brian and Jiang, Mason P. and Lee, Wonhee and Ng, Amy and Rapstine, Natalya I. and Storer, Alex , booktitle =. 2022 , address =. doi:10.18653/v1/2022.deeplo-1.5 , url =

work page doi:10.18653/v1/2022.deeplo-1.5 2022
[6]

Gultekin and Achille Globo and Andrea Zugarini and Marco Ernandes and Leonardo Rigutini , journal =

S. Gultekin and Achille Globo and Andrea Zugarini and Marco Ernandes and Leonardo Rigutini , journal =. An energy-based comparative analysis of common approaches to text classification in the Legal domain , year =. doi:2311.01256 , eprint =

arXiv
[7]

Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =

Sachin Pawar and Basit Ali and Girish K. Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =. Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law , keywords =. 2023 , abstract =. doi:10.1145/3594536.3595134 , file =

work page doi:10.1145/3594536.3595134 2023
[8]

and Henderson, Peter and Ho, Daniel E

Zheng, Lucia and Guha, Neel and Anderson, Brandon R. and Henderson, Peter and Ho, Daniel E. , booktitle =. When does pretraining help?: assessing self-supervised learning for law and the CaseHOLD dataset of 53,000+ legal holdings , year =. doi:10.1145/3462757.3466088 , file =

work page doi:10.1145/3462757.3466088
[9]

, booktitle =

Wehnert, Sabine and Sudhi, Viju and Dureja, Shipra and Kutty, Libin and Shahania, Saijal and De Luca, Ernesto W. , booktitle =. Legal norm retrieval with variations of the bert model combined with TF-IDF vectorization , year =. doi:10.1145/3462757.3466104 , file =

work page doi:10.1145/3462757.3466104
[10]

, booktitle =

Yoshioka, Masaharu and Aoki, Yasuhiro and Suzuki, Youta , booktitle =. BERT-based ensemble methods with data augmentation for legal textual entailment in COLIEE statute law task , year =. doi:10.1145/3462757.3466105 , file =

work page doi:10.1145/3462757.3466105
[11]

Computable Contracts by Extracting Obligation Logic Graphs , year =

Savelka, Jaromir , booktitle =. Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts , year =. doi:10.1145/3594536.3595161 , file =

work page doi:10.1145/3594536.3595161
[12]

Computable Contracts by Extracting Obligation Logic Graphs , year =

Servantez, Sergio and Lipka, Nedim and Siu, Alexa and Aggarwal, Milan and Krishnamurthy, Balaji and Garimella, Aparna and Hammond, Kristian and Jain, Rajiv , booktitle =. Computable Contracts by Extracting Obligation Logic Graphs , year =. doi:10.1145/3594536.3595162 , keywords =

work page doi:10.1145/3594536.3595162
[13]

Legal Holding Extraction from Italian Case Documents using Italian-LEGAL-BERT Text Summarization , year =

Licari, Daniele and Bushipaka, Praveen and Marino, Gabriele and Comandé, Giovanni and Cucinotta, Tommaso , booktitle =. Legal Holding Extraction from Italian Case Documents using Italian-LEGAL-BERT Text Summarization , year =. doi:10.1145/3594536.3595177 , keywords =

work page doi:10.1145/3594536.3595177
[14]

Computable Contracts by Extracting Obligation Logic Graphs , year =

Paul, Shounak and Mandal, Arpan and Goyal, Pawan and Ghosh, Saptarshi , booktitle =. Pre-trained Language Models for the Legal Domain: A Case Study on Indian Law , year =. doi:10.1145/3594536.3595165 , file =

work page doi:10.1145/3594536.3595165
[15]

and Krass, Mark S

Huang, Zihan and Low, Charles and Teng, Mengqiu and Zhang, Hongyi and Ho, Daniel E. and Krass, Mark S. and Grabmair, Matthias , booktitle =. Context-aware legal citation recommendation using deep learning , year =. doi:10.1145/3462757.3466066 , file =

work page doi:10.1145/3462757.3466066
[16]

and Henderson, Peter and Ho, Daniel E

Aumiller, Dennis and Almasian, Satya and Lackner, Sebastian and Gertz, Michael , booktitle =. Structural text segmentation of legal documents , year =. doi:10.1145/3462757.3466085 , file =

work page doi:10.1145/3462757.3466085
[17]

Incorporating domain knowledge for extractive summarization of legal case documents , year =

Bhattacharya, Paheli and Poddar, Soham and Rudra, Koustav and Ghosh, Kripabandhu and Ghosh, Saptarshi , booktitle =. Incorporating domain knowledge for extractive summarization of legal case documents , year =. doi:10.1145/3462757.3466092 , file =

work page doi:10.1145/3462757.3466092
[18]

, booktitle =

Vold, Andrew and Conrad, Jack G. , booktitle =. Using transformers to improve answer retrieval for legal questions , year =. doi:10.1145/3462757.3466102 , file =

work page doi:10.1145/3462757.3466102
[19]

, booktitle =

Rosa, Guilherme Moraes and Rodrigues, Ruan Chaves and de Alencar Lotufo, Roberto and Nogueira, Rodrigo , booktitle =. To tune or not to tune?: zero-shot models for legal case entailment , year =. doi:10.1145/3462757.3466103 , file =

work page doi:10.1145/3462757.3466103
[20]

and Grant, Jayla C

Savelka, Jaromir and Westermann, Hannes and Benyekhlef, Karim and Alexander, Charlotte S. and Grant, Jayla C. and Amariles, David Restrepo and Hamdani, Rajaa El and Meeùs, Sébastien and Troussel, Aurore and Araszkiewicz, Michał and Ashley, Kevin D. and Ashley, Alexandra and Branting, Karl and Falduti, Mattia and Grabmair, Matthias and Harašta, Jakub and N...

work page doi:10.1145/3462757.3466149
[21]

van Drie, Romy A. N. and de Boer, Maaike H. T. and Bakker, Roos M. and Tolios, Ioannis and Vos, Daan , booktitle =. The Dutch Law as a Semantic Role Labeling Dataset , year =. doi:10.1145/3594536.3595124 , file =

work page doi:10.1145/3594536.3595124
[22]

Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =

Brugger, Tobias and Stürmer, Matthias and Niklaus, Joel , booktitle =. MultiLegalSBD: A Multilingual Legal Sentence Boundary Detection Dataset , year =. doi:10.1145/3594536.3595132 , file =

work page doi:10.1145/3594536.3595132
[23]

LeArNER: Few-shot Legal Argument Named Entity Recognition , year =

Lee, Shao-Man and Tan, Yu-Hsiang and Yu, Han-Ting , booktitle =. LeArNER: Few-shot Legal Argument Named Entity Recognition , year =. doi:10.1145/3594536.3595144 , file =

work page doi:10.1145/3594536.3595144
[24]

Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models , year =

Daniel Steinigen and Marcin Namysl and Markus Hepperle and Jan Krekeler and Susanne Landgraf , booktitle =. Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models , year =
[25]

Jarom. Can. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =

2023
[26]

Elize Herrewijnen and Dennis F. W. Craandijk , booktitle =. Towards Meaningful Paragraph Embeddings for Data-Scarce Domains:. 2023 , editor =

2023
[27]

Automatic Rhetorical Roles Classification for Legal Documents using LEGAL-TransformerOverBERT , year =

Gabriele Marino and Daniele Licari and Praveen Bushipaka and Giovanni Comand. Automatic Rhetorical Roles Classification for Legal Documents using LEGAL-TransformerOverBERT , year =. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law
[28]

Bridging the Gap: Mapping Layperson Narratives to Legal Issues with Language Models , year =

Hannes Westermann and S. Bridging the Gap: Mapping Layperson Narratives to Legal Issues with Language Models , year =. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law
[29]

Applying

Henrik Palmer Olsen and Malte H. Applying. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =

2023
[30]

Extracting

Malo Revel and Aur. Extracting. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =

2023
[31]

Ashley , booktitle =

Huihui Xu and Kevin D. Ashley , booktitle =. Argumentative Segmentation Enhancement for Legal Summarization , year =
[32]

Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions , year =

Anas Belfathi and Nicolas Hernandez and Laura Monceaux , booktitle =. Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions , year =
[33]

Palshikar , booktitle =

Basit Ali and Ravina More and Sachin Pawar and Girish K. Palshikar , booktitle =. Prior Case Retrieval using Evidence Extraction from Court Judgements , year =
[34]

Automatic Judgement Forecasting for Pending Applications of the European Court of Human Rights , year =

Masha Medvedeva and Ahmet. Automatic Judgement Forecasting for Pending Applications of the European Court of Human Rights , year =. Joint Proceedings of the Workshops on Automated Semantic Analysis of Information in Legal Text
[35]

Explainable Rule Extraction via Semantic Graphs , year =

G. Explainable Rule Extraction via Semantic Graphs , year =. Joint Proceedings of the Workshops on Automated Semantic Analysis of Information in Legal Text
[36]

Automatic Semantic Annotation for the Easification of Action Rule Legislative Sentences for Specialist Readers , year =

Sherry Maynard , booktitle =. Automatic Semantic Annotation for the Easification of Action Rule Legislative Sentences for Specialist Readers , year =
[37]

Sebastian Felix Schwemer and Letizia Tomada and Tommaso Pasini , booktitle =. Legal. 2021 , editor =

2021
[38]

, journal =

van Dijck, Gijs and Aguilera, Carlos and Chakravarthy, Shashank M. , journal =. Deciphering disagreement in the annotation of EU legislation , year =. doi:10.1007/s10506-024-09423-9 , file =

work page doi:10.1007/s10506-024-09423-9
[39]

G. J. Brandsma and J. Blom‐Hansen and Christiaan Meijer and Kody Moodley , title =. ArXiv preprint , pages =. 2025 , abstract =

2025
[40]

Prosecutorial Outcome Predication with LoRA and QLoRA , year =

Kuo. Prosecutorial Outcome Predication with LoRA and QLoRA , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA241253 , file =

work page doi:10.3233/faia241253
[41]

Leveraging

May Myo Zin and Ken Satoh and Georg Borges , booktitle =. Leveraging. 2024 , editor =. doi:10.3233/FAIA241247 , file =

work page doi:10.3233/faia241247 2024
[42]

Combining Rule-Based and Machine Learning Methods for Efficient Information Extraction from Enforcement Decisions , year =

Harry Nan and Maarten Marx and Johan Wolswinkel , booktitle =. Combining Rule-Based and Machine Learning Methods for Efficient Information Extraction from Enforcement Decisions , year =. doi:10.3233/FAIA241262 , file =

work page doi:10.3233/faia241262
[43]

New Horizons of Legal Judgement Predication via Multi-Task Learning and LoRA , year =

Chia. New Horizons of Legal Judgement Predication via Multi-Task Learning and LoRA , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230966 , file =

work page doi:10.3233/faia230966
[44]

Gray and Jarom

Morgan A. Gray and Jarom. Can. Legal Knowledge and Information Systems -. 2023 , editor =. doi:10.3233/FAIA230961 , file =

work page doi:10.3233/faia230961 2023
[45]

Information Extraction from Lengthy Legal Contracts: Leveraging Query-Based Summarization and

May Myo Zin and Ha. Information Extraction from Lengthy Legal Contracts: Leveraging Query-Based Summarization and. Legal Knowledge and Information Systems -. 2023 , editor =. doi:10.3233/FAIA230963 , file =

work page doi:10.3233/faia230963 2023
[46]

Harnessing GPT-3.5-Turbo for Rhetorical Role Prediction in Legal Cases , year =

Anas Belfathi and Nicolas Hernandez and Laura Monceaux , booktitle =. Harnessing GPT-3.5-Turbo for Rhetorical Role Prediction in Legal Cases , year =. doi:10.3233/FAIA230964 , file =

work page doi:10.3233/faia230964
[47]

Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using

Giulia Grundler and Ruta Liepina and Mariaceleste Musicco and Francesca Lagioia and Andrea Galassi and Giovanni Sartor and Paolo Torroni , booktitle =. Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using. 2024 , editor =. doi:10.3233/FAIA241235 , file =

work page doi:10.3233/faia241235 2024
[48]

Legal Chunking: Evaluating Methods for Effective Legal Text Retrieval , year =

Andrea Filippo Ferraris and Davide Audrito and Giovanni Siragusa and Alessandro Piovano , booktitle =. Legal Chunking: Evaluating Methods for Effective Legal Text Retrieval , year =. doi:10.3233/FAIA241255 , file =

work page doi:10.3233/faia241255
[49]

Legal Text Segmentation Through Breakpoint Detection , year =

Roberto Abbruzzese , booktitle =. Legal Text Segmentation Through Breakpoint Detection , year =. doi:10.3233/FAIA230968 , file =

work page doi:10.3233/faia230968
[50]

From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems , year =

Samyar Janatian and Hannes Westermann and Jinzhe Tan and Jarom. From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230962 , file =

work page doi:10.3233/faia230962
[51]

Automated Semantic Annotation Pipeline for Brazilian Judicial Decisions , year =

Melissa Zorzanelli Costa and Dylan Faria Robson and Thiago Baiense Pe. Automated Semantic Annotation Pipeline for Brazilian Judicial Decisions , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA241248 , file =

work page doi:10.3233/faia241248
[52]

Assessing Ocean's Legal Protection Using

Youssef Al Mouatamid and Jihad Zahir and Marie Bonnin and Hajar Mousannif , booktitle =. Assessing Ocean's Legal Protection Using. 2023 , editor =. doi:10.3233/FAIA230972 , file =

work page doi:10.3233/faia230972 2023
[53]

Event Extraction and Semantic Representation from Spanish Workers' Statute Using Large Language Models , year =

Gabriela Arg. Event Extraction and Semantic Representation from Spanish Workers' Statute Using Large Language Models , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230983 , file =

work page doi:10.3233/faia230983
[54]

American Political Science Review , author=

A Grammar of Institutions , volume=. American Political Science Review , author=. 1995 , pages=. doi:10.2307/2082975 , number=

work page doi:10.2307/2082975 1995
[55]

2009 , publisher=

Understanding institutional diversity , author=. 2009 , publisher=

2009
[56]

QLoRA: Efficient Finetuning of Quantized LLMs , year =

Tim Dettmers and Artidoro Pagnoni and Ari Holtzman and Luke Zettlemoyer , bibsource =. QLoRA: Efficient Finetuning of Quantized LLMs , year =. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , editor =

2023
[57]

2024 , author =

Annotation of Reporting obligations in EU legislation dataset , url =. 2024 , author =

2024
[58]

The hitchhiker ' s guide to testing statistical significance in natural language processing

Dror, Rotem and Baumer, Gili and Shlomov, Segev and Reichart, Roi , booktitle =. The Hitchhiker's Guide to Testing Statistical Significance in Natural Language Processing , year =. doi:10.18653/v1/P18-1128 , file =

work page doi:10.18653/v1/p18-1128
[59]

2018 , edition =

Krippendorff, Klaus , title =. 2018 , edition =

2018
[60]

Bioinformatics , volume=

BioBERT: a pre-trained biomedical language representation model for biomedical text mining , author=. Bioinformatics , volume=. 2020 , publisher=

2020
[61]

2019 , address =

Beltagy, Iz and Lo, Kyle and Cohan, Arman , booktitle =. 2019 , address =. doi:10.18653/v1/D19-1371 , file =

work page doi:10.18653/v1/d19-1371 2019
[62]

Deontic Sentence Classification Using Tree Kernel Classifiers

Liga, Davide and Palmirani, Monica. Deontic Sentence Classification Using Tree Kernel Classifiers. Intelligent Systems and Applications. 2023

2023
[63]

Schwartz, J

Schwartz, Roy and Dodge, Jesse and Smith, Noah A. and Etzioni, Oren , journal =. Green AI , year =. doi:10.1145/3381831 , file =

work page doi:10.1145/3381831
[64]

and Lee, Su-In , title =

Lundberg, Scott M. and Lee, Su-In , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =

2017
[65]

Reporting Requirement Metadata Vocabulary (RRMV) , year =
[66]

Low-Resource Deontic Modality Classification in EU Legislation , year =

Minkova, Kristina and Chakravarthy, Shashank and Dijck, Gijs , booktitle =. Low-Resource Deontic Modality Classification in EU Legislation , year =. doi:10.18653/v1/2023.nllp-1.15 , file =

work page doi:10.18653/v1/2023.nllp-1.15 2023
[67]

NOMOS: Navigating

Pennisi, Andrea and Gonz. NOMOS: Navigating. Proceedings of the. 2023 , organization =. doi:10.18653/v1/2023.nllp-1.2 , file =

work page doi:10.18653/v1/2023.nllp-1.2 2023
[68]

Scott Marcus and Apostolos Thomadakis , title =

J. Scott Marcus and Apostolos Thomadakis , title =. 2025 , abstract =. doi:10.2861/6089952 , url=

work page doi:10.2861/6089952 2025
[69]

Fine-tuning GPT-3 for legal rule classification , year =

Davide Liga and Livio Robaldo , journal =. Fine-tuning GPT-3 for legal rule classification , year =. doi:https://doi.org/10.1016/j.clsr.2023.105864 , file =

work page doi:10.1016/j.clsr.2023.105864 2023
[70]

2021 , abstract =

Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt , title =. 2021 , abstract =

2021
[71]

To NER or Not to NER? A Case Study of Low-Resource Deontic Modalities in EU Legislation , year =

Chakravarthy, Shashank M and Van Dijck, Gijs and Wilbik, Anna , booktitle =. To NER or Not to NER? A Case Study of Low-Resource Deontic Modalities in EU Legislation , year =. doi:10.1109/CI-NLPSoMeCompanion65206.2025.10977902 , file =

work page doi:10.1109/ci-nlpsomecompanion65206.2025.10977902 2025
[72]

Financial Industry Business Ontology (FIBO): Legal Obligation , year =
[73]

, title =

Hanindhito, Bagus and Patel, Bhavesh and John, Lizy K. , title =. Proceedings of the 16th ACM/SPEC International Conference on Performance Engineering , pages =. 2025 , isbn =. doi:10.1145/3676151.3719377 , abstract =

work page doi:10.1145/3676151.3719377 2025
[74]

ACM Comput

Ariai, Farid and Mackenzie, Joel and Demartini, Gianluca , title =. ACM Comput. Surv. , month = dec, articleno =. 2025 , issue_date =. doi:10.1145/3777009 , abstract =

work page doi:10.1145/3777009 2025
[75]

2024 , eprint=

SaulLM-7B: A pioneering Large Language Model for Law , author=. 2024 , eprint=

2024
[76]

2310.06825 , archivePrefix=

Jiang, Albert Q and Sablayrolles, Alexandre and Mensch, Arthur and Bamford, Chris and Chaplot, Devendra Singh and de las Casas, Diego and Bressand, Florian and Lengyel, Gianna and Lample, Guillaume and Saulnier, Lucile and others , year=. 2310.06825 , archivePrefix=

Pith/arXiv arXiv
[77]

2407.21783 , archivePrefix=

Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Yang, Amy and Fan, Angela and others , year=. 2407.21783 , archivePrefix=

Pith/arXiv arXiv
[78]

Principles of Law: LLMs vs RegEx , year =

Molinari, Marianna and Amantea, Ilaria Angela and Quaranta, Marinella and Governatori, Guido , publisher =. Principles of Law: LLMs vs RegEx , year =. Legal Knowledge and Information Systems , doi =
[79]

Automated Extraction of Judicial Interpretative Formulas in EU Case Law on VAT , year =

Grundler, Giulia and Santin, Piera and Fidelangeli, Alessia and Mignone, Rachele and Galli, Federico and Galassi, Andrea and Contissa, Giuseppe and di Caro, Luigi and Torroni, Paolo , publisher =. Automated Extraction of Judicial Interpretative Formulas in EU Case Law on VAT , year =. Legal Knowledge and Information Systems , doi =
[80]

In: Proc

Akiba, Takuya and Sano, Shotaro and Yanase, Toshihiko and Ohta, Takeru and Koyama, Masanori , title =. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining , pages =. 2019 , isbn =. doi:10.1145/3292500.3330701 , abstract =

work page doi:10.1145/3292500.3330701 2019

Showing first 80 references.

[1] [1]

2020 , address =

Chalkidis, Ilias and Fergadiotis, Manos and Malakasiotis, Prodromos and Aletras, Nikolaos and Androutsopoulos, Ion , booktitle =. 2020 , address =

2020

[2] [2]

Lin , journal =

Ruixue Zhang and Wei Yang and Luyun Lin and Zhengkai Tu and Yuqing Xie and Zihang Fu and Yuhao Xie and Luchen Tan and Kun Xiong and Jimmy J. Lin , journal =. Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents , year =. doi:2002.01861 , eprint =

arXiv 2002

[3] [3]

and Grabmair, Matthias , booktitle =

T.y.s.s., Santosh and Quero Hernandez, Elvin A. and Grabmair, Matthias , booktitle =. Query-driven Relevant Paragraph Extraction from Legal Judgments , year =

[4] [4]

Large Language Models are legal but they are not: Making the case for a powerful

Jayakumar, Thanmay and Farooqui, Fauzan and Farooqui, Luqman , booktitle =. Large Language Models are legal but they are not: Making the case for a powerful. 2023 , address =. doi:10.18653/v1/2023.nllp-1.22 , url =

work page doi:10.18653/v1/2023.nllp-1.22 2023

[5] [5]

and Lee, Wonhee and Ng, Amy and Rapstine, Natalya I

Chivers, Brian and Jiang, Mason P. and Lee, Wonhee and Ng, Amy and Rapstine, Natalya I. and Storer, Alex , booktitle =. 2022 , address =. doi:10.18653/v1/2022.deeplo-1.5 , url =

work page doi:10.18653/v1/2022.deeplo-1.5 2022

[6] [6]

Gultekin and Achille Globo and Andrea Zugarini and Marco Ernandes and Leonardo Rigutini , journal =

S. Gultekin and Achille Globo and Andrea Zugarini and Marco Ernandes and Leonardo Rigutini , journal =. An energy-based comparative analysis of common approaches to text classification in the Legal domain , year =. doi:2311.01256 , eprint =

arXiv

[7] [7]

Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =

Sachin Pawar and Basit Ali and Girish K. Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =. Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law , keywords =. 2023 , abstract =. doi:10.1145/3594536.3595134 , file =

work page doi:10.1145/3594536.3595134 2023

[8] [8]

and Henderson, Peter and Ho, Daniel E

Zheng, Lucia and Guha, Neel and Anderson, Brandon R. and Henderson, Peter and Ho, Daniel E. , booktitle =. When does pretraining help?: assessing self-supervised learning for law and the CaseHOLD dataset of 53,000+ legal holdings , year =. doi:10.1145/3462757.3466088 , file =

work page doi:10.1145/3462757.3466088

[9] [9]

, booktitle =

Wehnert, Sabine and Sudhi, Viju and Dureja, Shipra and Kutty, Libin and Shahania, Saijal and De Luca, Ernesto W. , booktitle =. Legal norm retrieval with variations of the bert model combined with TF-IDF vectorization , year =. doi:10.1145/3462757.3466104 , file =

work page doi:10.1145/3462757.3466104

[10] [10]

, booktitle =

Yoshioka, Masaharu and Aoki, Yasuhiro and Suzuki, Youta , booktitle =. BERT-based ensemble methods with data augmentation for legal textual entailment in COLIEE statute law task , year =. doi:10.1145/3462757.3466105 , file =

work page doi:10.1145/3462757.3466105

[11] [11]

Computable Contracts by Extracting Obligation Logic Graphs , year =

Savelka, Jaromir , booktitle =. Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts , year =. doi:10.1145/3594536.3595161 , file =

work page doi:10.1145/3594536.3595161

[12] [12]

Computable Contracts by Extracting Obligation Logic Graphs , year =

Servantez, Sergio and Lipka, Nedim and Siu, Alexa and Aggarwal, Milan and Krishnamurthy, Balaji and Garimella, Aparna and Hammond, Kristian and Jain, Rajiv , booktitle =. Computable Contracts by Extracting Obligation Logic Graphs , year =. doi:10.1145/3594536.3595162 , keywords =

work page doi:10.1145/3594536.3595162

[13] [13]

Legal Holding Extraction from Italian Case Documents using Italian-LEGAL-BERT Text Summarization , year =

Licari, Daniele and Bushipaka, Praveen and Marino, Gabriele and Comandé, Giovanni and Cucinotta, Tommaso , booktitle =. Legal Holding Extraction from Italian Case Documents using Italian-LEGAL-BERT Text Summarization , year =. doi:10.1145/3594536.3595177 , keywords =

work page doi:10.1145/3594536.3595177

[14] [14]

Computable Contracts by Extracting Obligation Logic Graphs , year =

Paul, Shounak and Mandal, Arpan and Goyal, Pawan and Ghosh, Saptarshi , booktitle =. Pre-trained Language Models for the Legal Domain: A Case Study on Indian Law , year =. doi:10.1145/3594536.3595165 , file =

work page doi:10.1145/3594536.3595165

[15] [15]

and Krass, Mark S

Huang, Zihan and Low, Charles and Teng, Mengqiu and Zhang, Hongyi and Ho, Daniel E. and Krass, Mark S. and Grabmair, Matthias , booktitle =. Context-aware legal citation recommendation using deep learning , year =. doi:10.1145/3462757.3466066 , file =

work page doi:10.1145/3462757.3466066

[16] [16]

and Henderson, Peter and Ho, Daniel E

Aumiller, Dennis and Almasian, Satya and Lackner, Sebastian and Gertz, Michael , booktitle =. Structural text segmentation of legal documents , year =. doi:10.1145/3462757.3466085 , file =

work page doi:10.1145/3462757.3466085

[17] [17]

Incorporating domain knowledge for extractive summarization of legal case documents , year =

Bhattacharya, Paheli and Poddar, Soham and Rudra, Koustav and Ghosh, Kripabandhu and Ghosh, Saptarshi , booktitle =. Incorporating domain knowledge for extractive summarization of legal case documents , year =. doi:10.1145/3462757.3466092 , file =

work page doi:10.1145/3462757.3466092

[18] [18]

, booktitle =

Vold, Andrew and Conrad, Jack G. , booktitle =. Using transformers to improve answer retrieval for legal questions , year =. doi:10.1145/3462757.3466102 , file =

work page doi:10.1145/3462757.3466102

[19] [19]

, booktitle =

Rosa, Guilherme Moraes and Rodrigues, Ruan Chaves and de Alencar Lotufo, Roberto and Nogueira, Rodrigo , booktitle =. To tune or not to tune?: zero-shot models for legal case entailment , year =. doi:10.1145/3462757.3466103 , file =

work page doi:10.1145/3462757.3466103

[20] [20]

and Grant, Jayla C

Savelka, Jaromir and Westermann, Hannes and Benyekhlef, Karim and Alexander, Charlotte S. and Grant, Jayla C. and Amariles, David Restrepo and Hamdani, Rajaa El and Meeùs, Sébastien and Troussel, Aurore and Araszkiewicz, Michał and Ashley, Kevin D. and Ashley, Alexandra and Branting, Karl and Falduti, Mattia and Grabmair, Matthias and Harašta, Jakub and N...

work page doi:10.1145/3462757.3466149

[21] [21]

van Drie, Romy A. N. and de Boer, Maaike H. T. and Bakker, Roos M. and Tolios, Ioannis and Vos, Daan , booktitle =. The Dutch Law as a Semantic Role Labeling Dataset , year =. doi:10.1145/3594536.3595124 , file =

work page doi:10.1145/3594536.3595124

[22] [22]

Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =

Brugger, Tobias and Stürmer, Matthias and Niklaus, Joel , booktitle =. MultiLegalSBD: A Multilingual Legal Sentence Boundary Detection Dataset , year =. doi:10.1145/3594536.3595132 , file =

work page doi:10.1145/3594536.3595132

[23] [23]

LeArNER: Few-shot Legal Argument Named Entity Recognition , year =

Lee, Shao-Man and Tan, Yu-Hsiang and Yu, Han-Ting , booktitle =. LeArNER: Few-shot Legal Argument Named Entity Recognition , year =. doi:10.1145/3594536.3595144 , file =

work page doi:10.1145/3594536.3595144

[24] [24]

Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models , year =

Daniel Steinigen and Marcin Namysl and Markus Hepperle and Jan Krekeler and Susanne Landgraf , booktitle =. Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models , year =

[25] [25]

Jarom. Can. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =

2023

[26] [26]

Elize Herrewijnen and Dennis F. W. Craandijk , booktitle =. Towards Meaningful Paragraph Embeddings for Data-Scarce Domains:. 2023 , editor =

2023

[27] [27]

Automatic Rhetorical Roles Classification for Legal Documents using LEGAL-TransformerOverBERT , year =

Gabriele Marino and Daniele Licari and Praveen Bushipaka and Giovanni Comand. Automatic Rhetorical Roles Classification for Legal Documents using LEGAL-TransformerOverBERT , year =. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law

[28] [28]

Bridging the Gap: Mapping Layperson Narratives to Legal Issues with Language Models , year =

Hannes Westermann and S. Bridging the Gap: Mapping Layperson Narratives to Legal Issues with Language Models , year =. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law

[29] [29]

Applying

Henrik Palmer Olsen and Malte H. Applying. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =

2023

[30] [30]

Extracting

Malo Revel and Aur. Extracting. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =

2023

[31] [31]

Ashley , booktitle =

Huihui Xu and Kevin D. Ashley , booktitle =. Argumentative Segmentation Enhancement for Legal Summarization , year =

[32] [32]

Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions , year =

Anas Belfathi and Nicolas Hernandez and Laura Monceaux , booktitle =. Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions , year =

[33] [33]

Palshikar , booktitle =

Basit Ali and Ravina More and Sachin Pawar and Girish K. Palshikar , booktitle =. Prior Case Retrieval using Evidence Extraction from Court Judgements , year =

[34] [34]

Automatic Judgement Forecasting for Pending Applications of the European Court of Human Rights , year =

Masha Medvedeva and Ahmet. Automatic Judgement Forecasting for Pending Applications of the European Court of Human Rights , year =. Joint Proceedings of the Workshops on Automated Semantic Analysis of Information in Legal Text

[35] [35]

Explainable Rule Extraction via Semantic Graphs , year =

G. Explainable Rule Extraction via Semantic Graphs , year =. Joint Proceedings of the Workshops on Automated Semantic Analysis of Information in Legal Text

[36] [36]

Automatic Semantic Annotation for the Easification of Action Rule Legislative Sentences for Specialist Readers , year =

Sherry Maynard , booktitle =. Automatic Semantic Annotation for the Easification of Action Rule Legislative Sentences for Specialist Readers , year =

[37] [37]

Sebastian Felix Schwemer and Letizia Tomada and Tommaso Pasini , booktitle =. Legal. 2021 , editor =

2021

[38] [38]

, journal =

van Dijck, Gijs and Aguilera, Carlos and Chakravarthy, Shashank M. , journal =. Deciphering disagreement in the annotation of EU legislation , year =. doi:10.1007/s10506-024-09423-9 , file =

work page doi:10.1007/s10506-024-09423-9

[39] [39]

G. J. Brandsma and J. Blom‐Hansen and Christiaan Meijer and Kody Moodley , title =. ArXiv preprint , pages =. 2025 , abstract =

2025

[40] [40]

Prosecutorial Outcome Predication with LoRA and QLoRA , year =

Kuo. Prosecutorial Outcome Predication with LoRA and QLoRA , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA241253 , file =

work page doi:10.3233/faia241253

[41] [41]

Leveraging

May Myo Zin and Ken Satoh and Georg Borges , booktitle =. Leveraging. 2024 , editor =. doi:10.3233/FAIA241247 , file =

work page doi:10.3233/faia241247 2024

[42] [42]

Combining Rule-Based and Machine Learning Methods for Efficient Information Extraction from Enforcement Decisions , year =

Harry Nan and Maarten Marx and Johan Wolswinkel , booktitle =. Combining Rule-Based and Machine Learning Methods for Efficient Information Extraction from Enforcement Decisions , year =. doi:10.3233/FAIA241262 , file =

work page doi:10.3233/faia241262

[43] [43]

New Horizons of Legal Judgement Predication via Multi-Task Learning and LoRA , year =

Chia. New Horizons of Legal Judgement Predication via Multi-Task Learning and LoRA , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230966 , file =

work page doi:10.3233/faia230966

[44] [44]

Gray and Jarom

Morgan A. Gray and Jarom. Can. Legal Knowledge and Information Systems -. 2023 , editor =. doi:10.3233/FAIA230961 , file =

work page doi:10.3233/faia230961 2023

[45] [45]

Information Extraction from Lengthy Legal Contracts: Leveraging Query-Based Summarization and

May Myo Zin and Ha. Information Extraction from Lengthy Legal Contracts: Leveraging Query-Based Summarization and. Legal Knowledge and Information Systems -. 2023 , editor =. doi:10.3233/FAIA230963 , file =

work page doi:10.3233/faia230963 2023

[46] [46]

Harnessing GPT-3.5-Turbo for Rhetorical Role Prediction in Legal Cases , year =

Anas Belfathi and Nicolas Hernandez and Laura Monceaux , booktitle =. Harnessing GPT-3.5-Turbo for Rhetorical Role Prediction in Legal Cases , year =. doi:10.3233/FAIA230964 , file =

work page doi:10.3233/faia230964

[47] [47]

Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using

Giulia Grundler and Ruta Liepina and Mariaceleste Musicco and Francesca Lagioia and Andrea Galassi and Giovanni Sartor and Paolo Torroni , booktitle =. Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using. 2024 , editor =. doi:10.3233/FAIA241235 , file =

work page doi:10.3233/faia241235 2024

[48] [48]

Legal Chunking: Evaluating Methods for Effective Legal Text Retrieval , year =

Andrea Filippo Ferraris and Davide Audrito and Giovanni Siragusa and Alessandro Piovano , booktitle =. Legal Chunking: Evaluating Methods for Effective Legal Text Retrieval , year =. doi:10.3233/FAIA241255 , file =

work page doi:10.3233/faia241255

[49] [49]

Legal Text Segmentation Through Breakpoint Detection , year =

Roberto Abbruzzese , booktitle =. Legal Text Segmentation Through Breakpoint Detection , year =. doi:10.3233/FAIA230968 , file =

work page doi:10.3233/faia230968

[50] [50]

From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems , year =

Samyar Janatian and Hannes Westermann and Jinzhe Tan and Jarom. From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230962 , file =

work page doi:10.3233/faia230962

[51] [51]

Automated Semantic Annotation Pipeline for Brazilian Judicial Decisions , year =

Melissa Zorzanelli Costa and Dylan Faria Robson and Thiago Baiense Pe. Automated Semantic Annotation Pipeline for Brazilian Judicial Decisions , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA241248 , file =

work page doi:10.3233/faia241248

[52] [52]

Assessing Ocean's Legal Protection Using

Youssef Al Mouatamid and Jihad Zahir and Marie Bonnin and Hajar Mousannif , booktitle =. Assessing Ocean's Legal Protection Using. 2023 , editor =. doi:10.3233/FAIA230972 , file =

work page doi:10.3233/faia230972 2023

[53] [53]

Event Extraction and Semantic Representation from Spanish Workers' Statute Using Large Language Models , year =

Gabriela Arg. Event Extraction and Semantic Representation from Spanish Workers' Statute Using Large Language Models , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230983 , file =

work page doi:10.3233/faia230983

[54] [54]

American Political Science Review , author=

A Grammar of Institutions , volume=. American Political Science Review , author=. 1995 , pages=. doi:10.2307/2082975 , number=

work page doi:10.2307/2082975 1995

[55] [55]

2009 , publisher=

Understanding institutional diversity , author=. 2009 , publisher=

2009

[56] [56]

QLoRA: Efficient Finetuning of Quantized LLMs , year =

Tim Dettmers and Artidoro Pagnoni and Ari Holtzman and Luke Zettlemoyer , bibsource =. QLoRA: Efficient Finetuning of Quantized LLMs , year =. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , editor =

2023

[57] [57]

2024 , author =

Annotation of Reporting obligations in EU legislation dataset , url =. 2024 , author =

2024

[58] [58]

The hitchhiker ' s guide to testing statistical significance in natural language processing

Dror, Rotem and Baumer, Gili and Shlomov, Segev and Reichart, Roi , booktitle =. The Hitchhiker's Guide to Testing Statistical Significance in Natural Language Processing , year =. doi:10.18653/v1/P18-1128 , file =

work page doi:10.18653/v1/p18-1128

[59] [59]

2018 , edition =

Krippendorff, Klaus , title =. 2018 , edition =

2018

[60] [60]

Bioinformatics , volume=

BioBERT: a pre-trained biomedical language representation model for biomedical text mining , author=. Bioinformatics , volume=. 2020 , publisher=

2020

[61] [61]

2019 , address =

Beltagy, Iz and Lo, Kyle and Cohan, Arman , booktitle =. 2019 , address =. doi:10.18653/v1/D19-1371 , file =

work page doi:10.18653/v1/d19-1371 2019

[62] [62]

Deontic Sentence Classification Using Tree Kernel Classifiers

Liga, Davide and Palmirani, Monica. Deontic Sentence Classification Using Tree Kernel Classifiers. Intelligent Systems and Applications. 2023

2023

[63] [63]

Schwartz, J

Schwartz, Roy and Dodge, Jesse and Smith, Noah A. and Etzioni, Oren , journal =. Green AI , year =. doi:10.1145/3381831 , file =

work page doi:10.1145/3381831

[64] [64]

and Lee, Su-In , title =

Lundberg, Scott M. and Lee, Su-In , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =

2017

[65] [65]

Reporting Requirement Metadata Vocabulary (RRMV) , year =

[66] [66]

Low-Resource Deontic Modality Classification in EU Legislation , year =

Minkova, Kristina and Chakravarthy, Shashank and Dijck, Gijs , booktitle =. Low-Resource Deontic Modality Classification in EU Legislation , year =. doi:10.18653/v1/2023.nllp-1.15 , file =

work page doi:10.18653/v1/2023.nllp-1.15 2023

[67] [67]

NOMOS: Navigating

Pennisi, Andrea and Gonz. NOMOS: Navigating. Proceedings of the. 2023 , organization =. doi:10.18653/v1/2023.nllp-1.2 , file =

work page doi:10.18653/v1/2023.nllp-1.2 2023

[68] [68]

Scott Marcus and Apostolos Thomadakis , title =

J. Scott Marcus and Apostolos Thomadakis , title =. 2025 , abstract =. doi:10.2861/6089952 , url=

work page doi:10.2861/6089952 2025

[69] [69]

Fine-tuning GPT-3 for legal rule classification , year =

Davide Liga and Livio Robaldo , journal =. Fine-tuning GPT-3 for legal rule classification , year =. doi:https://doi.org/10.1016/j.clsr.2023.105864 , file =

work page doi:10.1016/j.clsr.2023.105864 2023

[70] [70]

2021 , abstract =

Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt , title =. 2021 , abstract =

2021

[71] [71]

To NER or Not to NER? A Case Study of Low-Resource Deontic Modalities in EU Legislation , year =

Chakravarthy, Shashank M and Van Dijck, Gijs and Wilbik, Anna , booktitle =. To NER or Not to NER? A Case Study of Low-Resource Deontic Modalities in EU Legislation , year =. doi:10.1109/CI-NLPSoMeCompanion65206.2025.10977902 , file =

work page doi:10.1109/ci-nlpsomecompanion65206.2025.10977902 2025

[72] [72]

Financial Industry Business Ontology (FIBO): Legal Obligation , year =

[73] [73]

, title =

Hanindhito, Bagus and Patel, Bhavesh and John, Lizy K. , title =. Proceedings of the 16th ACM/SPEC International Conference on Performance Engineering , pages =. 2025 , isbn =. doi:10.1145/3676151.3719377 , abstract =

work page doi:10.1145/3676151.3719377 2025

[74] [74]

ACM Comput

Ariai, Farid and Mackenzie, Joel and Demartini, Gianluca , title =. ACM Comput. Surv. , month = dec, articleno =. 2025 , issue_date =. doi:10.1145/3777009 , abstract =

work page doi:10.1145/3777009 2025

[75] [75]

2024 , eprint=

SaulLM-7B: A pioneering Large Language Model for Law , author=. 2024 , eprint=

2024

[76] [76]

2310.06825 , archivePrefix=

Jiang, Albert Q and Sablayrolles, Alexandre and Mensch, Arthur and Bamford, Chris and Chaplot, Devendra Singh and de las Casas, Diego and Bressand, Florian and Lengyel, Gianna and Lample, Guillaume and Saulnier, Lucile and others , year=. 2310.06825 , archivePrefix=

Pith/arXiv arXiv

[77] [77]

2407.21783 , archivePrefix=

Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Yang, Amy and Fan, Angela and others , year=. 2407.21783 , archivePrefix=

Pith/arXiv arXiv

[78] [78]

Principles of Law: LLMs vs RegEx , year =

Molinari, Marianna and Amantea, Ilaria Angela and Quaranta, Marinella and Governatori, Guido , publisher =. Principles of Law: LLMs vs RegEx , year =. Legal Knowledge and Information Systems , doi =

[79] [79]

Automated Extraction of Judicial Interpretative Formulas in EU Case Law on VAT , year =

Grundler, Giulia and Santin, Piera and Fidelangeli, Alessia and Mignone, Rachele and Galli, Federico and Galassi, Andrea and Contissa, Giuseppe and di Caro, Luigi and Torroni, Paolo , publisher =. Automated Extraction of Judicial Interpretative Formulas in EU Case Law on VAT , year =. Legal Knowledge and Information Systems , doi =

[80] [80]

In: Proc

Akiba, Takuya and Sano, Shotaro and Yanase, Toshihiko and Ohta, Takeru and Koyama, Masanori , title =. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining , pages =. 2019 , isbn =. doi:10.1145/3292500.3330701 , abstract =

work page doi:10.1145/3292500.3330701 2019