EURO-5K: When Does Domain Pretraining Matter? Benchmarking Transformers for EU Reporting Obligation Extraction
Pith reviewed 2026-06-28 11:12 UTC · model grok-4.3
The pith
Fully fine-tuned generic BERT matches legal BERT at 0.89 F1 for EU reporting obligation extraction, with LLMs reaching the same level.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
On the EURO-5K corpus, fully fine-tuned generic and legal BERT models both reach 0.89 F1; fine-tuned LLMs match encoder accuracy at the sentence level; legal pretraining supplies only small gains for generative models but clear gains under parameter-efficient tuning; and all methods converge around three thousand samples with diminishing returns thereafter.
What carries the argument
The EURO-5K sentence-level dataset of reporting obligations paired with challenging negatives, used to compare full fine-tuning versus QLoRA on generic versus legal-pretrained encoders and on LLMs.
If this is right
- Legal pretraining speeds early learning when only small amounts of task data are available.
- Models trained on the corpus function as specialised reporting-obligation extractors rather than generic regulatory classifiers on external regulatory texts.
- Parameter-efficient methods gain more from legal pretraining than full fine-tuning does.
- Performance plateaus near three thousand examples, indicating that additional data yields little further improvement.
Where Pith is reading between the lines
- When full fine-tuning is impractical due to compute limits, legal pretraining retains practical value for similar extraction tasks.
- The same benchmark setup could test whether domain pretraining patterns hold for obligation extraction in other regulatory domains such as financial or environmental rules.
- Extending the models from sentence detection to structured extraction of details like deadlines or responsible parties would be a direct next measurement.
Load-bearing premise
The sentence-level labels and the choice of hard negative examples from the 136 acts correctly reflect the practical boundary between genuine reporting obligations and similar non-obligatory text.
What would settle it
A substantial drop in F1 when the trained models are applied to a fresh collection of EU acts outside the original 136 would show that the learned distinction does not generalize.
read the original abstract
Extracting reporting obligations from EU legislation is critical for assessing and reducing regulatory reporting burden. However, distinguishing reporting requirements from structurally similar provisions requires specialised legal understanding. Current legal NLP methods lack specialised datasets with clear guidelines and comparative evaluation of extraction paradigms and domain adaptation strategies. We curate EURO-5K, a corpus of sentence-level reporting obligations and challenging negative examples from 136 EU legislative acts. On this dataset, we train and compare discriminative token-classification models (BERT-style) and generative span-extraction models (LLMs), evaluating both full fine-tuning and parameter-efficient QLoRA against baselines (pattern and dependency-based extraction, few-shot prompting). Results show that fully fine-tuned generic and legal BERT models achieve similar performance (0.89 F1), while fine-tuned LLMs match encoder accuracy for sentence-level extraction. Legal pretraining offers only small gains for generative models. In contrast, it is clearly beneficial when adaptation capacity is constrained, as parameter-efficient tuning of Legal-BERT outperforms its generic counterpart. Learning curve analysis demonstrates that legal pretraining accelerates early learning with minimal data. All approaches converge around 3K samples with diminishing returns thereafter, validating dataset sufficiency. Cross-dataset evaluation on two external regulatory corpora shows that our models behave as specialised reporting obligation extractors rather than generic regulatory classifiers. We release EURO-5K, trained models, and an interactive demo with explainability visualizations and structured RDF export. These demonstrate that both paradigms and parameter-efficient training provide practical tools for regulatory compliance automation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces EURO-5K, a sentence-level corpus of reporting obligations and challenging negative examples drawn from 136 EU legislative acts. It benchmarks discriminative token-classification models (BERT variants) against generative span-extraction models (LLMs), comparing full fine-tuning and QLoRA, and reports that fully fine-tuned generic and legal BERTs both reach 0.89 F1, that fine-tuned LLMs match encoder accuracy, that legal pretraining yields only small gains for generative models but clear gains under parameter-efficient tuning, that learning curves converge around 3K samples, and that cross-dataset tests indicate the models act as specialized extractors rather than generic regulatory classifiers. The dataset, models, and interactive demo are released.
Significance. If the annotations are reliable, the results supply concrete evidence on the conditions under which domain-specific pretraining matters for legal information extraction and demonstrate the practical viability of both encoder and LLM paradigms for regulatory compliance tasks. The public release of the corpus and models is a clear strength that enables direct replication and extension.
major comments (1)
- [§3 (EURO-5K curation)] §3 (EURO-5K curation): the manuscript supplies no annotation guidelines, inter-annotator agreement figures, legal-expert involvement details, or explicit sampling rules for the 'challenging negative examples.' Because every headline result (0.89 F1, LLM parity, conditional pretraining benefit, learning-curve and cross-dataset claims) rests on the correctness of these sentence-level labels, the omission is load-bearing for the central empirical claims.
minor comments (1)
- [Abstract] Abstract and §4: the phrase 'challenging negative examples' is used without a concise summary of the selection heuristic; a one-sentence clarification would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the annotation process. We agree that additional details are necessary to substantiate the reliability of the labels and will incorporate them in the revised manuscript.
read point-by-point responses
-
Referee: [§3 (EURO-5K curation)] §3 (EURO-5K curation): the manuscript supplies no annotation guidelines, inter-annotator agreement figures, legal-expert involvement details, or explicit sampling rules for the 'challenging negative examples.' Because every headline result (0.89 F1, LLM parity, conditional pretraining benefit, learning-curve and cross-dataset claims) rests on the correctness of these sentence-level labels, the omission is load-bearing for the central empirical claims.
Authors: We acknowledge the omission of detailed annotation information in the current version of the manuscript. In the revised version, we will expand §3 to include: (1) the full annotation guidelines used by the annotators, (2) inter-annotator agreement statistics (e.g., Cohen's kappa or F1 agreement on a double-annotated subset), (3) details on legal expert involvement, including the number of experts, their background, and how disagreements were resolved, and (4) explicit sampling rules and criteria for selecting the challenging negative examples. This will allow readers to better assess the quality of EURO-5K and support the validity of our experimental results. We believe these additions will address the referee's concern without altering the core findings. revision: yes
Circularity Check
No significant circularity: empirical benchmarking on held-out data
full rationale
This is a purely empirical study that curates EURO-5K, trains and evaluates models (BERT-style token classifiers and LLM span extractors under full fine-tuning and QLoRA), and reports measured F1 scores, learning curves, and cross-dataset results on held-out sentences. No derivations, fitted parameters renamed as predictions, uniqueness theorems, or ansatzes appear; all claims are direct experimental outcomes on external test data rather than quantities defined by the authors' own modeling choices. Self-citations, if any, are not load-bearing for any central result. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard supervised learning assumptions hold (i.i.d. samples, appropriate loss functions for token classification and span extraction).
Reference graph
Works this paper leans on
-
[1]
2020 , address =
Chalkidis, Ilias and Fergadiotis, Manos and Malakasiotis, Prodromos and Aletras, Nikolaos and Androutsopoulos, Ion , booktitle =. 2020 , address =
2020
-
[2]
Ruixue Zhang and Wei Yang and Luyun Lin and Zhengkai Tu and Yuqing Xie and Zihang Fu and Yuhao Xie and Luchen Tan and Kun Xiong and Jimmy J. Lin , journal =. Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents , year =. doi:2002.01861 , eprint =
arXiv 2002
-
[3]
and Grabmair, Matthias , booktitle =
T.y.s.s., Santosh and Quero Hernandez, Elvin A. and Grabmair, Matthias , booktitle =. Query-driven Relevant Paragraph Extraction from Legal Judgments , year =
-
[4]
Large Language Models are legal but they are not: Making the case for a powerful
Jayakumar, Thanmay and Farooqui, Fauzan and Farooqui, Luqman , booktitle =. Large Language Models are legal but they are not: Making the case for a powerful. 2023 , address =. doi:10.18653/v1/2023.nllp-1.22 , url =
-
[5]
and Lee, Wonhee and Ng, Amy and Rapstine, Natalya I
Chivers, Brian and Jiang, Mason P. and Lee, Wonhee and Ng, Amy and Rapstine, Natalya I. and Storer, Alex , booktitle =. 2022 , address =. doi:10.18653/v1/2022.deeplo-1.5 , url =
-
[6]
Gultekin and Achille Globo and Andrea Zugarini and Marco Ernandes and Leonardo Rigutini , journal =
S. Gultekin and Achille Globo and Andrea Zugarini and Marco Ernandes and Leonardo Rigutini , journal =. An energy-based comparative analysis of common approaches to text classification in the Legal domain , year =. doi:2311.01256 , eprint =
-
[7]
Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =
Sachin Pawar and Basit Ali and Girish K. Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =. Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law , keywords =. 2023 , abstract =. doi:10.1145/3594536.3595134 , file =
-
[8]
and Henderson, Peter and Ho, Daniel E
Zheng, Lucia and Guha, Neel and Anderson, Brandon R. and Henderson, Peter and Ho, Daniel E. , booktitle =. When does pretraining help?: assessing self-supervised learning for law and the CaseHOLD dataset of 53,000+ legal holdings , year =. doi:10.1145/3462757.3466088 , file =
-
[9]
Wehnert, Sabine and Sudhi, Viju and Dureja, Shipra and Kutty, Libin and Shahania, Saijal and De Luca, Ernesto W. , booktitle =. Legal norm retrieval with variations of the bert model combined with TF-IDF vectorization , year =. doi:10.1145/3462757.3466104 , file =
-
[10]
Yoshioka, Masaharu and Aoki, Yasuhiro and Suzuki, Youta , booktitle =. BERT-based ensemble methods with data augmentation for legal textual entailment in COLIEE statute law task , year =. doi:10.1145/3462757.3466105 , file =
-
[11]
Computable Contracts by Extracting Obligation Logic Graphs , year =
Savelka, Jaromir , booktitle =. Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts , year =. doi:10.1145/3594536.3595161 , file =
-
[12]
Computable Contracts by Extracting Obligation Logic Graphs , year =
Servantez, Sergio and Lipka, Nedim and Siu, Alexa and Aggarwal, Milan and Krishnamurthy, Balaji and Garimella, Aparna and Hammond, Kristian and Jain, Rajiv , booktitle =. Computable Contracts by Extracting Obligation Logic Graphs , year =. doi:10.1145/3594536.3595162 , keywords =
-
[13]
Licari, Daniele and Bushipaka, Praveen and Marino, Gabriele and Comandé, Giovanni and Cucinotta, Tommaso , booktitle =. Legal Holding Extraction from Italian Case Documents using Italian-LEGAL-BERT Text Summarization , year =. doi:10.1145/3594536.3595177 , keywords =
-
[14]
Computable Contracts by Extracting Obligation Logic Graphs , year =
Paul, Shounak and Mandal, Arpan and Goyal, Pawan and Ghosh, Saptarshi , booktitle =. Pre-trained Language Models for the Legal Domain: A Case Study on Indian Law , year =. doi:10.1145/3594536.3595165 , file =
-
[15]
Huang, Zihan and Low, Charles and Teng, Mengqiu and Zhang, Hongyi and Ho, Daniel E. and Krass, Mark S. and Grabmair, Matthias , booktitle =. Context-aware legal citation recommendation using deep learning , year =. doi:10.1145/3462757.3466066 , file =
-
[16]
and Henderson, Peter and Ho, Daniel E
Aumiller, Dennis and Almasian, Satya and Lackner, Sebastian and Gertz, Michael , booktitle =. Structural text segmentation of legal documents , year =. doi:10.1145/3462757.3466085 , file =
-
[17]
Incorporating domain knowledge for extractive summarization of legal case documents , year =
Bhattacharya, Paheli and Poddar, Soham and Rudra, Koustav and Ghosh, Kripabandhu and Ghosh, Saptarshi , booktitle =. Incorporating domain knowledge for extractive summarization of legal case documents , year =. doi:10.1145/3462757.3466092 , file =
-
[18]
Vold, Andrew and Conrad, Jack G. , booktitle =. Using transformers to improve answer retrieval for legal questions , year =. doi:10.1145/3462757.3466102 , file =
-
[19]
Rosa, Guilherme Moraes and Rodrigues, Ruan Chaves and de Alencar Lotufo, Roberto and Nogueira, Rodrigo , booktitle =. To tune or not to tune?: zero-shot models for legal case entailment , year =. doi:10.1145/3462757.3466103 , file =
-
[20]
Savelka, Jaromir and Westermann, Hannes and Benyekhlef, Karim and Alexander, Charlotte S. and Grant, Jayla C. and Amariles, David Restrepo and Hamdani, Rajaa El and Meeùs, Sébastien and Troussel, Aurore and Araszkiewicz, Michał and Ashley, Kevin D. and Ashley, Alexandra and Branting, Karl and Falduti, Mattia and Grabmair, Matthias and Harašta, Jakub and N...
-
[21]
van Drie, Romy A. N. and de Boer, Maaike H. T. and Bakker, Roos M. and Tolios, Ioannis and Vos, Daan , booktitle =. The Dutch Law as a Semantic Role Labeling Dataset , year =. doi:10.1145/3594536.3595124 , file =
-
[22]
Palshikar and TCS Research and TCS Research and TCS Research , month = jun, title =
Brugger, Tobias and Stürmer, Matthias and Niklaus, Joel , booktitle =. MultiLegalSBD: A Multilingual Legal Sentence Boundary Detection Dataset , year =. doi:10.1145/3594536.3595132 , file =
-
[23]
LeArNER: Few-shot Legal Argument Named Entity Recognition , year =
Lee, Shao-Man and Tan, Yu-Hsiang and Yu, Han-Ting , booktitle =. LeArNER: Few-shot Legal Argument Named Entity Recognition , year =. doi:10.1145/3594536.3595144 , file =
-
[24]
Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models , year =
Daniel Steinigen and Marcin Namysl and Markus Hepperle and Jan Krekeler and Susanne Landgraf , booktitle =. Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models , year =
-
[25]
Jarom. Can. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =
2023
-
[26]
Elize Herrewijnen and Dennis F. W. Craandijk , booktitle =. Towards Meaningful Paragraph Embeddings for Data-Scarce Domains:. 2023 , editor =
2023
-
[27]
Automatic Rhetorical Roles Classification for Legal Documents using LEGAL-TransformerOverBERT , year =
Gabriele Marino and Daniele Licari and Praveen Bushipaka and Giovanni Comand. Automatic Rhetorical Roles Classification for Legal Documents using LEGAL-TransformerOverBERT , year =. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law
-
[28]
Bridging the Gap: Mapping Layperson Narratives to Legal Issues with Language Models , year =
Hannes Westermann and S. Bridging the Gap: Mapping Layperson Narratives to Legal Issues with Language Models , year =. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law
-
[29]
Applying
Henrik Palmer Olsen and Malte H. Applying. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =
2023
-
[30]
Extracting
Malo Revel and Aur. Extracting. Proceedings of the 6th Workshop on Automated Semantic Analysis of Information in Legal Text co-located with the 19th International Conference on Artificial Intelligence and Law. 2023 , editor =
2023
-
[31]
Ashley , booktitle =
Huihui Xu and Kevin D. Ashley , booktitle =. Argumentative Segmentation Enhancement for Legal Summarization , year =
-
[32]
Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions , year =
Anas Belfathi and Nicolas Hernandez and Laura Monceaux , booktitle =. Enhancing Pre-Trained Language Models with Sentence Position Embeddings for Rhetorical Roles Recognition in Legal Opinions , year =
-
[33]
Palshikar , booktitle =
Basit Ali and Ravina More and Sachin Pawar and Girish K. Palshikar , booktitle =. Prior Case Retrieval using Evidence Extraction from Court Judgements , year =
-
[34]
Automatic Judgement Forecasting for Pending Applications of the European Court of Human Rights , year =
Masha Medvedeva and Ahmet. Automatic Judgement Forecasting for Pending Applications of the European Court of Human Rights , year =. Joint Proceedings of the Workshops on Automated Semantic Analysis of Information in Legal Text
-
[35]
Explainable Rule Extraction via Semantic Graphs , year =
G. Explainable Rule Extraction via Semantic Graphs , year =. Joint Proceedings of the Workshops on Automated Semantic Analysis of Information in Legal Text
-
[36]
Automatic Semantic Annotation for the Easification of Action Rule Legislative Sentences for Specialist Readers , year =
Sherry Maynard , booktitle =. Automatic Semantic Annotation for the Easification of Action Rule Legislative Sentences for Specialist Readers , year =
-
[37]
Sebastian Felix Schwemer and Letizia Tomada and Tommaso Pasini , booktitle =. Legal. 2021 , editor =
2021
-
[38]
van Dijck, Gijs and Aguilera, Carlos and Chakravarthy, Shashank M. , journal =. Deciphering disagreement in the annotation of EU legislation , year =. doi:10.1007/s10506-024-09423-9 , file =
-
[39]
G. J. Brandsma and J. Blom‐Hansen and Christiaan Meijer and Kody Moodley , title =. ArXiv preprint , pages =. 2025 , abstract =
2025
-
[40]
Prosecutorial Outcome Predication with LoRA and QLoRA , year =
Kuo. Prosecutorial Outcome Predication with LoRA and QLoRA , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA241253 , file =
-
[41]
May Myo Zin and Ken Satoh and Georg Borges , booktitle =. Leveraging. 2024 , editor =. doi:10.3233/FAIA241247 , file =
-
[42]
Harry Nan and Maarten Marx and Johan Wolswinkel , booktitle =. Combining Rule-Based and Machine Learning Methods for Efficient Information Extraction from Enforcement Decisions , year =. doi:10.3233/FAIA241262 , file =
-
[43]
New Horizons of Legal Judgement Predication via Multi-Task Learning and LoRA , year =
Chia. New Horizons of Legal Judgement Predication via Multi-Task Learning and LoRA , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230966 , file =
-
[44]
Morgan A. Gray and Jarom. Can. Legal Knowledge and Information Systems -. 2023 , editor =. doi:10.3233/FAIA230961 , file =
-
[45]
Information Extraction from Lengthy Legal Contracts: Leveraging Query-Based Summarization and
May Myo Zin and Ha. Information Extraction from Lengthy Legal Contracts: Leveraging Query-Based Summarization and. Legal Knowledge and Information Systems -. 2023 , editor =. doi:10.3233/FAIA230963 , file =
-
[46]
Harnessing GPT-3.5-Turbo for Rhetorical Role Prediction in Legal Cases , year =
Anas Belfathi and Nicolas Hernandez and Laura Monceaux , booktitle =. Harnessing GPT-3.5-Turbo for Rhetorical Role Prediction in Legal Cases , year =. doi:10.3233/FAIA230964 , file =
-
[47]
Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using
Giulia Grundler and Ruta Liepina and Mariaceleste Musicco and Francesca Lagioia and Andrea Galassi and Giovanni Sartor and Paolo Torroni , booktitle =. Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using. 2024 , editor =. doi:10.3233/FAIA241235 , file =
-
[48]
Legal Chunking: Evaluating Methods for Effective Legal Text Retrieval , year =
Andrea Filippo Ferraris and Davide Audrito and Giovanni Siragusa and Alessandro Piovano , booktitle =. Legal Chunking: Evaluating Methods for Effective Legal Text Retrieval , year =. doi:10.3233/FAIA241255 , file =
-
[49]
Legal Text Segmentation Through Breakpoint Detection , year =
Roberto Abbruzzese , booktitle =. Legal Text Segmentation Through Breakpoint Detection , year =. doi:10.3233/FAIA230968 , file =
-
[50]
Samyar Janatian and Hannes Westermann and Jinzhe Tan and Jarom. From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230962 , file =
-
[51]
Automated Semantic Annotation Pipeline for Brazilian Judicial Decisions , year =
Melissa Zorzanelli Costa and Dylan Faria Robson and Thiago Baiense Pe. Automated Semantic Annotation Pipeline for Brazilian Judicial Decisions , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA241248 , file =
-
[52]
Assessing Ocean's Legal Protection Using
Youssef Al Mouatamid and Jihad Zahir and Marie Bonnin and Hajar Mousannif , booktitle =. Assessing Ocean's Legal Protection Using. 2023 , editor =. doi:10.3233/FAIA230972 , file =
-
[53]
Gabriela Arg. Event Extraction and Semantic Representation from Spanish Workers' Statute Using Large Language Models , year =. Legal Knowledge and Information Systems -. doi:10.3233/FAIA230983 , file =
-
[54]
American Political Science Review , author=
A Grammar of Institutions , volume=. American Political Science Review , author=. 1995 , pages=. doi:10.2307/2082975 , number=
-
[55]
2009 , publisher=
Understanding institutional diversity , author=. 2009 , publisher=
2009
-
[56]
QLoRA: Efficient Finetuning of Quantized LLMs , year =
Tim Dettmers and Artidoro Pagnoni and Ari Holtzman and Luke Zettlemoyer , bibsource =. QLoRA: Efficient Finetuning of Quantized LLMs , year =. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , editor =
2023
-
[57]
2024 , author =
Annotation of Reporting obligations in EU legislation dataset , url =. 2024 , author =
2024
-
[58]
The hitchhiker ' s guide to testing statistical significance in natural language processing
Dror, Rotem and Baumer, Gili and Shlomov, Segev and Reichart, Roi , booktitle =. The Hitchhiker's Guide to Testing Statistical Significance in Natural Language Processing , year =. doi:10.18653/v1/P18-1128 , file =
-
[59]
2018 , edition =
Krippendorff, Klaus , title =. 2018 , edition =
2018
-
[60]
Bioinformatics , volume=
BioBERT: a pre-trained biomedical language representation model for biomedical text mining , author=. Bioinformatics , volume=. 2020 , publisher=
2020
-
[61]
Beltagy, Iz and Lo, Kyle and Cohan, Arman , booktitle =. 2019 , address =. doi:10.18653/v1/D19-1371 , file =
-
[62]
Deontic Sentence Classification Using Tree Kernel Classifiers
Liga, Davide and Palmirani, Monica. Deontic Sentence Classification Using Tree Kernel Classifiers. Intelligent Systems and Applications. 2023
2023
-
[63]
Schwartz, Roy and Dodge, Jesse and Smith, Noah A. and Etzioni, Oren , journal =. Green AI , year =. doi:10.1145/3381831 , file =
-
[64]
and Lee, Su-In , title =
Lundberg, Scott M. and Lee, Su-In , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems , pages =. 2017 , isbn =
2017
-
[65]
Reporting Requirement Metadata Vocabulary (RRMV) , year =
-
[66]
Low-Resource Deontic Modality Classification in EU Legislation , year =
Minkova, Kristina and Chakravarthy, Shashank and Dijck, Gijs , booktitle =. Low-Resource Deontic Modality Classification in EU Legislation , year =. doi:10.18653/v1/2023.nllp-1.15 , file =
-
[67]
Pennisi, Andrea and Gonz. NOMOS: Navigating. Proceedings of the. 2023 , organization =. doi:10.18653/v1/2023.nllp-1.2 , file =
-
[68]
Scott Marcus and Apostolos Thomadakis , title =
J. Scott Marcus and Apostolos Thomadakis , title =. 2025 , abstract =. doi:10.2861/6089952 , url=
-
[69]
Fine-tuning GPT-3 for legal rule classification , year =
Davide Liga and Livio Robaldo , journal =. Fine-tuning GPT-3 for legal rule classification , year =. doi:https://doi.org/10.1016/j.clsr.2023.105864 , file =
-
[70]
2021 , abstract =
Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt , title =. 2021 , abstract =
2021
-
[71]
To NER or Not to NER? A Case Study of Low-Resource Deontic Modalities in EU Legislation , year =
Chakravarthy, Shashank M and Van Dijck, Gijs and Wilbik, Anna , booktitle =. To NER or Not to NER? A Case Study of Low-Resource Deontic Modalities in EU Legislation , year =. doi:10.1109/CI-NLPSoMeCompanion65206.2025.10977902 , file =
work page doi:10.1109/ci-nlpsomecompanion65206.2025.10977902 2025
-
[72]
Financial Industry Business Ontology (FIBO): Legal Obligation , year =
-
[73]
Hanindhito, Bagus and Patel, Bhavesh and John, Lizy K. , title =. Proceedings of the 16th ACM/SPEC International Conference on Performance Engineering , pages =. 2025 , isbn =. doi:10.1145/3676151.3719377 , abstract =
-
[74]
Ariai, Farid and Mackenzie, Joel and Demartini, Gianluca , title =. ACM Comput. Surv. , month = dec, articleno =. 2025 , issue_date =. doi:10.1145/3777009 , abstract =
-
[75]
2024 , eprint=
SaulLM-7B: A pioneering Large Language Model for Law , author=. 2024 , eprint=
2024
-
[76]
Jiang, Albert Q and Sablayrolles, Alexandre and Mensch, Arthur and Bamford, Chris and Chaplot, Devendra Singh and de las Casas, Diego and Bressand, Florian and Lengyel, Gianna and Lample, Guillaume and Saulnier, Lucile and others , year=. 2310.06825 , archivePrefix=
-
[77]
Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Yang, Amy and Fan, Angela and others , year=. 2407.21783 , archivePrefix=
-
[78]
Principles of Law: LLMs vs RegEx , year =
Molinari, Marianna and Amantea, Ilaria Angela and Quaranta, Marinella and Governatori, Guido , publisher =. Principles of Law: LLMs vs RegEx , year =. Legal Knowledge and Information Systems , doi =
-
[79]
Automated Extraction of Judicial Interpretative Formulas in EU Case Law on VAT , year =
Grundler, Giulia and Santin, Piera and Fidelangeli, Alessia and Mignone, Rachele and Galli, Federico and Galassi, Andrea and Contissa, Giuseppe and di Caro, Luigi and Torroni, Paolo , publisher =. Automated Extraction of Judicial Interpretative Formulas in EU Case Law on VAT , year =. Legal Knowledge and Information Systems , doi =
-
[80]
Akiba, Takuya and Sano, Shotaro and Yanase, Toshihiko and Ohta, Takeru and Koyama, Masanori , title =. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining , pages =. 2019 , isbn =. doi:10.1145/3292500.3330701 , abstract =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.