SciNLP is the first full-text entity and relation extraction benchmark for the NLP domain, built from 60 manually annotated publications and used to evaluate models and construct a domain knowledge graph.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , year =
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
MultiSynt/MT supplies 4.8 trillion translated tokens in 36 languages from 100B English tokens, letting LLMs match native-data baselines with 72% fewer tokens and beat them by 15% at equal budget.
Many-shot ICL with LLMs matches or exceeds supervised BERT on NER and generates high-quality labels for low-resource settings, producing ~10% absolute F1 gains when used to fine-tune BERT.
Beaver agent harness achieves 81.0 GRAS on multimodal scientific curation, outperforming frontier agents by over 23 points through scaffolding and evidence tooling.
Decomposing annotation tasks using centers from centering theory reduces aggregate inferential load via a degrees-of-freedom model and enables better sub-task allocation.
citing papers explorer
-
MultiSynt/MT: Trillion-Token Multi-Parallel Pre-Training Data Translated Across 36 Languages
MultiSynt/MT supplies 4.8 trillion translated tokens in 36 languages from 100B English tokens, letting LLMs match native-data baselines with 72% fewer tokens and beat them by 15% at equal budget.
-
Scaling Performance and Low-Resource Annotation with Many-Shot In-Context Learning for Named Entity Recognition
Many-shot ICL with LLMs matches or exceeds supervised BERT on NER and generates high-quality labels for low-resource settings, producing ~10% absolute F1 gains when used to fine-tune BERT.
-
Building Agent Harnesses for Scientific Curation from Multimodal Sources
Beaver agent harness achieves 81.0 GRAS on multimodal scientific curation, outperforming frontier agents by over 23 points through scaffolding and evidence tooling.
-
Task Decomposition for Efficient Annotation
Decomposing annotation tasks using centers from centering theory reduces aggregate inferential load via a degrees-of-freedom model and enables better sub-task allocation.