EvoPool evolves pools of programmatic annotators that outperform LLM annotation by 0.141 average macro-F1 on 7 of 8 specialized tasks while running thousands of times faster.
InFindings of the ACL: EMNLP 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
Concept Separation Curves provide a classifier-independent method to visualize and quantify how sentence embeddings distinguish conceptual meaning from syntactic variations across languages and domains.
A comprehensive survey of knowledge distillation for LLMs structured around algorithms, skill enhancement, and vertical applications, highlighting data augmentation as a key enabler.
citing papers explorer
-
EvoPool: Evolutionary Programmatic Annotation for Label-Efficient Specialized Supervision
EvoPool evolves pools of programmatic annotators that outperform LLM annotation by 0.141 average macro-F1 on 7 of 8 specialized tasks while running thousands of times faster.
-
Guidelines for Empirical Studies in Software Engineering involving Large Language Models
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
-
Finding Meaning in Embeddings: Concept Separation Curves
Concept Separation Curves provide a classifier-independent method to visualize and quantify how sentence embeddings distinguish conceptual meaning from syntactic variations across languages and domains.
-
A Survey on Knowledge Distillation of Large Language Models
A comprehensive survey of knowledge distillation for LLMs structured around algorithms, skill enhancement, and vertical applications, highlighting data augmentation as a key enabler.