In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp

Dodge, J · 2021 · DOI 10.18653/v1/2021.emnlp-main.98

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

open at publisher browse 14 citing papers

citation-role summary

background 3 other 1

citation-polarity summary

background 3 unclear 1

representative citing papers

Pretraining Exposure Explains Popularity Judgments in Large Language Models

cs.CL · 2026-05-12 · unverdicted · novelty 8.0

LLM popularity judgments align more closely with pretraining data exposure counts than with Wikipedia popularity, with stronger effects in pairwise comparisons and larger models.

Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.

SPAGBias: Uncovering and Tracing Structured Spatial Gender Bias in Large Language Models

cs.CL · 2026-04-16 · unverdicted · novelty 7.0

SPAGBias reveals that LLMs form nuanced gender associations with specific urban micro-spaces that exceed real-world distributions and produce failures in planning and descriptive tasks.

Who Owns the AI Recommendation? A Multi-Industry Empirical Map of Brand Category Ownership Across Large Language Models

cs.IR · 2026-06-22 · unverdicted · novelty 6.0

Empirical study of LLM brand recommendations across industries finds moderate concentration (mean Gini 0.28) and low cross-model agreement (41.6%) on top brands.

GRASPrune: Global Gating for Budgeted Structured Pruning of Large Language Models

cs.AI · 2026-04-21 · unverdicted · novelty 6.0

GRASPrune removes 50% of parameters from LLaMA-2-7B via global gating and projected straight-through estimation, reaching 12.18 WikiText-2 perplexity and competitive zero-shot accuracy after four epochs on 512 calibration sequences.

A Human-Centric Framework for Data Attribution in Large Language Models

cs.CY · 2026-02-11 · unverdicted · novelty 6.0

Introduces a parameter-driven framework for data attribution in LLMs that enables negotiation among creators, users, and intermediaries to meet stakeholder goals within the data economy.

DataComp-LM: In search of the next generation of training sets for language models

cs.LG · 2024-06-17 · unverdicted · novelty 6.0

DCLM-Baseline dataset lets a 7B model reach 64% 5-shot MMLU accuracy after 2.6T tokens, beating prior open-data models by 6.6 points on MMLU with 40% less compute.

BloombergGPT: A Large Language Model for Finance

cs.LG · 2023-03-30 · conditional · novelty 6.0

BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.

Atlas: Few-shot Learning with Retrieval Augmented Language Models

cs.CL · 2022-08-05 · unverdicted · novelty 6.0

Atlas reaches over 42% accuracy on Natural Questions with only 64 examples, outperforming a 540B-parameter model by 3% with 50x fewer parameters.

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

cs.CL · 2022-04-14 · accept · novelty 6.0

GPT-NeoX-20B is a publicly released 20B parameter autoregressive language model trained on the Pile that shows strong gains in five-shot reasoning over similarly sized prior models.

Survey-aware Machine Learning: A Guideline for Valid Population Health Inference based on Scoping Review

stat.ML · 2026-05-09 · unverdicted · novelty 5.0

The authors introduce Survey-aware Machine Learning (SaML) as a nine-step guideline that integrates survey design metadata throughout the ML lifecycle to enable valid population inference from complex health surveys.

InternLM2 Technical Report

cs.CL · 2024-03-26 · unverdicted · novelty 5.0

InternLM2 is a new open-source LLM that outperforms prior versions on 30 benchmarks and long-context tasks through scaled pre-training to 32k tokens and a conditional online RLHF alignment strategy.

Benchmark Data Contamination of Large Language Models: A Survey

cs.CL · 2024-06-06 · unverdicted · novelty 3.0

A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.

The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models

cs.AI · 2026-05-06

citing papers explorer

Showing 14 of 14 citing papers.

Pretraining Exposure Explains Popularity Judgments in Large Language Models cs.CL · 2026-05-12 · unverdicted · none · ref 5
LLM popularity judgments align more closely with pretraining data exposure counts than with Wikipedia popularity, with stronger effects in pairwise comparisons and larger models.
Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion cs.CL · 2026-04-20 · unverdicted · none · ref 76
TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.
SPAGBias: Uncovering and Tracing Structured Spatial Gender Bias in Large Language Models cs.CL · 2026-04-16 · unverdicted · none · ref 17
SPAGBias reveals that LLMs form nuanced gender associations with specific urban micro-spaces that exceed real-world distributions and produce failures in planning and descriptive tasks.
Who Owns the AI Recommendation? A Multi-Industry Empirical Map of Brand Category Ownership Across Large Language Models cs.IR · 2026-06-22 · unverdicted · none · ref 17
Empirical study of LLM brand recommendations across industries finds moderate concentration (mean Gini 0.28) and low cross-model agreement (41.6%) on top brands.
GRASPrune: Global Gating for Budgeted Structured Pruning of Large Language Models cs.AI · 2026-04-21 · unverdicted · none · ref 36
GRASPrune removes 50% of parameters from LLaMA-2-7B via global gating and projected straight-through estimation, reaching 12.18 WikiText-2 perplexity and competitive zero-shot accuracy after four epochs on 512 calibration sequences.
A Human-Centric Framework for Data Attribution in Large Language Models cs.CY · 2026-02-11 · unverdicted · none · ref 63
Introduces a parameter-driven framework for data attribution in LLMs that enables negotiation among creators, users, and intermediaries to meet stakeholder goals within the data economy.
DataComp-LM: In search of the next generation of training sets for language models cs.LG · 2024-06-17 · unverdicted · none · ref 56
DCLM-Baseline dataset lets a 7B model reach 64% 5-shot MMLU accuracy after 2.6T tokens, beating prior open-data models by 6.6 points on MMLU with 40% less compute.
BloombergGPT: A Large Language Model for Finance cs.LG · 2023-03-30 · conditional · none · ref 30
BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.
Atlas: Few-shot Learning with Retrieval Augmented Language Models cs.CL · 2022-08-05 · unverdicted · none · ref 186
Atlas reaches over 42% accuracy on Natural Questions with only 64 examples, outperforming a 540B-parameter model by 3% with 50x fewer parameters.
GPT-NeoX-20B: An Open-Source Autoregressive Language Model cs.CL · 2022-04-14 · accept · none · ref 26
GPT-NeoX-20B is a publicly released 20B parameter autoregressive language model trained on the Pile that shows strong gains in five-shot reasoning over similarly sized prior models.
Survey-aware Machine Learning: A Guideline for Valid Population Health Inference based on Scoping Review stat.ML · 2026-05-09 · unverdicted · none · ref 33
The authors introduce Survey-aware Machine Learning (SaML) as a nine-step guideline that integrates survey design metadata throughout the ML lifecycle to enable valid population inference from complex health surveys.
InternLM2 Technical Report cs.CL · 2024-03-26 · unverdicted · none · ref 173
InternLM2 is a new open-source LLM that outperforms prior versions on 30 benchmarks and long-context tasks through scaled pre-training to 32k tokens and a conditional online RLHF alignment strategy.
Benchmark Data Contamination of Large Language Models: A Survey cs.CL · 2024-06-06 · unverdicted · none · ref 35
A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.
The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models cs.AI · 2026-05-06 · unreviewed · ref 15

In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer