hub

BERT : Pre-training of deep bidirectional transformers for language understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova · 2019 · Proceedings of the 2019 Conference of the North · DOI 10.18653/v1/n19-1423

98 Pith papers cite this work, alongside 6,639 external citations. Polarity classification is still indexing.

98 Pith papers citing it

6,639 external citations · Crossref

open at publisher browse 98 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 3

citation-polarity summary

background 3

claims ledger

background The retrieval system only manages to fetch informationabout Fleming's professional achievements in the discoveryof penicillin. However, the document does not provide informa-tion about his educational background, thus the model generates ahallucinatory answer. inappropriately activated, blindly retrieving inaccurate information and consequently leading to an undesirable response. Consequently, several studies [75, 204, 228, 378] have proposed to make a shift from passive retrieval to adaptive re

co-cited works

representative citing papers

TokAlign++: Advancing Vocabulary Adaptation via Better Token Alignment

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

TokAlign++ learns token alignments between LLM vocabularies from monolingual representations to enable faster adaptation, better text compression, and effective token-level distillation across 15 languages with minimal steps.

BadSKP: Backdoor Attacks on Knowledge Graph-Enhanced LLMs with Soft Prompts

cs.AI · 2026-05-12 · conditional · novelty 7.0

BadSKP poisons graph node embeddings to steer soft prompts in KG-enhanced LLMs, achieving high attack success rates where text-channel backdoors fail due to semantic anchoring.

Scratchpad Patching: Decoupling Compute from Patch Size in Byte-Level Language Models

cs.CL · 2026-05-10 · conditional · novelty 7.0

Scratchpad Patching decouples compute from patch size in byte-level language models by inserting entropy-triggered scratchpads to update patch context dynamically.

Neural Cluster First, Route Second: One-Shot Capacitated Vehicle Routing via Differentiable Optimal Transport

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

Neural CFRS is a non-autoregressive one-shot framework for CVRP that uses entropic optimal transport for capacitated clustering and achieves competitive gaps on large instances.

SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

SeBA is a joint-embedding framework that separates tabular data into two complementary views and aligns one view's representations to the nearest-neighbor structure of the other, improving feature-label relationships and achieving SOTA results in most benchmarks without relying on augmentations.

Accurate and Efficient Statistical Testing for Word Semantic Breadth

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

A new permutation test uses Householder reflection to align word embedding clouds before testing dispersion differences, cutting Type-I error by 32.5% and speeding up 23x on GPU.

TRACE: Transport Alignment Conformal Prediction via Diffusion and Flow Matching Models

stat.ML · 2026-05-08 · unverdicted · novelty 7.0

TRACE creates valid conformal prediction sets for complex generative models by scoring outputs via averaged denoising or velocity errors along stochastic transport paths instead of likelihoods.

Towards Self-Referential Analytic Assessment: A Profile-Based Approach to L2 Writing Evaluation with LLMs

cs.CL · 2026-05-05 · unverdicted · novelty 7.0

LLMs outperform single human raters at spotting relative weaknesses in L2 writing profiles on the ICNALE GRA dataset while humans are better at spotting strengths, using a self-referential intra-learner evaluation method.

How Language Models Process Negation

cs.CL · 2026-05-04 · unverdicted · novelty 7.0

LLMs implement both attention-based suppression and constructive representations for negation, with construction dominant, despite poor accuracy from late-layer attention shortcuts.

TCDA: Thread-Constrained Discourse-Aware Modeling for Conversational Sentiment Quadruple Analysis

cs.CL · 2026-05-03 · unverdicted · novelty 7.0 · 2 refs

TCDA introduces TC-DAG to filter cross-thread noise while preserving temporal order and D-RoPE to align semantics across layers and reduce distance dilution, achieving state-of-the-art results on two DiaASQ benchmarks.

A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

Presents MBFC-2025 dataset and multi-view embeddings with fusion methods for media bias and factuality, reporting SOTA results on ACL-2020 and new benchmarks on MBFC-2025.

InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees

cs.LG · 2026-05-01 · unverdicted · novelty 7.0 · 2 refs

InvEvolve evolves white-box inventory policies from LLMs with statistical safety guarantees and outperforms classical and deep learning methods on synthetic and real retail data.

Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising

cs.CL · 2026-04-27 · unverdicted · novelty 7.0

DLM4G applies graph-aware adaptive noising in a diffusion framework to generate text from graphs, outperforming larger autoregressive and diffusion baselines in factual grounding and edit sensitivity on three datasets plus molecule captioning.

EVENT5Ws: A Large Dataset for Open-Domain Event Extraction from Documents

cs.CL · 2026-04-23 · unverdicted · novelty 7.0

EVENT5Ws is a new large-scale, manually verified open-domain event extraction dataset that benchmarks LLMs and demonstrates cross-context generalization.

Decoding Text Spans for Efficient and Accurate Named-Entity Recognition

cs.CL · 2026-04-22 · unverdicted · novelty 7.0

SpanDec achieves competitive NER accuracy with improved efficiency by using a final-stage lightweight decoder for span representations and early candidate filtering to reduce redundant computation.

Learning Posterior Predictive Distributions for Node Classification from Synthetic Graph Priors

cs.LG · 2026-04-21 · unverdicted · novelty 7.0

NodePFN pre-trains on synthetic graphs with controllable homophily and causal feature-label models to achieve 71.27 average accuracy on 23 node classification benchmarks without graph-specific training.

On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability

cs.IR · 2026-04-17 · unverdicted · novelty 7.0

LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,

Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization

cs.LG · 2026-04-14 · unverdicted · novelty 7.0

STOMP extends direct preference optimization to the multi-objective setting via smooth Tchebysheff scalarization and standardization of observed rewards, achieving highest hypervolume in eight of nine protein engineering evaluations.

Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning

cs.LG · 2026-04-07 · unverdicted · novelty 7.0

Multimodal contrastive learning using multilinear products is fragile to single bad modalities, and a gated version improves top-1 retrieval accuracy on synthetic and real trimodal data.

Moshi: a speech-text foundation model for real-time dialogue

eess.AS · 2024-09-17 · accept · novelty 7.0

Moshi is the first real-time full-duplex spoken large language model that casts dialogue as speech-to-speech generation using parallel audio streams and an inner monologue of time-aligned text tokens.

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

cs.LG · 2024-02-27 · unverdicted · novelty 7.0

HSTU-based generative recommenders with 1.5 trillion parameters scale as a power law with compute up to GPT-3 scale, outperform baselines by up to 65.8% NDCG, run 5-15x faster than FlashAttention2 on long sequences, and improve online A/B metrics by 12.4%.

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

cs.CL · 2024-02-05 · unverdicted · novelty 7.0

M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.

The Power of Scale for Parameter-Efficient Prompt Tuning

cs.CL · 2021-04-18 · unverdicted · novelty 7.0

Prompt tuning matches full model tuning performance on large language models while tuning only a small fraction of parameters and improves robustness to domain shifts.

Prefix-Tuning: Optimizing Continuous Prompts for Generation

cs.CL · 2021-01-01 · conditional · novelty 7.0

Prefix-tuning matches or exceeds fine-tuning on NLG tasks by optimizing a continuous prefix using 0.1% of parameters while keeping the LM frozen.

citing papers explorer

Showing 50 of 98 citing papers.

TokAlign++: Advancing Vocabulary Adaptation via Better Token Alignment cs.CL · 2026-05-13 · unverdicted · none · ref 21
TokAlign++ learns token alignments between LLM vocabularies from monolingual representations to enable faster adaptation, better text compression, and effective token-level distillation across 15 languages with minimal steps.
BadSKP: Backdoor Attacks on Knowledge Graph-Enhanced LLMs with Soft Prompts cs.AI · 2026-05-12 · conditional · none · ref 40
BadSKP poisons graph node embeddings to steer soft prompts in KG-enhanced LLMs, achieving high attack success rates where text-channel backdoors fail due to semantic anchoring.
Scratchpad Patching: Decoupling Compute from Patch Size in Byte-Level Language Models cs.CL · 2026-05-10 · conditional · none · ref 26
Scratchpad Patching decouples compute from patch size in byte-level language models by inserting entropy-triggered scratchpads to update patch context dynamically.
Neural Cluster First, Route Second: One-Shot Capacitated Vehicle Routing via Differentiable Optimal Transport cs.LG · 2026-05-10 · unverdicted · none · ref 7
Neural CFRS is a non-autoregressive one-shot framework for CVRP that uses entropic optimal transport for capacitated clustering and achieves competitive gaps on large instances.
SeBA: Semi-supervised few-shot learning via Separated-at-Birth Alignment for tabular data cs.LG · 2026-05-08 · unverdicted · none · ref 234
SeBA is a joint-embedding framework that separates tabular data into two complementary views and aligns one view's representations to the nearest-neighbor structure of the other, improving feature-label relationships and achieving SOTA results in most benchmarks without relying on augmentations.
Accurate and Efficient Statistical Testing for Word Semantic Breadth cs.CL · 2026-05-08 · unverdicted · none · ref 30
A new permutation test uses Householder reflection to align word embedding clouds before testing dispersion differences, cutting Type-I error by 32.5% and speeding up 23x on GPU.
TRACE: Transport Alignment Conformal Prediction via Diffusion and Flow Matching Models stat.ML · 2026-05-08 · unverdicted · none · ref 22
TRACE creates valid conformal prediction sets for complex generative models by scoring outputs via averaged denoising or velocity errors along stochastic transport paths instead of likelihoods.
Towards Self-Referential Analytic Assessment: A Profile-Based Approach to L2 Writing Evaluation with LLMs cs.CL · 2026-05-05 · unverdicted · none · ref 29
LLMs outperform single human raters at spotting relative weaknesses in L2 writing profiles on the ICNALE GRA dataset while humans are better at spotting strengths, using a self-referential intra-learner evaluation method.
How Language Models Process Negation cs.CL · 2026-05-04 · unverdicted · none · ref 54
LLMs implement both attention-based suppression and constructive representations for negation, with construction dominant, despite poor accuracy from late-layer attention shortcuts.
TCDA: Thread-Constrained Discourse-Aware Modeling for Conversational Sentiment Quadruple Analysis cs.CL · 2026-05-03 · unverdicted · none · ref 26 · 2 links
TCDA introduces TC-DAG to filter cross-thread noise while preserving temporal order and D-RoPE to align semantics across layers and reduce distance dilution, achieving state-of-the-art results on two DiaASQ benchmarks.
A Multi-View Media Profiling Suite: Resources, Evaluation, and Analysis cs.CL · 2026-05-02 · unverdicted · none · ref 211
Presents MBFC-2025 dataset and multi-view embeddings with fusion methods for media bias and factuality, reporting SOTA results on ACL-2020 and new benchmarks on MBFC-2025.
InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees cs.LG · 2026-05-01 · unverdicted · none · ref 76 · 2 links
InvEvolve evolves white-box inventory policies from LLMs with statistical safety guarantees and outperforms classical and deep learning methods on synthetic and real retail data.
Factual and Edit-Sensitive Graph-to-Sequence Generation via Graph-Aware Adaptive Noising cs.CL · 2026-04-27 · unverdicted · none · ref 1
DLM4G applies graph-aware adaptive noising in a diffusion framework to generate text from graphs, outperforming larger autoregressive and diffusion baselines in factual grounding and edit sensitivity on three datasets plus molecule captioning.
EVENT5Ws: A Large Dataset for Open-Domain Event Extraction from Documents cs.CL · 2026-04-23 · unverdicted · none · ref 97
EVENT5Ws is a new large-scale, manually verified open-domain event extraction dataset that benchmarks LLMs and demonstrates cross-context generalization.
Decoding Text Spans for Efficient and Accurate Named-Entity Recognition cs.CL · 2026-04-22 · unverdicted · none · ref 4
SpanDec achieves competitive NER accuracy with improved efficiency by using a final-stage lightweight decoder for span representations and early candidate filtering to reduce redundant computation.
Learning Posterior Predictive Distributions for Node Classification from Synthetic Graph Priors cs.LG · 2026-04-21 · unverdicted · none · ref 206
NodePFN pre-trains on synthetic graphs with controllable homophily and causal feature-label models to achieve 71.27 average accuracy on 23 node classification benchmarks without graph-specific training.
On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability cs.IR · 2026-04-17 · unverdicted · none · ref 14
LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,
Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization cs.LG · 2026-04-14 · unverdicted · none · ref 18
STOMP extends direct preference optimization to the multi-objective setting via smooth Tchebysheff scalarization and standardization of observed rewards, achieving highest hypervolume in eight of nine protein engineering evaluations.
Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning cs.LG · 2026-04-07 · unverdicted · none · ref 14
Multimodal contrastive learning using multilinear products is fragile to single bad modalities, and a gated version improves top-1 retrieval accuracy on synthetic and real trimodal data.
Moshi: a speech-text foundation model for real-time dialogue eess.AS · 2024-09-17 · accept · none · ref 24
Moshi is the first real-time full-duplex spoken large language model that casts dialogue as speech-to-speech generation using parallel audio streams and an inner monologue of time-aligned text tokens.
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations cs.LG · 2024-02-27 · unverdicted · none · ref 106
HSTU-based generative recommenders with 1.5 trillion parameters scale as a power law with compute up to GPT-3 scale, outperform baselines by up to 65.8% NDCG, run 5-15x faster than FlashAttention2 on long sequences, and improve online A/B metrics by 12.4%.
M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation cs.CL · 2024-02-05 · unverdicted · none · ref 27
M3-Embedding is a single model for multi-lingual, multi-functional, and multi-granular text embeddings trained via self-knowledge distillation that achieves new state-of-the-art results on multilingual, cross-lingual, and long-document retrieval benchmarks.
The Power of Scale for Parameter-Efficient Prompt Tuning cs.CL · 2021-04-18 · unverdicted · none · ref 8
Prompt tuning matches full model tuning performance on large language models while tuning only a small fraction of parameters and improves robustness to domain shifts.
Prefix-Tuning: Optimizing Continuous Prompts for Generation cs.CL · 2021-01-01 · conditional · none · ref 50
Prefix-tuning matches or exceeds fine-tuning on NLG tasks by optimizing a continuous prefix using 0.1% of parameters while keeping the LM frozen.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks cs.CL · 2020-05-22 · accept · none · ref 9
RAG models set new state-of-the-art results on open-domain QA by retrieving Wikipedia passages and conditioning a generative model on them, while also producing more factual text than parametric baselines.
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations cs.CL · 2019-09-26 · accept · none · ref 9
ALBERT reduces BERT parameters via embedding factorization and layer sharing, adds inter-sentence coherence pretraining, and reaches SOTA on GLUE, RACE, and SQuAD with fewer parameters than BERT-large.
GESR: A Genetic Programming-Based Symbolic Regression Method with Gene Editing cs.AI · 2026-05-11 · unverdicted · none · ref 34 · 2 links
GESR uses two BERT models to intelligently direct mutations and crossovers inside genetic programming, yielding higher efficiency and competitive accuracy on symbolic regression benchmarks.
An Annotation Scheme and Classifier for Personal Facts in Dialogue cs.CL · 2026-05-11 · accept · none · ref 5
An extended annotation scheme with new categories and attributes plus a Gemma-300M-based multi-head classifier achieves 81.6% macro F1 on personal fact classification, outperforming few-shot LLM baselines by nearly 9 points with lower compute.
When Reviews Disagree: Fine-Grained Contradiction Analysis in Scientific Peer Reviews cs.CL · 2026-05-11 · unverdicted · none · ref 49
Introduces RevCI benchmark and IMPACT multi-agent framework for evidence-level contradiction detection and graded intensity scoring in peer reviews, distilled into efficient TIDE model.
Molecules Meet Language: Confound-Aware Representation Learning and Chemical Property Steering in Transformer-VAE Latent Spaces cs.LG · 2026-05-07 · unverdicted · none · ref 9
Chemically meaningful steering for properties like cLogP and TPSA emerges in entangled Transformer-VAE latent spaces only after controlling for SELFIES representation confounds through residualization and decoded traversals.
A Unified Benchmark for Evaluating Knowledge Graph Construction Methods and Graph Neural Networks cs.LG · 2026-05-06 · unverdicted · none · ref 24
A dual-purpose benchmark supplies two text-derived knowledge graphs and one expert reference graph on the same biomedical corpus to jointly measure construction method quality and GNN robustness via semi-supervised node classification.
Perturbation is All You Need for Extrapolating Language Models stat.ML · 2026-05-05 · unverdicted · none · ref 28
Perturbing prefixes to semantic neighbors during training creates a hierarchical noise model that improves language model predictions on token sequences outside the training corpus support.
Verbal-R3: Verbal Reranker as the Missing Bridge between Retrieval and Reasoning cs.CL · 2026-05-02 · unverdicted · none · ref 37
Verbal-R3 uses a verbal reranker to generate analytic narratives that guide retrieval and reasoning in LLMs, achieving SOTA results on complex QA benchmarks.
Compared to What? Baselines and Metrics for Counterfactual Prompting cs.CL · 2026-05-01 · conditional · none · ref 162
Counterfactual prompting effects on LLMs are often indistinguishable from those caused by meaning-preserving paraphrases, causing most previously reported demographic sensitivities to disappear under proper statistical comparison.
Kernel Affine Hull Machines for Compute-Efficient Query-Side Semantic Encoding cs.LG · 2026-05-01 · unverdicted · none · ref 6
Kernel Affine Hull Machines map lexical features to semantic embeddings via RKHS and least-mean-squares, outperforming adapters in reconstruction and retrieval metrics while reducing latency 8.5-fold on a legal benchmark.
Beyond Decodability: Reconstructing Language Model Representations with an Encoding Probe cs.CL · 2026-05-01 · unverdicted · none · ref 15
An encoding probe reconstructs transformer representations from acoustic, phonetic, syntactic, lexical and speaker features, showing independent syntactic/lexical contributions and training-dependent speaker effects.
PrismAgent: Illuminating Harm in Memes via a Zero-Shot Interpretable Multi-Agent Framework cs.LG · 2026-05-01 · unverdicted · none · ref 15
PrismAgent deploys four specialized LLM agents in sequence to analyze meme intent, gather context, make preliminary judgments, and deliver a final harm verdict, outperforming prior zero-shot methods on three public datasets.
Cultural Benchmarking of LLMs in Standard and Dialectal Arabic Dialogues cs.CL · 2026-04-30 · unverdicted · none · ref 16
ArabCulture-Dialogue dataset shows LLMs perform worse on dialectal Arabic than Modern Standard Arabic across cultural reasoning, translation, and generation tasks.
A Survey of Reasoning-Intensive Retrieval: Progress and Challenges cs.IR · 2026-04-30 · unverdicted · none · ref 15
A survey that categorizes RIR benchmarks by domain and modality, proposes a taxonomy for integrating reasoning into retrieval pipelines, and outlines key challenges.
Structural Generalization on SLOG without Hand-Written Rules cs.CL · 2026-04-28 · unverdicted · none · ref 15 · 2 links
A neural cellular automaton learns compositional rules from data alone to achieve structural generalization on the SLOG semantic parsing benchmark, reaching 67.3% accuracy and fully succeeding on 11 of 17 categories.
GLIER: Generative Legal Inference and Evidence Ranking for Legal Case Retrieval cs.IR · 2026-04-26 · unverdicted · none · ref 3
GLIER reformulates legal case retrieval as generative inference over latent legal variables like charges and elements, then fuses generative, structural, and lexical signals, outperforming baselines on LeCaRD datasets with strong performance at 10% training data.
Empirical Insights of Test Selection Metrics under Multiple Testing Objectives and Distribution Shifts cs.SE · 2026-04-25 · unverdicted · none · ref 18
A broad empirical benchmark shows how 15 existing test selection metrics perform for fault detection, performance estimation, and retraining under corrupted, adversarial, temporal, natural, and label shifts across image, text, and Android data.
MiMIC: Mitigating Visual Modality Collapse in Universal Multimodal Retrieval While Avoiding Semantic Misalignment cs.CV · 2026-04-23 · unverdicted · none · ref 27
MiMIC mitigates visual modality collapse and semantic misalignment in universal multimodal retrieval via fusion-in-decoder architecture and robust single-modality training.
LLM Safety From Within: Detecting Harmful Content with Internal Representations cs.AI · 2026-04-20 · unverdicted · none · ref 65
SIREN identifies safety neurons via linear probing on internal LLM layers and combines them with adaptive weighting to detect harm, outperforming prior guard models with 250x fewer parameters.
Towards E-Value Based Stopping Rules for Bayesian Deep Ensembles cs.LG · 2026-04-20 · unverdicted · none · ref 253
E-value sequential tests enable early stopping of MCMC sampling in Bayesian deep ensembles, often needing only a fraction of the full budget while improving over standard deep ensembles.
Enhancing Reinforcement Learning for Radiology Report Generation with Evidence-aware Rewards and Self-correcting Preference Learning cs.LG · 2026-04-15 · unverdicted · none · ref 8
ESC-RL improves RL for radiology reports via group-wise evidence-aware rewards (GEAR) and LLM-driven self-correcting preference learning (SPL), reaching state-of-the-art on two chest X-ray datasets.
MetFuse: Figurative Fusion between Metonymy and Metaphor cs.CL · 2026-04-14 · unverdicted · none · ref 14
MetFuse provides the first dataset of 1,000 meaning-aligned quadruplets fusing literal, metonymic, metaphoric, and hybrid sentences, which augments training to boost metonymy and metaphor classification performance on benchmarks.
TSUBASA: Improving Long-Horizon Personalization via Evolving Memory and Self-Learning with Context Distillation cs.CL · 2026-04-09 · unverdicted · none · ref 11
TSUBASA improves long-horizon personalization in LLMs via dynamic memory evolution for writing and context-distillation self-learning for reading, outperforming Mem0 and Memory-R1 on Qwen-3 benchmarks while reducing token use.
SemLink: A Semantic-Aware Automated Test Oracle for Hyperlink Verification using Siamese Sentence-BERT cs.SE · 2026-04-07 · unverdicted · none · ref 25
SemLink applies a Siamese SBERT model to detect semantic drift in hyperlinks, achieving 96% recall at 47.5 times the speed of GPT-5.2 using a new 60k-pair dataset.
Content Fuzzing for Escaping Information Cocoons on Digital Social Media cs.CL · 2026-04-07 · unverdicted · none · ref 20
ContentFuzz rewrites posts with LLM guidance from stance model confidence to flip machine labels without altering human intent, tested across four models and three datasets in two languages.

BERT : Pre-training of deep bidirectional transformers for language understanding

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer