ReMMD presents ReMMDBench (500 samples, 2756 images, five languages, five-way veracity) and ReMMD-Agent, which achieves 41.80% accuracy and 39.12% macro-F1 on five-way classification with GPT-5.2 while cutting costs versus prior agents.
Science 359(6380):1146--1151
11 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.
A scalable Aumann-Shapley attribution method for million-agent systems reveals that small-scale samples structurally misattribute emergence under nonlinear macro indicators, as shown by the Attribution Scaling Bias theorem.
A reproducible framework maps topics from separate corpora into a shared space anchored by the 94 IPTC Media Topics taxonomy via guided BERTopic, centroid scoring, and upward collapse to 17 parent topics.
BOUTEF is a publicly available multilingual corpus for fake news research in Algeria and Tunisia, with narratives, comments, and debunkings across multiple languages and dialects, accompanied by thematic and engagement analyses.
Guardrail sampling strategies embedded in line charts increase user trust, improve accuracy of performance judgments, and raise perceived completeness of context in persuasive visualizations for COVID-19 and stock data.
Foundation models are large adaptable AI systems with emergent capabilities that offer broad opportunities but carry risks from homogenization, opacity, and inherited defects across downstream applications.
Survival analysis of three years of X posts shows conspiracy claims with greater semantic mutations have substantially longer lifespans, linked to changes in pronouns, social words, cognitive terms, and actor-action-target structures.
Simulations show information overload decreases source localization effectiveness in networks, with Erdős-Rényi graphs more resilient than Barabási-Albert ones and a reversal where less dense networks perform better under strong overload.
AI alignment should target objective floors of competence, accuracy, honesty, and lawfulness rather than aggregated human preferences.
Fine-tuned RoBERTa achieves 0.62 macro-F1 on 900 Reddit comments, outperforming best zero-shot LLM at 0.50, with largest gap on detecting belief propagation.
citing papers explorer
-
ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection
ReMMD presents ReMMDBench (500 samples, 2756 images, five languages, five-way veracity) and ReMMD-Agent, which achieves 41.80% accuracy and 39.12% macro-F1 on five-way classification with GPT-5.2 while cutting costs versus prior agents.
-
PluRule: A Benchmark for Moderating Pluralistic Communities on Social Media
PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.
-
Attributing Emergence in Million-Agent Systems
A scalable Aumann-Shapley attribution method for million-agent systems reveals that small-scale samples structurally misattribute emergence under nonlinear macro indicators, as shown by the Attribution Scaling Bias theorem.
-
A Shared IPTC Topic Space for Cross-Source Topic Modelling
A reproducible framework maps topics from separate corpora into a shared space anchored by the 94 IPTC Media Topics taxonomy via guided BERTopic, centroid scoring, and upward collapse to 17 parent topics.
-
BOUTEF: A Multilingual Corpus for FakeNews in North Africa -- Language as a Weapon
BOUTEF is a publicly available multilingual corpus for fake news research in Algeria and Tunisia, with narratives, comments, and debunkings across multiple languages and dialects, accompanied by thematic and engagement analyses.
-
Guardrail Selection in Line Charts to Contextualize Persuasive Visualizations
Guardrail sampling strategies embedded in line charts increase user trust, improve accuracy of performance judgments, and raise perceived completeness of context in persuasive visualizations for COVID-19 and stock data.
-
On the Opportunities and Risks of Foundation Models
Foundation models are large adaptable AI systems with emergent capabilities that offer broad opportunities but carry risks from homogenization, opacity, and inherited defects across downstream applications.
-
Language Mutations Sustain the Persistences of Conspiracy Theories on Social Media
Survival analysis of three years of X posts shows conspiracy claims with greater semantic mutations have substantially longer lifespans, linked to changes in pronouns, social words, cognitive terms, and actor-action-target structures.
-
Nonlinear dynamics of information overload: Impact on source localization in complex networks
Simulations show information overload decreases source localization effectiveness in networks, with Erdős-Rényi graphs more resilient than Barabási-Albert ones and a reversal where less dense networks perform better under strong overload.
-
Position: Align AI to Our Aspirations, Not Our Flaws
AI alignment should target objective floors of competence, accuracy, honesty, and lawfulness rather than aggregated human preferences.
-
Long Live Fine-Tuning: Task-Specific Transformers Outperform Zero-Shot LLMs for Misinformation Response Classification on Reddit
Fine-tuned RoBERTa achieves 0.62 macro-F1 on 900 Reddit comments, outperforming best zero-shot LLM at 0.50, with largest gap on detecting belief propagation.