VIGIL: An Extensible System for Real-Time Detection and Mitigation of Cognitive Bias Triggers

Bo Kang; Sander Noels; Tijl De Bie

arxiv: 2604.03261 · v2 · submitted 2026-03-12 · 💻 cs.CL · cs.CY· cs.HC

VIGIL: An Extensible System for Real-Time Detection and Mitigation of Cognitive Bias Triggers

Bo Kang , Sander Noels , Tijl De Bie This is my paper

Pith reviewed 2026-05-15 11:36 UTC · model grok-4.3

classification 💻 cs.CL cs.CYcs.HC

keywords cognitive biasbrowser extensionreal-time detectionLLM reformulationmedia literacydisinformation mitigationextensible systemonline information

0 comments

The pith

VIGIL is the first browser extension for real-time detection and mitigation of cognitive bias triggers in online text.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces VIGIL as a browser extension that identifies triggers of cognitive biases in web content while users scroll and supplies reformulated versions meant to reduce their persuasive effect. Existing media tools focus mainly on factual accuracy or source reliability, leaving a gap in addressing subtler manipulation techniques that exploit mental shortcuts. VIGIL combines scroll-synced detection, reversible LLM rephrasing, and options for offline or cloud processing, with support for third-party plugins that extend its capabilities. The authors position the tool as open-source and extensible so that validated detection methods can be added over time. If the approach works as described, readers could encounter flagged content and alternative phrasing without switching applications or losing access to the original text.

Core claim

VIGIL is the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference from fully offline to cloud. The system is built to be extensible with third-party plugins, with several plugins that are rigorously validated against NLP benchmarks already included.

What carries the argument

The VIGIL browser extension that performs scroll-synced detection of cognitive bias triggers and supplies reversible LLM reformulations of the affected text.

If this is right

Users receive in-page alerts for bias triggers as they scroll and can toggle between original and reformulated text.
Detection and reformulation can run locally for privacy or use cloud models when needed.
Developers can add new plugins that extend detection methods while preserving the core interface.
The extension remains fully reversible so users retain access to the source material at all times.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Widespread adoption might prompt platforms to build similar bias-mitigation features directly into feeds and articles.
Repeated exposure to reformulations could be tested for effects on users' long-term susceptibility to persuasive framing.
Combining VIGIL with existing fact-checking extensions could produce layered tools that address both facts and framing.

Load-bearing premise

Current NLP models and LLMs can accurately identify cognitive bias triggers in diverse real-world text and produce reformulations that reduce bias without introducing new distortions or factual errors.

What would settle it

A controlled evaluation on a diverse set of web pages in which human raters judge that the system either misses most bias triggers or that the reformulated versions frequently alter facts or add new distortions.

Figures

Figures reproduced from arXiv: 2604.03261 by Bo Kang, Sander Noels, Tijl De Bie.

**Figure 1.** Figure 1: Vigil architecture. Four extension components communicate via typed messages: the content layer extracts and renders page text, the message router coordinates, the sidepanel exposes the UI, and the inference runtime runs the in-browser LLM. Detection is delegated to browser plugins (in-process) or server plugins (HTTP), both producing Finding objects via a shared contract (dash-dotted line). The dashed l… view at source ↗

**Figure 2.** Figure 2: Vigil analyzing a tweet on Twitter/X. (A) A detected cognitive bias trigger is underlined in-page with a severity-colored highlight. (B) Hovering reveals the bias type (Loaded Language) and an explanation. (C) The sidepanel’s scroll-synced “Currently Viewing” card tracks the post in the user’s viewport. (D) The finding card details the trigger span, bias type, severity, and explanation. (E) Action buttons … view at source ↗

read the original abstract

The rise of generative AI is posing increasing risks to online information integrity and civic discourse. Most concretely, such risks can materialise in the form of mis- and disinformation. As a mitigation, media-literacy and transparency tools have been developed to address factuality of information and the reliability and ideological leaning of information sources. However, a subtler but possibly no less harmful threat to civic discourse is to use of persuasion or manipulation by exploiting human cognitive biases and related cognitive limitations. To the best of our knowledge, no tools exist to directly detect and mitigate the presence of triggers of such cognitive biases in online information. We present VIGIL (VIrtual GuardIan angeL), the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference from fully offline to cloud. VIGIL is built to be extensible with third-party plugins, with several plugins that are rigorously validated against NLP benchmarks are already included. It is open-sourced at https://github.com/aida-ugent/vigil.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

VIGIL is a straightforward browser extension that wires together existing NLP detectors and reversible LLM rephrasing for cognitive bias triggers, but the abstract gives no numbers on how well any of it works.

read the letter

The main point is that this paper ships a working browser extension called VIGIL that does scroll-synced detection of bias triggers and offers LLM reformulations you can instantly revert, with a choice between local and cloud inference. The code is open-sourced and the system is built to accept third-party plugins, which is a clean engineering move for an applied tool in this space. The architecture for reversibility and privacy tiers follows standard patterns and does not require exotic assumptions to function as a prototype. What the paper does well is the integration: it takes off-the-shelf components and packages them into something users could actually install and try on real pages. That kind of end-to-end system description is still rare in the bias-mitigation literature. The soft spot is the missing evaluation. The abstract says the plugins are rigorously validated on NLP benchmarks, yet no accuracy figures, baselines, or error breakdowns are shown. Without those, it is impossible to judge whether the detection is reliable across diverse text or whether the reformulations reduce bias without adding new distortions. This paper is for people who build or evaluate practical media-literacy tools rather than for theorists. A reader who wants a concrete starting point for extensible bias-detection extensions would get value from the design and the GitHub release. It deserves peer review so the evaluation details can be checked and any gaps addressed.

Referee Report

2 major / 1 minor

Summary. The manuscript presents VIGIL, claimed as the first browser extension for real-time detection and mitigation of cognitive bias triggers in online text. It integrates scroll-synced detection via extensible plugins, LLM-powered reversible reformulation, and privacy-tiered inference (offline to cloud), with several included plugins stated to be rigorously validated against NLP benchmarks; the system is open-sourced.

Significance. If the plugin validations and reformulation accuracy hold, VIGIL could provide a meaningful extension of media-literacy tools by targeting cognitive bias triggers directly rather than only factuality or source reliability. The privacy tiers, reversibility, and extensibility are practical strengths, and the open-source release supports reproducibility and community extension.

major comments (2)

[Abstract] Abstract: the statement that 'several plugins that are rigorously validated against NLP benchmarks are already included' provides no metrics, baselines, error analysis, or comparison to existing NLP methods for bias trigger detection, which is load-bearing for the claim of effective real-time mitigation.
[§3] The manuscript does not detail how specific cognitive bias triggers are operationalized in the plugins or how LLM reformulations are constrained to avoid introducing new factual errors or distortions, leaving the performance of the core mitigation step unverified.

minor comments (1)

[Abstract] The acronym expansion 'VIrtual GuardIan angeL' contains inconsistent capitalization that should be standardized.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments on our manuscript. We believe these suggestions will improve the clarity and rigor of our presentation of VIGIL. Below, we provide point-by-point responses to the major comments.

read point-by-point responses

Referee: [Abstract] Abstract: the statement that 'several plugins that are rigorously validated against NLP benchmarks are already included' provides no metrics, baselines, error analysis, or comparison to existing NLP methods for bias trigger detection, which is load-bearing for the claim of effective real-time mitigation.

Authors: We acknowledge that the abstract does not include specific quantitative metrics, which would strengthen the claim. In the revised version, we will incorporate key results from our plugin validations, including F1-scores, baseline comparisons (e.g., against standard bias detection models like those based on BERT), and a brief error analysis summary. This will make the abstract self-contained while directing readers to the detailed evaluation in the main text. revision: yes
Referee: [§3] The manuscript does not detail how specific cognitive bias triggers are operationalized in the plugins or how LLM reformulations are constrained to avoid introducing new factual errors or distortions, leaving the performance of the core mitigation step unverified.

Authors: We agree that additional details on operationalization and constraints are necessary for verification. We will revise Section 3 to include: (1) explicit definitions and feature sets for each bias trigger (e.g., lexical patterns for availability heuristic), (2) the plugin architectures and validation against NLP benchmarks with reported metrics, and (3) the specific prompting strategies and post-processing steps used in LLM reformulation to ensure factual fidelity, such as entity preservation and entailment checks. We will also report quantitative results on reformulation accuracy to verify the mitigation performance. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

This is an engineering systems paper describing the VIGIL browser extension architecture, its extensibility via plugins, scroll-synced detection, reversible LLM reformulation, and privacy tiers. No mathematical derivations, equations, parameter fitting, or self-referential predictions appear in the provided text. The novelty claim (first such tool) and validation statements reference external NLP benchmarks and open-source release rather than reducing to self-citation chains or definitional loops. The system is self-contained as a prototype integration of existing components; no load-bearing step collapses by construction to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work integrates existing NLP models and LLMs without introducing new free parameters, invented entities, or non-standard axioms beyond the domain assumption that bias triggers are detectable by current language models.

axioms (1)

domain assumption Pre-trained NLP models can reliably detect cognitive bias triggers in text
The detection component relies on this capability without new proof or calibration shown in the abstract.

pith-pipeline@v0.9.0 · 5506 in / 1167 out tokens · 37569 ms · 2026-05-15T11:36:26.297294+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

VIGIL, the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 1 internal anchor

[1]

arXiv preprint arXiv:2512.15248 (2025)

Becker, M., Sommer, M., Tapken, L., Teh, Y.W., Brocai, B.: The moralization cor- pus: Frame-based annotation and analysis of moralizing speech acts across diverse text genres. arXiv preprint arXiv:2512.15248 (2025)

work page arXiv 2025
[2]

Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)

Bhuiyan, M.M., Horning, M., Lee, S.W., Mitra, T.: Nudgecred: Supporting news credibility assessment on social media through nudges. Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)

work page 2021
[3]

Journal of Clinical Epidemiology46(5), 423–429 (1993)

Byrt, T., Bishop, J., Carlin, J.B.: Bias, prevalence and kappa. Journal of Clinical Epidemiology46(5), 423–429 (1993)

work page 1993
[4]

In: Proceedings of the fourteenth workshop on semantic evaluation

Da San Martino, G., Barrón-Cedeño, A., Wachsmuth, H., Petrov, R., Nakov, P.: Semeval-2020 task 11: Detection of propaganda techniques in news articles. In: Proceedings of the fourteenth workshop on semantic evaluation. pp. 1377–1414 (2020)

work page 2020
[5]

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Da San Martino, G., Shaar, S., Zhang, Y., Yu, S., Barrón-Cedeno, A., Nakov, P.: Prta: A system to support the analysis of propaganda techniques in the news. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. pp. 287–293 (2020)

work page 2020
[6]

Ground News: Ground news: Compare news from every perspective.https:// ground.news/(2024), accessed: 2026-03-01

work page 2024
[7]

Vintage (2012) 10 B

Haidt, J.: The righteous mind: Why good people are divided by politics and reli- gion. Vintage (2012) 10 B. Kang et al

work page 2012
[8]

Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)

Hassan, N., Zhang, G., Arslan, F., Caraballo, J., Jimenez, D., Gawsane, S., Hasan, S., Joseph, M., Kulkarni, A., Nayak, A.K., et al.: Claimbuster: The first-ever end- to-end fact-checking system. Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)

work page 1945
[9]

In: Findings of the Association for Computational Linguistics: EMNLP 2022

Jin,Z.,Lalwani,A.,Vaidhya,T.,Shen,X.,Ding,Y.,Lyu,Z.,Sachan,M.,Mihalcea, R., Schoelkopf, B.: Logical fallacy detection. In: Findings of the Association for Computational Linguistics: EMNLP 2022. pp. 7180–7198 (2022)

work page 2022
[10]

macmillan (2011)

Kahneman, D.: Thinking, fast and slow. macmillan (2011)

work page 2011
[11]

In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining

Lees,A.,Tran,V.Q.,Tay,Y.,Sorensen,J.,Gupta,J.,Metzler,D.,Vasserman,L.:A new generation of perspective api: Efficient multilingual character-level transform- ers. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. pp. 3197–3207 (2022)

work page 2022
[12]

In: Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing

Lim, G., Perrault, S.T.: Evaluation of an llm in identifying logical fallacies: A call for rigor when adopting llms in hci research. In: Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing. pp. 303–308 (2024)

work page 2024
[13]

WW Norton & Company (2023)

Van der Linden, S.: Foolproof: Why misinformation infects our minds and how to build immunity. WW Norton & Company (2023)

work page 2023
[14]

NewsGuard Technologies: Newsguard: Restoring trust & accountability.https: //www.newsguardtech.com/(2024), accessed: 2026-03-01

work page 2024
[15]

WebLLM: A High-Performance In-Browser LLM Inference Engine

Ruan, C.F., Qin, Y., Zhou, X., Lai, R., Jin, H., Dong, Y., Hou, B., Yu, M.S., Zhai, Y., Agarwal, S., et al.: Webllm: A high-performance in-browser llm inference engine. arXiv preprint arXiv:2412.15803 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[16]

arXiv preprint arXiv:2310.06422 (2023)

Sprenkamp,K.,Jones,D.G.,Zavolokina,L.:Largelanguagemodelsforpropaganda detection. arXiv preprint arXiv:2310.06422 (2023)

work page arXiv 2023
[17]

https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01

Van Zandt, D.: Media bias/fact check: Independent online media bias detector. https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01

work page 2024

[1] [1]

arXiv preprint arXiv:2512.15248 (2025)

Becker, M., Sommer, M., Tapken, L., Teh, Y.W., Brocai, B.: The moralization cor- pus: Frame-based annotation and analysis of moralizing speech acts across diverse text genres. arXiv preprint arXiv:2512.15248 (2025)

work page arXiv 2025

[2] [2]

Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)

Bhuiyan, M.M., Horning, M., Lee, S.W., Mitra, T.: Nudgecred: Supporting news credibility assessment on social media through nudges. Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)

work page 2021

[3] [3]

Journal of Clinical Epidemiology46(5), 423–429 (1993)

Byrt, T., Bishop, J., Carlin, J.B.: Bias, prevalence and kappa. Journal of Clinical Epidemiology46(5), 423–429 (1993)

work page 1993

[4] [4]

In: Proceedings of the fourteenth workshop on semantic evaluation

Da San Martino, G., Barrón-Cedeño, A., Wachsmuth, H., Petrov, R., Nakov, P.: Semeval-2020 task 11: Detection of propaganda techniques in news articles. In: Proceedings of the fourteenth workshop on semantic evaluation. pp. 1377–1414 (2020)

work page 2020

[5] [5]

In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Da San Martino, G., Shaar, S., Zhang, Y., Yu, S., Barrón-Cedeno, A., Nakov, P.: Prta: A system to support the analysis of propaganda techniques in the news. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. pp. 287–293 (2020)

work page 2020

[6] [6]

Ground News: Ground news: Compare news from every perspective.https:// ground.news/(2024), accessed: 2026-03-01

work page 2024

[7] [7]

Vintage (2012) 10 B

Haidt, J.: The righteous mind: Why good people are divided by politics and reli- gion. Vintage (2012) 10 B. Kang et al

work page 2012

[8] [8]

Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)

Hassan, N., Zhang, G., Arslan, F., Caraballo, J., Jimenez, D., Gawsane, S., Hasan, S., Joseph, M., Kulkarni, A., Nayak, A.K., et al.: Claimbuster: The first-ever end- to-end fact-checking system. Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)

work page 1945

[9] [9]

In: Findings of the Association for Computational Linguistics: EMNLP 2022

Jin,Z.,Lalwani,A.,Vaidhya,T.,Shen,X.,Ding,Y.,Lyu,Z.,Sachan,M.,Mihalcea, R., Schoelkopf, B.: Logical fallacy detection. In: Findings of the Association for Computational Linguistics: EMNLP 2022. pp. 7180–7198 (2022)

work page 2022

[10] [10]

macmillan (2011)

Kahneman, D.: Thinking, fast and slow. macmillan (2011)

work page 2011

[11] [11]

In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining

Lees,A.,Tran,V.Q.,Tay,Y.,Sorensen,J.,Gupta,J.,Metzler,D.,Vasserman,L.:A new generation of perspective api: Efficient multilingual character-level transform- ers. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. pp. 3197–3207 (2022)

work page 2022

[12] [12]

In: Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing

Lim, G., Perrault, S.T.: Evaluation of an llm in identifying logical fallacies: A call for rigor when adopting llms in hci research. In: Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing. pp. 303–308 (2024)

work page 2024

[13] [13]

WW Norton & Company (2023)

Van der Linden, S.: Foolproof: Why misinformation infects our minds and how to build immunity. WW Norton & Company (2023)

work page 2023

[14] [14]

NewsGuard Technologies: Newsguard: Restoring trust & accountability.https: //www.newsguardtech.com/(2024), accessed: 2026-03-01

work page 2024

[15] [15]

WebLLM: A High-Performance In-Browser LLM Inference Engine

Ruan, C.F., Qin, Y., Zhou, X., Lai, R., Jin, H., Dong, Y., Hou, B., Yu, M.S., Zhai, Y., Agarwal, S., et al.: Webllm: A high-performance in-browser llm inference engine. arXiv preprint arXiv:2412.15803 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[16] [16]

arXiv preprint arXiv:2310.06422 (2023)

Sprenkamp,K.,Jones,D.G.,Zavolokina,L.:Largelanguagemodelsforpropaganda detection. arXiv preprint arXiv:2310.06422 (2023)

work page arXiv 2023

[17] [17]

https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01

Van Zandt, D.: Media bias/fact check: Independent online media bias detector. https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01

work page 2024