pith. sign in

arxiv: 2604.03261 · v2 · submitted 2026-03-12 · 💻 cs.CL · cs.CY· cs.HC

VIGIL: An Extensible System for Real-Time Detection and Mitigation of Cognitive Bias Triggers

Pith reviewed 2026-05-15 11:36 UTC · model grok-4.3

classification 💻 cs.CL cs.CYcs.HC
keywords cognitive biasbrowser extensionreal-time detectionLLM reformulationmedia literacydisinformation mitigationextensible systemonline information
0
0 comments X

The pith

VIGIL is the first browser extension for real-time detection and mitigation of cognitive bias triggers in online text.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces VIGIL as a browser extension that identifies triggers of cognitive biases in web content while users scroll and supplies reformulated versions meant to reduce their persuasive effect. Existing media tools focus mainly on factual accuracy or source reliability, leaving a gap in addressing subtler manipulation techniques that exploit mental shortcuts. VIGIL combines scroll-synced detection, reversible LLM rephrasing, and options for offline or cloud processing, with support for third-party plugins that extend its capabilities. The authors position the tool as open-source and extensible so that validated detection methods can be added over time. If the approach works as described, readers could encounter flagged content and alternative phrasing without switching applications or losing access to the original text.

Core claim

VIGIL is the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference from fully offline to cloud. The system is built to be extensible with third-party plugins, with several plugins that are rigorously validated against NLP benchmarks already included.

What carries the argument

The VIGIL browser extension that performs scroll-synced detection of cognitive bias triggers and supplies reversible LLM reformulations of the affected text.

If this is right

  • Users receive in-page alerts for bias triggers as they scroll and can toggle between original and reformulated text.
  • Detection and reformulation can run locally for privacy or use cloud models when needed.
  • Developers can add new plugins that extend detection methods while preserving the core interface.
  • The extension remains fully reversible so users retain access to the source material at all times.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Widespread adoption might prompt platforms to build similar bias-mitigation features directly into feeds and articles.
  • Repeated exposure to reformulations could be tested for effects on users' long-term susceptibility to persuasive framing.
  • Combining VIGIL with existing fact-checking extensions could produce layered tools that address both facts and framing.

Load-bearing premise

Current NLP models and LLMs can accurately identify cognitive bias triggers in diverse real-world text and produce reformulations that reduce bias without introducing new distortions or factual errors.

What would settle it

A controlled evaluation on a diverse set of web pages in which human raters judge that the system either misses most bias triggers or that the reformulated versions frequently alter facts or add new distortions.

Figures

Figures reproduced from arXiv: 2604.03261 by Bo Kang, Sander Noels, Tijl De Bie.

Figure 1
Figure 1. Figure 1: Vigil architecture. Four extension components communicate via typed mes￾sages: the content layer extracts and renders page text, the message router coordinates, the sidepanel exposes the UI, and the inference runtime runs the in-browser LLM. De￾tection is delegated to browser plugins (in-process) or server plugins (HTTP), both producing Finding objects via a shared contract (dash-dotted line). The dashed l… view at source ↗
Figure 2
Figure 2. Figure 2: Vigil analyzing a tweet on Twitter/X. (A) A detected cognitive bias trigger is underlined in-page with a severity-colored highlight. (B) Hovering reveals the bias type (Loaded Language) and an explanation. (C) The sidepanel’s scroll-synced “Currently Viewing” card tracks the post in the user’s viewport. (D) The finding card details the trigger span, bias type, severity, and explanation. (E) Action buttons … view at source ↗
read the original abstract

The rise of generative AI is posing increasing risks to online information integrity and civic discourse. Most concretely, such risks can materialise in the form of mis- and disinformation. As a mitigation, media-literacy and transparency tools have been developed to address factuality of information and the reliability and ideological leaning of information sources. However, a subtler but possibly no less harmful threat to civic discourse is to use of persuasion or manipulation by exploiting human cognitive biases and related cognitive limitations. To the best of our knowledge, no tools exist to directly detect and mitigate the presence of triggers of such cognitive biases in online information. We present VIGIL (VIrtual GuardIan angeL), the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference from fully offline to cloud. VIGIL is built to be extensible with third-party plugins, with several plugins that are rigorously validated against NLP benchmarks are already included. It is open-sourced at https://github.com/aida-ugent/vigil.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents VIGIL, claimed as the first browser extension for real-time detection and mitigation of cognitive bias triggers in online text. It integrates scroll-synced detection via extensible plugins, LLM-powered reversible reformulation, and privacy-tiered inference (offline to cloud), with several included plugins stated to be rigorously validated against NLP benchmarks; the system is open-sourced.

Significance. If the plugin validations and reformulation accuracy hold, VIGIL could provide a meaningful extension of media-literacy tools by targeting cognitive bias triggers directly rather than only factuality or source reliability. The privacy tiers, reversibility, and extensibility are practical strengths, and the open-source release supports reproducibility and community extension.

major comments (2)
  1. [Abstract] Abstract: the statement that 'several plugins that are rigorously validated against NLP benchmarks are already included' provides no metrics, baselines, error analysis, or comparison to existing NLP methods for bias trigger detection, which is load-bearing for the claim of effective real-time mitigation.
  2. [§3] The manuscript does not detail how specific cognitive bias triggers are operationalized in the plugins or how LLM reformulations are constrained to avoid introducing new factual errors or distortions, leaving the performance of the core mitigation step unverified.
minor comments (1)
  1. [Abstract] The acronym expansion 'VIrtual GuardIan angeL' contains inconsistent capitalization that should be standardized.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments on our manuscript. We believe these suggestions will improve the clarity and rigor of our presentation of VIGIL. Below, we provide point-by-point responses to the major comments.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the statement that 'several plugins that are rigorously validated against NLP benchmarks are already included' provides no metrics, baselines, error analysis, or comparison to existing NLP methods for bias trigger detection, which is load-bearing for the claim of effective real-time mitigation.

    Authors: We acknowledge that the abstract does not include specific quantitative metrics, which would strengthen the claim. In the revised version, we will incorporate key results from our plugin validations, including F1-scores, baseline comparisons (e.g., against standard bias detection models like those based on BERT), and a brief error analysis summary. This will make the abstract self-contained while directing readers to the detailed evaluation in the main text. revision: yes

  2. Referee: [§3] The manuscript does not detail how specific cognitive bias triggers are operationalized in the plugins or how LLM reformulations are constrained to avoid introducing new factual errors or distortions, leaving the performance of the core mitigation step unverified.

    Authors: We agree that additional details on operationalization and constraints are necessary for verification. We will revise Section 3 to include: (1) explicit definitions and feature sets for each bias trigger (e.g., lexical patterns for availability heuristic), (2) the plugin architectures and validation against NLP benchmarks with reported metrics, and (3) the specific prompting strategies and post-processing steps used in LLM reformulation to ensure factual fidelity, such as entity preservation and entailment checks. We will also report quantitative results on reformulation accuracy to verify the mitigation performance. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

This is an engineering systems paper describing the VIGIL browser extension architecture, its extensibility via plugins, scroll-synced detection, reversible LLM reformulation, and privacy tiers. No mathematical derivations, equations, parameter fitting, or self-referential predictions appear in the provided text. The novelty claim (first such tool) and validation statements reference external NLP benchmarks and open-source release rather than reducing to self-citation chains or definitional loops. The system is self-contained as a prototype integration of existing components; no load-bearing step collapses by construction to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work integrates existing NLP models and LLMs without introducing new free parameters, invented entities, or non-standard axioms beyond the domain assumption that bias triggers are detectable by current language models.

axioms (1)
  • domain assumption Pre-trained NLP models can reliably detect cognitive bias triggers in text
    The detection component relies on this capability without new proof or calibration shown in the abstract.

pith-pipeline@v0.9.0 · 5506 in / 1167 out tokens · 37569 ms · 2026-05-15T11:36:26.297294+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 1 internal anchor

  1. [1]

    arXiv preprint arXiv:2512.15248 (2025)

    Becker, M., Sommer, M., Tapken, L., Teh, Y.W., Brocai, B.: The moralization cor- pus: Frame-based annotation and analysis of moralizing speech acts across diverse text genres. arXiv preprint arXiv:2512.15248 (2025)

  2. [2]

    Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)

    Bhuiyan, M.M., Horning, M., Lee, S.W., Mitra, T.: Nudgecred: Supporting news credibility assessment on social media through nudges. Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)

  3. [3]

    Journal of Clinical Epidemiology46(5), 423–429 (1993)

    Byrt, T., Bishop, J., Carlin, J.B.: Bias, prevalence and kappa. Journal of Clinical Epidemiology46(5), 423–429 (1993)

  4. [4]

    In: Proceedings of the fourteenth workshop on semantic evaluation

    Da San Martino, G., Barrón-Cedeño, A., Wachsmuth, H., Petrov, R., Nakov, P.: Semeval-2020 task 11: Detection of propaganda techniques in news articles. In: Proceedings of the fourteenth workshop on semantic evaluation. pp. 1377–1414 (2020)

  5. [5]

    In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

    Da San Martino, G., Shaar, S., Zhang, Y., Yu, S., Barrón-Cedeno, A., Nakov, P.: Prta: A system to support the analysis of propaganda techniques in the news. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. pp. 287–293 (2020)

  6. [6]

    Ground News: Ground news: Compare news from every perspective.https:// ground.news/(2024), accessed: 2026-03-01

  7. [7]

    Vintage (2012) 10 B

    Haidt, J.: The righteous mind: Why good people are divided by politics and reli- gion. Vintage (2012) 10 B. Kang et al

  8. [8]

    Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)

    Hassan, N., Zhang, G., Arslan, F., Caraballo, J., Jimenez, D., Gawsane, S., Hasan, S., Joseph, M., Kulkarni, A., Nayak, A.K., et al.: Claimbuster: The first-ever end- to-end fact-checking system. Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)

  9. [9]

    In: Findings of the Association for Computational Linguistics: EMNLP 2022

    Jin,Z.,Lalwani,A.,Vaidhya,T.,Shen,X.,Ding,Y.,Lyu,Z.,Sachan,M.,Mihalcea, R., Schoelkopf, B.: Logical fallacy detection. In: Findings of the Association for Computational Linguistics: EMNLP 2022. pp. 7180–7198 (2022)

  10. [10]

    macmillan (2011)

    Kahneman, D.: Thinking, fast and slow. macmillan (2011)

  11. [11]

    In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining

    Lees,A.,Tran,V.Q.,Tay,Y.,Sorensen,J.,Gupta,J.,Metzler,D.,Vasserman,L.:A new generation of perspective api: Efficient multilingual character-level transform- ers. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. pp. 3197–3207 (2022)

  12. [12]

    In: Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing

    Lim, G., Perrault, S.T.: Evaluation of an llm in identifying logical fallacies: A call for rigor when adopting llms in hci research. In: Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing. pp. 303–308 (2024)

  13. [13]

    WW Norton & Company (2023)

    Van der Linden, S.: Foolproof: Why misinformation infects our minds and how to build immunity. WW Norton & Company (2023)

  14. [14]

    NewsGuard Technologies: Newsguard: Restoring trust & accountability.https: //www.newsguardtech.com/(2024), accessed: 2026-03-01

  15. [15]

    WebLLM: A High-Performance In-Browser LLM Inference Engine

    Ruan, C.F., Qin, Y., Zhou, X., Lai, R., Jin, H., Dong, Y., Hou, B., Yu, M.S., Zhai, Y., Agarwal, S., et al.: Webllm: A high-performance in-browser llm inference engine. arXiv preprint arXiv:2412.15803 (2024)

  16. [16]

    arXiv preprint arXiv:2310.06422 (2023)

    Sprenkamp,K.,Jones,D.G.,Zavolokina,L.:Largelanguagemodelsforpropaganda detection. arXiv preprint arXiv:2310.06422 (2023)

  17. [17]

    https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01

    Van Zandt, D.: Media bias/fact check: Independent online media bias detector. https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01