VIGIL: An Extensible System for Real-Time Detection and Mitigation of Cognitive Bias Triggers
Pith reviewed 2026-05-15 11:36 UTC · model grok-4.3
The pith
VIGIL is the first browser extension for real-time detection and mitigation of cognitive bias triggers in online text.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
VIGIL is the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference from fully offline to cloud. The system is built to be extensible with third-party plugins, with several plugins that are rigorously validated against NLP benchmarks already included.
What carries the argument
The VIGIL browser extension that performs scroll-synced detection of cognitive bias triggers and supplies reversible LLM reformulations of the affected text.
If this is right
- Users receive in-page alerts for bias triggers as they scroll and can toggle between original and reformulated text.
- Detection and reformulation can run locally for privacy or use cloud models when needed.
- Developers can add new plugins that extend detection methods while preserving the core interface.
- The extension remains fully reversible so users retain access to the source material at all times.
Where Pith is reading between the lines
- Widespread adoption might prompt platforms to build similar bias-mitigation features directly into feeds and articles.
- Repeated exposure to reformulations could be tested for effects on users' long-term susceptibility to persuasive framing.
- Combining VIGIL with existing fact-checking extensions could produce layered tools that address both facts and framing.
Load-bearing premise
Current NLP models and LLMs can accurately identify cognitive bias triggers in diverse real-world text and produce reformulations that reduce bias without introducing new distortions or factual errors.
What would settle it
A controlled evaluation on a diverse set of web pages in which human raters judge that the system either misses most bias triggers or that the reformulated versions frequently alter facts or add new distortions.
Figures
read the original abstract
The rise of generative AI is posing increasing risks to online information integrity and civic discourse. Most concretely, such risks can materialise in the form of mis- and disinformation. As a mitigation, media-literacy and transparency tools have been developed to address factuality of information and the reliability and ideological leaning of information sources. However, a subtler but possibly no less harmful threat to civic discourse is to use of persuasion or manipulation by exploiting human cognitive biases and related cognitive limitations. To the best of our knowledge, no tools exist to directly detect and mitigate the presence of triggers of such cognitive biases in online information. We present VIGIL (VIrtual GuardIan angeL), the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference from fully offline to cloud. VIGIL is built to be extensible with third-party plugins, with several plugins that are rigorously validated against NLP benchmarks are already included. It is open-sourced at https://github.com/aida-ugent/vigil.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents VIGIL, claimed as the first browser extension for real-time detection and mitigation of cognitive bias triggers in online text. It integrates scroll-synced detection via extensible plugins, LLM-powered reversible reformulation, and privacy-tiered inference (offline to cloud), with several included plugins stated to be rigorously validated against NLP benchmarks; the system is open-sourced.
Significance. If the plugin validations and reformulation accuracy hold, VIGIL could provide a meaningful extension of media-literacy tools by targeting cognitive bias triggers directly rather than only factuality or source reliability. The privacy tiers, reversibility, and extensibility are practical strengths, and the open-source release supports reproducibility and community extension.
major comments (2)
- [Abstract] Abstract: the statement that 'several plugins that are rigorously validated against NLP benchmarks are already included' provides no metrics, baselines, error analysis, or comparison to existing NLP methods for bias trigger detection, which is load-bearing for the claim of effective real-time mitigation.
- [§3] The manuscript does not detail how specific cognitive bias triggers are operationalized in the plugins or how LLM reformulations are constrained to avoid introducing new factual errors or distortions, leaving the performance of the core mitigation step unverified.
minor comments (1)
- [Abstract] The acronym expansion 'VIrtual GuardIan angeL' contains inconsistent capitalization that should be standardized.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments on our manuscript. We believe these suggestions will improve the clarity and rigor of our presentation of VIGIL. Below, we provide point-by-point responses to the major comments.
read point-by-point responses
-
Referee: [Abstract] Abstract: the statement that 'several plugins that are rigorously validated against NLP benchmarks are already included' provides no metrics, baselines, error analysis, or comparison to existing NLP methods for bias trigger detection, which is load-bearing for the claim of effective real-time mitigation.
Authors: We acknowledge that the abstract does not include specific quantitative metrics, which would strengthen the claim. In the revised version, we will incorporate key results from our plugin validations, including F1-scores, baseline comparisons (e.g., against standard bias detection models like those based on BERT), and a brief error analysis summary. This will make the abstract self-contained while directing readers to the detailed evaluation in the main text. revision: yes
-
Referee: [§3] The manuscript does not detail how specific cognitive bias triggers are operationalized in the plugins or how LLM reformulations are constrained to avoid introducing new factual errors or distortions, leaving the performance of the core mitigation step unverified.
Authors: We agree that additional details on operationalization and constraints are necessary for verification. We will revise Section 3 to include: (1) explicit definitions and feature sets for each bias trigger (e.g., lexical patterns for availability heuristic), (2) the plugin architectures and validation against NLP benchmarks with reported metrics, and (3) the specific prompting strategies and post-processing steps used in LLM reformulation to ensure factual fidelity, such as entity preservation and entailment checks. We will also report quantitative results on reformulation accuracy to verify the mitigation performance. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
This is an engineering systems paper describing the VIGIL browser extension architecture, its extensibility via plugins, scroll-synced detection, reversible LLM reformulation, and privacy tiers. No mathematical derivations, equations, parameter fitting, or self-referential predictions appear in the provided text. The novelty claim (first such tool) and validation statements reference external NLP benchmarks and open-source release rather than reducing to self-citation chains or definitional loops. The system is self-contained as a prototype integration of existing components; no load-bearing step collapses by construction to its inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Pre-trained NLP models can reliably detect cognitive bias triggers in text
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
VIGIL, the first browser extension for real-time cognitive bias trigger detection and mitigation, providing in-situ scroll-synced detection, LLM-powered reformulation with full reversibility, and privacy-tiered inference
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:2512.15248 (2025)
Becker, M., Sommer, M., Tapken, L., Teh, Y.W., Brocai, B.: The moralization cor- pus: Frame-based annotation and analysis of moralizing speech acts across diverse text genres. arXiv preprint arXiv:2512.15248 (2025)
-
[2]
Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)
Bhuiyan, M.M., Horning, M., Lee, S.W., Mitra, T.: Nudgecred: Supporting news credibility assessment on social media through nudges. Proceedings of the ACM on Human-Computer Interaction5(CSCW2), 1–30 (2021)
work page 2021
-
[3]
Journal of Clinical Epidemiology46(5), 423–429 (1993)
Byrt, T., Bishop, J., Carlin, J.B.: Bias, prevalence and kappa. Journal of Clinical Epidemiology46(5), 423–429 (1993)
work page 1993
-
[4]
In: Proceedings of the fourteenth workshop on semantic evaluation
Da San Martino, G., Barrón-Cedeño, A., Wachsmuth, H., Petrov, R., Nakov, P.: Semeval-2020 task 11: Detection of propaganda techniques in news articles. In: Proceedings of the fourteenth workshop on semantic evaluation. pp. 1377–1414 (2020)
work page 2020
-
[5]
Da San Martino, G., Shaar, S., Zhang, Y., Yu, S., Barrón-Cedeno, A., Nakov, P.: Prta: A system to support the analysis of propaganda techniques in the news. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. pp. 287–293 (2020)
work page 2020
-
[6]
Ground News: Ground news: Compare news from every perspective.https:// ground.news/(2024), accessed: 2026-03-01
work page 2024
-
[7]
Haidt, J.: The righteous mind: Why good people are divided by politics and reli- gion. Vintage (2012) 10 B. Kang et al
work page 2012
-
[8]
Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)
Hassan, N., Zhang, G., Arslan, F., Caraballo, J., Jimenez, D., Gawsane, S., Hasan, S., Joseph, M., Kulkarni, A., Nayak, A.K., et al.: Claimbuster: The first-ever end- to-end fact-checking system. Proceedings of the VLDB Endowment10(12), 1945– 1948 (2017)
work page 1945
-
[9]
In: Findings of the Association for Computational Linguistics: EMNLP 2022
Jin,Z.,Lalwani,A.,Vaidhya,T.,Shen,X.,Ding,Y.,Lyu,Z.,Sachan,M.,Mihalcea, R., Schoelkopf, B.: Logical fallacy detection. In: Findings of the Association for Computational Linguistics: EMNLP 2022. pp. 7180–7198 (2022)
work page 2022
- [10]
-
[11]
In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining
Lees,A.,Tran,V.Q.,Tay,Y.,Sorensen,J.,Gupta,J.,Metzler,D.,Vasserman,L.:A new generation of perspective api: Efficient multilingual character-level transform- ers. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. pp. 3197–3207 (2022)
work page 2022
-
[12]
Lim, G., Perrault, S.T.: Evaluation of an llm in identifying logical fallacies: A call for rigor when adopting llms in hci research. In: Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing. pp. 303–308 (2024)
work page 2024
-
[13]
Van der Linden, S.: Foolproof: Why misinformation infects our minds and how to build immunity. WW Norton & Company (2023)
work page 2023
-
[14]
NewsGuard Technologies: Newsguard: Restoring trust & accountability.https: //www.newsguardtech.com/(2024), accessed: 2026-03-01
work page 2024
-
[15]
WebLLM: A High-Performance In-Browser LLM Inference Engine
Ruan, C.F., Qin, Y., Zhou, X., Lai, R., Jin, H., Dong, Y., Hou, B., Yu, M.S., Zhai, Y., Agarwal, S., et al.: Webllm: A high-performance in-browser llm inference engine. arXiv preprint arXiv:2412.15803 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[16]
arXiv preprint arXiv:2310.06422 (2023)
Sprenkamp,K.,Jones,D.G.,Zavolokina,L.:Largelanguagemodelsforpropaganda detection. arXiv preprint arXiv:2310.06422 (2023)
-
[17]
https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01
Van Zandt, D.: Media bias/fact check: Independent online media bias detector. https://mediabiasfactcheck.com/(2024), accessed: 2026-03-01
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.