reconCTI: A Proactive Approach to Cyber-Threat Intelligence

Ameer Al-Nemrat; Mohammed Mahir Rahman; Shahzad Memon; Tauseef Ahmed

arxiv: 2605.19899 · v1 · pith:4SXPCKYEnew · submitted 2026-05-19 · 💻 cs.CR

reconCTI: A Proactive Approach to Cyber-Threat Intelligence

Mohammed Mahir Rahman , Shahzad Memon , Tauseef Ahmed , Ameer Al-Nemrat This is my paper

Pith reviewed 2026-05-20 04:00 UTC · model grok-4.3

classification 💻 cs.CR

keywords cyber threat intelligenceOSINTdark webMITRE ATT&CKdata leaksreconnaissanceproactive defensethreat reporting

0 comments

The pith

A Python tool called reconCTI lets users keyword-scan surface and dark web sites for sensitive data leaks and map results to MITRE ATT&CK for threat reports with mitigation steps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces reconCTI to help defend against reconnaissance by threat actors who gather open-source intelligence on targets. The tool accepts specific keywords from the user, performs scans across multiple sites on both the regular web and the dark web, then evaluates the collected information against the MITRE ATT&CK framework. Results are turned into a single threat report that lists possible mitigation strategies. This setup is meant to let cybersecurity professionals and ordinary users spot risks early and respond before data is exploited.

Core claim

The authors introduce reconCTI, a command-line tool built in Python for Linux systems that searches for sensitive data leaks across surface web and dark web platforms, accepts user keywords for multi-site scans, assesses findings by referencing the MITRE ATT&CK framework, and compiles the results into a threat report that includes possible mitigation strategies.

What carries the argument

The reconCTI command-line tool that runs keyword-driven multi-site scans on surface and dark web platforms then maps detections to MITRE ATT&CK entries for report generation.

Load-bearing premise

That a keyword-driven scan can reliably locate and correctly interpret sensitive data leaks on the dark web while remaining technically feasible, legally permissible, and accurate enough to produce useful MITRE ATT&CK mappings and mitigation advice.

What would settle it

Running reconCTI with known leaked data on accessible dark web sites and observing that the tool either misses the leaks, produces no report, or generates incorrect MITRE ATT&CK mappings and mitigations.

Figures

Figures reproduced from arXiv: 2605.19899 by Ameer Al-Nemrat, Mohammed Mahir Rahman, Shahzad Memon, Tauseef Ahmed.

**Figure 4.** Figure 4: Threat report pages 1 (left) and 2 (right) This result demonstrates the capability of the reconCTI tool to successfully navigate through onion links and identify potential data leaks. The file `sc_result-2.json` illustrated in [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 2.** Figure 2: Input for Scenario 1 [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Scraping session complete Following this step, the user is asked if they wish to initiate an analysis of the scraped results. The latest `sc_result-n.json` file is then parsed by `threat_analysis.py`, which identifies and maps potential threats based on both local CVE mappings and MITRE ATT&CK framework references. If any threats are identified, a PDF report is automatically generated and displayed to the … view at source ↗

**Figure 6.** Figure 6: Webpage where the data was found B. Scenario 2 The second scenario is based on surface web databases. A user’s email address was scraped from a website using commando mode [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗

**Figure 9.** Figure 9: Input snippet for scenario 3 After the scraping was completed, the result file was saved separately for further analysis. A snippet of the scraped results file is depicted in [PITH_FULL_IMAGE:figures/full_fig_p005_9.png] view at source ↗

**Figure 7.** Figure 7: Commando mode input method The code is also designed to handle incorrect input, as shown in the snippet ( [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗

**Figure 8.** Figure 8: Threat report for Scenario 2 C. Scenario 3 As part of this testing phase, a generic keyword was searched as shown in Table IV, with the aim of independently locating potentially valuable intelligence. TABLE IV. SEARCH SPECIFICATIONS FOR SCENRIO 3 Data Type/s Value/s And/Or Website/s Name/ Text Onion Links that share data leaks for FREE And/Or http://ruc4i7xn5qu5u c7fu2sc34r6xl55xhgv xbcs56t4ayvbqo2fmp 4peh… view at source ↗

**Figure 12.** Figure 12: Further investigation on the links found [PITH_FULL_IMAGE:figures/full_fig_p006_12.png] view at source ↗

**Figure 13.** Figure 13: Files with leaked data and password This test demonstrates the strong capability of reconCTI to facilitate security research by automating the process of scanning for leaked information across darknet links. D. Performance Evaluation In controlled tests using known leaks, reconCTI successfully identified all flagged data points, demonstrating strong detection capability. However, as the threat landscape e… view at source ↗

**Figure 14.** Figure 14: Detection rates Overall detection rates are shown in [PITH_FULL_IMAGE:figures/full_fig_p006_14.png] view at source ↗

read the original abstract

The rapid advancement of information technology has introduced a noticeable shift from traditional offline practices to more efficient and interconnected online environments. This transition, while offering convenience, has also increased exposure to various cyber threats such as identity theft, impersonation, and phishing scams. Reconnaissance, or briefly known as information gathering, is a key stage for threat actors, often relying on open-source intelligence (OSINT) to collect sensitive and extensive data on targets. In response to this challenge, this study introduces reconCTI, a command-line tool built using Python for Linux systems. The tool is designed to search for sensitive data leaks across both surface web and dark web platforms. It allows users to input specific keywords, scan multiple sites at once, and then assess the findings by referencing the MITRE ATT&CK framework. The results are compiled into a threat report that also includes possible mitigation strategies. reconCTI is intended to support both cybersecurity professionals and individuals in identifying risks early and taking appropriate action.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

reconCTI describes a Python CLI for keyword scanning of leaks on surface and dark web with MITRE mapping, but supplies no results, metrics, or validation.

read the letter

The core point on this paper is that it introduces reconCTI, a command-line Python tool meant to let users run keyword searches across surface and dark web sites for sensitive data leaks, map the hits to MITRE ATT&CK tactics, and output a report with suggested mitigations. The work stays at the level of describing the intended workflow and does not claim any new scanning method or theoretical advance. What it does reasonably well is package several standard OSINT steps into one Linux CLI that handles multiple sites at once and produces a single report. That convenience could matter for someone who wants a quick, local script rather than juggling separate services or browsers. The main limitation is the total lack of evidence that the tool actually works as described. There are no scan examples, no precision or recall numbers, no case studies on real or planted leaks, and no discussion of practical issues such as dark-web access reliability, false positives, or legal constraints. Without those details the central claim that the tool supports early risk identification remains untested. This is the sort of paper that might interest a practitioner who is looking for an open-source starting point to experiment with threat intel gathering. A reader who already knows the basic OSINT and MITRE concepts could pull the code and try it out, but anyone expecting a evaluated contribution or reproducible findings will find little to use. I would not send it to peer review in this form. Adding even modest test results, sample outputs, and a comparison to existing tools would make it worth a referee's time; right now the absence of data makes the utility hard to assess.

Referee Report

3 major / 2 minor

Summary. The paper introduces reconCTI, a Python-based command-line tool for Linux systems that performs keyword-driven searches for sensitive data leaks across surface web and dark web platforms, scans multiple sites simultaneously, maps findings to the MITRE ATT&CK framework, and generates threat reports that include mitigation strategies to enable proactive cyber-threat intelligence for professionals and individuals.

Significance. If the described functionality were demonstrated to work reliably, the tool could offer a practical contribution to open-source intelligence (OSINT) workflows in cybersecurity by combining multi-platform scanning with standardized attack mapping and actionable reporting. This addresses the reconnaissance phase of threats such as identity theft and phishing. However, the current lack of any supporting evidence substantially reduces the assessed significance.

major comments (3)

Abstract: the claim that reconCTI 'supports both cybersecurity professionals and individuals in identifying risks early and taking appropriate action' is unsupported, as the manuscript supplies only a high-level description of intended workflow with no validation data, test results, error analysis, precision/recall metrics, or case studies on real or synthetic leaks.
Abstract and full manuscript: no implementation details, source code, scan outputs, or assessment of dark-web access feasibility are provided, leaving the central claim that keyword-driven scans can reliably locate and correctly interpret sensitive data leaks unevaluable.
Abstract: the assumption that results can be accurately mapped to MITRE ATT&CK and paired with useful mitigation strategies is presented without any discussion of mapping accuracy, false-positive handling, or legal/technical constraints of dark-web scanning, which is load-bearing for the tool's claimed utility.

minor comments (2)

Abstract: the phrase 'reconnaissance, or briefly known as information gathering' could be clarified with a standard reference to OSINT literature for improved precision.
Abstract: consider adding a brief note on the specific mechanisms or libraries intended for surface-web versus dark-web access to aid reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript describing reconCTI. The comments correctly identify that the current version is primarily a high-level description of the tool's intended functionality without accompanying empirical validation or implementation specifics. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation.

read point-by-point responses

Referee: Abstract: the claim that reconCTI 'supports both cybersecurity professionals and individuals in identifying risks early and taking appropriate action' is unsupported, as the manuscript supplies only a high-level description of intended workflow with no validation data, test results, error analysis, precision/recall metrics, or case studies on real or synthetic leaks.

Authors: We agree that the abstract claim regarding support for professionals and individuals is not backed by empirical evidence in the current manuscript. The text describes the designed workflow rather than demonstrated outcomes. In revision we will modify the abstract to present the claim as the tool's intended purpose and add a dedicated evaluation section that includes preliminary test cases, example outputs, and planned metrics such as precision for leak detection. revision: yes
Referee: Abstract and full manuscript: no implementation details, source code, scan outputs, or assessment of dark-web access feasibility are provided, leaving the central claim that keyword-driven scans can reliably locate and correctly interpret sensitive data leaks unevaluable.

Authors: The manuscript indeed focuses on conceptual design and does not include code, sample outputs, or feasibility analysis. We will expand the methods section with pseudocode for the multi-platform scanning logic, anonymized example scan results, and a new subsection assessing dark-web access via Tor, including technical challenges such as connectivity reliability and rate limiting. revision: yes
Referee: Abstract: the assumption that results can be accurately mapped to MITRE ATT&CK and paired with useful mitigation strategies is presented without any discussion of mapping accuracy, false-positive handling, or legal/technical constraints of dark-web scanning, which is load-bearing for the tool's claimed utility.

Authors: We acknowledge the absence of discussion on mapping accuracy, false-positive mitigation, and constraints. The revision will add a section describing the heuristic mapping approach to MITRE ATT&CK tactics, explicit handling of potential false positives through user review, and coverage of legal/ethical considerations and technical limitations of dark-web queries to provide a balanced assessment of utility. revision: yes

Circularity Check

0 steps flagged

No circularity: tool-description paper with no derivations or self-referential claims

full rationale

The manuscript introduces reconCTI as a Python command-line tool for keyword-driven scanning of surface and dark web for data leaks, followed by MITRE ATT&CK mapping and report generation with mitigations. No equations, fitted parameters, predictions, uniqueness theorems, or ansatzes appear anywhere in the text. The central claim is a high-level description of intended software workflow rather than a derived analytical result; therefore no load-bearing step reduces to its own inputs by construction. The paper is self-contained as a tool proposal and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software tool description rather than a theoretical or empirical scientific paper. No free parameters are fitted to data, no domain axioms beyond standard programming assumptions are invoked, and no new physical or conceptual entities are postulated.

pith-pipeline@v0.9.0 · 5707 in / 1307 out tokens · 58444 ms · 2026-05-20T04:00:42.390013+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The tool is designed to search for sensitive data leaks across both surface web and dark web platforms. It allows users to input specific keywords, scan multiple sites at once, and then assess the findings by referencing the MITRE ATT&CK framework.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The results are compiled into a threat report that also includes possible mitigation strategies.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages

[1]

Lockheed-Martin, ‘Gaining the Advantage - Applying Cyber Kill Chain® Methodology to Network Defense’, Nov. 2024. Accessed: Nov. 25, 2024. [Online]. Available: https://www.lockheedmartin.com/content/dam/lockheed- martin/rms/documents/cyber/Gaining_the_Advantage_Cyber_Kil l_Chain.pdf

work page 2024
[2]

Robertson et al., Darkweb Cyber Threat Intelligence Mining

J. Robertson et al., Darkweb Cyber Threat Intelligence Mining . Cambridge University Press, 2017

work page 2017
[3]

Raman, V

R. Raman, V. K. Nair, P. Nedungadi, I. Ray, and K. Achuthan, ‘Darkweb: Past, Present and Future Research Trends and its Mapping to Sustainable Development Goals’, Heliyon, 2023

work page 2023
[4]

R. P, A. Mansoor, T. Mansour, M. A, and C. G, ‘Analysis Of Cyber Threat Detection And Emulation Using MITRE Attack Framework’, International Conference on Intelligent Data Science Technologies and Applications (IDSTA), 2022

work page 2022
[5]

Martins and I

C. Martins and I. Medeiros, ‘Generating Quality Threat Intelligence Leveraging OSINT and a Cyber Threat Unified Taxonomy’, ACM Transactions on Privacy and Security, vol. 25, no. 3, pp. 1–39, Nov. 2022

work page 2022
[6]

J. S. Slinde, ‘Unveiling the Potential of Open-Source Intelligence (OSINT) for Enhanced Cybersecurity Posture’, University of Agder, 2023

work page 2023
[7]

M. G. Solomon and S. -P. Oriyano, Ethical Hacking: Techniques, Tools, and Countermeasures. Jones & Bartlett Learning, 2022

work page 2022
[8]

Tounsi and H

W. Tounsi and H. Rais, ‘A survey on technical threat intelligence in the age of sophisticated cyber attacks’, Comput Secur, vol. 72, pp. 212–233, Nov. 2018

work page 2018
[9]

Sabottke, O

C. Sabottke, O. Suciu, and T. Dumitraş, ‘Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real- World Exploits’, in Proceedings of the 24th USENIX Security Symposium, USENIX Association, Nov. 2015

work page 2015
[10]

ZIÓŁKOWSKA, ‘OPEN SOURCE INTELLIGENCE (OSINT) AS AN ELEMENT OF MILITARY RECON’, War Studies University, Warsaw, 2018

A. ZIÓŁKOWSKA, ‘OPEN SOURCE INTELLIGENCE (OSINT) AS AN ELEMENT OF MILITARY RECON’, War Studies University, Warsaw, 2018

work page 2018
[11]

Google, ‘We’re All in this Together: A Year in Review of Zero - Days Exploited In-the-Wild in 2023’, Nov. 2024

work page 2023
[12]

De Pascale, G

D. De Pascale, G. Cascavilla, D. A. Tamburri, and W. Van Den Heuvel, ‘CRATOR: a Dark Web Crawler’, arXiv:2405.06356v1, 2024

work page arXiv 2024
[13]

AlKhatib and R

B. AlKhatib and R. Basheer, ‘Crawling the Dark Web: A Conceptual Perspective, Challenges and Implementation’, Journal of Digital Information Management, vol. 17, no. 2, 2019

work page 2019
[14]

Ahmed, P

F. Ahmed, P. Khatri, G. Surange, and A. Agrawal, ‘SearchOL: A Tool for Reconnaissance’, Journal of Network and Innovative Computing, vol. 11, pp. 021–029, 2023

work page 2023
[15]

Al Ismaili, ‘Enhancing Cybersecurity: Exploring Effective Ethical Hacking Techniques with Kali Linux’, Research and Applications Towards Mathematics and Computer Science , pp

M. Al Ismaili, ‘Enhancing Cybersecurity: Exploring Effective Ethical Hacking Techniques with Kali Linux’, Research and Applications Towards Mathematics and Computer Science , pp. 135–152, 2023

work page 2023
[16]

Kashyap and V

P. Kashyap and V. Selvarajah, ‘Analysis of Different Methods of Reconnaissance’, in 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021) , Atlantis Press, 2021, pp. 509–519

work page 2021
[17]

Botwright, Advanced OSINT Strategies: Online Investigations And Intelligence Gathering

R. Botwright, Advanced OSINT Strategies: Online Investigations And Intelligence Gathering. Pastor Publishing Limited, 2024

work page 2024

[1] [1]

Lockheed-Martin, ‘Gaining the Advantage - Applying Cyber Kill Chain® Methodology to Network Defense’, Nov. 2024. Accessed: Nov. 25, 2024. [Online]. Available: https://www.lockheedmartin.com/content/dam/lockheed- martin/rms/documents/cyber/Gaining_the_Advantage_Cyber_Kil l_Chain.pdf

work page 2024

[2] [2]

Robertson et al., Darkweb Cyber Threat Intelligence Mining

J. Robertson et al., Darkweb Cyber Threat Intelligence Mining . Cambridge University Press, 2017

work page 2017

[3] [3]

Raman, V

R. Raman, V. K. Nair, P. Nedungadi, I. Ray, and K. Achuthan, ‘Darkweb: Past, Present and Future Research Trends and its Mapping to Sustainable Development Goals’, Heliyon, 2023

work page 2023

[4] [4]

R. P, A. Mansoor, T. Mansour, M. A, and C. G, ‘Analysis Of Cyber Threat Detection And Emulation Using MITRE Attack Framework’, International Conference on Intelligent Data Science Technologies and Applications (IDSTA), 2022

work page 2022

[5] [5]

Martins and I

C. Martins and I. Medeiros, ‘Generating Quality Threat Intelligence Leveraging OSINT and a Cyber Threat Unified Taxonomy’, ACM Transactions on Privacy and Security, vol. 25, no. 3, pp. 1–39, Nov. 2022

work page 2022

[6] [6]

J. S. Slinde, ‘Unveiling the Potential of Open-Source Intelligence (OSINT) for Enhanced Cybersecurity Posture’, University of Agder, 2023

work page 2023

[7] [7]

M. G. Solomon and S. -P. Oriyano, Ethical Hacking: Techniques, Tools, and Countermeasures. Jones & Bartlett Learning, 2022

work page 2022

[8] [8]

Tounsi and H

W. Tounsi and H. Rais, ‘A survey on technical threat intelligence in the age of sophisticated cyber attacks’, Comput Secur, vol. 72, pp. 212–233, Nov. 2018

work page 2018

[9] [9]

Sabottke, O

C. Sabottke, O. Suciu, and T. Dumitraş, ‘Vulnerability Disclosure in the Age of Social Media: Exploiting Twitter for Predicting Real- World Exploits’, in Proceedings of the 24th USENIX Security Symposium, USENIX Association, Nov. 2015

work page 2015

[10] [10]

ZIÓŁKOWSKA, ‘OPEN SOURCE INTELLIGENCE (OSINT) AS AN ELEMENT OF MILITARY RECON’, War Studies University, Warsaw, 2018

A. ZIÓŁKOWSKA, ‘OPEN SOURCE INTELLIGENCE (OSINT) AS AN ELEMENT OF MILITARY RECON’, War Studies University, Warsaw, 2018

work page 2018

[11] [11]

Google, ‘We’re All in this Together: A Year in Review of Zero - Days Exploited In-the-Wild in 2023’, Nov. 2024

work page 2023

[12] [12]

De Pascale, G

D. De Pascale, G. Cascavilla, D. A. Tamburri, and W. Van Den Heuvel, ‘CRATOR: a Dark Web Crawler’, arXiv:2405.06356v1, 2024

work page arXiv 2024

[13] [13]

AlKhatib and R

B. AlKhatib and R. Basheer, ‘Crawling the Dark Web: A Conceptual Perspective, Challenges and Implementation’, Journal of Digital Information Management, vol. 17, no. 2, 2019

work page 2019

[14] [14]

Ahmed, P

F. Ahmed, P. Khatri, G. Surange, and A. Agrawal, ‘SearchOL: A Tool for Reconnaissance’, Journal of Network and Innovative Computing, vol. 11, pp. 021–029, 2023

work page 2023

[15] [15]

Al Ismaili, ‘Enhancing Cybersecurity: Exploring Effective Ethical Hacking Techniques with Kali Linux’, Research and Applications Towards Mathematics and Computer Science , pp

M. Al Ismaili, ‘Enhancing Cybersecurity: Exploring Effective Ethical Hacking Techniques with Kali Linux’, Research and Applications Towards Mathematics and Computer Science , pp. 135–152, 2023

work page 2023

[16] [16]

Kashyap and V

P. Kashyap and V. Selvarajah, ‘Analysis of Different Methods of Reconnaissance’, in 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021) , Atlantis Press, 2021, pp. 509–519

work page 2021

[17] [17]

Botwright, Advanced OSINT Strategies: Online Investigations And Intelligence Gathering

R. Botwright, Advanced OSINT Strategies: Online Investigations And Intelligence Gathering. Pastor Publishing Limited, 2024

work page 2024