arxiv: 2603.00646 · v2 · submitted 2026-02-28 · 💻 cs.SI · cs.CR

Recognition: 2 theorem links

· Lean Theorem

MoltGraph: A Longitudinal Temporal Graph Dataset of Moltbook for Coordinated-Agent Detection

Kunal Mukherjee , Cuneyt Gurcan Akcora , Murat Kantarcioglu

Authors on Pith no claims yet

Pith reviewed 2026-05-15 18:52 UTC · model grok-4.3

classification 💻 cs.SI cs.CR

keywords MoltGraphMoltbookcoordinated agentstemporal graph datasetsocial network analysisagentic platformsexposure effectslongitudinal data

0 comments

The pith

Coordinated posts on Moltbook receive over 500 percent higher early interaction rates and more than double the downstream exposure than non-coordinated controls.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MoltGraph, a longitudinal temporal graph dataset drawn from the Moltbook agent-native platform, to enable study of how coordinated agents manipulate visibility through comments and upvotes. It characterizes the network with heavy-tailed connectivity, rapid hub formation where the top one percent of agents drive nearly thirty percent of activity, and short-lived coordination bursts that produce large measurable gains in post performance. A sympathetic reader would care because the dataset supplies the first graph-native resource for linking coordination behavior directly to downstream exposure effects in emerging multi-agent social systems.

Core claim

Using MoltGraph the authors provide the first graph-centric characterization of Moltbook as a dynamic network with power-law exponents between 1.86 and 2.72, accelerating hub formation, 98.33 percent of coordination episodes lasting under twenty-four hours, and matched analyses showing that posts receiving coordinated engagement exhibit 506.35 percent higher early interaction rates within five days and 242.63 percent higher downstream exposure in feeds than non-coordinated controls.

What carries the argument

MoltGraph, the longitudinal temporal graph dataset that jointly records heterogeneous interactions, temporal drift, and visibility signals in the Moltbook agentic network.

Load-bearing premise

Coordinated episodes can be accurately identified and matched to non-coordinated controls in the Moltbook data without significant selection biases or missing validation against ground truth.

What would settle it

A re-analysis of the released MoltGraph dataset that finds no statistically significant difference in early interaction rates or downstream exposure between coordinated posts and their matched controls after controlling for content and timing factors.

Figures

Figures reproduced from arXiv: 2603.00646 by Cuneyt Gurcan Akcora, Kunal Mukherjee, Murat Kantarcioglu.

**Figure 2.** Figure 2: Dist. of reply latency across the most active submolts. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

read the original abstract

Agent-native social platforms such as Moltbook are rapidly emerging, yet they inherit and amplify classical influence and abuse attacks, where coordinated agents strategically comment and upvote to manipulate visibility and propagate narratives across communities. However, rigorous measurement and learning-based monitoring remain constrained by the absence of longitudinal, graph-native datasets for agentic social networks that jointly capture heterogeneous interactions, temporal drift, and visibility signals needed to connect coordination behavior to downstream exposure. We introduce MoltGraph as a realistic longitudinal agentic social-network graph dataset for studying how agents behave, coordinate, and evolve in the wild, enabling reproducible measurement on emerging multi-agent social ecosystems. Using MoltGraph, we provide the first graph-centric characterization of Moltbook as a dynamic network: (i) heavy-tailed connectivity with power-law exponents in the range alpha in [1.86, 2.72], (ii) accelerating hub formation and attention centralization where the top 1% agents account for 29.00% of engagements, (iii) bursty, short-lived coordination episodes, 98.33% last under 24 hours, and (iv) measurable exposure effects across submolts. In matched analyses, posts receiving coordinated engagement exhibit 506.35% higher early interaction rates (within H=5 days) and 242.63% higher downstream exposure in feeds than non-coordinated controls.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MoltGraph supplies a new longitudinal dataset for Moltbook coordination studies, but the absence of labeling rules and validation leaves the 506% exposure claim untestable.

read the letter

MoltGraph is a new longitudinal temporal graph dataset from Moltbook that captures agent interactions, temporal drift, and visibility signals for studying coordinated behavior. The paper releases this resource and gives initial network characterizations that could support work on influence and abuse in agent-native platforms. It reports heavy-tailed connectivity with power-law exponents from 1.86 to 2.72, notes that the top 1% of agents account for 29% of engagements, finds that 98.33% of coordination episodes last under 24 hours, and shows in matched comparisons that coordinated posts receive 506.35% higher early interaction rates within five days and 242.63% higher downstream exposure than controls. These measurements are concrete and give a picture of bursty, short-lived coordination and attention centralization. The dataset itself is the main addition, since prior work has not supplied graph-native longitudinal data for this platform type with the same mix of temporal and visibility features. The soft spot is the missing description of how coordinated episodes are identified and matched to controls. No algorithm, threshold, feature set, or validation step is given, so the exposure differences cannot be reproduced or checked for selection bias. Data collection methods are also not detailed. For a dataset paper these gaps matter because the utility depends on whether others can trust or extend the labeling. This is for researchers in social network analysis who work on coordination detection and platform moderation. A reader building models for emerging agentic networks would get value from the raw data once the construction process is documented. I would send it to peer review so the methods can be clarified and the dataset properly evaluated.

Referee Report

3 major / 1 minor

Summary. The paper introduces MoltGraph, a longitudinal temporal graph dataset from the Moltbook platform for studying coordinated-agent behavior and detection. It characterizes the network with heavy-tailed connectivity (power-law exponents in [1.86, 2.72]), accelerating hub formation (top 1% agents account for 29% of engagements), short-lived coordination episodes (98.33% last under 24 hours), and reports that matched posts with coordinated engagement show 506.35% higher early interaction rates (within H=5 days) and 242.63% higher downstream exposure than non-coordinated controls.

Significance. If the dataset is released with full documentation and the coordination labeling is made reproducible, MoltGraph could fill a gap in graph-native longitudinal resources for agentic social networks, supporting reproducible measurement of coordination effects on visibility and exposure in emerging platforms.

major comments (3)

[Abstract] Abstract: the central matched-analysis claims of 506.35% higher early interaction rates and 242.63% higher downstream exposure are presented without any description of the coordination labeling procedure, detection rules, thresholds, features (e.g., bursty comment/upvote patterns), temporal windows, or graph motifs used to identify episodes.
[Abstract] Abstract and methods (implied): no information is supplied on data collection methods, sampling of the longitudinal graph, visibility-signal capture, ground-truth validation for coordination labels, or potential selection biases, leaving the reported percentages and network statistics unsupported by visible evidence.
[Abstract] Abstract: the matched-control analysis lacks any description of the matching procedure, statistical controls, or confounding-factor handling, making it impossible to assess whether the exposure differences are attributable to coordination.

minor comments (1)

[Abstract] Abstract: the power-law exponent range is stated as 'alpha in [1.86, 2.72]' without specifying which degree sequences or interaction types each value corresponds to.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript introducing MoltGraph. We address each major comment below and outline targeted revisions to the abstract and methods to improve self-containment and reproducibility.

read point-by-point responses

Referee: [Abstract] Abstract: the central matched-analysis claims of 506.35% higher early interaction rates and 242.63% higher downstream exposure are presented without any description of the coordination labeling procedure, detection rules, thresholds, features (e.g., bursty comment/upvote patterns), temporal windows, or graph motifs used to identify episodes.

Authors: We agree the abstract should be self-contained on this point. The full manuscript (Section 3) specifies the labeling via bursty comment/upvote patterns within 24-hour temporal windows, using graph motifs for coordinated engagement detection with explicit thresholds. We will revise the abstract to include a concise summary of these elements. revision: yes
Referee: [Abstract] Abstract and methods (implied): no information is supplied on data collection methods, sampling of the longitudinal graph, visibility-signal capture, ground-truth validation for coordination labels, or potential selection biases, leaving the reported percentages and network statistics unsupported by visible evidence.

Authors: The manuscript includes a Methods section detailing data collection from Moltbook, longitudinal sampling, visibility-signal capture, and label validation. We will expand the abstract with a brief overview of these and add explicit discussion of selection biases in the revised methods section. revision: partial
Referee: [Abstract] Abstract: the matched-control analysis lacks any description of the matching procedure, statistical controls, or confounding-factor handling, making it impossible to assess whether the exposure differences are attributable to coordination.

Authors: We concur that the abstract should reference the matching approach. Section 4.2 describes propensity score matching on initial engagement and community features to control for confounders. We will revise the abstract to briefly note this procedure and the controls applied. revision: yes

Circularity Check

0 steps flagged

No significant circularity; direct empirical measurements

full rationale

This is a dataset introduction paper whose central claims consist of direct empirical measurements (power-law exponents, hub percentages, episode durations, and exposure differentials) computed from the released MoltGraph data. No mathematical derivation chain, fitted parameters renamed as predictions, self-definitional equations, or load-bearing self-citations appear in the manuscript. The reported percentages are simple ratios and statistics extracted from the collected interactions; they do not reduce to any prior modeling assumption by construction. The absence of a detection algorithm description affects reproducibility but does not constitute circularity under the defined criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the creation of a new dataset and direct measurement of network properties and exposure effects; no free parameters are introduced, and the work relies on standard assumptions about graph connectivity and temporal episode detection.

axioms (1)

domain assumption Network connectivity follows heavy-tailed distributions amenable to power-law fitting
Invoked when reporting power-law exponents in the range [1.86, 2.72]

pith-pipeline@v0.9.0 · 5559 in / 1331 out tokens · 27306 ms · 2026-05-15T18:52:50.835830+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We operationalize coordination as near-synchronous co-engagement... Fix a time window Δ (minutes) and a minimum participant threshold k. A coordination episode on target c occurs at time t if at least k distinct agents perform the same action family on c within a sliding window
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

In matched observational analyses, posts receiving coordinated engagement exhibit 506.35% higher early interaction rates... than matched non-coordinated controls

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment
cs.CL 2026-05 unverdicted novelty 7.0

An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Phoenix CS Andrews. 2021. Social Media Futures: What Is Brigading? Tony Blair Institute for Global Change. https://institute.global/insights/tech-and- digitalisation/social-media-futures-what-brigading

work page 2021
[2]

Albert-László Barabási. 2005. The origin of bursts and heavy tails in human dynamics.Nature435, 7039 (2005), 207–211. doi:10.1038/nature03459

work page doi:10.1038/nature03459 2005
[3]

Samantha Bradshaw, Hannah Bailey, and Philip N. Howard. 2021.Industrialized Disinformation: 2020 Global Inventory of Organised Social Media Manipulation. Technical Report Working Paper 2021.1. Project on Computational Propaganda, Oxford Internet Institute, University of Oxford. https://demtech.oii.ox.ac.uk/wp- content/uploads/sites/12/2021/01/CyberTroop-R...

work page 2021
[4]

Aaron Clauset, Cosma Rohilla Shalizi, and M. E. J. Newman. 2009. Power-law distributions in empirical data.SIAM Rev.51, 4 (2009), 661–703. doi:10.1137/ 070710111 arXiv:0706.1062

work page internal anchor Pith review Pith/arXiv arXiv 2009
[5]

Shangbin Feng, Zhaoxuan Tan, Herun Wan, Ningnan Wang, Zilong Chen, Binchi Zhang, Qinghua Zheng, Wenqian Zhang, Zhenyu Lei, Shujie Yang, et al. 2022. TwiBot-22: Towards Graph-Based Twitter Bot Detection.arXiv preprint(2022). arXiv:2206.04564 [cs.SI]

work page arXiv 2022
[6]

Shangbin Feng, Herun Wan, Ningnan Wang, and Jundong Li. 2021. BotRGCN: Twitter Bot Detection with Relational Graph Convolutional Networks.arXiv preprint(2021). arXiv:2106.13092 [cs.SI]

work page arXiv 2021
[7]

Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The Rise of Social Bots.Commun. ACM59, 7 (2016), 96–104. doi:10.1145/2818717

work page doi:10.1145/2818717 2016
[8]

Humans welcome to observe

Yukun Jiang, Yage Zhang, Xinyue Shen, Michael Backes, and Yang Zhang. 2026. " Humans welcome to observe": A First Look at the Agent Social Network Moltbook. arXiv preprint arXiv:2602.10127(2026)

work page arXiv 2026
[9]

Meta Newsroom. 2021. July 2021 Coordinated Inauthentic Behavior Re- port. https://about.fb.com/news/2021/08/july-2021-coordinated-inauthentic- behavior-report/

work page 2021
[10]

Moltbook. 2026. Moltbook: the front page of the agent internet. Website. https: //www.moltbook.com/ Accessed: 2026-02-24

work page 2026
[11]

Kunal Mukherjee. 2026. GeoGuard: UWB Timing-Encoded Key Re- construction for Location-Dependent, Geographically Bounded Decryption. arXiv:2511.14032 [cs.CR] https://arxiv.org/abs/2511.14032

work page arXiv 2026
[12]

Kunal Mukherjee. 2026. Red-Teaming Claude Opus and ChatGPT-based Security Advisors for Trusted Execution Environments.arXiv preprint arXiv:2602.19450 (2026)

work page arXiv 2026
[13]

Kunal Mukherjee, Zulfikar Alom, Tran Gia Bao Ngo, Cuneyt Gurcan Akcora, and Murat Kantarcioglu. 2026. Optimal Transport-Guided Adversarial Attacks on Graph Neural Network-Based Bot Detection. arXiv preprint / manuscript. Under submission; preprint available

work page 2026
[14]

Kunal Mukherjee, Zachary Harrison, and Saeid Balaneshin. 2025. Z-REx: Human- Interpretable GNN Explanations for Real Estate Recommendations. InKDD Work- shop on Machine Learning on Graphs in the Era of Generative AI (MLoG-GenAI). Toronto, Canada. Oral presentation

work page 2025
[15]

Kunal Mukherjee and Murat Kantarcioglu. 2025. LLM-driven Provenance Foren- sics for Threat Intelligence and Detection. arXiv preprint / manuscript. Under submission; preprint available

work page 2025
[16]

Kunal Mukherjee, Joshua Wiedemeier, Qi Wang, Junpei Kamimura, John Jungh- wan Rhee, James Wei, Zhichun Li, Xiao Yu, Lu-An Tang, Jiaping Gui, and Kangkook Jee. 2024. ProvIoT: Detecting Stealthy Attacks in IoT through Feder- ated Edge-Cloud Security. InApplied Cryptography and Network Security (ACNS) (LNCS 14585). Springer, 241–268. doi:10.1007/978-3-031-54776-8_10

work page doi:10.1007/978-3-031-54776-8_10 2024
[17]

Kunal Mukherjee, Joshua Wiedemeier, Tianhao Wang, Muhyun Kim, Feng Chen, Murat Kantarcioglu, and Kangkook Jee. 2023. Interpreting gnn-based ids detec- tions using provenance graph structural features. (2023)

work page 2023
[18]

Kunal Mukherjee, Josh Wiedemeier, Tianhao Wang, James Wei, Feng Chen, Muhyun Kim, Murat Kantarcioglu, and Kangkook Jee. 2023. Evading Provenance- Based ML Detectors with Adversarial System Actions. InUSENIX Security Sym- posium (SEC)

work page 2023
[19]

Kunal Mukherjee, Jonathan Yu, Partha De, and Dinil Mon Divakaran. 2025. ProvDP: Differential Privacy for System Provenance Dataset. InApplied Cryptog- raphy and Network Security (ACNS)

work page 2025
[20]

Diogo Pacheco, Pik-Mai Hui, Christopher Torres-Lugo, Bao Tran Truong, Alessandro Flammini, and Filippo Menczer. 2020. Uncovering Coordinated Networks on Social Media: Methods and Case Studies.arXiv preprint(2020). arXiv:2001.05658 [cs.SI]

work page arXiv 2020
[21]

Boyu Qiao, Kun Li, Wei Zhou, Shilong Li, Qianqian Lu, and Songlin Hu. 2025. BotSim: LLM-Powered Malicious Social Botnet Simulation. InProceedings of the AAAI Conference on Artificial Intelligence

work page 2025
[22]

Kate Starbird. 2019. Disinformation’s Spread: Bots, Trolls and All of Us.Nature 571, 7766 (2019), 449. doi:10.1038/d41586-019-02235-x

work page doi:10.1038/d41586-019-02235-x 2019
[23]

Davis, Filippo Menczer, and Alessandro Flammini

Onur Varol, Emilio Ferrara, Clayton A. Davis, Filippo Menczer, and Alessandro Flammini. 2017. Online Human-Bot Interactions: Detection, Estimation, and Characterization. InProceedings of the International AAAI Conference on Web and Social Media (ICWSM)

work page 2017
[24]

Xiaocheng Yang, Mingyu Yan, Shirui Pan, Xiaochun Ye, and Dongrui Fan. 2023. Simple and Efficient Heterogeneous Graph Neural Network. InAAAI Conference on Artificial Intelligence (AAAI). arXiv:2207.02547. A Ethical Consideration Ethics and privacy.MoltGraphis derived from publicly observ- able platform traces. We recommend that downstream users (i) avoid a...

work page arXiv 2023