pith. sign in

arxiv: 2605.17655 · v1 · pith:RGMVTRHHnew · submitted 2026-05-17 · 💻 cs.CY · cs.SI

Disarranged Harmonization of Transparency Reporting by Social Media Platforms Under the Digital Services Act

Pith reviewed 2026-05-19 22:24 UTC · model grok-4.3

classification 💻 cs.CY cs.SI
keywords Digital Services Acttransparency reportingsocial media platformsdata qualityharmonizationcontent moderationEuropean Union regulation
0
0 comments X

The pith

Transparency reports from major social media platforms remain inconsistent and incomplete under the Digital Services Act.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts the first systematic evaluation of transparency reporting by the eight largest social media platforms in the European Union after the Digital Services Act introduced harmonization requirements. It runs quantitative analyses on reporting dimensions and compares results across platforms and mechanisms. The analysis reveals that all platforms show problems with data formatting, timeliness, consistency, and completeness. Some platforms submit differing information through different channels, and many earlier problems with transparency reporting persist. These findings matter because poor data quality blocks reliable auditing of platform behavior and limits public accountability.

Core claim

Despite the DSA's push for harmonized transparency reporting, the eight largest EU social media platforms display varying compliance levels with persistent issues in data formatting, timeliness, consistency, and completeness; some platforms use different procedures across mechanisms and submit contrasting information; interoperability between mechanisms stays limited; and many previously noted problems with transparency reporting remain unresolved.

What carries the argument

Structured comparative assessment of key reporting dimensions through large-scale quantitative analyses on data quality and consistency across platforms and reporting mechanisms.

If this is right

  • Harmonization under the DSA has not produced consistent reporting across different mechanisms for the same platform.
  • Data quality problems continue to obstruct effective auditing of platform transparency.
  • Interoperability between reporting mechanisms is still blocked by differing procedures.
  • Many pre-DSA issues with transparency reporting have not been fixed by the new rules.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Mandating uniform data formats and submission deadlines could reduce the observed inconsistencies.
  • Auditors may need to cross-check multiple reporting channels to obtain a reliable picture of platform activity.
  • Persistent gaps could limit the ability of researchers and regulators to track changes in content moderation over time.

Load-bearing premise

The transparency reports submitted by platforms are sufficiently complete and accessible to allow direct quantitative comparison of data quality and consistency across different reporting mechanisms without significant missing context or selection effects.

What would settle it

Locating one platform that submits identical, timely, complete, and consistently formatted data through every required DSA reporting mechanism would undermine the claim of widespread unresolved issues.

Figures

Figures reproduced from arXiv: 2605.17655 by Amaury Trujillo, Benedetta Tessa, Stefano Cresci.

Figure 1
Figure 1. Figure 1: Platform-wise, country-level number of average monthly active recipients (AMAR), for the second half of 2025. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Platform-wise daily number of SoR submissions (N) and mean communication delay (D), during the second half of [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Platform-wise distributions of the use of automated means in statements of reasons, ordered by an ad hoc automation [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Platform-wise proportions of T&C own-initiative actions with solely automated means in the Transparency Database (TDB) and Reports (TRs), sorted by effect size of their difference (Cohen’s h). TDB actions with partially au￾tomated decisions are taken as fully automated. Notice and Action Mechanisms Unlike own-initiative moderation, notice-based actions orig￾inate from external reports, making them a useful… view at source ↗
Figure 5
Figure 5. Figure 5: Platform-wise distributions of self-reported decision appeals on internal-complaint mechanisms. Decisions [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Platform-wise distributions of decision application delays in statements of reasons (compared to the content creation [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Self-reported own-initiative actions for violations of terms and conditions (T [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Platform-wise self-reported human resources working on moderation by language and total. A moderator might be [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Platform-wise self-reported number of received notices —via either Trusted Flaggers or Article 16 notice and action [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
read the original abstract

The European Commission recently introduced new regulation to harmonize transparency reporting of large online platforms under the Digital Services Act (DSA). Here, we present the first systematic evaluation of transparency reporting data quality after this normative change, for the eight largest social media platforms in the European Union. In detail, we run a set of large-scale quantitative analyses on key reporting dimensions, followed by a structured comparative assessment across platforms and reporting mechanisms. Among our findings is that: (i) the analyzed platforms had varying degrees of compliance and data quality, but all exhibited issues on data formatting, timeliness, consistency, and completeness; (ii) some platforms employed differing reporting procedures across mechanisms, which caused them to submit contrasting information; (iii) despite the harmonization, a number of issues still prevent interoperability between reporting mechanisms; and (iv) many of the previously identified issues with transparency reporting are still unresolved. We conclude by discussing implications for transparency auditing and proposing key targeted improvements to strengthen the reliability and interoperability of DSA transparency reporting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents the first systematic evaluation of transparency reporting data quality for the eight largest social media platforms in the EU under the Digital Services Act (DSA). Through large-scale quantitative analyses on key reporting dimensions and a structured comparative assessment across platforms and reporting mechanisms, it finds that all platforms exhibit issues with data formatting, timeliness, consistency, and completeness; some platforms employ differing procedures leading to contrasting information; interoperability between mechanisms remains limited; and many previously identified transparency issues persist. The paper concludes with implications for auditing and targeted improvement proposals.

Significance. If the empirical findings hold, this work provides timely evidence on the practical outcomes of DSA harmonization efforts, highlighting persistent gaps in transparency reporting that could inform regulatory refinements and auditing standards. The comparative assessment across eight platforms and multiple mechanisms is a clear strength, offering a broad view of compliance variation that prior studies have not systematically addressed at this scale.

major comments (3)
  1. [Methods] Methods section: The retrieval protocol, inclusion criteria, and handling of missing entries or inaccessible historical versions for the DSA transparency reports are not detailed. This is load-bearing for the central claims, as the findings that all platforms exhibited issues on formatting, timeliness, consistency, and completeness, plus contrasting information from differing procedures, assume the collected reports form a representative sample without selection effects from platform-specific publication practices.
  2. [Results] Results section (quantitative analyses): Specifics on dataset sizes, error handling, statistical methods, or metrics for assessing data quality dimensions are absent. This undermines evaluation of the robustness of the claim that all eight platforms exhibited issues, particularly given the abstract's emphasis on large-scale analyses.
  3. [Comparative assessment] Comparative assessment: The assertion that differing reporting procedures caused platforms to submit contrasting information requires concrete examples linked to specific data points, tables, or figures to demonstrate the scale and implications for interoperability.
minor comments (2)
  1. [Abstract] Abstract: Consider briefly specifying one or two example metrics or dimensions used in the quantitative analyses to give readers a clearer sense of the evaluation scope.
  2. Notation and terminology: Ensure consistent use of terms like 'reporting mechanisms' and 'transparency reports' throughout to avoid minor ambiguity in cross-platform comparisons.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which helps clarify key aspects of our methodology and presentation of results. We address each major comment below, indicating revisions where we agree additional detail or examples will strengthen the manuscript.

read point-by-point responses
  1. Referee: [Methods] Methods section: The retrieval protocol, inclusion criteria, and handling of missing entries or inaccessible historical versions for the DSA transparency reports are not detailed. This is load-bearing for the central claims, as the findings that all platforms exhibited issues on formatting, timeliness, consistency, and completeness, plus contrasting information from differing procedures, assume the collected reports form a representative sample without selection effects from platform-specific publication practices.

    Authors: We agree that greater detail on data collection is needed to support the representativeness of our sample and address potential selection effects. The current manuscript outlines the overall approach at a high level but does not fully specify the retrieval protocol, inclusion criteria, or procedures for missing or historical versions. In the revised manuscript, we will expand the Methods section with a dedicated subsection describing the data sources (official DSA repositories and platform disclosures), the time window and search strategy used, explicit inclusion/exclusion criteria, and how inaccessible or missing reports were handled, including any platform-specific publication variations encountered. revision: yes

  2. Referee: [Results] Results section (quantitative analyses): Specifics on dataset sizes, error handling, statistical methods, or metrics for assessing data quality dimensions are absent. This undermines evaluation of the robustness of the claim that all eight platforms exhibited issues, particularly given the abstract's emphasis on large-scale analyses.

    Authors: We acknowledge that the Results section would benefit from explicit reporting of dataset characteristics and analytical procedures to allow readers to assess robustness. While the manuscript presents the outcomes of the large-scale quantitative analyses on the key dimensions, it does not currently include dataset sizes, error-handling steps, or the precise metrics and methods applied. In the revision, we will add these specifics, including the number of reports analyzed per platform and mechanism, how inconsistencies or errors were identified and coded, the quantitative metrics used for each quality dimension (e.g., formatting compliance rates, timeliness thresholds), and any descriptive or comparative statistics employed. revision: yes

  3. Referee: [Comparative assessment] Comparative assessment: The assertion that differing reporting procedures caused platforms to submit contrasting information requires concrete examples linked to specific data points, tables, or figures to demonstrate the scale and implications for interoperability.

    Authors: We agree that linking the observed differences in reporting procedures to concrete examples would improve clarity and demonstrate the practical implications. The comparative assessment section already identifies instances where platforms used differing procedures across mechanisms leading to contrasting information, but these could be more explicitly tied to underlying data. In the revised manuscript, we will add specific examples drawn from the collected reports, cross-referenced to particular data points, and include or expand a table or figure that illustrates selected cases of inconsistency and their effects on interoperability. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical analysis of public DSA transparency reports relies on direct data comparison without derivations or self-referential reductions

full rationale

The paper presents a systematic evaluation of transparency reports submitted by eight social media platforms under the DSA. It describes running large-scale quantitative analyses on dimensions such as formatting, timeliness, consistency, and completeness, followed by comparative assessment. No equations, fitted parameters, predictions, or first-principles derivations are claimed or present in the abstract or described methodology. The central findings rest on direct inspection and comparison of publicly accessible reports rather than any internal construction where outputs reduce to inputs by definition or self-citation chains. This is a standard empirical study of external data sources, self-contained against public benchmarks, with no load-bearing steps that exhibit the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The evaluation assumes that publicly filed DSA reports can be treated as comparable raw data across platforms and mechanisms; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption The eight largest social media platforms in the EU are representative for assessing overall DSA transparency reporting compliance.
    Abstract states analysis covers the eight largest platforms without further justification of selection criteria.

pith-pipeline@v0.9.0 · 5708 in / 1094 out tokens · 52990 ms · 2026-05-19T22:24:20.796045+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 1 internal anchor

  1. [1]

    PeerJ Computer Science , volume=

    How to detect propaganda from social media? Exploitation of semantic and fine-tuned language models , author=. PeerJ Computer Science , volume=. 2023 , publisher=

  2. [2]

    IEEE Access , volume=

    Advances in machine learning algorithms for hate speech detection in social media: A review , author=. IEEE Access , volume=. 2021 , publisher=

  3. [3]

    ACM Computing Surveys , volume=

    Multi-modal misinformation detection: Approaches, challenges and opportunities , author=. ACM Computing Surveys , volume=. 2024 , publisher=

  4. [4]

    Journal of Computer and Communications , volume=

    Deepfakes detection techniques using deep learning: A survey , author=. Journal of Computer and Communications , volume=

  5. [5]

    Proceedings of the 2021 CHI conference on human factors in computing systems , pages=

    The psychological well-being of content moderators: the emotional labor of commercial moderation and avenues for improving support , author=. Proceedings of the 2021 CHI conference on human factors in computing systems , pages=

  6. [6]

    Proceedings of the ACM on Human-Computer Interaction , volume=

    Does transparency in moderation really matter? User behavior after content removal explanations on reddit , author=. Proceedings of the ACM on Human-Computer Interaction , volume=. 2019 , publisher=

  7. [7]

    Automated Transparency: A legal and empirical analysis of the

    Kaushal, Rishabh and Van De Kerkhof, Jacob and Goanta, Catalina and Spanakis, Gerasimos and Iamnitchi, Adriana , booktitle=. Automated Transparency: A legal and empirical analysis of the

  8. [8]

    , author=

    Meaningful XAI based on user-centric design methodology: Combining legal and human-computer interaction (HCI) approaches to achieve meaningful algorithmic explainability. , author=. Available at SSRN 4520754 , year=

  9. [9]

    Proceedings of the ACM on Human-Computer Interaction , volume=

    Disproportionate removals and differing content moderation experiences for conservative, transgender, and black social media users: Marginalization and moderation gray areas , author=. Proceedings of the ACM on Human-Computer Interaction , volume=. 2021 , publisher=

  10. [10]

    Proceedings of the ACM on human-computer interaction , volume=

    Contestability for content moderation , author=. Proceedings of the ACM on human-computer interaction , volume=. 2021 , publisher=

  11. [11]

    The Palgrave Handbook of Global Social Problems , pages=

    Impact of Social Media Among Vulnerable Sections of Society and the Construction of Social Problems , author=. The Palgrave Handbook of Global Social Problems , pages=. 2023 , publisher=

  12. [12]

    Engineering Applications of Artificial Intelligence , volume=

    Identification of cyber harassment and intention of target users on social media platforms , author=. Engineering Applications of Artificial Intelligence , volume=. 2022 , publisher=

  13. [13]

    Trujillo, Amaury and Fagni, Tiziano and Cresci, Stefano , booktitle=

  14. [14]

    One day in content moderation: Analyzing 24h of social media platforms’ content decisions through the

    Dergacheva, Daria and Kuznetsova, Vasilisa and Scharlach, Rebecca and Katzenbach, Christian , year=. One day in content moderation: Analyzing 24h of social media platforms’ content decisions through the

  15. [15]

    Content moderation on social media in the

    Drolsbach, Chiara Patricia and Pr. Content moderation on social media in the. ACM WebConf Companion , year=

  16. [16]

    Pornographic content classification using deep-learning , year =

    Tabone, Andr\'. Pornographic content classification using deep-learning , year =

  17. [17]

    How transparent are transparency reports?

    Urman, Aleksandra and Makhortykh, Mykola , journal=. How transparent are transparency reports?. 2023 , publisher=

  18. [18]

    Enabling research with publicly accessible platform data: Early

    Jaursch, Julian and Ohme, Jakob and Klinger, Ulrike , year=. Enabling research with publicly accessible platform data: Early

  19. [19]

    Regulation on a Single Market For Digital Services (Digital Services Act) and amending Directive , year = 2022, note =

  20. [20]

    Shahi, Gautam Kishore and Tessa, Benedetta and Trujillo, Amaury and Cresci, Stefano , booktitle=

  21. [21]

    Van de Kerkhof, J , journal=

  22. [22]

    AAAI/ACM AIES , year=

    Foundation model transparency reports , author=. AAAI/ACM AIES , year=

  23. [23]

    2024 , publisher=

    Li, Huaxia and Gao, Haoyun and Wu, Chengzhang and Vasarhelyi, Miklos A , journal=. 2024 , publisher=

  24. [24]

    Can large language models replace humans in systematic reviews?

    Khraisha, Qusai and Put, Sophie and Kappenberg, Johanna and Warraitch, Azza and Hadfield, Kristin , journal=. Can large language models replace humans in systematic reviews?. 2024 , publisher=

  25. [25]

    2022 , publisher=

    Zhu, Miao and Cole, Jacqueline M , journal=. 2022 , publisher=

  26. [26]

    Content moderation and platform observability in the

    Papaevangelou, Charis and Votta, Fabio , year=. Content moderation and platform observability in the

  27. [27]

    Cima, Lorenzo and Miaschi, Alessio and Trujillo, Amaury and Avvenuti, Marco and Dell'Orletta, Felice and Cresci, Stefano , booktitle =

  28. [28]

    LLM s to the Rescue: Explaining DSA Statements of Reason with Platform's Terms of Services

    Aspromonte, Marco and Ferraris, Andrea and Galli, Federico and Contissa, Giuseppe. LLM s to the Rescue: Explaining DSA Statements of Reason with Platform's Terms of Services. NLLP. 2024

  29. [29]

    Linking Transparency and Accountability: Analysing The Connection Between T ik T ok ' s Terms of Service and Moderation Decisions

    E er, Leonard and Spanakis, Gerasimos. Linking Transparency and Accountability: Analysing The Connection Between T ik T ok ' s Terms of Service and Moderation Decisions. NLLP. 2025

  30. [30]

    ECML-PKDD Workshops , year=

    Improving regulatory oversight in online content moderation , author=. ECML-PKDD Workshops , year=

  31. [31]

    Telecommunications Policy , year=

    Big data, small answers: How the DSA Transparency Database falls short of its regulatory objectives , author=. Telecommunications Policy , year=

  32. [32]

    2023 , doi =

    Digital Services Act Transparency Database , author =. 2023 , doi =

  33. [33]

    arXiv:2502.08841 , year=

    Delayed takedown of illegal content on social media makes moderation ineffective , author=. arXiv:2502.08841 , year=

  34. [34]

    It is unfair, and it would be unwise to expect the user to know the law!

    “It is unfair, and it would be unwise to expect the user to know the law!”--Evaluating reporting mechanisms under the Digital Services Act , author=. ACM FAccT '25 , year =

  35. [35]

    Platforms under the Digital Services Act , author=

    The Great Data Standoff: Researchers vs. Platforms under the Digital Services Act , author=. AAAI ICWSM , year =

  36. [36]

    arXiv:2603.29874 , year=

    ``There is literally zero funding''': Understanding the Emerging Role of Trusted Flaggers under the EU Digital Services Act , author=. arXiv:2603.29874 , year=

  37. [37]

    When Transparency Falls Short: Auditing Platform Moderation During a High-Stakes Election

    When Transparency Falls Short: Auditing Platform Moderation During a High-Stakes Election , author=. arXiv:2604.19285 , year=

  38. [38]

    2021 , note =

    Santa Clara principles on transparency and accountability in content moderation , howpublished =. 2021 , note =

  39. [39]

    AI and Ethics , volume=

    Beyond phase-in: assessing impacts on disinformation of the EU Digital Services Act , author=. AI and Ethics , volume=. 2025 , publisher=

  40. [40]

    Social Responsibility Journal , volume=

    Transparency reports as CSR reports: motives, stakeholders, and strategies , author=. Social Responsibility Journal , volume=. 2024 , publisher=

  41. [41]

    Annual Review of Law and Social Science , volume=

    Regulating Content Moderation for Democracy: A Transatlantic Divide , author=. Annual Review of Law and Social Science , volume=. 2025 , publisher=

  42. [42]

    Jaursch, Julian and Ohme, Jakob and Klinger, Ulrike , title=