pith. sign in

arxiv: 2606.09922 · v1 · pith:VT7KLSVOnew · submitted 2026-06-07 · 💻 cs.IT · cs.AI· math.IT

The Bioelectrical Information Theory: Investigating the theoretical compression limit of bioelectrical signals under artificial intelligence

Pith reviewed 2026-06-27 18:17 UTC · model grok-4.3

classification 💻 cs.IT cs.AImath.IT
keywords bioelectrical signalsinformation theorysignal compressiondeep learningbrain-computer interfacessemantic levelconditional entropyphysiological encoders
0
0 comments X

The pith

The compression limit of bioelectrical signals is a model- and task-conditioned quantity rather than a fixed property of the waveform.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper sets out an information-theoretic view in which bioelectrical signal compression depends on physiological structure, model capacity, and the needs of a downstream task in addition to raw waveform fidelity. It organizes the process into a three-level hierarchy: the signal level removes noise to retain information about latent sources, the physiological level uses parametric encoders to create compact quantized representations, and the semantic level discards task-irrelevant content by letting deep learning models substitute conditional entropy for marginal entropy. A reader would care because brain-computer interfaces face growing bandwidth demands, and this framing suggests that more expressive models can reduce what must be transmitted to only the residuals needed for interpretation. The central shift is from preserving the entire signal to transmitting only what the task and model require.

Core claim

The paper claims that the effective information of bioelectrical data is determined not only by signal fidelity, but also by physiological structure, model capacity and downstream task requirements. It formulates bioelectrical compression as a three-level hierarchy. At the signal level, noise is reduced to the information carried about latent physiological sources. At the physiological level, parametric encoders map purified signals into compact, structured and quantized representations. At the semantic level, task-irrelevant information is discarded while deep learning models exploit causal dependencies to replace marginal entropy with conditional entropy. This perspective reframes the comp

What carries the argument

The three-level hierarchy of signal, physiological, and semantic compression that makes the effective limit depend on model capacity and task requirements.

Load-bearing premise

Deep learning models can reliably exploit causal dependencies in bioelectrical data to replace marginal entropy with conditional entropy at the semantic level without losing task-critical information.

What would settle it

An experiment in which compression rates for a given bioelectrical task remain at or above the marginal entropy of the raw signal even after scaling model capacity and integrating task-specific training.

Figures

Figures reproduced from arXiv: 2606.09922 by Bo Yan, Jiawen Zou.

Figure 1
Figure 1. Figure 1: Hierarchical information reduction for bioelectrical signal compression. Raw bioelectrical recordings first undergo signal-level purification, where noise and measurement noise are removed to recover cleaner physiological signals. The purified signals are then mapped by an encoder into compact quantized features, yielding a structured physiological representation with reduced dimensionality and finite reso… view at source ↗
read the original abstract

Bioelectrical signals are increasingly acquired at scales that challenge the bandwidth of brain-computer interfaces. However, their compression is still often framed as a problem of waveform preservation, limited by the entropy of the raw signal. Here we propose an information-theoretic framework in which the effective information of bioelectrical data is determined not only by signal fidelity, but also by physiological structure, model capacity and downstream task requirements. We formulate bioelectrical compression as a three-level hierarchy. At the signal level, noise is reduced to the information they carry about latent physiological sources. At the physiological level, parametric encoders map purified signals into compact, structured and quantized representations. At the semantic level, task-irrelevant information is discarded, while deep learning models exploit causal dependencies to replace marginal entropy with conditional entropy. This perspective reframes the compression limit of bioelectrical signals as a model- and task-conditioned quantity rather than a fixed property of the waveform. As increasingly expressive models become integrated with neural and physiological interfaces, bioelectrical compression may shift from transmitting signals to transmitting only the residual information required for task-level interpretation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an information-theoretic framework for bioelectrical signal compression framed as a three-level hierarchy. At the signal level, noise is reduced to information about latent physiological sources. At the physiological level, parametric encoders produce compact quantized representations. At the semantic level, task-irrelevant information is discarded while deep learning models exploit causal dependencies to replace marginal entropy with conditional entropy. The central claim is that this makes the effective compression limit a model- and task-conditioned quantity rather than a fixed property of the waveform.

Significance. If the hierarchy were shown to produce a strictly lower rate-distortion function at the semantic level for equivalent task fidelity, the reframing could influence compression strategies in brain-computer interfaces. The manuscript supplies only a verbal description with no rate-distortion derivation, entropy definitions, or empirical comparison, so the claimed shift remains an assertion rather than a demonstrated result.

major comments (2)
  1. [Abstract] Abstract (final paragraph): the assertion that 'deep learning models exploit causal dependencies to replace marginal entropy with conditional entropy' is load-bearing for the reframing claim, yet the manuscript supplies neither definitions of the relevant conditional entropies nor a distortion measure tied to task performance, nor any achievability argument showing a lower achievable rate than classical waveform-level rate-distortion theory.
  2. [Conceptual framework] The three-level hierarchy description (throughout): no explicit expressions are given for the conditional entropies at each level, no comparison of the resulting rate-distortion functions is derived, and no argument is made that the semantic level yields a quantitatively different (lower) limit for the same task fidelity; this absence prevents verification of the central claim.
minor comments (1)
  1. [Abstract] The sentence 'noise is reduced to the information they carry about latent physiological sources' is grammatically awkward; consider rephrasing for clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and precise comments. The report correctly observes that the manuscript advances a conceptual reframing of bioelectrical compression without supplying explicit entropy expressions, rate-distortion derivations, or empirical comparisons. We will revise the text to incorporate informal but explicit definitions of the relevant quantities at each hierarchy level and to clarify the intended scope as a perspective on model- and task-conditioned limits rather than a new theorem. These changes will improve verifiability while preserving the paper's focus.

read point-by-point responses
  1. Referee: [Abstract] Abstract (final paragraph): the assertion that 'deep learning models exploit causal dependencies to replace marginal entropy with conditional entropy' is load-bearing for the reframing claim, yet the manuscript supplies neither definitions of the relevant conditional entropies nor a distortion measure tied to task performance, nor any achievability argument showing a lower achievable rate than classical waveform-level rate-distortion theory.

    Authors: We agree that the claim would be strengthened by formal support. The manuscript is a perspective proposing a hierarchy rather than deriving new bounds. In revision we will add explicit (informal) definitions: at the semantic level the effective rate is governed by the conditional entropy H(S | T, M) where S denotes the signal, T the downstream task and M the model, contrasted with the marginal H(S). We will also introduce a task-dependent distortion D_task and note that the relevant quantity is then the conditional rate-distortion function R(D_task | M). A general achievability proof establishing strict improvement over the classical waveform rate-distortion function lies beyond the scope of this work and depends on specific causal structure captured by M; we will revise the abstract to state this limitation explicitly. revision: partial

  2. Referee: [Conceptual framework] The three-level hierarchy description (throughout): no explicit expressions are given for the conditional entropies at each level, no comparison of the resulting rate-distortion functions is derived, and no argument is made that the semantic level yields a quantitatively different (lower) limit for the same task fidelity; this absence prevents verification of the central claim.

    Authors: We accept that the absence of explicit expressions hinders verification. The hierarchy is presented conceptually to highlight the shift from waveform to semantic compression. In the revised manuscript we will supply expressions for each level: signal level uses mutual information I(S; physiological sources); physiological level uses entropy of the quantized representation H(Q); semantic level uses conditional entropy H(residual | T, M). We will argue conceptually that, for fixed task fidelity, conditioning on model knowledge can reduce the required rate relative to marginal entropy, but we will not claim or derive a general inequality between the associated rate-distortion functions, as any quantitative comparison is instance-specific. This revision will make the central reframing more precise within its stated conceptual scope. revision: yes

Circularity Check

0 steps flagged

No significant circularity; conceptual reframing without load-bearing derivations or self-referential reductions

full rationale

The paper advances a verbal three-level hierarchy (signal, physiological, semantic) that reframes compression limits as model- and task-conditioned. No equations, rate-distortion functions, entropy definitions, or parameter fittings appear in the provided text. The central claim rests on an interpretive perspective rather than a derivation chain that reduces to its inputs by construction. None of the enumerated circularity patterns (self-definitional, fitted-input prediction, self-citation load-bearing, etc.) are instantiated. This is the expected non-finding for a perspective manuscript that supplies no formal mathematical steps to inspect.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

The framework rests entirely on domain assumptions about separability of signal, physiological, and semantic levels and on the unverified capabilities of parametric encoders and deep learning models; no free parameters, invented entities, or external benchmarks are specified in the abstract.

axioms (3)
  • domain assumption Bioelectrical signals can be purified to information about latent physiological sources by noise reduction.
    Invoked at the signal level of the proposed hierarchy.
  • domain assumption Parametric encoders exist that map purified signals into compact, structured, and quantized representations.
    Invoked at the physiological level.
  • ad hoc to paper Deep learning models can exploit causal dependencies to replace marginal entropy with conditional entropy by discarding task-irrelevant information.
    Central assumption at the semantic level that enables the model- and task-conditioned reframing.

pith-pipeline@v0.9.1-grok · 5718 in / 1423 out tokens · 26110 ms · 2026-06-27T18:17:57.194238+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 1 linked inside Pith

  1. [1]

    Large-scale training of foundation models for wearable biosignals

    Salar Abbaspourazad, Oussama Elachqar, Andrew C Miller, Saba Emrani, Udhyakumar Nal- lasamy, and Ian Shapiro. Large-scale training of foundation models for wearable biosignals. arXiv preprint arXiv:2312.05409, 2023

  2. [2]

    Deep variational infor- mation bottleneck.arXiv preprint arXiv:1612.00410, 2016

    Alexander A Alemi, Ian Fischer, Joshua V Dillon, and Kevin Murphy. Deep variational infor- mation bottleneck.arXiv preprint arXiv:1612.00410, 2016

  3. [3]

    John Wiley & Sons, 1999

    Thomas M Cover.Elements of information theory. John Wiley & Sons, 1999

  4. [4]

    Fast reconstruction of eeg signal compression sensing based on deep learning.Scientific Reports, 14(1):5087, 2024

    XiuLi Du, KuanYang Liang, YaNa Lv, and ShaoMing Qiu. Fast reconstruction of eeg signal compression sensing based on deep learning.Scientific Reports, 14(1):5087, 2024

  5. [5]

    Learning optimal representations with the decodable information bottleneck.Advances in Neural Information Processing Systems, 33:18674–18690, 2020

    Yann Dubois, Douwe Kiela, David J Schwab, and Ramakrishna Vedantam. Learning optimal representations with the decodable information bottleneck.Advances in Neural Information Processing Systems, 33:18674–18690, 2020. 5

  6. [6]

    Gray and David L

    Robert M. Gray and David L. Neuhoff. Quantization.IEEE transactions on information theory, 44(6):2325–2383, 2002

  7. [7]

    Vector quantization pretraining for eeg time series with random projection and phase alignment

    Haokun Gui, Xiucheng Li, and Xinyang Chen. Vector quantization pretraining for eeg time series with random projection and phase alignment. InICML, 2024

  8. [8]

    Mark Hettick, Elton Ho, Adam J Poole, Manuel Monge, Demetrios Papageorgiou, Kazutaka Takahashi, Morgan LaMarca, Daniel Trietsch, Kyle Reed, Mark Murphy, et al. Minimally invasive implantation of scalable high-density cortical microelectrode arrays for multimodal neural decoding and stimulation.Nature Biomedical Engineering, pages 1–16, 2025

  9. [9]

    Ongoing eeg artifact correction using blind source separation.Clinical Neurophysiology, 158:149–158, 2024

    Nicole Ille, Yoshiaki Nakao, Shumpei Yano, Toshiyuki Taura, Arndt Ebert, Harald Bornfleth, Suguru Asagi, Kanoko Kozawa, Izumi Itabashi, Takafumi Sato, et al. Ongoing eeg artifact correction using blind source separation.Clinical Neurophysiology, 158:149–158, 2024

  10. [10]

    MoonHyung Jang, Maddy Hays, Wei-Han Yu, Changuk Lee, Pietro Caragiulo, Athanasios T Ramkaj, Pingyu Wang, AJ Phillips, Nicholas Vitale, Pulkit Tandon, et al. A 1024-channel 268-nw/pixel 36×36µm 2/channel data-compressive neural recording ic for high-bandwidth brain–computer interfaces.IEEE journal of solid-state circuits, 59(4):1123–1136, 2023

  11. [11]

    Large brain model for learning generic representations with tremendous eeg data in bci

    Wei-Bang Jiang, Liming Zhao, and Bao-Liang Lu. Large brain model for learning generic representations with tremendous eeg data in bci. InInternational Conference on Learning Representations, volume 2024, pages 16405–16426, 2024

  12. [12]

    An electrocardiogram foundation model built on over 10 million recordings.Nejm ai, 2(7):AIoa2401033, 2025

    Jun Li, Aaron D Aguirre, Valdery Moura Junior, Jiarui Jin, Che Liu, Lanhai Zhong, Chenxi Sun, Gari Clifford, M Brandon Westover, and Shenda Hong. An electrocardiogram foundation model built on over 10 million recordings.Nejm ai, 2(7):AIoa2401033, 2025

  13. [13]

    A high-performance neuroprosthesis for speech decoding and avatar control.Nature, 620(7976): 1037–1046, 2023

    Sean L Metzger, Kaylo T Littlejohn, Alexander B Silva, David A Moses, Margaret P Seaton, Ran Wang, Maximilian E Dougherty, Jessie R Liu, Peter Wu, Michael A Berger, et al. A high-performance neuroprosthesis for speech decoding and avatar control.Nature, 620(7976): 1037–1046, 2023

  14. [14]

    Optimal spatial filter- ing of single trial eeg during imagined hand movement.IEEE transactions on rehabilitation engineering, 8(4):441–446, 2000

    Herbert Ramoser, Johannes Muller-Gerking, and Gert Pfurtscheller. Optimal spatial filter- ing of single trial eeg during imagined hand movement.IEEE transactions on rehabilitation engineering, 8(4):441–446, 2000

  15. [15]

    Modeling behaviorally relevant neural dynamics enabled by preferential subspace identification.Nature neuroscience, 24(1):140–149, 2021

    Omid G Sani, Hamidreza Abbaspourazad, Yan T Wong, Bijan Pesaran, and Maryam M Shanechi. Modeling behaviorally relevant neural dynamics enabled by preferential subspace identification.Nature neuroscience, 24(1):140–149, 2021

  16. [16]

    Wearable eeg electronics for a brain–ai closed-loop system to enhance autonomous machine decision-making.npj Flexible Electronics, 6(1):32, 2022

    Joo Hwan Shin, Junmo Kwon, Jong Uk Kim, Hyewon Ryu, Jehyung Ok, S Joon Kwon, Hyunjin Park, and Tae-il Kim. Wearable eeg electronics for a brain–ai closed-loop system to enhance autonomous machine decision-making.npj Flexible Electronics, 6(1):32, 2022

  17. [17]

    Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings.Science, 372 (6539):eabf4588, 2021

    Nicholas A Steinmetz, Cagatay Aydin, Anna Lebedeva, Michael Okun, Marius Pachitariu, Marius Bauza, Maxime Beau, Jai Bhagat, Claudia Böhm, Martijn Broux, et al. Neuropixels 2.0: A miniaturized high-density probe for stable, long-term brain recordings.Science, 372 (6539):eabf4588, 2021

  18. [18]

    Capacity of multi-antenna gaussian channels.European transactions on telecom- munications, 10(6):585–595, 1999

    Emre Telatar. Capacity of multi-antenna gaussian channels.European transactions on telecom- munications, 10(6):585–595, 1999. 6

  19. [19]

    Neural discrete representation learning.Advances in neural information processing systems, 30, 2017

    Aaron Van Den Oord, Oriol Vinyals, et al. Neural discrete representation learning.Advances in neural information processing systems, 30, 2017

  20. [20]

    High-performance brain-to-text communication via handwriting.Nature, 593(7858): 249–254, 2021

    Francis R Willett, Donald T Avansino, Leigh R Hochberg, Jaimie M Henderson, and Krishna V Shenoy. High-performance brain-to-text communication via handwriting.Nature, 593(7858): 249–254, 2021. 7