PaaF: Raising the perceived quality of INR-Based Image Compression

Dario Allegra; Lorenzo Catania

arxiv: 2606.21655 · v1 · pith:KL4PLA7Onew · submitted 2026-06-19 · 📡 eess.IV · cs.CV· cs.MM

PaaF: Raising the perceived quality of INR-Based Image Compression

Lorenzo Catania , Dario Allegra This is my paper

Pith reviewed 2026-06-26 12:27 UTC · model grok-4.3

classification 📡 eess.IV cs.CVcs.MM

keywords implicit neural representationsimage compressionadaptive quantizationentropy codingrate-distortion performanceperceptual qualityINR-based codec

0 comments

The pith

PaaF improves INR-based image compression through refined architecture, adaptive quantization, and entropy coding.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Implicit neural representations encode images as continuous functions rather than fixed pixel arrays. Prior INR codecs for compression showed slower encoding and lower scores on standard metrics such as PSNR. The PaaF method adds three targeted changes—an updated network layout, signal-adaptive quantization, and a compact entropy coder—while keeping the decoder simple and parallel. If these changes raise rate-distortion curves and perceptual scores, INR approaches move closer to practical use without sacrificing their decoding advantages. The work tests the claim on standard image sets and reports gains over earlier INR baselines.

Core claim

The paper states that an INR codec named PaaF, built with an improved architectural design, adaptive quantization, and an efficient entropy coding scheme, delivers higher rate-distortion performance and better perceptual quality than previous INR-based compressors while retaining the simplicity and parallelizability of INR decoding.

What carries the argument

PaaF, the INR codec whose performance rests on the joint use of improved network architecture, adaptive quantization, and efficient entropy coding.

If this is right

INR-based compression can now reach higher PSNR values at given bit rates than earlier functional methods.
Perceptual quality improves alongside the quantitative metrics under the same rate constraints.
Decoding stays simple and parallelizable because the added components act only at the encoder.
The performance gap between functional representations and established codecs narrows on both objective and subjective measures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the gains hold on video sequences, the same functional approach could extend beyond still images without pixel-grid overhead.
Adaptive quantization may shorten encoding time, addressing one of the original drawbacks noted for INR codecs.
The method leaves open whether similar gains appear when the same components are grafted onto non-INR learned codecs.

Load-bearing premise

The three added components actually raise rate-distortion performance without losing the decoder's simplicity and parallel speed.

What would settle it

A head-to-head test on standard datasets showing that PaaF produces no higher PSNR or perceptual scores than prior INR methods at the same bit rate would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.21655 by Dario Allegra, Lorenzo Catania.

**Figure 1.** Figure 1: Overall scheme of the compression pipeline adopted in PaaF. Note that the weight restart is not performed during the last epoch of each [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Architecture of the self-modulated layer (top) and the overall [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Quantitative experiments on the Kodak and CLIC2020 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Visual comparisons on CLIC2020 image details. Reported metrics are, in order, bits-per-pixel, PSNR, MS-SSIM and LPIPS. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

read the original abstract

Implicit Neural Representations (INRs) have recently emerged as a promising paradigm for image compression, offering a fundamentally different approach from traditional and learned codecs. Nevertheless, INR-based methods for image compression suffer from long encoding times and a consistent performance gap in classic quality metrics such as PSNR. In this work, we explore the potential of purely INR-based compression methods and we propose PaaF (Picture as a Function), a novel INR-based image codec that introduces improved architectural design, adaptive quantization, and an efficient entropy coding scheme. These components are designed to enhance rate-distortion performance while preserving the simplicity and parallelizability of INR-based decoding. Experimental results demonstrate consistent improvements over existing INR-based methods in both quantitative metrics and perceptual quality. These findings highlight the potential of INR-based approaches and contribute to narrowing the gap between functional representations and more established compression paradigms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PaaF adds adaptive quantization and entropy coding to INR compression but the abstract supplies no numbers or comparisons to support the performance claims.

read the letter

The main takeaway is that this is an incremental engineering paper on INR image compression. It takes the usual INR setup, layers on an improved architecture, adaptive quantization, and an entropy coding scheme, then asserts better rate-distortion numbers and perceptual quality while keeping decoding simple and parallel.

What the work does reasonably well is name the actual problems in the subfield: long encoding times and the persistent PSNR gap versus conventional codecs. Framing the fixes around those three components and explicitly trying to hold onto parallel decoding is a sensible constraint. The abstract also positions the contribution clearly as an extension rather than a new paradigm.

The soft spots are straightforward and central. The abstract states that experimental results show consistent improvements, yet it contains no metrics, no baselines, no error bars, and no direct comparisons to the prior INR methods it references. Without that data it is impossible to judge whether the claimed gains are real or how large they are. The parallelizability claim is also thin. Standard entropy coding is sequential; the abstract gives no indication of a block-parallel or otherwise non-serial implementation, so the stress-test concern stands on the evidence provided. If the full paper has the experiments and a parallel entropy module, that would change the picture, but nothing in the given text shows it.

This paper is aimed at the small set of researchers already working on implicit representations for compression. Someone in that niche might pick up the architectural or quantization details if the full manuscript supplies the missing results. Outside that group the paper has little to offer because the evidence is missing. It does not rise to the level that would justify sending it to peer review in its current form.

Referee Report

1 major / 0 minor

Summary. The paper proposes PaaF, an INR-based image codec that adds an improved architectural design, adaptive quantization, and an efficient entropy coding scheme. The central claim is that these three components raise rate-distortion performance and perceptual quality while preserving the simplicity and parallelizability of INR decoding; experimental results are said to show consistent gains over prior INR methods.

Significance. If the rate-distortion and parallelizability claims are substantiated with concrete measurements, the work would narrow the documented performance gap between INR representations and conventional codecs while retaining the parallel decoding advantage that distinguishes INR approaches. The absence of any quantitative tables, baselines, or latency figures in the supplied text prevents assessment of whether this potential is realized.

major comments (1)

[Abstract] Abstract: the central claim that the three proposed components 'enhance rate-distortion performance while preserving the simplicity and parallelizability of INR-based decoding' is load-bearing, yet the text supplies neither the entropy-coding algorithm (arithmetic, range, ANS, etc.), its context model, nor any wall-clock decoding latency numbers on multi-threaded hardware. Standard entropy coders are sequential; without an explicit parallel variant or timing data the preservation assertion cannot be evaluated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed comment. We agree that the abstract claim regarding entropy coding and parallelizability requires concrete support and will revise the manuscript to address this.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the three proposed components 'enhance rate-distortion performance while preserving the simplicity and parallelizability of INR-based decoding' is load-bearing, yet the text supplies neither the entropy-coding algorithm (arithmetic, range, ANS, etc.), its context model, nor any wall-clock decoding latency numbers on multi-threaded hardware. Standard entropy coders are sequential; without an explicit parallel variant or timing data the preservation assertion cannot be evaluated.

Authors: We agree that the submitted manuscript does not supply the requested details on the entropy-coding algorithm, context model, or latency measurements, which prevents full evaluation of the parallelizability claim. In the revised version we will (1) expand the abstract to name the entropy coder and briefly note its context model, (2) add a dedicated subsection describing the full entropy-coding procedure, and (3) include wall-clock decoding latency figures measured on multi-threaded hardware. These additions will allow readers to assess whether the overall pipeline preserves the parallel decoding advantage of INR methods. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical proposal without self-referential derivations

full rationale

The paper introduces PaaF as an INR-based codec with three new components (architectural design, adaptive quantization, efficient entropy coding) and supports its claims solely via experimental comparisons on rate-distortion and perceptual metrics. No equations, fitted parameters, uniqueness theorems, or ansatzes appear in the abstract or described structure; the central assertions reduce to measured performance deltas rather than any quantity defined in terms of itself or prior self-citations. The parallelizability claim is an engineering assertion open to external verification and does not constitute a derivation that collapses by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.1-grok · 5671 in / 1006 out tokens · 17291 ms · 2026-06-26T12:27:26.844411+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references

[1]

G. K. Wallace, The jpeg still picture compression standard, IEEE Transactions on Consumer Electron- ics (1992)

1992
[2]

Barman, M

N. Barman, M. G. Martini, An evaluation of the next-generation image coding standard avif, in: In- ternational Conference on Quality of Multimedia Experience, 2020

2020
[3]

Ballé, V

J. Ballé, V . Laparra, E. P. Simoncelli, End-to-end op- timized image compression, in: International Con- ference on Learning Representations, 2017

2017
[4]

X. Pan, Z. Guo, Z. Chen, Analyzing time complex- ity of practical learned image compression models, 6 in: International Conference on Visual Communica- tions and Image Processing, 2021

2021
[5]

Mildenhall, P

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Bar- ron, R. Ramamoorthi, R. Ng, Nerf: Representing scenes as neural radiance fields for view synthe- sis, in: European Conference on Computer Vision, 2020

2020
[6]

Sitzmann, J

V . Sitzmann, J. N. Martel, A. W. Bergman, D. B. Lindell, G. Wetzstein, Implicit neural representa- tions with periodic activation functions, in: Neural Information Processing Systems, 2020

2020
[7]

Dupont, A

E. Dupont, A. Golí nski, M. Alizadeh, Y . W. Teh, A. Doucet, Coin: Compression with implicit neural representations, arXiv:2103.03123 (2021)

arXiv 2021
[8]

Strümpler, J

Y . Strümpler, J. Postels, R. Yang, L. V . Gool, F. Tombari, Implicit neural representations for im- age compression, in: European Conference on Com- puter Vision, 2022

2022
[9]

Ladune, P

T. Ladune, P. Philippe, F. Henry, G. Clare, T. Leguay, Cool-chic: Coordinate-based low com- plexity hierarchical image codec (2023)

2023
[10]

J. J. Park, P. Florence, J. Straub, R. Newcombe, S. Lovegrove, Deepsdf: Learning continuous signed distance functions for shape representation, in: Computer Vision and Pattern Recognition, 2019

2019
[11]

Tancik, P

M. Tancik, P. P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ra- mamoorthi, J. T. Barron, R. Ng, Fourier features let networks learn high frequency functions in low di- mensional domains, Neural Information Processing Systems (2020)

2020
[12]

Saragadam, D

V . Saragadam, D. LeJeune, J. Tan, G. Bal- akrishnan, A. Veeraraghavan, R. G. Baraniuk, Wire: Wavelet implicit neural representations, in: arXiv:2301.05187, 2022

arXiv 2022
[13]

H. Chen, B. He, H. Wang, Y . Ren, S. N. Lim, A. Shrivastava, Nerv: Neural representations for videos, Neural Information Processing Systems (2021)

2021
[14]

Catania, D

L. Catania, D. Allegra, Nif: A fast implicit image compression with bottleneck layers and modulated sinusoidal activations, in: ACM International Con- ference on Multimedia, 2023

2023
[15]

Alakuijala, A

J. Alakuijala, A. Farruggia, P. Ferragina, E. Kli- uchnikov, R. Obryk, Z. Szabadka, L. Vandevenne, Brotli: A general-purpose data compressor, ACM Transactions on Information Systems (2018)

2018
[16]

Bamler, Understanding entropy coding with asymmetric numeral systems (ans): a statistician’s perspective, arXiv:2201.01741 (2022)

R. Bamler, Understanding entropy coding with asymmetric numeral systems (ans): a statistician’s perspective, arXiv:2201.01741 (2022)

arXiv 2022
[17]

W. Shi, J. Caballero, F. Huszar, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, Z. Wang, Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network, in: Confer- ence on Computer Vision and Pattern Recognition, 2016

2016
[18]

Mehta, M

I. Mehta, M. Gharbi, C. Barnes, E. Shechtman, R. Ramamoorthi, M. Chandraker, Modulated peri- odic activations for generalizable local functional representations, in: International Conference on Computer Vision, 2021

2021
[19]

Y . Xie, K. L. Cheng, Q. Chen, Enhanced invertible encoding for learned image compression, in: ACM International Conference on Multimedia, 2021

2021
[20]

Z. Wang, E. Simoncelli, A. Bovik, Multiscale struc- tural similarity for image quality assessment, in: Asilomar Conference on Signals, Systems & Com- puters, 2003

2003
[21]

Z. Wang, A. Bovik, H. Sheikh, E. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Transactions on Image Processing (2004)

2004
[22]

Kodak, Lossless true color image suite (1999)

1999
[23]

Toderici, W

G. Toderici, W. Shi, R. Timofte, L. Theis, J. Balle, E. Agustsson, N. Johnston, F. Mentzer, Challenge on learned image compression (clic2020), in: Com- puter Vision and Pattern Recognition Workshops, 2020

2020
[24]

Rivas-Manzaneque, A

F. Rivas-Manzaneque, A. Ribeiro, O. Avila-García, Ice: Implicit coordinate encoder for multiple image neural representation, IEEE Transactions on Image Processing (2023)

2023
[25]

Zhang, P

R. Zhang, P. Isola, A. A. Efros, E. Shechtman, O. Wang, The unreasonable effectiveness of deep features as a perceptual metric, in: Conference on Computer Vision and Pattern Recognition, 2018

2018
[26]

Herglotz, H

C. Herglotz, H. Och, A. Meyer, G. Ramasubbu, L. Eichermüller, M. Kränzler, F. Brand, K. Fischer, D. T. Nguyen, A. Regensky, A. Kaup, The bjøn- tegaard bible why your way of comparing video codecs may be wrong, IEEE Transactions on Image Processing (2024). 7

2024

[1] [1]

G. K. Wallace, The jpeg still picture compression standard, IEEE Transactions on Consumer Electron- ics (1992)

1992

[2] [2]

Barman, M

N. Barman, M. G. Martini, An evaluation of the next-generation image coding standard avif, in: In- ternational Conference on Quality of Multimedia Experience, 2020

2020

[3] [3]

Ballé, V

J. Ballé, V . Laparra, E. P. Simoncelli, End-to-end op- timized image compression, in: International Con- ference on Learning Representations, 2017

2017

[4] [4]

X. Pan, Z. Guo, Z. Chen, Analyzing time complex- ity of practical learned image compression models, 6 in: International Conference on Visual Communica- tions and Image Processing, 2021

2021

[5] [5]

Mildenhall, P

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Bar- ron, R. Ramamoorthi, R. Ng, Nerf: Representing scenes as neural radiance fields for view synthe- sis, in: European Conference on Computer Vision, 2020

2020

[6] [6]

Sitzmann, J

V . Sitzmann, J. N. Martel, A. W. Bergman, D. B. Lindell, G. Wetzstein, Implicit neural representa- tions with periodic activation functions, in: Neural Information Processing Systems, 2020

2020

[7] [7]

Dupont, A

E. Dupont, A. Golí nski, M. Alizadeh, Y . W. Teh, A. Doucet, Coin: Compression with implicit neural representations, arXiv:2103.03123 (2021)

arXiv 2021

[8] [8]

Strümpler, J

Y . Strümpler, J. Postels, R. Yang, L. V . Gool, F. Tombari, Implicit neural representations for im- age compression, in: European Conference on Com- puter Vision, 2022

2022

[9] [9]

Ladune, P

T. Ladune, P. Philippe, F. Henry, G. Clare, T. Leguay, Cool-chic: Coordinate-based low com- plexity hierarchical image codec (2023)

2023

[10] [10]

J. J. Park, P. Florence, J. Straub, R. Newcombe, S. Lovegrove, Deepsdf: Learning continuous signed distance functions for shape representation, in: Computer Vision and Pattern Recognition, 2019

2019

[11] [11]

Tancik, P

M. Tancik, P. P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ra- mamoorthi, J. T. Barron, R. Ng, Fourier features let networks learn high frequency functions in low di- mensional domains, Neural Information Processing Systems (2020)

2020

[12] [12]

Saragadam, D

V . Saragadam, D. LeJeune, J. Tan, G. Bal- akrishnan, A. Veeraraghavan, R. G. Baraniuk, Wire: Wavelet implicit neural representations, in: arXiv:2301.05187, 2022

arXiv 2022

[13] [13]

H. Chen, B. He, H. Wang, Y . Ren, S. N. Lim, A. Shrivastava, Nerv: Neural representations for videos, Neural Information Processing Systems (2021)

2021

[14] [14]

Catania, D

L. Catania, D. Allegra, Nif: A fast implicit image compression with bottleneck layers and modulated sinusoidal activations, in: ACM International Con- ference on Multimedia, 2023

2023

[15] [15]

Alakuijala, A

J. Alakuijala, A. Farruggia, P. Ferragina, E. Kli- uchnikov, R. Obryk, Z. Szabadka, L. Vandevenne, Brotli: A general-purpose data compressor, ACM Transactions on Information Systems (2018)

2018

[16] [16]

Bamler, Understanding entropy coding with asymmetric numeral systems (ans): a statistician’s perspective, arXiv:2201.01741 (2022)

R. Bamler, Understanding entropy coding with asymmetric numeral systems (ans): a statistician’s perspective, arXiv:2201.01741 (2022)

arXiv 2022

[17] [17]

W. Shi, J. Caballero, F. Huszar, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, Z. Wang, Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network, in: Confer- ence on Computer Vision and Pattern Recognition, 2016

2016

[18] [18]

Mehta, M

I. Mehta, M. Gharbi, C. Barnes, E. Shechtman, R. Ramamoorthi, M. Chandraker, Modulated peri- odic activations for generalizable local functional representations, in: International Conference on Computer Vision, 2021

2021

[19] [19]

Y . Xie, K. L. Cheng, Q. Chen, Enhanced invertible encoding for learned image compression, in: ACM International Conference on Multimedia, 2021

2021

[20] [20]

Z. Wang, E. Simoncelli, A. Bovik, Multiscale struc- tural similarity for image quality assessment, in: Asilomar Conference on Signals, Systems & Com- puters, 2003

2003

[21] [21]

Z. Wang, A. Bovik, H. Sheikh, E. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Transactions on Image Processing (2004)

2004

[22] [22]

Kodak, Lossless true color image suite (1999)

1999

[23] [23]

Toderici, W

G. Toderici, W. Shi, R. Timofte, L. Theis, J. Balle, E. Agustsson, N. Johnston, F. Mentzer, Challenge on learned image compression (clic2020), in: Com- puter Vision and Pattern Recognition Workshops, 2020

2020

[24] [24]

Rivas-Manzaneque, A

F. Rivas-Manzaneque, A. Ribeiro, O. Avila-García, Ice: Implicit coordinate encoder for multiple image neural representation, IEEE Transactions on Image Processing (2023)

2023

[25] [25]

Zhang, P

R. Zhang, P. Isola, A. A. Efros, E. Shechtman, O. Wang, The unreasonable effectiveness of deep features as a perceptual metric, in: Conference on Computer Vision and Pattern Recognition, 2018

2018

[26] [26]

Herglotz, H

C. Herglotz, H. Och, A. Meyer, G. Ramasubbu, L. Eichermüller, M. Kränzler, F. Brand, K. Fischer, D. T. Nguyen, A. Regensky, A. Kaup, The bjøn- tegaard bible why your way of comparing video codecs may be wrong, IEEE Transactions on Image Processing (2024). 7

2024