GS-SCNet unifies 3D Gaussian Splatting with a disparity-guided semantic codec and direct Gaussian parameter prediction for efficient real-time 3D video communications with strong generalization.
hub
End-to-end optimized image compression
16 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
SAD is a new explicit differentiable image representation based on soft anisotropic additively weighted Voronoi partitions that achieves higher PSNR and 4-19x faster training than Image-GS and Instant-NGP at matched bitrate.
NDGI compresses temporal lightmaps via neural feature maps and lightweight networks, delivering high-quality dynamic global illumination with low storage and modest real-time decompression cost.
Finite scalar quantization simplifies VQ-VAE latents by independently rounding a few dimensions to fixed levels, producing an equivalent-sized implicit codebook with competitive performance and no collapse.
MambaRaw uses SSM-based context modeling with TileMambaBlock and EAR modules for efficient JPEG-guided 4K raw reconstruction, reporting 1.2-1.4 dB PSNR gains and 9% lower latency over baselines on Sony, Olympus, and Samsung datasets.
MoECodec replaces FFN layers with token-wise MoE plus stable routing and GShMLP experts to support multiple downstream tasks in a single image compression model.
ECC integrates hyperprior side information, channel-wise context, latent residual prediction, temporal modeling, and entropy skip into a learned entropy model, yielding 39.9% and 76.3% average BD-rate reductions on ViSQOL and PESQ over baselines.
Few-step generative models can be reformulated as lossy codecs in the reverse channel coding framework without retraining, yielding faster encoding/decoding on low-resolution image benchmarks.
RDVQ enables joint rate-distortion optimization for vector-quantized generative image compression via differentiable codebook distribution relaxation and an autoregressive entropy model.
FrequencyFormer co-designs a multi-scale DCT tokenizer, LUT-based near-sensor hardware, and modified MIPI communication to enable frequency-domain ViT inference with up to 128x data reduction and 230x lower communication energy.
Derives optimality constraints for nonnegative joint dictionary learning that explain observed SAE behaviors such as feature splitting, absorption, and dense antipodal features.
A practical learned image codec delivers 2.3-3x bitrate savings over AV1/VVC and 20-40% over prior learned codecs while encoding 12MP images in 230ms on iPhone.
SAMIC introduces semantic-aware Mamba blocks and SVD-based redundancy reduction to achieve efficient perceptual image compression with improved rate-distortion-perception tradeoffs.
ActDiff-VC partitions video into segments, transmits adaptive keyframes and budget-aware point trajectories, and reconstructs frames via conditional diffusion, reporting up to 64.6% bitrate reduction at matched NIQE on UVG and MCL-JCV.
A bilinear CNN that fuses features from a distortion-type classifier and an image classifier achieves superior BIQA performance on both synthetic and authentic distortion databases.
DinoLink uses saliency-aware token pruning plus residual vector quantization to cut V2X bitrate by 139x while reporting 32.8% mAP on nuScenes.
citing papers explorer
-
Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments
NDGI compresses temporal lightmaps via neural feature maps and lightweight networks, delivering high-quality dynamic global illumination with low storage and modest real-time decompression cost.