pith. sign in

Int vs fp: A comprehensive study of fine-grained low-bit quantization formats

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 7 2025 1

roles

background 2

polarities

background 2

clear filters

representative citing papers

QuantClaw: Precision Where It Matters for OpenClaw

cs.AI · 2026-04-24 · unverdicted · novelty 6.0

QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.

MixFP4: Enhancing NVFP4 with Adaptive FP4/INT4 Block Representations

cs.AR · 2026-05-29 · unverdicted · novelty 5.0

MixFP4 extends NVFP4 by adaptively selecting between two FP4 micro-formats per block using repurposed scale sign bits and a unified E2M2 compute path, claiming better accuracy than standard NVFP4 at 3.1% area and 1.5% power overhead.

Cassandra: Enabling Reasoning LLMs at Edge via Self-Speculative Decoding

cs.AR · 2026-05-26 · unverdicted · novelty 5.0

Cassandra is a self-speculative decoding system that builds a draft model via fine-grained data selection and optimized pruning/mantissa truncation, achieving up to 2.41x speedup over BF16 and 1.81x more tokens than Eagle-3 on Llama 3 8B without training.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • MixFP4: Enhancing NVFP4 with Adaptive FP4/INT4 Block Representations cs.AR · 2026-05-29 · unverdicted · none · ref 3

    MixFP4 extends NVFP4 by adaptively selecting between two FP4 micro-formats per block using repurposed scale sign bits and a unified E2M2 compute path, claiming better accuracy than standard NVFP4 at 3.1% area and 1.5% power overhead.

  • Cassandra: Enabling Reasoning LLMs at Edge via Self-Speculative Decoding cs.AR · 2026-05-26 · unverdicted · none · ref 3

    Cassandra is a self-speculative decoding system that builds a draft model via fine-grained data selection and optimized pruning/mantissa truncation, achieving up to 2.41x speedup over BF16 and 1.81x more tokens than Eagle-3 on Llama 3 8B without training.