pith. sign in

hub Canonical reference

Roofline: An insightful visual performance model for multicore architectures

Canonical reference. 80% of citing Pith papers cite this work as background.

31 Pith papers citing it
Background 80% of classified citations

hub tools

citation-role summary

background 4 method 1

citation-polarity summary

representative citing papers

Apple Neural Engine: Architecture, Programming, and Performance

cs.AR · 2026-06-21 · unverdicted · novelty 8.0

The paper delivers a reverse-engineered documentation of the Apple Neural Engine architecture, dispatch mechanisms, weight compression, and roofline performance based on measurements from M1 and M5 chips and analysis of private runtime components.

Enabling AI ASICs for Zero Knowledge Proof

cs.AR · 2026-04-20 · conditional · novelty 8.0

MORPH reformulates ZKP MSM and NTT kernels into GEMM operations for TPUs using a new Big-T complexity model, achieving up to 10x NTT throughput over GZKP.

KernelSight-LM: A Kernel-Level LLM Inference Simulator

cs.PF · 2026-06-26 · unverdicted · novelty 6.0 · 2 refs

KernelSight-LM simulates LLM inference at kernel granularity with cross-generation (12.1% per-kernel error) and target-measured (3.8% error) tiers, yielding end-to-end median errors of 15.4%/12.8%/3.0% and 14.3%/6.2%/2.7% for TTFT/TPOT/throughput across six model families.

The Recurrent Transformer: Greater Effective Depth and Efficient Decoding

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

Recurrent Transformers add per-layer recurrent memory via self-attention on own activations plus a tiling algorithm that reduces training memory traffic, yielding better C4 pretraining cross-entropy than parameter-matched standard transformers with fewer layers.

citing papers explorer

Showing 31 of 31 citing papers.