Training deep neural networks with low precision multiplications

Jean-Pierre David; Matthieu Courbariaux; Yoshua Bengio

arxiv: 1412.7024 · v5 · pith:B26NHBQ4new · submitted 2014-12-22 · 💻 cs.LG · cs.CV· cs.NE

Training deep neural networks with low precision multiplications

Matthieu Courbariaux , Yoshua Bengio , Jean-Pierre David This is my paper

classification 💻 cs.LG cs.CVcs.NE

keywords networksmultiplicationsneuralpointprecisiontrainingdatasetsdeep

0 comments

read the original abstract

Multipliers are the most space and power-hungry arithmetic operators of the digital implementation of deep neural networks. We train a set of state-of-the-art neural networks (Maxout networks) on three benchmark datasets: MNIST, CIFAR-10 and SVHN. They are trained with three distinct formats: floating point, fixed point and dynamic fixed point. For each of those datasets and for each of those formats, we assess the impact of the precision of the multiplications on the final error after training. We find that very low precision is sufficient not just for running trained networks but also for training them. For example, it is possible to train Maxout networks with 10 bits multiplications.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU
cs.DC 2026-05 conditional novelty 7.0

LlamaWeb is a WebGPU backend for llama.cpp that uses static memory planning, tunable kernels, and templated multi-precision support to cut memory use by 29-33% and raise decode throughput by 45-69% versus prior browse...
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
cs.LG 2022-08 conditional novelty 7.0

LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
cs.CV 2017-04 accept novelty 7.0

MobileNets introduce depthwise separable convolutions plus width and resolution multipliers to produce efficient CNNs that trade off latency and accuracy for mobile and embedded vision applications.
Computing $k$-means in mixed precision
math.NA 2024-07 conditional novelty 4.0

Mixed-precision Lloyd's k-means remains stable and effective for normalized data in clustering and image segmentation tasks, with care needed for unnormalized data to avoid overflow.