BinaryConnect: Training Deep Neural Networks with binary weights during propagations

Jean-Pierre David; Matthieu Courbariaux; Yoshua Bengio

arxiv: 1511.00363 · v3 · pith:NWGBSLYEnew · submitted 2015-11-02 · 💻 cs.LG · cs.CV· cs.NE

BinaryConnect: Training Deep Neural Networks with binary weights during propagations

Matthieu Courbariaux , Yoshua Bengio , Jean-Pierre David This is my paper

classification 💻 cs.LG cs.CVcs.NE

keywords weightsbinaryconnecttrainingbinarydeepnetworksneuralresults

0 comments

read the original abstract

Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. In the past, GPUs enabled these breakthroughs because of their greater computational speed. In the future, faster computation at both training and test time is likely to be crucial for further progress and for consumer applications on low-power devices. As a result, there is much interest in research and development of dedicated hardware for Deep Learning (DL). Binary weights, i.e., weights which are constrained to only two possible values (e.g. -1 or 1), would bring great benefits to specialized DL hardware by replacing many multiply-accumulate operations by simple accumulations, as multipliers are the most space and power-hungry components of the digital implementation of neural networks. We introduce BinaryConnect, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated. Like other dropout schemes, we show that BinaryConnect acts as regularizer and we obtain near state-of-the-art results with BinaryConnect on the permutation-invariant MNIST, CIFAR-10 and SVHN.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Geometric Analysis of Sign-Magnitude Asymmetry in a ReLU + RMSNorm Block under Ternary Quantization
cs.LG 2026-05 unverdicted novelty 6.0

Sign-flip perturbations produce π/(π-2) ≈ 2.75 times more transverse output energy than equal-norm sign-preserving perturbations in a ReLU + RMSNorm block because ReLU creates directional asymmetry that RMSNorm's tran...
Learning Instance-wise Sparsity for Accelerating Deep Models
cs.CV 2019-07 unverdicted novelty 5.0

Instance-wise sparsity is learned via feature decay regularization to accelerate deep model inference by pruning uninformative intermediate features per data instance.
New pointwise convolution in Deep Neural Networks through Extremely Fast and Non Parametric Transforms
cs.CV 2019-06 unverdicted novelty 5.0

Replacing pointwise convolutions with DWHT yields a model with 79.1% fewer parameters, 48.4% fewer FLOPs, and 1.49% higher accuracy than MobileNet-V1 on CIFAR-100.
VeloxQ: A Fast and Efficient QUBO Solver
quant-ph 2025-01 unverdicted novelty 4.0

VeloxQ is a classical QUBO solver that reports competitive or superior performance and unique scalability to 10^8-variable sparse instances across benchmarks against quantum annealers, physics-inspired methods, and co...
Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems
cs.LG 2019-07 unverdicted novelty 4.0

Experiments show that shifted-ReLU layers can replace batch-normalization in single-bit-weight wide residual networks on CIFAR-10/100 and ImageNet without consistent accuracy penalty.