A Biologically Plausible Learning Rule for Deep Learning in the Brain

Isabella Pozzi; Pieter Roelfsema; Sander Boht\'e

arxiv: 1811.01768 · v3 · pith:MICJWXMGnew · submitted 2018-11-05 · 💻 cs.NE

A Biologically Plausible Learning Rule for Deep Learning in the Brain

Isabella Pozzi , Sander Boht\'e , Pieter Roelfsema This is my paper

classification 💻 cs.NE

keywords learningbiologicallydeepbrainnetworkstaskscifar100complexity

0 comments

read the original abstract

Researchers have proposed that deep learning, which is providing important progress in a wide range of high complexity tasks, might inspire new insights into learning in the brain. However, the methods used for deep learning by artificial neural networks are biologically unrealistic and would need to be replaced by biologically realistic counterparts. Previous biologically plausible reinforcement learning rules, like AGREL and AuGMEnT, showed promising results but focused on shallow networks with three layers. Will these learning rules also generalize to networks with more layers and can they handle tasks of higher complexity? We demonstrate the learning scheme on classical and hard image-classification benchmarks, namely MNIST, CIFAR10 and CIFAR100, cast as direct reward tasks, both for fully connected, convolutional and locally connected architectures. We show that our learning rule - Q-AGREL - performs comparably to supervised learning via error-backpropagation, with this type of trial-and-error reinforcement learning requiring only 1.5-2.5 times more epochs, even when classifying 100 different classes as in CIFAR100. Our results provide new insights into how deep learning may be implemented in the brain.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Spike-based alignment learning solves the weight transport problem
q-bio.NC 2025-03 unverdicted novelty 6.0

SAL is a spike-timing-based local learning rule that aligns feedback weights to forward weights in spiking networks by exploiting noise and Hebbian/anti-Hebbian plasticity to recover the true gradient.