Bilinear gating of motor primitives: a principle linking dendritic computation to rapid goal-directed adaptation

Andrea Ciardiello; Cristiano Capone; Luca Falorsi; Luca Manneschi

arxiv: 2606.10891 · v1 · pith:TLVK4WBLnew · submitted 2026-06-09 · 🧬 q-bio.NC

Bilinear gating of motor primitives: a principle linking dendritic computation to rapid goal-directed adaptation

Cristiano Capone , Luca Falorsi , Andrea Ciardiello , Luca Manneschi This is my paper

Pith reviewed 2026-06-27 10:48 UTC · model grok-4.3

classification 🧬 q-bio.NC

keywords burst fractionmotor cortexdendritic coincidencebilinear gatinggoal-directed movementlayer-5 pyramidal neuronsreinforcement learningmacaque

0 comments

The pith

In macaque motor cortex, neuron burst fraction encodes reach direction more selectively than firing rate because it computes the product of goal and state via dendritic coincidence detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that burst fraction, rather than overall firing rate, selectively encodes goal information in motor cortex neurons. This dissociation arises from apical goal-related inputs coinciding with basal state-related drives in layer-5 pyramidal neurons, resulting in burst probability equaling the product of the two signals. A minimal two-compartment spiking model reproduces this bilinear gating, and when incorporated into a reinforcement-learning agent, it enables zero-shot generalization to new goals and rapid online adaptation. A reader would care because it connects a cellular mechanism directly to computational advantages in motor control and learning.

Core claim

Burst fraction of neurons in motor cortex encodes reach direction far more selectively than firing rate, as a direct consequence of dendritic coincidence detection in layer-5 pyramidal neurons where goal-related apical input coincides with state-related basal drive, so that burst probability computes the bilinear gate G(g) Y(s). This same multiplicative gate, when used in a reinforcement-learning agent, supports zero-shot generalisation to new goals and rapid online adaptation.

What carries the argument

The bilinear gate G(g) Y(s) arising from coincidence of apical goal input and basal state drive in layer-5 pyramidal neurons, which gates bursting as the product of goal and state signals.

If this is right

The observed dissociation between burst fraction and firing rate holds in every recording session across animals and labs.
A minimal two-compartment spiking model reproduces the selective encoding in bursts.
Embedding the multiplicative gate in an RL agent confers zero-shot generalisation to new goals.
The mechanism provides a computational rationale for segregating goal information into bursts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This form of gating could allow similar rapid adaptation in other cognitive tasks involving goal and state separation.
Disrupting apical inputs specifically might eliminate the goal selectivity in bursts while preserving firing rates.
Similar bilinear computations might be identifiable in other cortical regions using burst fraction analysis.

Load-bearing premise

The dissociation is produced by apical goal-related input coinciding with basal state-related drive in layer-5 pyramidal neurons rather than other cellular or network mechanisms.

What would settle it

Simultaneously controlling apical and basal inputs to layer-5 neurons in vitro and measuring whether burst probability equals the product of the two input strengths.

Figures

Figures reproduced from arXiv: 2606.10891 by Andrea Ciardiello, Cristiano Capone, Luca Falorsi, Luca Manneschi.

**Figure 1.** Figure 1: Burst fraction encodes reach direction more selectively than tonic spike rate. (A) Delayed center-out reaching task: after a hold and a go cue, the monkey reaches to one of several peripheral targets (red, active target). (B) Spikes of an example neuron across trials to its preferred direction, aligned to the go cue (500 ms window); each spike is classified as a burst spike if its interval to the preceding… view at source ↗

**Figure 2.** Figure 2: Burst-gated neural actor with voltage-based burst probability for goal-conditioned locomotion. (A) Two-compartment pyramidal neuron model. Feedforward basal current Ibas drives the somatic voltage Vs ; top-down apical current Iapi = g ·IAPI_SCALE (IAPI_SCALE = 4.0) drives the dendritic voltage Vd. Burst requires co-activation of both compartments. (B) Somatic membrane-potential traces (Vs , 300 ms) under t… view at source ↗

**Figure 3.** Figure 3: MLP-based bilinear actor-critic architecture. A. Scheme of our architecture: the actor and critic are decomposed into K parallel basis modules, policy primitives Yk(s) and value components ψk(s,a). B. Biological interpretation: sensory and contextual inputs affect the dynamics of pyramidal neurons in a multiplicative way. C. Scheme of the navigation task: a robot with 8 DOF is asked to move in a specific d… view at source ↗

**Figure 4.** Figure 4: Zero-shot learning. A. Task scheme: the MuJoCo Ant agent is pre-trained on target directions and tested on new ones. Pretrained bilinear agent is evaluated on unseen goal directions (or task descriptors) without any parameter update, by conditioning on g. B. Performance compared against baselines when switching to novel directions. C-D. Behavior trajectories for training and test directions, respectively, … view at source ↗

**Figure 5.** Figure 5: Zero-shot directional control and reward-weighted cold-start adaptation. (A) Actual movement direction as a function of commanded direction for three G magnitudes ( 1 2G¯, G¯, 2G¯, where G¯ is the mean zero-shot norm). Across all amplitudes, the ant tracks the commanded direction closely (dashed line: identity), demonstrating that the direction of G controls movement direction independently of its magnitud… view at source ↗

**Figure 6.** Figure 6: Multiplicative context-gating enables robust generalisation across floor friction conditions. (A) Principal component analysis of the G-module activations collected over 100 evaluation episodes per friction level. Each point represents the gate vector g = G(s,c) at a single timestep, coloured by friction coefficient (blue = low, red = high). Stars denote training frictions; circles denote novel test fricti… view at source ↗

**Figure 7.** Figure 7: Gating representations in SAC locomotion on the Unitree Go1 quadruped. (a) MuJoCo simulation environment. The robot is rewarded for walking in a direction θ ∈ {−45◦ , 0 ◦ , +45◦} (encoded as [cosθ,sinθ] appended to the observation), which changes randomly every 500 steps within each 1500-step episode. (b) Learning curves for four architectures (median ± IQR over 5 independent runs, averaged in chunks of 10… view at source ↗

read the original abstract

Movement requires the motor cortex to specify both \emph{what} action to produce and \emph{which goal} it serves, yet how individual neurons separate these factors is not understood. Here we show that in macaque motor cortex the \emph{burst fraction} of a neuron, the proportion of its spikes emitted in high-frequency bursts, encodes reach direction far more selectively than its overall firing rate. This dissociation is highly consistent: it holds in every one of 12 recording sessions spanning three animals and two laboratories (all $p<10^{-12}$) and survives controls that remove any contribution of firing rate, showing that goal information is concentrated specifically in bursts. We then show that this coding signature is the predicted consequence of dendritic coincidence detection in layer-5 pyramidal neurons: when a goal-related apical input coincides with a state-related basal drive the neuron bursts, so burst probability computes the product of goal and state, a bilinear gate $G(g)\,Y(s)$. A minimal two-compartment spiking model reproduces the effect, and the same multiplicative gate, embedded in a reinforcement-learning agent, supports zero-shot generalisation to new goals and rapid online adaptation, providing a computational rationale for segregating goal information into bursts. These results identify burst fraction as a goal-selective code in motor cortex, tie it to a concrete cellular mechanism, and show that the mechanism confers a learning advantage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Burst fraction selectivity for goals is a solid empirical result, but the dendritic bilinear gate remains an inference without compartment-level evidence.

read the letter

The main thing here is that the data show burst fraction encodes reach direction more selectively than mean rate across every session, but the link to apical-basal coincidence as a bilinear gate G(g)Y(s) is plausible rather than demonstrated.

The empirical dissociation stands out. It appears in all 12 sessions from three animals with p-values below 10^{-12} and survives the rate-removal controls. That is a clear observation worth noting on its own. The two-compartment model then shows how goal input on the apical side coinciding with state input on the basal side can produce burst probability as their product, which matches the reported pattern. Placing the same multiplicative gate inside an RL agent produces zero-shot generalization to new goals and faster online adaptation, which gives a computational reason why segregating the signal into bursts could be useful.

The paper handles the model part cleanly by keeping it minimal, so the logic is easy to check. The RL demonstration is straightforward and illustrates the claimed advantage without overclaiming a direct fit to the neural numbers.

The soft spot is exactly the one flagged in the stress-test note. Extracellular recordings do not resolve whether goal signals arrive apically, whether coincidence detection is necessary, or whether intrinsic bursting properties, inhibitory timing, or population effects could generate the same rate-independent selectivity. No compartment-specific measurements or targeted interventions are described, so the bilinear interpretation stays one hypothesis among alternatives. The RL section is also presented as a consequence rather than a parameter-matched test against the recorded statistics.

This is for motor neuroscientists and people modeling dendritic computation or adaptive control. A reader looking for a concrete cellular story tied to a learning advantage will get something usable from the data and the model, even if the mechanism needs more direct tests.

It deserves peer review. The statistical consistency is strong enough to warrant checking the full methods and controls, and the model is simple enough for referees to evaluate the claims directly.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that in macaque motor cortex, the burst fraction of layer-5 pyramidal neurons encodes reach direction more selectively than mean firing rate (consistent across all 12 sessions in 3 animals with p<10^{-12} and surviving rate-matched controls), that this dissociation arises because burst probability implements a bilinear gate G(g)Y(s) via apical goal-related coincidence with basal state-related drive, that a minimal two-compartment spiking model reproduces the effect, and that embedding the same multiplicative gate in an RL agent yields zero-shot generalization to new goals and rapid online adaptation.

Significance. If the mechanistic interpretation is supported, the work supplies a concrete cellular-to-computational link between dendritic coincidence detection and goal-directed motor adaptation, with the statistical consistency across sessions and the RL demonstration of learning advantages constituting clear strengths. The result would be significant for theories that separate goal and state representations at the single-neuron level.

major comments (2)

[Abstract / two-compartment spiking model] Abstract and the section presenting the two-compartment model: the central claim that burst probability computes the product G(g)Y(s) via apical-basal coincidence in L5 pyramids is an inference; the extracellular recordings establish only that burst fraction is more direction-selective than rate, while the model shows the dissociation is possible under the assumed compartmental assignment. No compartment-specific evidence (e.g., intracellular or optogenetic) is provided to rule out alternative cellular or circuit mechanisms (intrinsic bursting, inhibitory timing, or population synchrony), which is load-bearing for the proposed dendritic mechanism.
[RL agent section] RL demonstration section: the claim that the multiplicative gate supports zero-shot generalization and rapid adaptation is presented as a computational rationale, yet the manuscript does not report whether the RL parameters were tuned to match the observed neural statistics or chosen independently; without this, it is unclear whether the performance advantage is a genuine prediction or a consequence of the modeling choices.

minor comments (2)

[Methods] Methods: the controls that 'remove any contribution of firing rate' are described only at high level; explicit equations or pseudocode for how burst fraction is computed and how rate-matched surrogates are generated would improve clarity and reproducibility.
[Figures] Figure legends: several panels compare burst fraction and rate selectivity but do not state the exact statistical test or correction used for the reported p<10^{-12} values across sessions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help sharpen the presentation of our claims. We respond to each major comment below.

read point-by-point responses

Referee: [Abstract / two-compartment spiking model] Abstract and the section presenting the two-compartment model: the central claim that burst probability computes the product G(g)Y(s) via apical-basal coincidence in L5 pyramids is an inference; the extracellular recordings establish only that burst fraction is more direction-selective than rate, while the model shows the dissociation is possible under the assumed compartmental assignment. No compartment-specific evidence (e.g., intracellular or optogenetic) is provided to rule out alternative cellular or circuit mechanisms (intrinsic bursting, inhibitory timing, or population synchrony), which is load-bearing for the proposed dendritic mechanism.

Authors: We agree that the bilinear-gate interpretation is a mechanistic hypothesis supported by the two-compartment model rather than a direct experimental demonstration from the extracellular data. The recordings establish the greater direction selectivity of burst fraction (with the reported statistical consistency), while the model shows that apical-basal coincidence can produce this dissociation. We will revise the abstract and model section to state explicitly that the dendritic mechanism is proposed on the basis of the model and is consistent with the data, while noting that alternative cellular or circuit mechanisms remain possible and are not ruled out by the present recordings. revision: yes
Referee: [RL agent section] RL demonstration section: the claim that the multiplicative gate supports zero-shot generalization and rapid adaptation is presented as a computational rationale, yet the manuscript does not report whether the RL parameters were tuned to match the observed neural statistics or chosen independently; without this, it is unclear whether the performance advantage is a genuine prediction or a consequence of the modeling choices.

Authors: The RL parameters were chosen independently of the recorded neural statistics to demonstrate the general computational benefit of the multiplicative gate; they were not fitted to the observed burst fractions or firing rates. We will add an explicit statement to this effect in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is deductive from independent observations

full rationale

The paper reports an empirical dissociation (burst fraction more direction-selective than mean rate) that is statistically robust across sessions and survives rate controls. It then deduces that apical-basal coincidence in L5 pyramids produces burst probability as the product G(g)Y(s), demonstrates this with a minimal two-compartment model constructed from standard dendritic properties, and shows that the same gate confers RL advantages. None of these steps reduce to a fit of the target data, a self-citation chain, or a renaming; the neural statistics are external to the model construction and the bilinear form follows directly from the coincidence logic without parameter tuning to the observed selectivity ratios.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on the abstract alone, the central claim rests on the domain assumption that layer-5 pyramidal bursting implements a multiplicative coincidence gate; no explicit free parameters or new entities are introduced.

axioms (1)

domain assumption Apical goal input coinciding with basal state drive produces high-frequency bursts in layer-5 pyramidal neurons.
This premise converts the empirical burst-fraction observation into the bilinear gate G(g) Y(s).

pith-pipeline@v0.9.1-grok · 5788 in / 1338 out tokens · 36361 ms · 2026-06-27T10:48:13.796965+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 1 canonical work pages · 1 internal anchor

[1]

Thoroughman, K. A. & Shadmehr, R. Learning of action through adaptive combination of motor primitives.Nature407, 742–747 (2000)

2000
[2]

Larkum, M. E. A new cellular mechanism for coupling inputs to their outputs in pyramidal neurons.Nature Reviews Neuroscience14, 783–793 (2013)

2013
[3]

E., Zhu, J

Larkum, M. E., Zhu, J. J. & Sakmann, B. A new cellular mechanism for coupling inputs to their outputs in pyramidal neurons.Nature398, 338–341 (1999)

1999
[4]

& Segev, I

Hay, E., Hill, S., Schürmann, F., Markram, H. & Segev, I. Division of labor in the layer 5 apical dendrite.Frontiers in cellular neuroscience5, 28 (2011)

2011
[5]

Neuroscience research60, 268–279 (2008)

Urakubo, H.et al.Nonlinear dendritic integration in a functional model of rat layer 5 cortical pyramidal neuron. Neuroscience research60, 268–279 (2008)

2008
[6]

S., Anastassiou, C

Shai, A. S., Anastassiou, C. A., Larkum, M. E. & Koch, C. Physiology of layer 5 pyramidal neurons in mouse primary visual cortex: coincidence detection through bursting.PLoS Computational Biology11, e1004090 (2015). 8.Capone, C. & Falorsi, L. Adaptive behavior with stable synapses.Neural Networks108082 (2025)

2015
[7]

& Paolucci, P

Capone, C., Lupo, C., Muratore, P. & Paolucci, P. S. Beyond spiking networks: The computational advantages of dendritic amplification and input segregation.Proceedings of the National Academy of Sciences120, e2220743120 (2023). 10.Sherman, S. M. Thalamo-cortical interactions.Current Opinion in Neurobiology40, 78–84 (2016)

2023
[8]

& Markram, H

Maass, W., Natschäger, T. & Markram, H. Real-time computing without stable states: A new framework for neural computation based on perturbations.Neural computation14, 2531–2560 (2002)

2002
[9]

echo state

Jaeger, H. The "echo state" approach to analysing and training recurrent neural networks. Tech. Rep., German National Research Center for Information Technology (2001)

2001
[10]

& Strick, P

Rathelot, J.-A. & Strick, P. L. Motor cortical control of a fore-arm movement in 3d.Journal of Neuroscience29, 9859–9870 (2009)

2009
[11]

Rigoét, X.et al.The organization of prefrontal-subthalamic inputs in primates provides an anatomical substrate for both suppression and generation of action.Brain136, 1654–1667 (2013)

2013
[12]

& D’Esposito, M

Badre, D. & D’Esposito, M. Prefrontal cortex and hierarchical cognitive control.Annual review of neuroscience32, 167–191 (2009)

2009
[13]

Wise, S. P. The premotor cortex and the supplementary motor area: functional organisation.Trends in neurosciences8, 239–242 (1985)

1985
[14]

Sequential organization of multiple movements: involvement of cortical motor areas.Annual review of neuroscience 24, 631–651 (2001)

Tanji, J. Sequential organization of multiple movements: involvement of cortical motor areas.Annual review of neuroscience 24, 631–651 (2001). 18.Churchland, M. M.et al.Neural population dynamics during reaching.Nature487, 51–56 (2012)

2001
[15]

InNeurIPS 2021 Datasets and Benchmarks Track(2021)

Pei, F.et al.Neural latents benchmark ’21: Evaluating latent variable models of neural population activity. InNeurIPS 2021 Datasets and Benchmarks Track(2021)

2021
[16]

& Sprekeler, H

Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code.Proceedings of the National Academy of Sciences115, E6329–E6338 (2018)

2018
[17]

Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits.Nature Neuroscience24, 1010–1019 (2021)

2021
[18]

G., Hegemann, P

Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E. Active cortical dendrites modulate perception.Science354, 1587–1590 (2016)

2016
[19]

W., Pemberton, J., Mellor, J

Greedy, W., Zhu, H. W., Pemberton, J., Mellor, J. & Costa, R. P. Single-phase deep learning in cortico-cortical networks. InAdvances in Neural Information Processing Systems, vol. 35 (2022)

2022
[20]

Probabilistic logics and the synthesis of reliable organisms from unreliable components

von Neumann, J. Probabilistic logics and the synthesis of reliable organisms from unreliable components. In Shannon, C. E. & McCarthy, J. (eds.)Automata Studies, 43–98 (Princeton University Press, Princeton, NJ, 1956)

1956
[21]

Gaines, B. R. Stochastic computing systems. In Tou, J. T. (ed.)Advances in Information Systems Science, vol. 2, 37–172 (Plenum Press, New York, 1969)

1969
[22]

InAdvances in Neural Information Processing Systems (NeurIPS)(2017)

Barreto, A.et al.Successor features for transfer in reinforcement learning. InAdvances in Neural Information Processing Systems (NeurIPS)(2017)

2017
[23]

Barreto, A.et al.Transfer in reinforcement learning with successor features and generalized policy improvement.Journal of Machine Learning Research19, 1–64 (2018)

2018
[24]

M., Diedrichsen, J

Wolpert, D. M., Diedrichsen, J. & Flanagan, J. R. Principles of sensorimotor learning.Nature Reviews Neuroscience12, 739–751 (2011)

2011
[25]

OpenAI Gym

Todorov, E., Erez, T. & Tassa, Y . MuJoCo: A physics engine for model-based control. In2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 5026–5033 (IEEE, 2012). 18/19 30.Brockman, G.et al.OpenAI gym (2016).1606.01540

work page internal anchor Pith review Pith/arXiv arXiv 2012
[26]

& Levine, S

Haarnoja, T., Zhou, A., Abbeel, P. & Levine, S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. InProceedings of the 35th International Conference on Machine Learning (ICML), 1861–1870 (PMLR, 2018). 19/19

2018

[1] [1]

Thoroughman, K. A. & Shadmehr, R. Learning of action through adaptive combination of motor primitives.Nature407, 742–747 (2000)

2000

[2] [2]

Larkum, M. E. A new cellular mechanism for coupling inputs to their outputs in pyramidal neurons.Nature Reviews Neuroscience14, 783–793 (2013)

2013

[3] [3]

E., Zhu, J

Larkum, M. E., Zhu, J. J. & Sakmann, B. A new cellular mechanism for coupling inputs to their outputs in pyramidal neurons.Nature398, 338–341 (1999)

1999

[4] [4]

& Segev, I

Hay, E., Hill, S., Schürmann, F., Markram, H. & Segev, I. Division of labor in the layer 5 apical dendrite.Frontiers in cellular neuroscience5, 28 (2011)

2011

[5] [5]

Neuroscience research60, 268–279 (2008)

Urakubo, H.et al.Nonlinear dendritic integration in a functional model of rat layer 5 cortical pyramidal neuron. Neuroscience research60, 268–279 (2008)

2008

[6] [6]

S., Anastassiou, C

Shai, A. S., Anastassiou, C. A., Larkum, M. E. & Koch, C. Physiology of layer 5 pyramidal neurons in mouse primary visual cortex: coincidence detection through bursting.PLoS Computational Biology11, e1004090 (2015). 8.Capone, C. & Falorsi, L. Adaptive behavior with stable synapses.Neural Networks108082 (2025)

2015

[7] [7]

& Paolucci, P

Capone, C., Lupo, C., Muratore, P. & Paolucci, P. S. Beyond spiking networks: The computational advantages of dendritic amplification and input segregation.Proceedings of the National Academy of Sciences120, e2220743120 (2023). 10.Sherman, S. M. Thalamo-cortical interactions.Current Opinion in Neurobiology40, 78–84 (2016)

2023

[8] [8]

& Markram, H

Maass, W., Natschäger, T. & Markram, H. Real-time computing without stable states: A new framework for neural computation based on perturbations.Neural computation14, 2531–2560 (2002)

2002

[9] [9]

echo state

Jaeger, H. The "echo state" approach to analysing and training recurrent neural networks. Tech. Rep., German National Research Center for Information Technology (2001)

2001

[10] [10]

& Strick, P

Rathelot, J.-A. & Strick, P. L. Motor cortical control of a fore-arm movement in 3d.Journal of Neuroscience29, 9859–9870 (2009)

2009

[11] [11]

Rigoét, X.et al.The organization of prefrontal-subthalamic inputs in primates provides an anatomical substrate for both suppression and generation of action.Brain136, 1654–1667 (2013)

2013

[12] [12]

& D’Esposito, M

Badre, D. & D’Esposito, M. Prefrontal cortex and hierarchical cognitive control.Annual review of neuroscience32, 167–191 (2009)

2009

[13] [13]

Wise, S. P. The premotor cortex and the supplementary motor area: functional organisation.Trends in neurosciences8, 239–242 (1985)

1985

[14] [14]

Sequential organization of multiple movements: involvement of cortical motor areas.Annual review of neuroscience 24, 631–651 (2001)

Tanji, J. Sequential organization of multiple movements: involvement of cortical motor areas.Annual review of neuroscience 24, 631–651 (2001). 18.Churchland, M. M.et al.Neural population dynamics during reaching.Nature487, 51–56 (2012)

2001

[15] [15]

InNeurIPS 2021 Datasets and Benchmarks Track(2021)

Pei, F.et al.Neural latents benchmark ’21: Evaluating latent variable models of neural population activity. InNeurIPS 2021 Datasets and Benchmarks Track(2021)

2021

[16] [16]

& Sprekeler, H

Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code.Proceedings of the National Academy of Sciences115, E6329–E6338 (2018)

2018

[17] [17]

Payeur, A., Guerguiev, J., Zenke, F., Richards, B. A. & Naud, R. Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits.Nature Neuroscience24, 1010–1019 (2021)

2021

[18] [18]

G., Hegemann, P

Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E. Active cortical dendrites modulate perception.Science354, 1587–1590 (2016)

2016

[19] [19]

W., Pemberton, J., Mellor, J

Greedy, W., Zhu, H. W., Pemberton, J., Mellor, J. & Costa, R. P. Single-phase deep learning in cortico-cortical networks. InAdvances in Neural Information Processing Systems, vol. 35 (2022)

2022

[20] [20]

Probabilistic logics and the synthesis of reliable organisms from unreliable components

von Neumann, J. Probabilistic logics and the synthesis of reliable organisms from unreliable components. In Shannon, C. E. & McCarthy, J. (eds.)Automata Studies, 43–98 (Princeton University Press, Princeton, NJ, 1956)

1956

[21] [21]

Gaines, B. R. Stochastic computing systems. In Tou, J. T. (ed.)Advances in Information Systems Science, vol. 2, 37–172 (Plenum Press, New York, 1969)

1969

[22] [22]

InAdvances in Neural Information Processing Systems (NeurIPS)(2017)

Barreto, A.et al.Successor features for transfer in reinforcement learning. InAdvances in Neural Information Processing Systems (NeurIPS)(2017)

2017

[23] [23]

Barreto, A.et al.Transfer in reinforcement learning with successor features and generalized policy improvement.Journal of Machine Learning Research19, 1–64 (2018)

2018

[24] [24]

M., Diedrichsen, J

Wolpert, D. M., Diedrichsen, J. & Flanagan, J. R. Principles of sensorimotor learning.Nature Reviews Neuroscience12, 739–751 (2011)

2011

[25] [25]

OpenAI Gym

Todorov, E., Erez, T. & Tassa, Y . MuJoCo: A physics engine for model-based control. In2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 5026–5033 (IEEE, 2012). 18/19 30.Brockman, G.et al.OpenAI gym (2016).1606.01540

work page internal anchor Pith review Pith/arXiv arXiv 2012

[26] [26]

& Levine, S

Haarnoja, T., Zhou, A., Abbeel, P. & Levine, S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. InProceedings of the 35th International Conference on Machine Learning (ICML), 1861–1870 (PMLR, 2018). 19/19

2018