Progressive Neural Networks
Pith reviewed 2026-05-12 16:07 UTC · model grok-4.3
The pith
Progressive neural networks learn sequences of tasks without forgetting by adding task-specific columns with lateral connections to prior features.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Progressive networks are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features, outperforming common baselines based on pretraining and finetuning across a wide variety of reinforcement learning tasks in Atari and 3D maze games.
What carries the argument
The progressive network architecture consisting of task-specific columns linked by lateral connections to features in all earlier columns.
If this is right
- The network can accumulate skills across a sequence of tasks without interference between them.
- Transfer of knowledge occurs at both low-level sensory features and high-level control policies.
- The approach outperforms standard pretraining and finetuning on Atari games and 3D navigation tasks.
- A sensitivity measure confirms the locations of useful feature reuse within the policy.
Where Pith is reading between the lines
- This column-based design may extend to domains outside reinforcement learning where tasks arrive over time.
- It could reduce the need to restart training from scratch when environments or goals change gradually.
- Scaling the number of columns might eventually require mechanisms to manage computational cost.
Load-bearing premise
Lateral connections between columns will reliably produce positive transfer across tasks without introducing harmful interference.
What would settle it
If progressive networks exhibit significant forgetting of prior tasks or underperform fine-tuning on a sequence of reinforcement learning tasks, the central claim would be falsified.
read the original abstract
Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivity measure, we demonstrate that transfer occurs at both low-level sensory and high-level control layers of the learned policy.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces progressive neural networks for continual learning in RL: a new column is added per task, prior columns are frozen to prevent forgetting, and lateral connections from previous columns to the new one enable transfer of features. The architecture is evaluated on Atari games and 3D maze navigation tasks, with claims of outperformance over pretraining and finetuning baselines plus a sensitivity analysis showing transfer at sensory and control layers.
Significance. If the performance gains can be shown to arise from the lateral transfer mechanism rather than capacity scaling, the approach offers a concrete, scalable architecture for avoiding catastrophic forgetting while reusing knowledge across tasks. This would be a useful contribution to multi-task and lifelong RL, with the sensitivity measure providing a starting point for analyzing where transfer occurs.
major comments (2)
- [Evaluation] Evaluation section: the central claim that lateral connections enable positive transfer (and thus outperformance) is not isolated from the fact that total model capacity grows linearly with the number of tasks. No capacity-matched baseline (e.g., a single larger network with equivalent total parameters) or lateral-connection ablation is reported, so the evidence that gains are due to transfer rather than extra parameters remains indirect.
- [Sensitivity analysis] The sensitivity measure is introduced to quantify cross-column influence, but without reported numerical values, error bars, or controls for task difficulty, it is unclear how strongly it supports the claim of transfer at both low- and high-level layers.
minor comments (2)
- [Abstract] The abstract states outperformance but supplies no quantitative metrics, task counts, or statistical details; these should be summarized with key numbers and error bars for readers.
- [Methods] Notation for the lateral connections and column indexing could be clarified with a single diagram or equation set early in the methods.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment point by point below, acknowledging where the concerns are valid and indicating the revisions we will make to strengthen the manuscript.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: the central claim that lateral connections enable positive transfer (and thus outperformance) is not isolated from the fact that total model capacity grows linearly with the number of tasks. No capacity-matched baseline (e.g., a single larger network with equivalent total parameters) or lateral-connection ablation is reported, so the evidence that gains are due to transfer rather than extra parameters remains indirect.
Authors: We agree that the current evaluation does not fully isolate the contribution of lateral connections from the increase in total model capacity, as progressive networks add new columns (and thus parameters) for each task. The pretraining and finetuning baselines use fixed-capacity networks equivalent to a single column, which is the standard comparison in this setting, but a capacity-matched single-network baseline would indeed provide stronger evidence. We will add a dedicated discussion of this limitation in the revised manuscript and include an ablation or capacity-matched comparison where feasible with existing compute resources. This revision will clarify the role of the lateral transfer mechanism while preserving the core result that the architecture avoids catastrophic forgetting. revision: partial
-
Referee: [Sensitivity analysis] The sensitivity measure is introduced to quantify cross-column influence, but without reported numerical values, error bars, or controls for task difficulty, it is unclear how strongly it supports the claim of transfer at both low- and high-level layers.
Authors: The sensitivity analysis is presented via figures in the manuscript showing relative influence across layers. To address this, we will revise the relevant section to explicitly report the numerical sensitivity values, include error bars from multiple runs, and add a brief discussion of how task difficulty was accounted for in the analysis. These additions will provide quantitative support for the observation that transfer occurs at both sensory and control layers. revision: yes
Circularity Check
No circularity: empirical architecture evaluated on external benchmarks
full rationale
The paper introduces progressive neural networks as an architecture for continual RL, with lateral connections for transfer and frozen columns to prevent forgetting. It reports performance on Atari and 3D maze tasks against pretraining/finetuning baselines, plus a sensitivity analysis for transfer. No equations, first-principles derivations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text or abstract. The central claims rest on external empirical comparisons rather than internal definitions or tautological reductions, satisfying the self-contained benchmark criterion.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
HierarchyEmergencehierarchy_emergence_forces_phi contradicts?
contradictsCONTRADICTS: the theorem conflicts with this paper passage, or marks a claim that would need revision before publication.
the addition of new capacity alongside pretrained networks gives these models the flexibility to both reuse old computations and learn new ones
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 60 Pith papers
-
ReConText3D: Replay-based Continual Text-to-3D Generation
ReConText3D is the first replay-memory framework for continual text-to-3D generation that prevents catastrophic forgetting on new textual categories while preserving quality on previously seen classes.
-
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning
LIBERO is a new benchmark for lifelong robot learning that evaluates transfer of declarative, procedural, and mixed knowledge across 130 manipulation tasks with provided demonstration data.
-
MedCRP-CL: Continual Medical Image Segmentation via Bayesian Nonparametric Semantic Modality Discovery
MedCRP-CL discovers semantic modalities online via CRP from text prompts and maintains modality-specific LoRA adapters with intra-modality EWC, achieving 73.3% Dice and 4.1% forgetting on 16 tasks while using 6x fewer...
-
Continual Learning of Domain-Invariant Representations
Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, med...
-
Matrix-Space Reinforcement Learning for Reusing Local Transition Geometry
MSRL represents trajectory segments as PSD matrices to prove additive composition properties and bootstrap value functions for better transfer, reaching 0.73 AUC versus 0.57-0.65 baselines.
-
KAN-CL: Per-Knot Importance Regularization for Continual Learning with Kolmogorov-Arnold Networks
KAN-CL cuts catastrophic forgetting by 88-93% on Split-CIFAR-10/5T and Split-CIFAR-100/10T by anchoring KAN parameters at per-knot granularity while matching baseline accuracy.
-
MIST: Reliable Streaming Decision Trees for Online Class-Incremental Learning via McDiarmid Bound
MIST fixes unreliable splits in streaming decision trees for class-incremental learning by using a K-independent McDiarmid bound on Gini impurity, Bayesian moment projection for knowledge transfer, and KLL quantile sk...
-
MIST: Reliable Streaming Decision Trees for Online Class-Incremental Learning via McDiarmid Bound
MIST fixes unreliable splits in streaming decision trees for class-incremental learning by replacing Hoeffding-style bounds with a K-independent McDiarmid radius on Gini, plus Bayesian parent-to-child inheritance and ...
-
Dynamic Full-body Motion Agent with Object Interaction via Blending Pre-trained Modular Controllers
A two-stage framework augments HOI data with dynamic priors and blends pre-trained dynamic motion and static interaction agents via a composer network to enable long-term dynamic human-object interactions with higher ...
-
Beyond Forgetting in Continual Medical Image Segmentation: A Comprehensive Benchmark Study
Benchmark experiments in continual medical image segmentation reveal that no single method satisfies all clinical requirements, with replay-based approaches offering the best stability-plasticity trade-off while forwa...
-
Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay
A structure-aware VAE generates realistic FC matrices for replay, combined with multi-level knowledge distillation and hierarchical contextual bandit sampling, to enable continual fMRI-based brain disorder diagnosis a...
-
EMBER: Autonomous Cognitive Behaviour from Learned Spiking Neural Network Dynamics in a Hybrid LLM Architecture
A hybrid SNN-LLM system uses learned spiking dynamics and lateral STDP propagation to trigger LLM actions without external prompts, producing the first autonomous action after 7 exchanges from a clean start.
-
SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning
SafeAdapt certifies a Rashomon set of safe policies from demonstration data and projects updates from arbitrary RL algorithms onto it to guarantee preservation of safety on source tasks.
-
SLE-FNO: Single-Layer Extensions for Task-Agnostic Continual Learning in Fourier Neural Operators
SLE-FNO achieves zero forgetting and strong plasticity-stability balance in continual learning for FNO surrogate models of pulsatile blood flow by adding minimal single-layer extensions across four out-of-distribution tasks.
-
Prism: Policy Reuse via Interpretable Strategy Mapping in Reinforcement Learning
PRISM transfers RL policies zero-shot by aligning causally validated discrete concepts from agent encoders, achieving 69-76% win rates in Go 7x7 but random performance in Atari Breakout.
-
Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting
The paper offers a comprehensive survey and proposes a new taxonomy for continual learning strategies in VLMs and MLLMs to combat catastrophic forgetting beyond traditional methods.
-
A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA
LoRA adapters should be scaled by 1/sqrt(rank) rather than 1/rank to stabilize learning and enable effective use of higher ranks during fine-tuning of large language models.
-
A Generalist Agent
Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.
-
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI Five achieved superhuman performance in Dota 2 by defeating the world champions using scaled self-play reinforcement learning.
-
NetTailor: Tuning the Architecture, Not Just the Weights
NetTailor adapts CNN architecture for new tasks by assembling pre-trained universal blocks with task-specific layers, trained via activation mimicry and complexity penalties to match accuracy while reducing size for s...
-
Understanding Goal Generalisation in Sequential Reinforcement Learning
Empirical analysis of over 100 sequential RL training pipelines across 250+ OOD environments finds salient features drive generalization and early goals persist, with latent policy gradients simulating latent variable...
-
Expandable, Compressible, Mineable: Open-World Thermal Image Restoration
ECMRNet is a continual-learning restoration network that decomposes features into isolated groups, expands new groups for novel degradations, prunes via structural entropy, and mines historical components for compound...
-
NeuroMAS: Multi-Agent Systems as Neural Networks with Joint Reinforcement Learning
NeuroMAS reframes multi-agent language systems as neural architectures where LLM agents learn coordination via reinforcement learning rather than predefined roles.
-
TFGN: Task-Free, Replay-Free Continual Pre-Training Without Catastrophic Forgetting at LLM Scale
TFGN is an architectural overlay for transformers enabling task-free, replay-free continual pre-training across heterogeneous domains at LLM scale with near-zero backward transfer and high gradient orthogonality.
-
MoRe: Modular Representations for Principled Continual Representation Learning on Sequential Data
MoRe identifies modular structure in representations themselves to enable principled reuse, alignment, and expansion of modules during continual adaptation on sequential data.
-
DIMoE-Adapters: Dynamic Expert Evolution for Continual Learning in Vision-Language Models
DIMoE-Adapters uses self-calibrated expert evolution and prototype-guided selection to dynamically grow and allocate experts, outperforming prior continual learning methods on vision-language models.
-
Shortcut Solutions Learned by Transformers Impair Continual Compositional Reasoning
BERT learns shortcut solutions that impair generalization and forward transfer in continual LEGO, while ALBERT learns loop-like solutions for better performance, yet both fail at cross-experience composition, with ALB...
-
MILE: Mixture of Incremental LoRA Experts for Continual Semantic Segmentation across Domains and Modalities
MILE combines incremental LoRA experts with prototype-guided gating to support continual semantic segmentation across domains and modalities while adding only a small number of parameters per task.
-
Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting
Sharpness-aware pretraining and related flat-minima interventions reduce catastrophic forgetting by up to 80% after post-training across 20M-150M models and by 31-40% at 1B scale.
-
NORACL: Neurogenesis for Oracle-free Resource-Adaptive Continual Learning
NORACL dynamically grows network capacity via neurogenesis-inspired signals to achieve oracle-level continual learning performance without pre-specifying architecture size.
-
Cortex-Inspired Continual Learning: Unsupervised Instantiation and Recovery of Functional Task Networks
FTN achieves near-zero forgetting on continual learning benchmarks by isolating task subnetworks via self-organizing binary masks generated through gradient descent, smoothing, and k-winner-take-all.
-
Learning Without Losing Identity: Capability Evolution for Embodied Agents
Embodied agents maintain a persistent identity while evolving capabilities via modular ECMs, raising simulated task success from 32.4% to 91.3% over 20 iterations with zero policy drift or safety violations.
-
Learning Without Losing Identity: Capability Evolution for Embodied Agents
Embodied agents maintain persistent identity while evolving modular capabilities through a closed-loop process, raising simulated task success from 32.4% to 91.3% with zero policy drift.
-
Information as Structural Alignment: A Dynamical Theory of Continual Learning
IBF achieves near-zero forgetting and positive backward transfer in continual learning by driving configurations toward coherence through motion and modification dynamics without storing raw data.
-
When Modalities Remember: Continual Learning for Multimodal Knowledge Graphs
MRCKG combines a multimodal-structural curriculum, cross-modal preservation, and contrastive replay to let multimodal knowledge graphs learn new entities and relations over time without catastrophic forgetting.
-
Adaptive Memory Crystallization for Autonomous AI Agent Learning in Dynamic Environments
AMC models memory consolidation via a Liquid-Glass-Crystal process governed by an SDE with proven convergence to a Beta distribution, yielding 34-43% better forward transfer and 67-80% less forgetting on standard cont...
-
Evidence of an Emergent "Self" in Continual Robot Learning
Continual learning robots form a significantly more stable invariant subnetwork than constant-task controls, and preserving it improves adaptation while damaging it hurts performance.
-
Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning
CPNS regularization with dual counterfactual generators mitigates intra-task and inter-task spurious correlations in class-incremental learning feature expansion.
-
CrispEdit: Low-Curvature Projections for Scalable Non-Destructive LLM Editing
CrispEdit edits LLMs via low-curvature projections using Bregman divergence and K-FAC approximations, achieving high edit success with under 1% average capability degradation.
-
Robust Policy Optimization to Prevent Catastrophic Forgetting
FRPO applies a max-min robust optimization over KL-bounded policy neighborhoods during RLHF to reduce catastrophic forgetting of safety and accuracy under subsequent SFT or RL fine-tuning.
-
CLARE: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and Expansion
CLARE is an exemplar-free continual learning framework for VLAs that autonomously expands modular adapters based on feature similarity and uses autoencoder routing for label-free deployment.
-
Continually Evolving Skill Knowledge in Vision Language Action Model
Stellar VLA achieves continual learning in VLA models by maintaining a growing knowledge space and routing tasks to specialized experts conditioned on semantic relations, delivering strong LIBERO benchmark results wit...
-
Routing-Based Continual Learning for Multimodal Large Language Models
Routing architecture for MLLMs enables continual learning with constant compute, matching multi-task learning performance and supporting cross-modal transfer.
-
A Survey of Continual Reinforcement Learning
The paper surveys CRL literature, proposes a taxonomy of methods into four categories based on knowledge storage and transfer, reviews metrics and benchmarks, and outlines challenges and future research directions.
-
Little by Little: Continual Learning via Incremental Mixture of Rank-1 Associative Memory Experts
MoRAM frames continual learning as incremental addition of rank-1 adapters viewed as self-activating key-value associative memory units in a mixture-of-experts setup.
-
No Forgetting Learning: Buffer-free Continual Learning Classification
NFL is a buffer-free continual learning framework that decomposes networks, applies stepwise freezing with knowledge distillation, and adds an auto-encoder in NFL+ to match replay-based performance on image benchmarks...
-
Continual Domain Randomization
Continual Domain Randomization trains RL policies sequentially on randomization parameter subsets with continual learning to achieve robust sim-to-real transfer in robotic reaching and grasping.
-
Scalable and Efficient Continual Learning from Demonstration via a Hypernetwork-generated Stable Dynamics Model
A hypernetwork generates clock-augmented stable neural ODEs (sNODEs) for scalable continual learning from demonstration, achieving O(N) training time via stochastic regularization while outperforming baselines on LfD ...
-
Attentive Multi-Task Deep Reinforcement Learning
Attention mechanism dynamically groups task knowledge at state granularity in multi-task DRL to enable positive transfer and avoid negative transfer, matching or exceeding prior methods with fewer parameters.
-
Lifelong Learning Starting From Zero
A blank-slate neural network grows via expansion, generalization, forgetting, and backpropagation for lifelong learning with claimed gains in accuracy, efficiency, and versatility.
-
Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction
CDAN framework uses diversity exploration and adversarial self-correction for continual RL in continuous control, evaluated on new CAM environment with NSD metric showing 18.35% NSD improvement over baseline.
-
HyLoVQA: Dynamic Hypernetwork-Generated Low-Rank Adaptation for Continual Visual Question Answering
HyLoVQA combines an anchor memory bank with hypernetwork-generated LoRA adapters and an alignment loss to adapt to new VQA tasks while limiting interference with prior knowledge.
-
Tunable MAGMAX: Preference-Aware Model Merging for Continual Learning
Tunable MAGMAX adds a tunable preference vector to model merging for continual learning, enabling automatic adaptation to target environments using small amounts of data while maintaining or improving task-wise performance.
-
CP-MoE: Consistency-Preserving Mixture-of-Experts for Continual Learning
CP-MoE uses a transient expert, consistency-preserving routing bias, and guided regularization to reduce catastrophic forgetting in MoE-based LLMs and VLMs while preserving cross-task transfer, reporting SOTA on Super...
-
On the Stability of Growth in Structural Plasticity
Newborn units in growing neural networks are forward-active but backward-starved, receiving weaker gradients than existing units and creating integration challenges that make growth less reliable than pruning in compl...
-
MoRe: Modular Representations for Principled Continual Representation Learning on Sequential Data
MoRe identifies modular representations in sequential data for continual learning with identifiability guarantees, enabling principled adaptation without disrupting old modules.
-
MoRe: Modular Representations for Principled Continual Representation Learning on Sequential Data
MoRe decomposes representations into identifiable hierarchical modules to enable principled continual adaptation on sequential data.
-
FLAME: Adaptive Mixture-of-Experts for Continual Multimodal Multi-Task Learning
FLAME is an MoE architecture using modality-specific routers and low-rank compression of expert knowledge to support efficient continual multimodal multi-task learning while reducing catastrophic forgetting.
-
Learning Material-Aware Hamiltonian Risk Fields for Safe Navigation
A learned context-energy term in port-Hamiltonian policies creates selective risk navigation that activates evasive forces only when safer paths are available.
-
A Domain Incremental Continual Learning Benchmark for ICU Time Series Model Transportability
Proposes a domain incremental continual learning benchmark for ICU time series model transportability across US regions and evaluates data replay and EWC methods.
Reference graph
Works this paper leans on
-
[1]
Adaptive multi-column deep neural networks with application to robust image denoising
Forest Agostinelli, Michael R Anderson, and Honglak Lee. Adaptive multi-column deep neural networks with application to robust image denoising. In Advances in Neural Information Processing Systems, 2013
work page 2013
-
[2]
Natural gradient works efficiently in learning
Shun-ichi Amari. Natural gradient works efficiently in learning. Neural Computation, 1998
work page 1998
-
[3]
M. G. Bellemare, Y . Naddaf, J. Veness, and M. Bowling. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research (JAIR), 47:253–279, 2013
work page 2013
-
[4]
Deep learning of representations for unsupervised and transfer learning
Yoshua Bengio. Deep learning of representations for unsupervised and transfer learning. In JMLR: Workshop on Unsupervised and Transfer Learning, 2012
work page 2012
-
[5]
Ciresan, Ueli Meier, and Jürgen Schmidhuber
Dan C. Ciresan, Ueli Meier, and Jürgen Schmidhuber. Multi-column deep neural networks for image classification. In Conf. on Computer Vision and Pattern Recognition, 2012
work page 2012
-
[6]
Scott E. Fahlman and Christian Lebiere. The cascade-correlation learning architecture. In Advances in Neural Information Processing Systems, 1990
work page 1990
-
[7]
G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, July 2006
work page 2006
-
[8]
Distilling the Knowledge in a Neural Network
Goeff Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. CoRR, abs/1503.02531, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[9]
Yann LeCun, John S. Denker, and Sara A. Solla. Optimal brain damage. InAdvances in Neural Information Processing Systems, 1990
work page 1990
-
[10]
Min Lin, Qiang Chen, and Shuicheng Yan. Network in network. In Proc. of Int’l Conference on Learning Representations (ICLR), 2013
work page 2013
-
[11]
G. Mesnil, Y . Dauphin, X. Glorot, S. Rifai, Y . Bengio, I. Goodfellow, E. Lavoie, X. Muller, G. Desjardins, D. Warde-Farley, P. Vincent, A. Courville, and J. Bergstra. Unsupervised and transfer learning challenge: a deep learning approach. In JMLR W& CP: Proc. of the Unsupervised and Transfer Learning challenge and workshop, volume 27, 2012
work page 2012
-
[12]
V . Mnih, Kk Kavukcuoglu, D. Silver, A. Rusu, J. Veness, M. Bellemare, A. Graves, M. Riedmiller, A. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015
work page 2015
-
[13]
Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu
V olodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In Int’l Conf. on Machine Learning (ICML), 2016
work page 2016
-
[14]
Actor-mimic: Deep multitask and transfer reinforcement learning
Emilio Parisotto, Lei Jimmy Ba, and Ruslan Salakhutdinov. Actor-mimic: Deep multitask and transfer reinforcement learning. In Proc. of Int’l Conference on Learning Representations (ICLR), 2016
work page 2016
-
[15]
Mark B. Ring. Continual Learning in Reinforcement Environments. R. Oldenbourg Verlag, 1995
work page 1995
-
[16]
Beyond sharing weights for deep domain adaptation
Artem Rozantsev, Mathieu Salzmann, and Pascal Fua. Beyond sharing weights for deep domain adaptation. CoRR, abs/1603.06432, 2016
-
[17]
A. Rusu, S. Colmenarejo, Ç. Gülçehre, G. Desjardins, J. Kirkpatrick, R. Pascanu, V . Mnih, K. Kavukcuoglu, and R. Hadsell. Policy distillation. abs/1511.06295, 2016
work page Pith review arXiv 2016
-
[18]
Ella: An efficient lifelong learning algorithm
Paul Ruvolo and Eric Eaton. Ella: An efficient lifelong learning algorithm. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), June 2013
work page 2013
-
[19]
Silver, Qiang Yang, and Lianghao Li
Daniel L. Silver, Qiang Yang, and Lianghao Li. Lifelong machine learning systems: Beyond learning algorithms. In AAAI Spring Symposium: Lifelong Machine Learning, 2013
work page 2013
-
[20]
Matthew E. Taylor and Peter Stone. An introduction to inter-task transfer for reinforcement learning. AI Magazine, 32(1):15–34, 2011
work page 2011
-
[21]
Terekhov, Guglielmo Montone, and J
Alexander V . Terekhov, Guglielmo Montone, and J. Kevin O’Regan. Knowledge Transfer in Deep Block-Modular Neural Networks, pages 268–279. Springer International Publishing, Cham, 2015
work page 2015
-
[22]
C. Tessler, S. Givony, T. Zahavy, D. J. Mankowitz, and S. Mannor. A Deep Hierarchical Approach to Lifelong Learning in Minecraft. ArXiv e-prints, 2016
work page 2016
-
[23]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems, pages 3320–3328, 2014
work page 2014
-
[24]
Online incremental feature learning with denoising autoencoders
Guanyu Zhou, Kihyuk Sohn, and Honglak Lee. Online incremental feature learning with denoising autoencoders. In Proc. of Int’l Conf. on Artificial Intelligence and Statistics (AISTATS), pages 1453–1461, 2012. 9 Supplementary Material A Perturbation Analysis We explored two related methods for analysing transfer in progressive networks. One based on Fisher i...
work page 2012
-
[25]
Grey line determines critical noise magnitude for each representation, σ2 i . (b-c) Comparison of per-layer sensitivities obtained using the APS method (b) and the AFS method (c; as per main text). These are highly similar. DefineΛ(k) i = 1/σ2(k) i as the precision of the noise injected at layeri of columnk, which results in a 50% drop in performance. The ...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.