pith. machine review for the scientific record. sign in

arxiv: 1312.6026 · v5 · submitted 2013-12-20 · 💻 cs.NE · cs.LG· stat.ML

Recognition: unknown

How to Construct Deep Recurrent Neural Networks

Authors on Pith no claims yet
classification 💻 cs.NE cs.LGstat.ML
keywords deepneuralrnnsrecurrentdepthfunctionnetworksnovel
0
0 comments X
read the original abstract

In this paper, we explore different ways to extend a recurrent neural network (RNN) to a \textit{deep} RNN. We start by arguing that the concept of depth in an RNN is not as clear as it is in feedforward neural networks. By carefully analyzing and understanding the architecture of an RNN, however, we find three points of an RNN which may be made deeper; (1) input-to-hidden function, (2) hidden-to-hidden transition and (3) hidden-to-output function. Based on this observation, we propose two novel architectures of a deep RNN which are orthogonal to an earlier attempt of stacking multiple recurrent layers to build a deep RNN (Schmidhuber, 1992; El Hihi and Bengio, 1996). We provide an alternative interpretation of these deep RNNs using a novel framework based on neural operators. The proposed deep RNNs are empirically evaluated on the tasks of polyphonic music prediction and language modeling. The experimental result supports our claim that the proposed deep RNNs benefit from the depth and outperform the conventional, shallow RNNs.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Pointer Sentinel Mixture Models

    cs.CL 2016-09 conditional novelty 7.0

    Pointer sentinel-LSTM mixes context copying with softmax prediction to reach 70.9 perplexity on Penn Treebank using fewer parameters than standard LSTMs.

  2. Cortico-cerebellar modularity as an architectural inductive bias for efficient temporal learning

    q-bio.NC 2026-05 unverdicted novelty 5.0

    CB-RNNs with a cerebellar feedforward module learn temporal tasks faster than matched RNNs, with the module driving efficiency even after freezing the recurrent core as a fixed reservoir.