pith. sign in

super hub Mixed citations

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Mixed citation behavior. Most common role is background (62%).

125 Pith papers citing it
Background 62% of classified citations
abstract

In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.

hub tools

citation-role summary

background 8 method 3 baseline 1 other 1

citation-polarity summary

claims ledger

  • abstract In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.

authors

co-cited works

clear filters

representative citing papers

CanViT: Toward Active-Vision Foundation Models

cs.CV · 2026-03-23 · conditional · novelty 8.0

CanViT is the first task- and policy-agnostic AVFM pretrained via passive-to-active dense latent distillation on 13.2M scenes and 1B random glimpses, achieving 38.5% ADE20K mIoU in one glimpse and 84.5% ImageNet-1k top-1 after fine-tuning.

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

cs.LG · 2023-12-01 · unverdicted · novelty 8.0

Mamba is a linear-time sequence model using input-dependent selective SSMs that achieves SOTA results across modalities and matches twice-larger Transformers on language modeling with 5x higher inference throughput.

AdaState: Self-Evolving Anchors for Streaming Video Generation

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

AdaState replaces the static first-frame KV anchor with an evolving hidden latent that the model denoises alongside content, treating time as relative to enable recurrence and richer dynamics in streaming video generation.

Learning to Theorize the World from Observation

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

NEO is a probabilistic neural model that induces compositional programs as a learned Language of Thought from non-textual observations and executes them via a shared transition model to enable explanation-driven generalization.

Estimation--Prediction Tradeoff in Causal Probabilistic Temporal Graphs

cs.LG · 2026-06-26 · unverdicted · novelty 6.0

Characterizes an estimation-prediction tradeoff in binary logistic models for causal probabilistic temporal graphs and proposes a framework to jointly evaluate temporal link prediction with causal parameter recovery via Cramér-Rao bounds.

Kolmogorov-Arnold Reservoir Computing

cs.LG · 2026-06-18 · unverdicted · novelty 6.0 · 2 refs

KARC is a lightweight KAN-style reservoir that admits closed-form training and outperforms standard reservoir computing on PDE benchmarks at comparable cost.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • Adaptive Learned State Estimation based on KalmanNet cs.RO · 2026-04-02 · unverdicted · none · ref 17 · internal anchor

    AM-KNet adds sensor-specific modules, hypernetwork conditioning on target type and pose, and Joseph-form covariance estimation to KalmanNet, yielding better accuracy and stability than base KalmanNet on nuScenes and View-of-Delft data.

  • Physics-based Digital Twins for Integrated Thermal Energy Systems Using Active Learning cs.LG · 2026-05-07 · unverdicted · none · ref 25 · internal anchor

    Active learning with physics-informed surrogates achieves comparable accuracy for a glycol heat exchanger digital twin using only one-fifth the high-fidelity simulation trajectories needed by random sampling.

  • A Survey on Deep Learning Techniques for Action Anticipation cs.CV · 2023-09-29 · unverdicted · none · ref 146 · internal anchor

    A literature survey reviewing deep learning approaches to action anticipation in everyday scenarios, with method classifications, dataset and metric summaries, and future directions.