pith. machine review for the scientific record. sign in

hub

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

35 Pith papers cite this work. Polarity classification is still indexing.

35 Pith papers citing it
abstract

In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.

hub tools

claims ledger

  • abstract In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.

co-cited works

clear filters

representative citing papers

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

cs.LG · 2023-12-01 · unverdicted · novelty 8.0

Mamba is a linear-time sequence model using input-dependent selective SSMs that achieves SOTA results across modalities and matches twice-larger Transformers on language modeling with 5x higher inference throughput.

Learning to Theorize the World from Observation

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.

UniDetect: LLM-Driven Universal Fraud Detection across Heterogeneous Blockchains

cs.CR · 2026-04-14 · unverdicted · novelty 6.0

UniDetect is an LLM-based system that generates universal transaction summary texts and uses two-stage multimodal training on text plus graphs to detect fraudulent accounts across heterogeneous blockchains, outperforming baselines by 5.57-7.58% KS and achieving over 94.58% zero-shot cross-chain and

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

cs.CL · 2025-06-16 · unverdicted · novelty 6.0

MiniMax-M1 is a 456B parameter hybrid-attention MoE model trained with CISPO RL that achieves performance comparable or superior to DeepSeek-R1 and Qwen3-235B on reasoning and software engineering tasks while training in three weeks on 512 GPUs.

ReMedi: Reasoner for Medical Clinical Prediction

cs.CL · 2026-05-02 · unverdicted · novelty 5.0

ReMedi boosts LLM performance on EHR clinical predictions by up to 19.9% F1 through ground-truth-guided rationale regeneration and fine-tuning.

citing papers explorer

Showing 34 of 34 citing papers after filters.