pith. sign in

arxiv: 2407.07239 · v3 · pith:EOKBE5XPnew · submitted 2024-07-09 · 💻 cs.LG · stat.ML

RotRNN: Modelling Long Sequences with Rotations

classification 💻 cs.LG stat.ML
keywords linearrecurrentrotrnnlongmodellingperformancemodelmodels
0
0 comments X
read the original abstract

Linear recurrent neural networks, such as State Space Models (SSMs) and Linear Recurrent Units (LRUs), have recently shown state-of-the-art performance on long sequence modelling benchmarks. Despite their success, their empirical performance is not well understood and they come with a number of drawbacks, most notably their complex initialisation and normalisation schemes. In this work, we address some of these issues by proposing RotRNN -- a linear recurrent model which utilises the convenient properties of rotation matrices. We show that RotRNN provides a simple and efficient model with a robust normalisation procedure, and a practical implementation that remains faithful to its theoretical derivation. RotRNN also achieves competitive performance to state-of-the-art linear recurrent models on several long sequence modelling datasets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Survey of Mamba

    cs.LG 2024-08 unverdicted novelty 2.0

    The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.