RotRNN: Modelling Long Sequences with Rotations
read the original abstract
Linear recurrent neural networks, such as State Space Models (SSMs) and Linear Recurrent Units (LRUs), have recently shown state-of-the-art performance on long sequence modelling benchmarks. Despite their success, their empirical performance is not well understood and they come with a number of drawbacks, most notably their complex initialisation and normalisation schemes. In this work, we address some of these issues by proposing RotRNN -- a linear recurrent model which utilises the convenient properties of rotation matrices. We show that RotRNN provides a simple and efficient model with a robust normalisation procedure, and a practical implementation that remains faithful to its theoretical derivation. RotRNN also achieves competitive performance to state-of-the-art linear recurrent models on several long sequence modelling datasets.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
A Survey of Mamba
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.