super hub Canonical reference

Efficiently Modeling Long Sequences with Structured State Spaces

Albert Gu, Karan Goel · 2021 · cs.LG · arXiv 2111.00396

Canonical reference. 77% of citing Pith papers cite this work as background.

128 Pith papers citing it

Background 77% of classified citations

open full Pith review browse 128 citing papers more from Albert Gu arXiv PDF

abstract

A central goal of sequence modeling is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies. Although conventional models including RNNs, CNNs, and Transformers have specialized variants for capturing long dependencies, they still struggle to scale to very long sequences of $10000$ or more steps. A promising recent approach proposed modeling sequences by simulating the fundamental state space model (SSM) $ x'(t) = Ax(t) + Bu(t), y(t) = Cx(t) + Du(t) $, and showed that for appropriate choices of the state matrix $ A $, this system could handle long-range dependencies mathematically and empirically. However, this method has prohibitive computation and memory requirements, rendering it infeasible as a general sequence modeling solution. We propose the Structured State Space sequence model (S4) based on a new parameterization for the SSM, and show that it can be computed much more efficiently than prior approaches while preserving their theoretical strengths. Our technique involves conditioning $ A $ with a low-rank correction, allowing it to be diagonalized stably and reducing the SSM to the well-studied computation of a Cauchy kernel. S4 achieves strong empirical results across a diverse range of established benchmarks, including (i) 91\% accuracy on sequential CIFAR-10 with no data augmentation or auxiliary losses, on par with a larger 2-D ResNet, (ii) substantially closing the gap to Transformers on image and language modeling tasks, while performing generation $60\times$ faster (iii) SoTA on every task from the Long Range Arena benchmark, including solving the challenging Path-X task of length 16k that all prior work fails on, while being as efficient as all competitors.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 20 method 5 baseline 1

citation-polarity summary

background 20 use method 5 baseline 1

claims ledger

abstract A central goal of sequence modeling is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies. Although conventional models including RNNs, CNNs, and Transformers have specialized variants for capturing long dependencies, they still struggle to scale to very long sequences of $10000$ or more steps. A promising recent approach proposed modeling sequences by simulating the fundamental state space model (SSM) $ x'(t) = Ax(t) + Bu(t), y(t) = Cx(t) + Du(t) $, and showed that for appropriate choices of the

authors

Albert Gu Christopher R\'e Karan Goel

co-cited works

representative citing papers

Rotation Equivariant Mamba for Vision Tasks

cs.CV · 2026-03-10 · unverdicted · novelty 8.0

EQ-VMamba adds rotation-equivariant cross-scan and group Mamba blocks to enforce end-to-end rotation equivariance, yielding better rotation robustness, competitive accuracy, and roughly 50% fewer parameters than non-equivariant baselines across classification, segmentation, and super-resolution.

Test-Time Training with KV Binding Is Secretly Linear Attention

cs.LG · 2026-02-24 · conditional · novelty 8.0

Test-time training with KV binding reduces to learned linear attention.

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

cs.CL · 2023-09-28 · unverdicted · novelty 8.0

Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.

MASS: Motion-Aligned Selective Scan for Refinement in Flow-Based Video Frame Interpolation

cs.CV · 2026-06-26 · unverdicted · novelty 7.0

MASS reformulates SSM-based feature scanning in flow-based VFI to follow dynamic motion trajectories via learnable path integration and velocity-aware sampling, claiming SOTA on challenging large-displacement cases.

AURA: Action-Gated Memory for Robot Policies at Constant VRAM

cs.AI · 2026-06-01 · unverdicted · novelty 7.0

AURA-Mem uses an action-gated recurrent memory trained on closed-loop action error to deliver constant 4,224-byte state and 5-9x fewer writes than baselines while matching base policy success on LIBERO-Long.

Trading Complexity for Expressivity Through Structured Generalized Linear Token Mixing

cs.LG · 2026-05-29 · unverdicted · novelty 7.0

Presents a structured generalized linear token mixing framework that extends recurrence equations to multiple past states, enabling new patterns with provable complexity-expressivity trade-offs for causal generation.

UWM-JEPA: Predictive World Models That Imagine in Belief Space

cs.LG · 2026-05-25 · unverdicted · novelty 7.0

UWM-JEPA uses a density-matrix latent and unitary predictor in JEPA to preserve joint-state spectrum during blind rollouts, achieving 0.77 accuracy on a five-step hidden-velocity task versus 0.53 for an LSTM baseline.

Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation

cs.CV · 2026-05-24 · unverdicted · novelty 7.0

MVCHead uses a hierarchical state space model with bi-directional scans and an SE(3) critic to enforce 3D consistency in Gaussian avatars trained only on 2D images.

Exact expression for maximum Lyapunov exponent during transients in computationally powerful dynamical networks

nlin.CD · 2026-05-20 · unverdicted · novelty 7.0

Exact analytical expression for the time-dependent maximum Lyapunov exponent during transients in a network supporting dynamics-based computation.

Social-Mamba: Socially-Aware Trajectory Forecasting with State-Space Models

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

Social-Mamba introduces a Cycle Mamba block and social triplet factorization to achieve state-of-the-art trajectory forecasting accuracy with linear-time social interaction modeling on five benchmarks.

A Novel Schur-Decomposition-Based Weight Projection Method for Stable State-Space Neural-Network Architectures

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

A real Schur decomposition projection maps the state matrix of discrete-time state-space layers onto its nearest stable counterpart, delivering accuracy comparable to prior stable identification methods with fewer weights.

QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

QLAM extends state-space models with quantum superposition in the hidden state for linear-time long-sequence modeling and reports consistent gains over RNN and transformer baselines on sequential image tasks.

Parallel Scan Recurrent Neural Quantum States for Scalable Variational Monte Carlo

cond-mat.str-el · 2026-05-13 · conditional · novelty 7.0

PSR-NQS makes recurrent neural quantum states scalable for variational Monte Carlo by using parallel scan recurrence, reaching accurate results on 52x52 two-dimensional lattices.

Selection, Not Fusion: Radar-Modulated State Space Models for Radar-Camera Depth Estimation

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

Radar-Modulated Selection perturbs only the step size Δ and readout C parameters inside Mamba's selective scan with radar data while keeping other components image-only, yielding state-of-the-art depth estimation on nuScenes with up to 34% MAE reduction.

TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.

TIDES: Implicit Time-Awareness in Selective State Space Models

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

TIDES reconciles selective SSM expressivity with continuous-time physical discretization by moving input dependence onto the state matrix, enabling native irregular time series handling and achieving SOTA on UEA and Physiome-ODE benchmarks.

PairAlign: A Framework for Sequence Tokenization via Self-Alignment with Applications to Audio Tokenization

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

PairAlign learns compact variable-length token sequences for audio via self-alignment on paired content-preserving views, achieving 55% fewer archive tokens than VQ while preserving edit-distance retrieval at 12.71 tokens/s.

Render, Don't Decode: Weight-Space World Models with Latent Structural Disentanglement

cs.CV · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

NOVA represents world states as INR weights for decoder-free rendering, compactness, and unsupervised disentanglement of background, foreground, and motion in video world models.

How Long Does Infinite Width Last? Signal Propagation in Long-Range Linear Recurrences

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

In linear recurrent models, infinite-width signal propagation remains accurate only for depths t much smaller than sqrt(width n), with a critical regime at t ~ c sqrt(n) where finite-width effects emerge and dominate for larger t.

The Predictive-Causal Gap: An Impossibility Theorem and Large-Scale Neural Evidence

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Predictive representation learning structurally favors encoding slower or less noisy environment modes over causal system modes, as shown by an impossibility theorem for linear-Gaussian dynamics and large-scale neural experiments.

FLUID: Continuous-Time Hyperconnected Sparse Transformer for Sink-Free Learning

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

FLUID is a continuous-time transformer using Liquid Attention Networks to model attention as stable ODE solutions that interpolate between discrete SDPA and CT-RNNs, with an explicit sink gate and liquid hyper-connections for better information flow.

Rethink MAE with Linear Time-Invariant Dynamics

cs.CV · 2026-04-29 · unverdicted · novelty 7.0

Token order in frozen visual representations is exploitable via SSM-based LTI probes, revealing pre-training-dependent heterogeneity that fixed pooling misses.

Mamba Sequence Modeling meets Model Predictive Control

math.OC · 2026-04-15 · unverdicted · novelty 7.0

Mamba-MPC stabilizes and tracks references on SISO and MIMO systems in simulation and hardware while outperforming LSTM-MPC with faster computation.

RSGMamba: Reliability-Aware Self-Gated State Space Model for Multimodal Semantic Segmentation

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

RSGMamba introduces a reliability-aware self-gated Mamba block for dynamic cross-modal feature selection in semantic segmentation, delivering state-of-the-art mIoU on RGB-D and RGB-T benchmarks with 48.6M parameters.

citing papers explorer

Showing 50 of 128 citing papers.

State Stream Transformer (SST) V2: Parallel Training of Nonlinear Recurrence for Latent Space Reasoning cs.LG · 2026-04-30 · unverdicted · none · ref 22 · internal anchor
SST V2 introduces parallel-trainable nonlinear recurrence in latent space to let transformers reason continuously across positions, delivering +15 points on GPQA-Diamond and halving remaining GSM8K errors over matched baselines.
Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling cs.CL · 2026-04-27 · unverdicted · none · ref 17 · internal anchor
HyLo upcycles Transformer LLMs into hybrids with MLA and Mamba2/Gated DeltaNet blocks via staged training and distillation, extending context to 2M tokens and outperforming prior upcycled hybrids on long-context benchmarks.
FETS Benchmark: Foundation Models Outperform Dataset-specific Machine Learning in Energy Time Series Forecasting cs.LG · 2026-04-24 · unverdicted · none · ref 68 · internal anchor
Foundation models outperform dataset-specific machine learning in energy time series forecasting across 54 datasets in 9 categories.
An explicit operator explains end-to-end computation in the modern neural networks used for sequence and language modeling cs.NE · 2026-04-22 · unverdicted · none · ref 8 · internal anchor
S4D state space models correspond exactly to wave propagation and nonlinear wave interactions in a one-dimensional ring oscillator network, with a closed-form operator describing the complete input-output map.
Hero-Mamba: Mamba-based Dual Domain Learning for Underwater Image Enhancement cs.CV · 2026-04-17 · unverdicted · none · ref 3 · internal anchor
Hero-Mamba combines parallel spatial-spectral Mamba processing and a background-light-guided ColorFusion block to enhance underwater images, reporting PSNR 25.802 and SSIM 0.913 on the LSUI benchmark.
Event-Adaptive State Transition and Gated Fusion for RGB-Event Object Tracking cs.CV · 2026-04-15 · unverdicted · none · ref 3 · internal anchor
MambaTrack improves RGB-Event object tracking via event-adaptive state transitions in a Dynamic State Space Model and a Gated Projection Fusion module, reporting state-of-the-art results on FE108 and FELT datasets.
TCL: Enabling Fast and Efficient Cross-Hardware Tensor Program Optimization via Continual Learning cs.LG · 2026-04-14 · conditional · none · ref 14 · internal anchor
TCL delivers 16.8x faster tuning on CPU and 12.48x on GPU with modestly lower inference latency by combining RDU active sampling, a lightweight Mamba cost model, and cross-platform continual knowledge distillation.
RetentiveKV: State-Space Memory for Uncertainty-Aware Multimodal KV Cache Eviction cs.LG · 2026-04-14 · unverdicted · none · ref 35 · internal anchor
RetentiveKV uses entropy to drive state-space model transitions that retain and reactivate low-attention visual tokens in a continuous memory instead of pruning them, delivering 5x KV cache compression and 1.5x faster decoding.
Membership Inference Attacks Expose Participation Privacy in ECG Foundation Encoders cs.LG · 2026-04-12 · unverdicted · none · ref 13 · internal anchor
Membership inference attacks can detect whether specific ECG data participated in pretraining self-supervised foundation encoders, with leakage strongest in small cohorts and contrastive models.
Tracking Listener Attention: Gaze-Guided Audio-Visual Speech Enhancement Framework eess.AS · 2026-04-09 · unverdicted · none · ref 16 · internal anchor
The GG-AVSE framework uses listener gaze direction combined with YOLO5Face and AVSEMamba to resolve target-speaker ambiguity in audio-visual speech enhancement, yielding gains in PESQ, STOI, and SI-SDR.
CloudMamba: An Uncertainty-Guided Dual-Scale Mamba Network for Cloud Detection in Remote Sensing Imagery cs.CV · 2026-04-08 · unverdicted · none · ref 51 · internal anchor
CloudMamba combines uncertainty-guided refinement with a dual-scale Mamba network to outperform prior methods on cloud segmentation accuracy while maintaining linear computational cost.
Physics-Aligned Spectral Mamba: Decoupling Semantics and Dynamics for Few-Shot Hyperspectral Target Detection cs.CV · 2026-04-07 · unverdicted · none · ref 55 · internal anchor
SpecMamba decouples stable semantic features from agile spectral adaptation via DCT-Mamba adapters, prior-guided tri-encoders, and self-supervised test-time mapping to improve few-shot hyperspectral target detection.
MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model cs.CV · 2026-03-27 · unverdicted · none · ref 22 · internal anchor
MPDiT uses a hierarchical multi-patch design in transformers to lower computation in diffusion models by handling coarse global features first then fine local details, plus faster-converging embeddings.
Generative Event Pretraining with Foundation Model Alignment cs.CV · 2026-03-24 · unverdicted · none · ref 19 · internal anchor
GEP transfers semantic knowledge from image foundation models to event data via alignment and generative pretraining on mixed sequences to create transferable event-based visual models.
M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling cs.LG · 2026-03-15 · unverdicted · none · ref 12 · internal anchor
M²RNN achieves perfect state tracking at unseen lengths and outperforms Gated DeltaNet hybrids by 0.4-0.5 perplexity on 7B models with 3x smaller recurrent states.
Upper Generalization Bounds for Neural Oscillators cs.LG · 2026-03-10 · conditional · none · ref 6 · internal anchor
Upper generalization bounds for neural oscillators scale polynomially with MLP size and time length, avoiding the curse of parametric complexity, with numerical validation on a Bouc-Wen nonlinear system.
Rethinking Efficiency in Neural Combinatorial Optimization: Batched Preference Optimization with Mamba cs.LG · 2026-02-24 · unverdicted · none · ref 29 · internal anchor
ECO uses supervised warm-up plus iterative batched DPO on a Mamba backbone to reach top neural performance on TSP and CVRP while lowering memory growth and raising throughput.
Latent-Space Causal Discovery from Indirect Neuroimaging Observations q-bio.NC · 2026-01-30 · unverdicted · none · ref 1 · internal anchor
INCAMA recovers directed causal graphs from indirect neuroimaging by physics-aware inversion plus delay-aware Mamba encoding, yielding 2-3x F1 gains in simulations and anatomically plausible sparse graphs on HCP motor fMRI.
Physics-Guided Tiny-Mamba Transformer for Reliability-Aware Early Fault Warning cs.LG · 2026-01-29 · unverdicted · none · ref 9 · internal anchor
PG-TMT couples a physics-aligned tri-branch encoder with EVT-calibrated decision rules to achieve higher PR-AUC and shorter detection times at controlled false-alarm rates across multiple bearing datasets.
LinMU: Multimodal Understanding Made Linear cs.CV · 2026-01-04 · conditional · none · ref 7 · internal anchor
LinMU achieves linear-complexity multimodal understanding by swapping self-attention for an M-MATE dual-branch block and distilling from a frozen teacher VLM, matching accuracy with up to 2.7x faster TTFT and 9x higher throughput.
Gated KalmaNet: A Fading Memory Layer Through Test-Time Ridge Regression cs.LG · 2025-11-26 · unverdicted · none · ref 21 · internal anchor
Gated KalmaNet uses exact Kalman gain computation with adaptive gating and Chebyshev iteration to improve SSM performance on long-context tasks over prior approximations like DeltaNet.
Higher-order Linear Attention cs.LG · 2025-10-31 · unverdicted · none · ref 6 · internal anchor
Higher-order Linear Attention realizes second-order and higher interactions in linear-time causal attention via constant-size state and associative scans.
Kimi Linear: An Expressive, Efficient Attention Architecture cs.CL · 2025-10-30 · unverdicted · none · ref 31 · internal anchor
Kimi Linear hybridizes linear attention with a new KDA module to beat full attention on tasks while slashing KV cache by 75% and speeding decoding up to 6x.
Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks cs.LG · 2025-08-16 · unverdicted · none · ref 4 · internal anchor
Gating in RNNs couples state time-scales with parameter gradients to produce lag- and direction-dependent effective learning rates, shown via exact Jacobians and first-order expansion.
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent cs.CL · 2025-07-03 · unverdicted · none · ref 36 · internal anchor
MemAgent uses multi-conversation RL to train a memory agent that reads text in segments and overwrites memory, extrapolating from 8K training to 3.5M token QA with under 5% loss and 95%+ on 512K RULER.
CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model cs.LG · 2025-06-10 · unverdicted · none · ref 36 · internal anchor
CodeBrain introduces a decoupled TFDual-Tokenizer and multi-scale EEGSSM architecture for an EEG foundation model pretrained on a large corpus, claiming strong generalization across eight downstream tasks and ten datasets.
Quantitative Error Feedback for Quantization Noise Reduction of Filtering over Graphs cs.LG · 2025-06-02 · unverdicted · none · ref 53 · internal anchor
Introduces quantitative error feedback from digital filter techniques to exactly compensate quantization noise in graph filtering, with closed-form optimal coefficients for deterministic, random-graph, and asynchronous scenarios.
Fine-Grained Fusion: The Missing Piece in Area-Efficient State Space Model Acceleration cs.AR · 2025-04-24 · unverdicted · none · ref 5 · 2 links · internal anchor
Fine-grained fusion and adaptive scheduling in SSMs deliver up to 4.8x speedup and 10x lower on-chip memory, enabling a fusion-aware accelerator with 1.78x higher performance than MARCA at equal area.
Retentive Network: A Successor to Transformer for Large Language Models cs.CL · 2023-07-17 · unverdicted · none · ref 7 · internal anchor
RetNet is a new sequence modeling architecture that delivers parallel training, constant-time inference, and competitive language modeling performance as a potential replacement for Transformers.
RCL-Mamba: A Dual-domain State Space Model for Measurement-oriented Image Restoration in Rotational Sparse-View Scanning Computed Laminography cs.CV · 2026-06-30 · unverdicted · none · ref 24 · internal anchor
RCL-Mamba restores images from sparse rotational computed laminography scans via cascaded projection-domain blur correction and image-domain artifact suppression using a Mamba-CNN dual-branch module.
CogSENet: Blind Image Deblurring with Blur-Conditioned Semantic Routing and Explicit Frequency Fusion cs.CV · 2026-06-29 · unverdicted · none · ref 14 · internal anchor
CogSENet proposes semantic-driven state space modules, bi-frequency fusion blocks, and continuous blur field estimation to outperform prior blind deblurring methods with fewer parameters.
SHARP: Sleep-based Hierarchical Accelerated Replay for Long Range Non-Stationary Temporal Pattern Recognition cs.AI · 2026-05-30 · unverdicted · none · ref 6 · internal anchor
SHARP separates memory accumulation from pattern recognition and uses accelerated offline replay of structured traces to achieve exponentially growing effective context at linear compute cost while learning non-stationary streams.
Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning cs.LG · 2026-05-29 · unverdicted · none · ref 2 · internal anchor
Linear recurrent filters exactly reproduce HMM belief logits under deterministic transitions and achieve near-zero decoding error under nearly deterministic ones, extending to action-controlled cases.
Graph Mamba Survival Analysis Based on Topology-Aware ordering cs.LG · 2026-05-23 · unverdicted · none · ref 59 · internal anchor
TopoMamSurv introduces topology-aware ordering and bidirectional Mamba with GCN for efficient WSI graph survival analysis, claiming performance gains on five TCGA datasets.
PlexRL: Cluster-Level Orchestration of Serviceized LLM Execution for RLVR cs.DC · 2026-05-20 · unverdicted · none · ref 7 · internal anchor
PlexRL multiplexes unified LLM services across RLVR jobs at the cluster level to exploit anti-correlated idle times and reduce GPU-hour costs by up to 37.58% with minimal per-job overhead.
MUSE: Multimodal Uncertainty Quantification of State Estimation cs.RO · 2026-05-17 · unverdicted · none · ref 26 · internal anchor
MUSE applies Mamba sequential modeling to produce real-time uncertainty estimates for visual-inertial state estimation from asynchronous multimodal sensors.
Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations cs.AR · 2026-05-12 · unverdicted · none · ref 14 · 2 links · internal anchor
BMRUs enable analog recurrent neural network hardware via discrete outputs that suppress noise 20-fold, with one-to-one parameter-to-circuit mapping and linear power scaling for recurrence.
Kaczmarz Linear Attention cs.LG · 2026-05-09 · unverdicted · none · ref 13 · internal anchor
Kaczmarz Linear Attention replaces the empirical coefficient in Gated DeltaNet with a key-norm-normalized step size derived from the online regression objective, yielding lower perplexity and better needle-in-haystack performance.
mHC-SSM: Manifold-Constrained Hyper-Connections for State Space Language Models with Stream-Specialized Adapters cs.LG · 2026-05-08 · unverdicted · none · ref 13 · internal anchor
Manifold-constrained multi-stream mixing plus per-stream adapters improves SSM language model validation loss from 6.3507 to 6.1353 and perplexity from 572.91 to 461.88 on WikiText-2.
StreamPhy: Streaming Inference of High-Dimensional Physical Dynamics via State Space Models cs.LG · 2026-05-08 · unverdicted · none · ref 19 · 2 links · internal anchor
StreamPhy introduces an end-to-end streaming framework using state-space models and an expressive FT-FiLM decoder to infer continuous physical dynamics from irregular sparse data, claiming 48% better accuracy and 20-100X faster inference than diffusion baselines.
Cubit: Token Mixer with Kernel Ridge Regression cs.LG · 2026-05-07 · unverdicted · none · ref 28 · 2 links · internal anchor
Cubit replaces Transformer's attention with a closed-form Kernel Ridge Regression token mixer and reports larger gains as training sequence length increases.
Neural Co-state Policies: Structuring Hidden States in Recurrent Reinforcement Learning cs.LG · 2026-05-06 · unverdicted · none · ref 58 · 2 links · internal anchor
Recurrent RL policies can have their hidden states aligned with PMP co-states through a derived loss, yielding robust performance on partially observable control tasks.
SAMIC: A Lightweight Semantic-Aware Mamba for Efficient Perceptual Image Compression cs.CV · 2026-05-06 · unverdicted · none · ref 13 · internal anchor
SAMIC introduces semantic-aware Mamba blocks and SVD-based redundancy reduction to achieve efficient perceptual image compression with improved rate-distortion-perception tradeoffs.
Selective Attention-Based Network for Robust Infrared Small Target Detection cs.CV · 2026-04-27 · unverdicted · none · ref 18 · internal anchor
SANet augments U-Net with a Dual-path Semantic-aware Module using pinwheel convolutions and CBAM, plus a Selective Attention Fusion Module for adaptive cross-scale feature fusion, to improve detection of sub-pixel infrared targets.
Looking Into the Past: Eye Movements Characterize Elements of Autobiographical Recall in Interviews with Holocaust Survivors cs.MM · 2026-04-23 · unverdicted · none · ref 10 · internal anchor
Eye movements during Holocaust survivor interviews vary by episodic, semantic, affective and temporal memory dimensions, with pre-onset gaze sufficient to predict sentence temporal context.
Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction cs.MM · 2026-04-22 · unverdicted · none · ref 22 · internal anchor
A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.
FG$^2$-GDN: Enhancing Long-Context Gated Delta Networks with Doubly Fine-Grained Control cs.LG · 2026-04-21 · unverdicted · none · ref 10 · internal anchor
FG²-GDN replaces the scalar beta in the delta update with a channel-wise vector and decouples key/value scaling to improve recall over prior GDN and KDA models.
Sessa: Selective State Space Attention cs.LG · 2026-04-20 · unverdicted · none · ref 9 · internal anchor
Sessa integrates attention within recurrent paths to achieve power-law memory tails and flexible non-decaying selective retrieval, outperforming baselines on long-context tasks.
MedMamba: Recasting Mamba for Medical Time Series Classification eess.SP · 2026-04-17 · unverdicted · none · ref 14 · internal anchor
MedMamba introduces a principle-guided bidirectional multi-scale Mamba model that outperforms prior methods on EEG, ECG, and activity classification benchmarks while delivering 4.6x inference speedup.
A Mamba-Based Multimodal Network for Multiscale Blast-Induced Rapid Structural Damage Assessment cs.AI · 2026-04-13 · unverdicted · none · ref 11 · internal anchor
A new Mamba multimodal network integrates multi-scale blast-loading information with satellite images to improve rapid structural damage assessment after explosions, showing gains over prior methods on the Beirut 2020 case.

Efficiently Modeling Long Sequences with Structured State Spaces

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer