super hub Mixed citations

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Caglar Gulcehre, Junyoung Chung, Kyunghyun Cho, Yoshua Bengio · 2014 · cs.NE · arXiv 1412.3555

Mixed citation behavior. Most common role is background (62%).

130 Pith papers citing it

Background 62% of classified citations

open full Pith review browse 130 citing papers more from Caglar Gulcehre arXiv PDF

abstract

In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 8 method 3 baseline 1 other 1

citation-polarity summary

background 8 use method 3 baseline 1 unclear 1

claims ledger

abstract In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.

authors

Caglar Gulcehre Junyoung Chung Kyunghyun Cho Yoshua Bengio

co-cited works

representative citing papers

CanViT: Toward Active-Vision Foundation Models

cs.CV · 2026-03-23 · conditional · novelty 8.0

CanViT is the first task- and policy-agnostic AVFM pretrained via passive-to-active dense latent distillation on 13.2M scenes and 1B random glimpses, achieving 38.5% ADE20K mIoU in one glimpse and 84.5% ImageNet-1k top-1 after fine-tuning.

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

cs.LG · 2023-12-01 · unverdicted · novelty 8.0

Mamba is a linear-time sequence model using input-dependent selective SSMs that achieves SOTA results across modalities and matches twice-larger Transformers on language modeling with 5x higher inference throughput.

Identifying Latent Concepts and Structures for Generalized Category Discovery

cs.CV · 2026-07-01 · unverdicted · novelty 7.0

CPF-GCD enforces low-rank compositional structure on vision backbone features via spatial primitive fields so that novel categories emerge as new activation patterns over a shared vocabulary of reusable visual primitives.

Urdu Katib Handwritten Dataset: A Historical Document Dataset for Offline Urdu Handwritten Text Recognition with CRNN-Based Baseline Evaluation

cs.CV · 2026-06-17 · unverdicted · novelty 7.0

Presents UKHD, the first historical offline Urdu handwritten text lines dataset from Katib materials, and benchmarks CRNN-based models with CNN-BGRU-CTC showing lowest CER and WER.

LongSpike: Fractional Order Spiking State Space Models for Efficient Long Sequence Learning

cs.LG · 2026-06-11 · unverdicted · novelty 7.0

LongSpike integrates fractional-order state-space modeling into spiking neural networks, enabling better long-sequence performance than prior SNNs on LRA, WikiText-103, and Speech Commands benchmarks while retaining sparse computation.

CoMetaPNS: Continually Meta-learning Personalized Neural Surrogates for Cardiac Electrophysiology Simulations

cs.LG · 2026-06-05 · unverdicted · novelty 7.0

CoMetaPNS combines meta-learned neural surrogates with a continual Bayesian Gaussian Mixture Model to adapt cardiac electrophysiology simulations to new data while avoiding catastrophic forgetting.

AdaState: Self-Evolving Anchors for Streaming Video Generation

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

AdaState replaces the static first-frame KV anchor with an evolving hidden latent that the model denoises alongside content, treating time as relative to enable recurrence and richer dynamics in streaming video generation.

LC-Flow: Learning Local Continuous Optical Flow and Confidence from events

cs.CV · 2026-05-23 · unverdicted · novelty 7.0

LC-Flow introduces a continuous local recurrent network for learning sparse optical flow and confidence directly from event streams, with confidence-guided aggregation reaching new SOTA on MVSEC.

Nested-GPT for variable-multiplicity parton showers: A case study in the resummation of non-global logarithms

hep-ph · 2026-05-18 · unverdicted · novelty 7.0 · 2 refs

Nested-GPT is an autoregressive Transformer surrogate that generates variable-multiplicity parton showers while enforcing ordered Markovian branching and matches reference Monte Carlo results for leading-log non-global logarithm resummation in the large-Nc limit.

Identify Then Project: Contrastive Learning of Latent Dynamics from Partial Observations with Port-Hamiltonian Structure

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

A two-stage contrastive teacher-student framework learns and then projects latent dynamics onto port-Hamiltonian submanifolds from partial observations.

TokAlign++: Advancing Vocabulary Adaptation via Better Token Alignment

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

TokAlign++ learns token alignments between LLM vocabularies from monolingual representations to enable faster adaptation, better text compression, and effective token-level distillation across 15 languages with minimal steps.

Vector-Quantized Discrete Latent Factors Meet Financial Priors: Dynamic Cross-Sectional Stock Ranking Prediction for Portfolio Construction

cs.LG · 2026-05-13 · conditional · novelty 7.0

PRISM-VQ integrates vector-quantized latent factors with financial priors and a structure-conditioned mixture-of-experts to deliver improved cross-sectional stock return predictions and portfolio performance on CSI 300 and S&P 500.

What-Where Transformer: A Slot-Centric Visual Backbone for Concurrent Representation and Localization

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

The What-Where Transformer achieves explicit what-where separation in a ViT-style backbone via concurrent token and attention-map streams, yielding emergent object discovery from attention maps and better weakly-supervised localization.

TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

TailedTS supplies 24.69 billion Wikipedia page-view records as a public benchmark for heavy-tailed time series forecasting and periodicity analysis, revealing weaker periodic structure in high-traffic pages.

TCRTransBench: A Comprehensive Benchmark for Bidirectional TCR-Peptide Sequence Generation

q-bio.CB · 2026-05-06 · unverdicted · novelty 7.0

TCRTransBench provides a new benchmark with bidirectional TCR-peptide generation tasks, a large validated dataset, and metrics to evaluate neural models for immunological sequence modeling.

Learning to Theorize the World from Observation

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

NEO is a probabilistic neural model that induces compositional programs as a learned Language of Thought from non-textual observations and executes them via a shared transition model to enable explanation-driven generalization.

Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning

cs.LG · 2026-04-09 · unverdicted · novelty 7.0

CLOVER augments value decomposition with a GNN mixer whose weights depend on the realized wireless communication graph, proving permutation invariance, monotonicity, and greater expressiveness than QMIX while showing gains on Predator-Prey and Lumberjacks under p-CSMA channels.

Oscillators Are All You Need: Irregular Time Series Modelling via Damped Harmonic Oscillators with Closed-Form Solutions

cs.LG · 2026-02-12 · unverdicted · novelty 7.0

Damped harmonic oscillators with closed-form solutions model keys, values, and queries in continuous attention for irregular time series, preserving universal approximation while being orders of magnitude faster than prior NODE-based methods.

Mitigating Error Accumulation in Continuous Navigation via Memory-Augmented Kalman Filtering

cs.RO · 2026-01-30 · unverdicted · novelty 7.0

NeuroKalman mitigates state drift in vision-language UAV navigation by using memory-augmented Kalman filtering where attention retrieves historical anchors to correct predictions without gradient updates.

ExDoS: Expert-Guided Dual-Focus Cross-Modal Distillation for Smart Contract Vulnerability Detection

cs.CR · 2025-09-12 · unverdicted · novelty 7.0

ExDoS uses expert-guided dual-focus distillation between source semantic graphs and bytecode control-flow graphs plus a dual-attention network to improve smart contract vulnerability detection, reporting 3-6% F1 gains over baselines.

Unsupervised Learning of Local Updates for Maximum Independent Set in Dynamic Graphs

cs.LG · 2025-05-19 · unverdicted · novelty 7.0

Unsupervised GNN model learns local updates for approximate MaxIS on dynamic graphs, achieving competitive ratios on 200-1000 node instances and 1.00-1.18x larger solutions than other unsupervised models when generalizing to 100x larger graphs.

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

cs.LG · 2024-05-31 · unverdicted · novelty 7.0

Transformers and SSMs are unified through structured state space duality, producing a 2-8X faster Mamba-2 model that remains competitive with Transformers.

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

cs.LG · 2024-02-29 · unverdicted · novelty 7.0

Griffin hybrid model matches Llama-2 performance while trained on over 6 times fewer tokens and offers lower inference latency with higher throughput.

Estimation--Prediction Tradeoff in Causal Probabilistic Temporal Graphs

cs.LG · 2026-06-26 · unverdicted · novelty 6.0

Characterizes an estimation-prediction tradeoff in binary logistic models for causal probabilistic temporal graphs and proposes a framework to jointly evaluate temporal link prediction with causal parameter recovery via Cramér-Rao bounds.

citing papers explorer

Showing 50 of 130 citing papers.

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention cs.CL · 2025-06-16 · unverdicted · none · ref 5 · internal anchor
MiniMax-M1 is a 456B parameter hybrid-attention MoE model trained with CISPO RL that achieves performance comparable or superior to DeepSeek-R1 and Qwen3-235B on reasoning and software engineering tasks while training in three weeks on 512 GPUs.
Logo-LLM: Local and Global Modeling with Large Language Models for Time Series Forecasting cs.LG · 2025-05-16 · unverdicted · none · ref 2 · internal anchor
Logo-LLM improves time series forecasting by pulling local dynamics from shallow LLM layers and global trends from deeper layers, then aligning them via new Local-Mixer and Global-Mixer modules.
DeePen: Penetration Testing for Audio Deepfake Detection cs.CR · 2025-02-27 · unverdicted · none · ref 63 · internal anchor
DeePen demonstrates that both production and academic audio deepfake detectors can be reliably deceived by simple signal processing attacks such as time-stretching or echo addition, with some attacks resistible via retraining and others remaining effective.
Non-invasive electromyographic speech neuroprosthesis: a geometric perspective eess.AS · 2025-02-09 · unverdicted · none · ref 15 · internal anchor
Direct sequence-to-sequence EMG-to-text conversion for silent articulation using a geometric representation of high-dimensional signals, without audio targets or time-alignment.
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive cs.CL · 2024-02-20 · conditional · none · ref 143 · internal anchor
DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.
R-Transformer: Recurrent Neural Network Enhanced Transformer cs.LG · 2019-07-12 · unverdicted · none · ref 6 · internal anchor
R-Transformer integrates RNNs with multi-head attention to model local and global sequence dependencies without position embeddings and reports large-margin gains over prior methods on diverse tasks.
A Bi-directional Transformer for Musical Chord Recognition cs.SD · 2019-07-05 · unverdicted · none · ref 16 · internal anchor
A bi-directional Transformer achieves competitive chord recognition by using self-attention to capture long-term dependencies in audio in a single training phase.
Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments cs.CV · 2019-06-23 · unverdicted · none · ref 10 · internal anchor
The MIA model with GC, RGA, and BFM modules achieves state-of-the-art performance on the CUHK-PEDES dataset for description-based person re-identification.
Learning to Distributedly Estimate under Partially Known Dynamics: A Covariance-Agnostic Neural Kalman Consensus Filter cs.LG · 2026-06-26 · unverdicted · none · ref 14 · internal anchor
CA-NKCF is a hybrid neural-Kalman consensus filter for distributed state estimation that operates without noise covariance knowledge and shows robustness to model misspecification in linear, chaotic, and wireless scenarios.
Interpretable Kolmogorov-Arnold Network with Feature-Isolated Temporal Attention Mechanism for Electricity Load Forecasting cs.LG · 2026-06-22 · unverdicted · none · ref 33 · internal anchor
LoadKAN combines feature-isolated temporal attention with KAN to produce competitive load forecasts on three U.S. markets and enables quantitative analysis of non-linear mobility-load relationships via learned activation functions.
Graph Grounded Cross Attention Transformer Neural Network for Structurally Constrained Full Event Sequence Generation in Predictive Process Monitoring cs.LG · 2026-06-17 · unverdicted · none · ref 7 · internal anchor
GGATN combines graph grounding with transformer self- and cross-attention to generate full event sequences, timestamps, length, and attributes in a single pass followed by Viterbi-style constrained decoding, outperforming prompted LLM baselines on six logs with zero hallucinated activities.
Predicting Cognitive Load from Speech and Interaction Dynamics in Dyadic Conversations cs.LG · 2026-06-11 · unverdicted · none · ref 43 · internal anchor
A GRU encoder using static acoustic, dynamic, and interaction features from 53 dyads predicts cognitive load related to time pressure, mental work, effort, and performance, with turn-taking linked to temporal demand and imbalanced participation to mental demand.
KinematicRL: A Sim-to-Real Reinforcement Learning Framework For Social Navigation With Kinodynamic Feasibility cs.RO · 2026-06-10 · unverdicted · none · ref 39 · internal anchor
KinematicRL is a sim-to-real RL framework for social navigation using second-order control inputs, iLQR pretraining, and cluster-based 2D LiDAR tracking to produce kinodynamically feasible policies deployable on real robots with minimal modifications.
Boosting ECG Classification Performance by Pre-training with Synthesized Data cs.LG · 2026-06-09 · unverdicted · none · ref 4 · internal anchor
Pre-training ten DNN architectures on knowledge-driven synthetic ECGs generated via Gaussian PQRST wave composition improves classification of AF, AFLT, PVC, and WPW, with largest gain of 33.2% for AFLT and stronger benefits on smaller real datasets.
Reconstructing and forecasting disease trajectories of patients with Alzheimer's disease using routine data in resource-constrained settings cs.AI · 2026-06-05 · unverdicted · none · ref 29 · internal anchor
GNOVA reconstructs and forecasts CDR-SB and MMSE scores with MAEs of 1.35 and 2.28 on 1727 ADNI patients over 10 years using only routine visit data, enabling interpolation, extrapolation, and uncertainty estimates.
EvoCSFL: Surrogate-Assisted Evolutionary Client Selection for Efficient and Robust Federated Learning cs.LG · 2026-06-05 · unverdicted · none · ref 4 · internal anchor
EvoCSFL combines candidate generation, a multi-objective metric, surrogate approximation, and evolutionary search to optimize client subsets in federated learning, reporting faster convergence and lower energy on image classification tasks.
HoT-SSM:Higher-order Temporal Knowledge Graph Reasoning with State Space Models for Health Care cs.LG · 2026-06-04 · unverdicted · none · ref 7 · internal anchor
HoT-SSM combines hypergraph construction from domain knowledge with a dynamic state space model to jointly capture higher-order clinical interactions and long-range temporal dependencies, yielding improved predictions on MIMIC-III and MIMIC-IV.
RePercENT: Scaling Disentangled Representation Learning Beyond Two Modalities cs.LG · 2026-06-03 · unverdicted · none · ref 4 · internal anchor
RePercENT introduces a plug-and-play self-supervised framework for scalable pairwise disentangled representation learning across more than two modalities using pre-extracted embeddings and a joint optimization objective with theoretical optimality guarantees.
3D Temporal Analysis for Autism Spectrum Disorder Screening During Attention Tasks cs.CV · 2026-06-03 · unverdicted · none · ref 9 · internal anchor
A 3D temporal framework extracts head pose and facial features via DECA from VR tasks, then uses GRU classifiers to achieve 84.6% accuracy distinguishing ASD from TD children, outperforming 2D baselines.
Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions cs.LG · 2026-06-02 · unverdicted · none · ref 10 · internal anchor
GTR introduces a bounded non-monotonic Gaussian trust region and Mixture Gaussian Anchor to enable effective behavior transitions in non-stationary RL where standard PPO fails.
Physics-Guided Recurrent State-Space Neural Networks for Multi-Step Prediction eess.SY · 2026-06-01 · unverdicted · none · ref 3 · internal anchor
PG-RSSNN adds recurrent structures to physics-guided neural networks to enable stable multi-step prediction that beats both physics-only and black-box models even with partial physics and limited data.
CHAM-net: A Contrastive Hierarchical Adaptive Meta-network for Robust Global Methane Flux Prediction cs.LG · 2026-05-29 · unverdicted · none · ref 38 · internal anchor
CHAM-net is a contrastive hierarchical adaptive meta-network that conditions predictions on historical site data to outperform baselines on methane flux tasks with nRMSE down to 0.43.
Decoupled Delay Compensation: Enhancing Pre-trained MARL Policies via Learned Dynamics Filtering cs.MA · 2026-05-25 · unverdicted · none · ref 8 · internal anchor
A decoupled estimator combining gated dynamics learning and recursive Kalman filtering improves robustness of pre-trained MARL policies under stale observations and message loss.
DeGRe: Dense-supervised Generative Reranking for Recommendation cs.IR · 2026-05-25 · unverdicted · none · ref 7 · internal anchor
DeGRe decouples offline exploration via a lookahead evaluator using beam search and cumulative regression to distill dense supervision into an online generator that approximates optimal reranking sequences with greedy decoding.
Atoms of Thought: Universal EEG Representation Learning with Microstates cs.LG · 2026-05-19 · unverdicted · none · ref 17 · internal anchor
Microstate tokenizer from clustered EEG signals provides universal representations that outperform traditional time- and frequency-domain features across sleep staging, emotion recognition, and motor imagery tasks.
Adaptive Outer-Loop Control of Quadrotors via Reinforcement Learning cs.RO · 2026-05-15 · unverdicted · none · ref 13 · internal anchor
An RL-based outer-loop quadrotor controller augmented with an online Residual Dynamics Predictor for disturbance estimation and a data-efficient sim-to-real calibration bridge.
Spatiotemporal decoupled physics-informed Stone-Weierstrass neural operator for long-time prediction of time-dependent parametric PDEs physics.comp-ph · 2026-05-15 · unverdicted · none · ref 30 · internal anchor
A spatiotemporally decoupled physics-informed Stone-Weierstrass neural operator for stable long-time prediction of time-dependent parametric PDEs.
Rethinking Random Transformers as Adaptive Sequence Smoothers for Sleep Staging cs.LG · 2026-05-11 · unverdicted · none · ref 91 · internal anchor
Randomly initialized Transformers act as adaptive sequence smoothers for sleep staging via a Random Attention Prior Kernel, with gains mainly from inductive bias rather than training.
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention cs.LG · 2026-05-07 · unverdicted · none · ref 108 · internal anchor
MDN parallelizes stepwise momentum for delta linear attention using geometric reordering and dynamical systems analysis, yielding performance gains over Mamba2 and GDN on 400M and 1.3B models.
Neural Co-state Policies: Structuring Hidden States in Recurrent Reinforcement Learning cs.LG · 2026-05-06 · unverdicted · none · ref 42 · 2 links · internal anchor
Recurrent RL policies can have their hidden states aligned with PMP co-states through a derived loss, yielding robust performance on partially observable control tasks.
ReMedi: Reasoner for Medical Clinical Prediction cs.CL · 2026-05-02 · unverdicted · none · ref 3 · internal anchor
ReMedi boosts LLM performance on EHR clinical predictions by up to 19.9% F1 through ground-truth-guided rationale regeneration and fine-tuning.
HOI-aware Adaptive Network for Weakly-supervised Action Segmentation cs.CV · 2026-04-29 · unverdicted · none · ref 5 · internal anchor
AdaAct employs a HOI encoder and two-branch hypernetwork to adaptively adjust temporal encoding parameters based on video-level human-object interactions for improved weakly-supervised action segmentation.
STK-Adapter: Incorporating Evolving Graph and Event Chain for Temporal Knowledge Graph Extrapolation cs.IR · 2026-04-21 · unverdicted · none · ref 5 · internal anchor
STK-Adapter adds Spatial-Temporal MoE, Event-Aware MoE, and Cross-Modality Alignment MoE to integrate evolving TKG graphs and event chains into LLMs, reducing information loss and improving extrapolation performance over prior methods.
Gated Memory Policy cs.RO · 2026-04-21 · unverdicted · none · ref 9 · internal anchor
GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.
Efficient and Effective Internal Memory Retrieval for LLM-Based Healthcare Prediction cs.CL · 2026-04-08 · unverdicted · none · ref 2 · internal anchor
K2K framework enables internal memory retrieval in LLMs for healthcare outcome prediction, achieving state-of-the-art results on four benchmarks.
Adaptive Learned State Estimation based on KalmanNet cs.RO · 2026-04-02 · unverdicted · none · ref 17 · internal anchor
AM-KNet adds sensor-specific modules, hypernetwork conditioning on target type and pose, and Joseph-form covariance estimation to KalmanNet, yielding better accuracy and stability than base KalmanNet on nuScenes and View-of-Delft data.
Probabilistic Hysteresis Factor Prediction for Electric Vehicle Batteries with Graphite Anodes Containing Silicon cs.LG · 2026-03-10 · unverdicted · none · ref 33 · internal anchor
A data-driven probabilistic approach predicts the hysteresis factor for silicon-graphite anode batteries in electric vehicles, with tests for generalization across vehicle models.
Multimodal Large Language Models with Adaptive Preference Optimization for Sequential Recommendation cs.IR · 2025-11-24 · unverdicted · none · ref 8 · internal anchor
HaNoRec dynamically weights harder preference samples and applies Gaussian perturbations to output distributions to improve multimodal LLM performance on sequential recommendation tasks.
WaveletInception Networks for on-board Vibration-Based Infrastructure Health Monitoring cs.LG · 2025-07-17 · unverdicted · none · ref 49 · internal anchor
The WaveletInception-BiGRU network uses learnable wavelet packet transforms, 1D Inception-ResNet modules, and BiGRU layers to generate high-resolution, spatially mapped health profiles from variable-speed vibration data, outperforming prior methods on track stiffness and transition zone tasks.
From Time-series Generation, Model Selection to Transfer Learning: A Comparative Review of Pixel-wise Approaches for Large-scale Crop Mapping cs.CV · 2025-07-16 · unverdicted · none · ref 13 · internal anchor
A comparative review with experiments identifying optimal preprocessing, models, and transfer strategies for large-scale pixel-wise crop mapping using Landsat 8 data across five sites.
What Matters in Building Vision-Language-Action Models for Generalist Robots cs.RO · 2024-12-18 · unverdicted · none · ref 10 · internal anchor
Systematic tests of VLM backbones, policy architectures, and cross-embodiment data yield RoboVLMs that set new SOTA on robot manipulation benchmarks while requiring few manual designs.
Click-Through Rate Prediction with the User Memory Network cs.IR · 2019-07-09 · unverdicted · none · ref 4 · internal anchor
MA-DNN augments DNNs with per-user memory vectors capturing likes and dislikes to exploit historical behavior for CTR prediction while remaining simpler than RNNs.
Latent Multi-Criteria Ratings for Recommendations cs.LG · 2019-06-26 · unverdicted · none · ref 8 · internal anchor
Uses variational autoencoders on user reviews to generate latent multi-criteria ratings that outperform baselines on multiple datasets.
Attention Is All You Need cs.CL · 2017-06-12 · unverdicted · none · ref 7 · internal anchor
Pith review generated a malformed one-line summary.
Improving Patient Subtyping on Longitudinal Data using Representations from Mamba-based Architecture cs.LG · 2026-06-26 · unverdicted · none · ref 7 · internal anchor
Self-supervised Mamba model learns EHR representations that improve patient subtyping on longitudinal data compared to baselines.
Risk Stratification for ICU Delirium using Pervasive Ambient Sensing Information cs.LG · 2026-06-17 · unverdicted · none · ref 29 · internal anchor
Ambient sound and light data from ICU rooms predict delirium with AUC 0.80 using convolutional neural networks, with sound as the dominant predictor, on data from 309 patients across 9 ICUs.
Time-Series Foundation Model Embeddings for Remaining Useful Life Estimation cs.LG · 2026-06-10 · unverdicted · none · ref 25 · internal anchor
Frozen Chronos-2 TSFM embeddings plus a lightweight regression head outperform standard baselines for RUL estimation on two industrial sensor datasets.
Learning and Adaptation in Wire Arc Additive Manufacturing Bead Geometry Control cs.RO · 2026-05-27 · unverdicted · none · ref 19 · internal anchor
RNN predictive control with layer-wise adaptation improves bead height and width consistency in WAAM experiments over constant-input and static-model baselines.
Enhancing BiGRU with a KAN Block for Legal Document Classification and Summarization cs.CL · 2026-05-27 · unverdicted · none · ref 11 · internal anchor
KAN-enhanced BiGRU reaches 67.96% accuracy and 0.65 F1 on legal document classification plus ROUGE-1/2/L of 0.38/0.23/0.31 on summarization, with ablation attributing a 10.62-point accuracy gain to the KAN block.
Bridging Classification and Reconstruction: Cooperative Time Series Anomaly Detection cs.LG · 2026-05-25 · unverdicted · none · ref 70 · internal anchor
CoAD unifies outlier exposure classification and masked autoencoder reconstruction in a cooperative loop to detect subtle and prolonged time series anomalies.

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer