RNNs with ranking loss outperform item-to-item baselines for session-based recommendations on two datasets.
hub
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches
32 Pith papers cite this work. Polarity classification is still indexing.
abstract
Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder--Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
A task-specific iterative framework for weakly supervised 4D radar scene flow estimation uses instance-aware self-supervised losses from 2D tracking/segmentation and a rigid static loss from odometry to outperform LiDAR-dependent cross-modal and fully supervised methods on the VoD dataset.
PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.
Koopman autoencoders with forcings and temporal unrolling deliver accurate year-long predictions for coastal-ocean models at 300-1400x speedup, outperforming POD in two of three cases.
RVM uses recurrent computation inside a masked autoencoder to learn video representations that match or exceed prior video and image models on classification, tracking, and dense spatial tasks with up to 30x better parameter efficiency.
TCD-Arena is a new customizable testing framework that runs millions of experiments to map how 33 different assumption violations affect time series causal discovery methods and shows ensembles can boost overall robustness.
A zero-shot machine learning decoder for handwriting BCIs achieves 64% hits@3 retrieval on unseen letters by exploiting conserved kinematic neural representations.
M²RNN achieves perfect state tracking at unseen lengths and outperforms Gated DeltaNet hybrids by 0.4-0.5 perplexity on 7B models with 3x smaller recurrent states.
SpectraLLM is an LLM fine-tuned to predict small-molecule structures from single or multiple spectra, reporting state-of-the-art results on four public benchmarks with gains from multi-modal input.
DragNUWA integrates text, image, and trajectory controls into a diffusion video model using a Trajectory Sampler, Multiscale Fusion, and Adaptive Training to enable fine-grained open-domain video generation.
A variational Bayesian framework exploits reciprocity between referents and context plus semantic reproduction to improve referring expression grounding over pairwise methods in supervised and unsupervised settings.
Bidirectional RNN with attention models real-time user knowledge from question-response sequences to predict correctness, outperforming baselines especially for new users on a large TOEIC mobile app dataset.
RNN for arrivals paired with recurrent GAN for service times to model queuing dynamics without assuming specific inter-event distributions.
BMIL learns belief modules jointly with policies for GAIL-style imitation learning in POMDPs, outperforming separate training and standard GAIL on continuous control tasks.
New seq2seq architectures for permutation indexing outperform baselines on synthetic reference-resolution tasks and reduce real decompilation error rates by 42%.
Primate ventral stream encodes visual stimuli through evolving neural dynamics that carry category information beyond any fixed spatial pattern during the initial feedforward pass.
ACARec attends over artist catalogs to generate CF embeddings for new tracks, more than doubling recall and NDCG versus content-only baselines in music recommendation.
Offline RL for ICU sedation shows that adding 30-day mortality to the objective yields policies whose clinician agreement correlates negatively with mortality, unlike pain-only versions.
StateX post-trains RNNs to expand recurrent state size, improving recall and in-context learning with negligible parameter growth.
A comparative review with experiments identifying optimal preprocessing, models, and transfer strategies for large-scale pixel-wise crop mapping using Landsat 8 data across five sites.
A normalizing-flow neural topic model plus control mechanism are added to Transformer summarizers to supply and regulate global semantics, with reported gains over prior models on five benchmarks.
Proposes a tool-use inspired framework with multiple test sets to measure specified types of generalization in RL.
EHR-RAGp is a retrieval-augmented EHR foundation model that employs prototype-guided retrieval to dynamically integrate relevant historical patient context, outperforming prior models on clinical prediction tasks.
Delta6 delivers a low-cost 6-DOF force-sensing end-effector with 3.8% FS accuracy using sequence models, validated on robot-arm tasks like buffing and tight assembly.
citing papers explorer
-
Fast-ULCNet: A fast and ultra low complexity network for single-channel speech enhancement
Fast-ULCNet matches original ULCNet speech enhancement quality while cutting model size by more than half and latency by 34% via FastGRNN replacement and a state-drift filter.