hub Canonical reference

Attention is all you need

· 2017

Canonical reference. 89% of citing Pith papers cite this work as background.

88 Pith papers citing it

Background 89% of classified citations

browse 88 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 8 method 1

citation-polarity summary

background 8 use method 1

claims ledger

background characteristics inherent in power load time series. Data-driven approaches based on artificial intelligence have become mainstream in recent years. Early methods centered on recurrent neural networks (RNNs) and convolutional neural networks (CNNs), which are adept at capturing temporal de- pendencies and inter-variable relationships [5]. With the advent of the Transformer architecture [6], attention-based models have advanced rapidly for time series forecasting, giving rise to numerous variants
background However, these models still face challenges: their ability to explicitly model local interactions remains limited, and their interpretability is relatively weak. These drawbacks motivate our approach, which leverages physically grounded quantum walk dynamics to provide both richer local structural model- ing and improved interpretability. Formally, the self-attention mechanism in the Transformer framework [20] is defined as Attention (Q, K, V) = softmax (QKT √ d ) V(2) WhereQ, K, V∈R n×dare the
background Generating accurate, human-like motion requires ac- counting for variability in emotion and semantic emphasis, two aspects that remain underexplored. Computational efficiency is an additional requirement for real-time robotics applications. Model architectures have evolved from recurrent networks such as long short-term memory (LSTM) [16] to attention-based transformers [17]. Adversarial and diffusion-based methods have also been proposed to improve motion realism and diver- sity [2], [14], [18]
background Neural Machine Translation (NMT) has emerged as a pow- erful end-to-end approach for automated translation, employ- ing a single neural network to directly model the probability of a target sentence given a source sentence [1]. In recent years, NMT models have significantly improved translation quality, accompanied by a substantial expansion in model scale. Since Transformer introduced [2], the parameter count of NMT models has grown exponentially. For instance, M2M-100 (12 billion parameters) [
background after GEMM completion while the output tiles still reside in on- chip memory (L1/L2 caches or registers), we avoid costly global memory traffic. However, conventional normalization layers operate along the feature dimension, which often misaligns with the physical data layout of GEMM outputs. To address this, we proposesBlockNorm, a normalization approach inspired by GroupNorm [71] which is originally designed to apply normalization within individual channels of a feature map. In our version of
background bias parameters(γ, β)from conditional inputs, then modulates intermediate features viaγ⊙x+βto achieve lightweight conditional feature selection [7]. We observe that this channel- level modulation effectively adjusts feature weights with low overhead and good trainability. In contrast, attention-based cross-modal fusion typically relies on spatial weights or token- level interactions [9], [15]-[17], which increase computa- tional/parameter overhead and may complicate optimization in reinforcement

co-cited works

representative citing papers

QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

QLAM extends state-space models with quantum superposition in the hidden state for linear-time long-sequence modeling and reports consistent gains over RNN and transformer baselines on sequential image tasks.

CTQWformer: A CTQW-based Transformer for Graph Classification

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.

Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems

cs.AR · 2026-05-07 · unverdicted · novelty 7.0

CAIS delivers 1.38x end-to-end LLM training speedup over NVLS and 1.61x over T3 by making in-switch computing aware of computation memory requirements instead of treating communication as an isolated phase.

Computer-Aided Design Generation by Cascaded Discrete Diffusion Model

cs.CV · 2026-05-06 · unverdicted · novelty 7.0

Cascaded discrete diffusion generates CAD command sequences with absorbing transitions and parameters with Gaussian, scale-invariant, and prior-preserving kernels, outperforming autoregressive and continuous diffusion baselines on the DeepCAD dataset.

Tempus: A Temporally Scalable Resource-Invariant GEMM Streaming Framework for Versal AI Edge

cs.DC · 2026-05-01 · unverdicted · novelty 7.0

Tempus delivers 607 GOPS at 10.677 W using fixed 16 AIE cores on Versal AI Edge, with 211.2x better platform-aware utility than spatial SOTA ARIES and zero URAM/DSP utilization.

Autocorrelation Reintroduces Spectral Bias in KANs for Time Series Forecasting

cs.LG · 2026-04-26 · unverdicted · novelty 7.0

Temporal autocorrelation reintroduces spectral bias in KANs for time series forecasting, which DCT preprocessing can mitigate.

Latent Space Probing for Adult Content Detection in Video Generative Models

cs.CV · 2026-04-25 · unverdicted · novelty 7.0

Latent space probing on CogVideoX achieves 97.29% F1 for adult content detection on a new 11k-clip dataset with 4-6ms overhead.

Planar Gaussian Splatting with Bilinear Spatial Transformer for Wireless Radiance Field Reconstruction

eess.SP · 2026-04-17 · unverdicted · novelty 7.0

BiSplat-WRF applies 2D planar Gaussians rendered on angular domains plus a bilinear spatial transformer to capture electromagnetic interactions, outperforming prior NeRF and GS methods on SSIM for wireless radiance field reconstruction.

DEMUX: Boundary-Aware Multi-Scale Traffic Demixing for Multi-Tab Website Fingerprinting

cs.CR · 2026-04-17 · unverdicted · novelty 7.0

DEMUX achieves state-of-the-art multi-tab website fingerprinting accuracy by preserving boundary signals, modeling at multiple scales, and associating dispersed traffic fragments with a new three-component architecture.

Beyond Visual Cues: Semantic-Driven Token Filtering and Expert Routing for Anytime Person ReID

cs.CV · 2026-04-16 · unverdicted · novelty 7.0

STFER uses LVLM-generated identity-consistent semantic text to drive visual token filtering and expert routing for improved any-time person re-identification under clothing changes and modality shifts.

CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

CDPR integrates polarization priors into a diffusion-based monocular depth estimator via shared latent space and adaptive gating, outperforming RGB-only methods in challenging scenes.

Multimodal Reasoning with LLM for Encrypted Traffic Interpretation: A Benchmark

cs.CR · 2026-04-09 · unverdicted · novelty 7.0

Creates the BGTD benchmark and mmTraffic architecture to enable explainable multimodal interpretation of encrypted network traffic using LLMs.

SHIELD: A Segmented Hierarchical Memory Architecture for Energy-Efficient LLM Inference on Edge NPUs

cs.AR · 2026-04-08 · unverdicted · novelty 7.0

SHIELD reduces eDRAM refresh energy by 35% for LLM inference on edge NPUs by isolating sign/exponent from mantissa bits, disabling refresh on transient QO mantissas, and relaxing it on persistent KV mantissas while keeping accuracy intact.

LiftFormer: Lifting and Frame Theory Based Monocular Depth Estimation Using Depth and Edge Oriented Subspace Representation

cs.CV · 2026-04-08 · unverdicted · novelty 7.0

LiftFormer transforms monocular depth prediction into depth-oriented geometric and edge-aware subspace representations via lifting and frame theory, achieving state-of-the-art results on standard datasets.

A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplete Multimodal EHRs

cs.LG · 2026-04-06 · unverdicted · novelty 7.0

HealthPoint represents clinical events as points in a 4D space (content, time, modality, case) and applies low-rank relational attention to achieve state-of-the-art mortality prediction from multi-level incomplete multimodal EHRs.

CBEN -- A Multimodal Machine Learning Dataset for Cloud Robust Remote Sensing Image Understanding

cs.CV · 2026-02-13 · accept · novelty 7.0

CBEN provides paired optical-radar images with cloud occlusion, revealing 23-33 point AP drops in clear-sky trained models and 17-29 point relative gains when models are trained on cloudy data.

Debate-Enhanced Pseudo Labeling and Frequency-Aware Progressive Debiasing for Weakly-Supervised Camouflaged Object Detection with Scribble Annotations

cs.CV · 2025-12-23 · unverdicted · novelty 7.0

D³ETOR combines debate-enhanced pseudo labeling from SAM with frequency-aware progressive debiasing in FADeNet to achieve state-of-the-art weakly-supervised camouflaged object detection using scribbles.

ExDoS: Expert-Guided Dual-Focus Cross-Modal Distillation for Smart Contract Vulnerability Detection

cs.CR · 2025-09-12 · unverdicted · novelty 7.0

ExDoS uses expert-guided dual-focus distillation between source semantic graphs and bytecode control-flow graphs plus a dual-attention network to improve smart contract vulnerability detection, reporting 3-6% F1 gains over baselines.

Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence

math.OC · 2025-05-06 · conditional · novelty 7.0

GT-NSGDm achieves the optimal non-asymptotic convergence rate O(1/T^{(p-1)/(3p-2)}) for decentralized nonconvex stochastic optimization under zero-mean heavy-tailed noise with p-th moment.

UniT: Unified Geometry Learning with Group Autoregressive Transformer

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

UniT unifies online and offline 3D geometry perception via a Group Autoregressive Transformer that processes observation groups with anchor-free point map prediction and a scale-adaptive loss.

Memory-Augmented Query Intent Understanding for Efficient Chat-based Image Retrieval

cs.CV · 2026-05-17 · unverdicted · novelty 6.0

MAQIU adds a memorization module and recall mechanism to update query intent dynamically in chat-based image retrieval, cutting FLOPs by 86.4% versus ChatIR while improving results.

CSI-JEPA: Towards Foundation Representations for Ubiquitous Sensing with Minimal Supervision

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

CSI-JEPA learns temporal-spectral representations from unlabeled CSI via masked prediction and achieves up to 10.64 percentage points accuracy gain and 98% label savings on seven real-world Wi-Fi sensing tasks.

LoKA: Low-precision Kernel Applications for Recommendation Models At Scale

cs.LG · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

LoKA enables practical FP8 use in numerically sensitive large recommendation models via online profiling of activations, reusable model modifications for stability, and dynamic kernel dispatching.

Evolving Knowledge Distillation for Lightweight Neural Machine Translation

cs.CL · 2026-05-11 · conditional · novelty 6.0

EKD trains lightweight NMT students progressively from a chain of teachers with rising capacity, achieving BLEU scores within 0.08 of the largest teacher on IWSLT-14.

citing papers explorer

Showing 50 of 88 citing papers.

QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling cs.LG · 2026-05-13 · unverdicted · none · ref 3
QLAM extends state-space models with quantum superposition in the hidden state for linear-time long-sequence modeling and reports consistent gains over RNN and transformer baselines on sequential image tasks.
CTQWformer: A CTQW-based Transformer for Graph Classification cs.LG · 2026-05-10 · unverdicted · none · ref 20
CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.
Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems cs.AR · 2026-05-07 · unverdicted · none · ref 50
CAIS delivers 1.38x end-to-end LLM training speedup over NVLS and 1.61x over T3 by making in-switch computing aware of computation memory requirements instead of treating communication as an isolated phase.
Computer-Aided Design Generation by Cascaded Discrete Diffusion Model cs.CV · 2026-05-06 · unverdicted · none · ref 29
Cascaded discrete diffusion generates CAD command sequences with absorbing transitions and parameters with Gaussian, scale-invariant, and prior-preserving kernels, outperforming autoregressive and continuous diffusion baselines on the DeepCAD dataset.
Tempus: A Temporally Scalable Resource-Invariant GEMM Streaming Framework for Versal AI Edge cs.DC · 2026-05-01 · unverdicted · none · ref 25
Tempus delivers 607 GOPS at 10.677 W using fixed 16 AIE cores on Versal AI Edge, with 211.2x better platform-aware utility than spatial SOTA ARIES and zero URAM/DSP utilization.
Autocorrelation Reintroduces Spectral Bias in KANs for Time Series Forecasting cs.LG · 2026-04-26 · unverdicted · none · ref 8
Temporal autocorrelation reintroduces spectral bias in KANs for time series forecasting, which DCT preprocessing can mitigate.
Latent Space Probing for Adult Content Detection in Video Generative Models cs.CV · 2026-04-25 · unverdicted · none · ref 48
Latent space probing on CogVideoX achieves 97.29% F1 for adult content detection on a new 11k-clip dataset with 4-6ms overhead.
Planar Gaussian Splatting with Bilinear Spatial Transformer for Wireless Radiance Field Reconstruction eess.SP · 2026-04-17 · unverdicted · none · ref 17
BiSplat-WRF applies 2D planar Gaussians rendered on angular domains plus a bilinear spatial transformer to capture electromagnetic interactions, outperforming prior NeRF and GS methods on SSIM for wireless radiance field reconstruction.
DEMUX: Boundary-Aware Multi-Scale Traffic Demixing for Multi-Tab Website Fingerprinting cs.CR · 2026-04-17 · unverdicted · none · ref 30
DEMUX achieves state-of-the-art multi-tab website fingerprinting accuracy by preserving boundary signals, modeling at multiple scales, and associating dispersed traffic fragments with a new three-component architecture.
Beyond Visual Cues: Semantic-Driven Token Filtering and Expert Routing for Anytime Person ReID cs.CV · 2026-04-16 · unverdicted · none · ref 13
STFER uses LVLM-generated identity-consistent semantic text to drive visual token filtering and expert routing for improved any-time person re-identification under clothing changes and modality shifts.
CDPR: Cross-modal Diffusion with Polarization for Reliable Monocular Depth Estimation cs.CV · 2026-04-13 · unverdicted · none · ref 58
CDPR integrates polarization priors into a diffusion-based monocular depth estimator via shared latent space and adaptive gating, outperforming RGB-only methods in challenging scenes.
Multimodal Reasoning with LLM for Encrypted Traffic Interpretation: A Benchmark cs.CR · 2026-04-09 · unverdicted · none · ref 33
Creates the BGTD benchmark and mmTraffic architecture to enable explainable multimodal interpretation of encrypted network traffic using LLMs.
SHIELD: A Segmented Hierarchical Memory Architecture for Energy-Efficient LLM Inference on Edge NPUs cs.AR · 2026-04-08 · unverdicted · none · ref 3
SHIELD reduces eDRAM refresh energy by 35% for LLM inference on edge NPUs by isolating sign/exponent from mantissa bits, disabling refresh on transient QO mantissas, and relaxing it on persistent KV mantissas while keeping accuracy intact.
LiftFormer: Lifting and Frame Theory Based Monocular Depth Estimation Using Depth and Edge Oriented Subspace Representation cs.CV · 2026-04-08 · unverdicted · none · ref 25
LiftFormer transforms monocular depth prediction into depth-oriented geometric and edge-aware subspace representations via lifting and frame theory, achieving state-of-the-art results on standard datasets.
A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplete Multimodal EHRs cs.LG · 2026-04-06 · unverdicted · none · ref 30
HealthPoint represents clinical events as points in a 4D space (content, time, modality, case) and applies low-rank relational attention to achieve state-of-the-art mortality prediction from multi-level incomplete multimodal EHRs.
CBEN -- A Multimodal Machine Learning Dataset for Cloud Robust Remote Sensing Image Understanding cs.CV · 2026-02-13 · accept · none · ref 83
CBEN provides paired optical-radar images with cloud occlusion, revealing 23-33 point AP drops in clear-sky trained models and 17-29 point relative gains when models are trained on cloudy data.
Debate-Enhanced Pseudo Labeling and Frequency-Aware Progressive Debiasing for Weakly-Supervised Camouflaged Object Detection with Scribble Annotations cs.CV · 2025-12-23 · unverdicted · none · ref 44
D³ETOR combines debate-enhanced pseudo labeling from SAM with frequency-aware progressive debiasing in FADeNet to achieve state-of-the-art weakly-supervised camouflaged object detection using scribbles.
ExDoS: Expert-Guided Dual-Focus Cross-Modal Distillation for Smart Contract Vulnerability Detection cs.CR · 2025-09-12 · unverdicted · none · ref 51
ExDoS uses expert-guided dual-focus distillation between source semantic graphs and bytecode control-flow graphs plus a dual-attention network to improve smart contract vulnerability detection, reporting 3-6% F1 gains over baselines.
Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence math.OC · 2025-05-06 · conditional · none · ref 14
GT-NSGDm achieves the optimal non-asymptotic convergence rate O(1/T^{(p-1)/(3p-2)}) for decentralized nonconvex stochastic optimization under zero-mean heavy-tailed noise with p-th moment.
UniT: Unified Geometry Learning with Group Autoregressive Transformer cs.CV · 2026-05-20 · unverdicted · none · ref 41
UniT unifies online and offline 3D geometry perception via a Group Autoregressive Transformer that processes observation groups with anchor-free point map prediction and a scale-adaptive loss.
Memory-Augmented Query Intent Understanding for Efficient Chat-based Image Retrieval cs.CV · 2026-05-17 · unverdicted · none · ref 39
MAQIU adds a memorization module and recall mechanism to update query intent dynamically in chat-based image retrieval, cutting FLOPs by 86.4% versus ChatIR while improving results.
CSI-JEPA: Towards Foundation Representations for Ubiquitous Sensing with Minimal Supervision cs.LG · 2026-05-13 · unverdicted · none · ref 38
CSI-JEPA learns temporal-spectral representations from unlabeled CSI via masked prediction and achieves up to 10.64 percentage points accuracy gain and 98% label savings on seven real-world Wi-Fi sensing tasks.
LoKA: Low-precision Kernel Applications for Recommendation Models At Scale cs.LG · 2026-05-11 · unverdicted · none · ref 71 · 2 links
LoKA enables practical FP8 use in numerically sensitive large recommendation models via online profiling of activations, reusable model modifications for stability, and dynamic kernel dispatching.
Evolving Knowledge Distillation for Lightweight Neural Machine Translation cs.CL · 2026-05-11 · conditional · none · ref 2
EKD trains lightweight NMT students progressively from a chain of teachers with rising capacity, achieving BLEU scores within 0.08 of the largest teacher on IWSLT-14.
Generating Roadside LiDAR Datasets from Vehicle-Side Datasets via Novel View Synthesis cs.RO · 2026-05-07 · unverdicted · none · ref 29
VRS generates annotated roadside LiDAR data from vehicle observations via novel view synthesis with geometry completion and occupancy constraints, improving 3D object detection generalization.
Accelerating MoE with Dynamic In-Switch Computing on Multi-GPUs cs.AR · 2026-05-07 · unverdicted · none · ref 52
DySHARP accelerates MoE expert parallelism via dynamic multimem addressing and token-centric kernel fusion to cut redundant traffic and deliver up to 1.79x speedup over prior in-switch solutions.
Text-to-CAD Retrieval: a Strong Baseline cs.CV · 2026-05-07 · unverdicted · none · ref 42
Text-to-CAD retrieval is introduced as a cross-modal task with a baseline that learns joint embeddings from CAD construction sequences, point clouds, and text queries via a masked feature decoder.
Stage Light is Sequence$^2$: Multi-Light Control via Imitation Learning cs.MM · 2026-05-05 · unverdicted · none · ref 12
SeqLight maps music to multi-light HSV control via SkipBART for global color prediction followed by hybrid imitation learning in a goal-conditioned MDP to decompose colors across lights.
RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation cs.CV · 2026-04-30 · unverdicted · none · ref 30
RIHA proposes a hierarchical alignment transformer that uses multi-scale visual and textual feature pyramids plus optimal transport to generate more accurate radiology reports from medical images.
DUAL-BLADE: Dual-Path NVMe-Direct KV-Cache Offloading for Edge LLM Inference cs.DC · 2026-04-29 · unverdicted · none · ref 15
DUAL-BLADE uses a dual-path KV-cache framework with NVMe-direct access to reduce prefill and decode latency by up to 33% and 42% while improving SSD utilization 2.2x under tight memory budgets.
VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection cs.CR · 2026-04-29 · unverdicted · none · ref 8
VulStyle pre-trains on 4.9M functions using code, non-terminal ASTs, and stylometry features, then fine-tunes to achieve SOTA F1 gains of 4-48% on BigVul and VulDeePecker.
FusionCIM: Accelerating LLM Inference with Fusion-Driven Computing-in-Memory Architecture cs.AR · 2026-04-28 · unverdicted · none · ref 15
FusionCIM is a fusion-driven CIM accelerator for LLM inference that maps QKT to IP-CIM and PV to OP-CIM, uses QO-stationary dataflow, and applies pattern-aware online softmax, delivering up to 3.86x energy savings and 1.98x speedup on LLaMA-3 at 29.4 TOPS/W.
BridgeACT: Bridging Human Demonstrations to Robot Actions via Unified Tool-Target Affordances cs.RO · 2026-04-25 · unverdicted · none · ref 25
BridgeACT learns robot manipulation from human videos alone by predicting task-relevant grasp regions and 3D motion affordances that map directly to robot controllers.
SparKV: Overhead-Aware KV Cache Loading for Efficient On-Device LLM Inference cs.NI · 2026-04-23 · unverdicted · none · ref 28
SparKV reduces time-to-first-token by 1.3x-5.1x and energy use by 1.5x-3.3x for on-device LLM inference by adaptively choosing between cloud KV streaming and local computation while overlapping execution and adjusting for runtime conditions.
Lossless Compression via Chained Lightweight Neural Predictors with Information Inheritance cs.IT · 2026-04-16 · unverdicted · none · ref 14
A new chain of lightweight neural predictors with information inheritance achieves near state-of-the-art lossless compression ratios while delivering 1.2-6.3x faster encoding and 2.8-12.3x faster decoding than PAC on GPUs.
Boundary-Centric Active Learning for Temporal Action Segmentation cs.CV · 2026-04-16 · unverdicted · none · ref 40
B-ACT improves label efficiency in temporal action segmentation by selecting only boundary frames for annotation via a two-stage uncertainty-driven process that fuses neighborhood uncertainty, class ambiguity, and temporal dynamics.
Expressivity of Transformers: A Tropical Geometry Perspective cs.LG · 2026-04-16 · unverdicted · none · ref 1
Self-attention in transformers corresponds exactly to Power Voronoi diagrams under tropical geometry, yielding tight bounds of Theta(N to the power of d_model times L) linear regions.
Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus cs.LG · 2026-04-15 · conditional · none · ref 9
CMAT uses a transformer decoder to produce a high-level consensus vector in latent space, enabling simultaneous order-independent actions by all agents and optimization via single-agent PPO, with superior results on StarCraft II, Multi-Agent MuJoCo, and Google Research Football.
Frequency-aware Decomposition Learning for Sensorless Wrench Forecasting on a Vibration-rich Hydraulic Manipulator cs.RO · 2026-04-14 · unverdicted · none · ref 27
FDN uses spectral decomposition, asymmetric heads for deterministic and probabilistic wrench components, and frequency-aware filtering to forecast high-frequency wrench from proprioception, outperforming baselines on hydraulic manipulator grinding data after pretraining and transfer.
CODO: An Automated Compiler for Comprehensive Dataflow Optimization cs.AR · 2026-04-14 · unverdicted · none · ref 38
CODO automates comprehensive dataflow optimization on FPGAs, achieving 1.45x-4.52x speedups on kernels and up to 33.8x on DNN models over state-of-the-art frameworks.
VLMaterial: Vision-Language Model-Based Camera-Radar Fusion for Physics-Grounded Material Identification eess.SP · 2026-04-13 · unverdicted · none · ref 42
VLMaterial fuses VLMs and physics-based radar analysis via PRCA extraction and context-augmented generation to reach 96.08% material identification accuracy on 41 everyday objects without task-specific training.
The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems cs.CR · 2026-04-13 · unverdicted · none · ref 40
Salami Attack chains low-risk inputs to cumulatively trigger high-risk LLM behaviors, achieving over 90% success on GPT-4o and Gemini while resisting some defenses.
MAG-Net: Physics-Aware Multi-Modal Fusion of Geostationary Satellite and Radar for Severe Convective Precipitation Nowcasting physics.ao-ph · 2026-04-03 · unverdicted · none · ref 16
MAG-Net integrates radar dynamics with satellite IR, WV, and BTD channels via dual-stream encoding and uncertainty-weighted decoding to raise CSI40 by 0.083 over prior baselines for intense convective events.
Light-ResKAN: A Parameter-Sharing Lightweight KAN with Gram Polynomials for Efficient SAR Image Recognition cs.CV · 2026-04-02 · unverdicted · none · ref 17
Light-ResKAN reaches 99.09% accuracy on MSTAR SAR images with 82.9 times fewer FLOPs and 163.78 times fewer parameters than VGG16 by combining KAN convolutions, Gram polynomials, and channel-wise parameter sharing.
VAN-AD: Visual Masked Autoencoder with Normalizing Flow For Time Series Anomaly Detection cs.LG · 2026-03-27 · unverdicted · none · ref 17
VAN-AD adapts a pretrained visual MAE with distribution mapping and normalizing flow modules to detect anomalies in time series data more effectively across different datasets.
UniMamba: A Unified Spatial-Temporal Modeling Framework with State-Space and Attention Integration cs.LG · 2026-03-06 · unverdicted · none · ref 12
UniMamba integrates Mamba state-space dynamics with attention layers and transforms like FFT-Laplace to outperform prior models on multivariate time series forecasting benchmarks.
Rethinking Efficiency in Neural Combinatorial Optimization: Batched Preference Optimization with Mamba cs.LG · 2026-02-24 · unverdicted · none · ref 15
ECO uses supervised warm-up plus iterative batched DPO on a Mamba backbone to reach top neural performance on TSP and CVRP while lowering memory growth and raising throughput.
Attention-Based Neural-Augmented Kalman Filter for Legged Robot State Estimation cs.RO · 2026-01-26 · unverdicted · none · ref 21
AttenNKF augments InEKF with an attention-based neural compensator trained in latent space to correct foot-slip errors in legged robot state estimation.
BERTO: Intent-Driven Network Time Series Forecasting via Natural Language Operator Preferences cs.LG · 2025-12-05 · unverdicted · none · ref 12
BERTO introduces a prompt-conditioned BERT framework for cellular traffic forecasting that uses a balancing loss to enable flexible trade-offs between power consumption and SLA violations using natural language inputs.
Learning A Unified Risk Map for Autonomous Driving in Partially Observable Environments cs.RO · 2026-05-21 · unverdicted · none · ref 31
The paper proposes a unified risk map modeling and learning framework integrated with diffusion-based adversarial scenario generation for risk-aware planning in partially observable autonomous driving, demonstrating improved time-to-collision metrics on the Waymo Open Motion Dataset.

Attention is all you need

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer