RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
hub
Weinberger , editor =
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
MM-Eval unifies evaluation of multimodal summaries by integrating factual text quality, cross-modal relevance via MLLM judge, and visual diversity via truncated CLIP entropy, then calibrates their combination on human preferences.
MIRL uses mutual information to guide trajectory selection and provide separate rewards for visual perception in RLVR for VLMs, achieving 70.22% average accuracy with 25% fewer full trajectories.
LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.
Low-bit post-training quantization of reasoning LLMs increases reasoning token counts while preserving accuracy, introducing a hidden test-time compute cost.
Introduces a representation-geometry-based taxonomy for continual learning in speech and audio, identifies mismatches with current CL assumptions in foundation models, and lists open challenges.
Mean-field theory of dropout at the edge of chaos derives scaling laws showing front-loaded schedules outperform constant dropout by shifting the perfect-alignment fixed point.
SPeCTrA-Sum uses hierarchical cross-modal fusion via DVP and DPP-distilled image selection via VRP to generate more accurate and visually grounded multimodal summaries.
ViSA-R2 recovers single executable SymPy expressions for linear steady-state fields from visualizations using a self-verifying chain-of-thought that recognizes patterns, hypothesizes solution families, derives parameters, and checks consistency.
VCON is a unified framework for smooth iterative DNN compression that uses parallel execution and an affine combination to progressively replace the original model with its compressed form during fine-tuning.
KLR Hopfield networks exhibit robustness to quantization but sensitivity to pruning, interpreted as arising from dense bimodal parameterization of sparse input mappings.
Gated-SwinRMT unifies Swin windowed attention with retentive Manhattan decay via gating, reaching 80.22% top-1 accuracy on Mini-ImageNet versus 73.74% for the RMT baseline.
A DenseNet201 base model trained on a constructed plant leaf disease dataset outperforms baselines and enables faster, more robust transfer learning with less data than general models.
citing papers explorer
-
Measuring What Matters Beyond Text: Evaluating Multimodal Summaries by Quality, Alignment, and Diversity
MM-Eval unifies evaluation of multimodal summaries by integrating factual text quality, cross-modal relevance via MLLM judge, and visual diversity via truncated CLIP entropy, then calibrates their combination on human preferences.
-
Quantization Inflates Reasoning: Token Inflation as a Hidden Cost of Low-Bit Reasoning Models
Low-bit post-training quantization of reasoning LLMs increases reasoning token counts while preserving accuracy, introducing a hidden test-time compute cost.
-
Towards Visually Grounded Multimodal Summarization via Cross-Modal Transformer and Gated Attention
SPeCTrA-Sum uses hierarchical cross-modal fusion via DVP and DPP-distilled image selection via VRP to generate more accurate and visually grounded multimodal summaries.
-
Hidden in Plain Sight: Visual-to-Symbolic Analytical Solution Inference from Field Visualizations
ViSA-R2 recovers single executable SymPy expressions for linear steady-state fields from visualizations using a self-verifying chain-of-thought that recognizes patterns, hypothesizes solution families, derives parameters, and checks consistency.