WE-MATH benchmark reveals most LMMs rely on rote memorization for visual math while GPT-4o has shifted toward knowledge generalization.
Gradient-based learning applied to document recognition
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
High-probability ergodic and last-iterate complexity guarantees for random reshuffling SGD on smooth nonconvex optimization that match best in-expectation bounds up to logarithmic factors without extra assumptions.
EdgeFD uses a KMeans-based client-side filter to improve federated distillation accuracy close to IID levels on non-IID data distributions for resource-constrained edge devices.
Refined probabilistic and smooth l0 pruning techniques approximate minimum description length for neural networks, achieving high compression with minimal accuracy loss and empirically verifying better sample efficiency and generalization on image and text tasks.
Hierarchical multigraph GCNs applied to superpixels achieve competitive or superior accuracy to CNNs on standard image classification benchmarks.
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.
A survey classifying machine unlearning into centralized (exact and approximate), distributed/irregular data, verification, and privacy/security categories with technique overviews.
citing papers explorer
-
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
WE-MATH benchmark reveals most LMMs rely on rote memorization for visual math while GPT-4o has shifted toward knowledge generalization.
-
High Probability Guarantees for Random Reshuffling
High-probability ergodic and last-iterate complexity guarantees for random reshuffling SGD on smooth nonconvex optimization that match best in-expectation bounds up to logarithmic factors without extra assumptions.
-
Federated Distillation on Edge Devices: Efficient Client-Side Filtering for Non-IID Data
EdgeFD uses a KMeans-based client-side filter to improve federated distillation accuracy close to IID levels on non-IID data distributions for resource-constrained edge devices.
-
Efficient compression of neural networks and datasets
Refined probabilistic and smooth l0 pruning techniques approximate minimum description length for neural networks, achieving high compression with minimal accuracy loss and empirically verifying better sample efficiency and generalization on image and text tasks.
-
Image Classification with Hierarchical Multigraph Networks
Hierarchical multigraph GCNs applied to superpixels achieve competitive or superior accuracy to CNNs on standard image classification benchmarks.
-
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
Position paper claiming that distributed training across massive edge devices can overcome data depletion and centralized compute monopolies in LLM scaling.
-
Machine Unlearning: A Comprehensive Survey
A survey classifying machine unlearning into centralized (exact and approximate), distributed/irregular data, verification, and privacy/security categories with technique overviews.