and Tao, Dacheng , year=

Gou, J · 2021 · DOI 10.1007/s11263-021-01453-z

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

open at publisher browse 8 citing papers

representative citing papers

HASTE: A Framework for Training-Free, Dynamic, and Steerable Compression of Pre-Trained Convolutional Neural Networks

cs.CV · 2026-06-29 · unverdicted · novelty 7.0

HASTE enables training-free dynamic compression of pre-trained CNNs by patch-wise LSH-based merging of redundant channels, reporting 46.2% FLOPs reduction on ResNet34 CIFAR-10 with 1.25% accuracy drop.

Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

CAML meta-learns a progressively refined inductive bias from active-learning queries to improve robustness to spurious correlations, reporting accuracy gains on minority groups across several benchmarks.

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

cs.AI · 2026-05-07 · conditional · novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning

astro-ph.GA · 2026-04-28 · unverdicted · novelty 7.0

A single end-to-end Transformer model unifies stellar labels from heterogeneous spectroscopic surveys into a self-consistent scale without post-hoc recalibration.

CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation

cs.CL · 2025-02-28 · unverdicted · novelty 7.0

CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.

Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

SkyPart achieves state-of-the-art single-pass cross-view geo-localization on SUES-200, University-1652, and DenseUAV by using prototype-based part discovery, altitude-conditioned modulation, and Kendall-weighted loss, with widening gains under weather corruptions.

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

cs.CL · 2023-05-22 · unverdicted · novelty 6.0

Uptraining multi-head transformer checkpoints to grouped-query attention models achieves near multi-head quality at multi-query inference speeds using 5% additional compute.

Query-efficient model evaluation using cached responses

cs.LG · 2026-05-08

citing papers explorer

Showing 8 of 8 citing papers.

HASTE: A Framework for Training-Free, Dynamic, and Steerable Compression of Pre-Trained Convolutional Neural Networks cs.CV · 2026-06-29 · unverdicted · none · ref 33
HASTE enables training-free dynamic compression of pre-trained CNNs by patch-wise LSH-based merging of redundant channels, reporting 46.2% FLOPs reduction on ResNet34 CIFAR-10 with 1.25% accuracy drop.
Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations cs.LG · 2026-05-20 · unverdicted · none · ref 95
CAML meta-learns a progressively refined inductive bias from active-learning queries to improve robustness to spurious correlations, reporting accuracy gains on minority groups across several benchmarks.
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost cs.AI · 2026-05-07 · conditional · none · ref 46
Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning astro-ph.GA · 2026-04-28 · unverdicted · none · ref 25
A single end-to-end Transformer model unifies stellar labels from heterogeneous spectroscopic surveys into a self-consistent scale without post-hoc recalibration.
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation cs.CL · 2025-02-28 · unverdicted · none · ref 96
CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.
Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery cs.CV · 2026-05-12 · unverdicted · none · ref 22 · 2 links
SkyPart achieves state-of-the-art single-pass cross-view geo-localization on SUES-200, University-1652, and DenseUAV by using prototype-based part discovery, altitude-conditioned modulation, and Kendall-weighted loss, with widening gains under weather corruptions.
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints cs.CL · 2023-05-22 · unverdicted · none · ref 44
Uptraining multi-head transformer checkpoints to grouped-query attention models achieves near multi-head quality at multi-query inference speeds using 5% additional compute.
Query-efficient model evaluation using cached responses cs.LG · 2026-05-08 · unreviewed · ref 156

and Tao, Dacheng , year=

fields

years

verdicts

representative citing papers

citing papers explorer