T oken S kip: Controllable Chain-of-Thought Compression in LLM s

Xia, Heming, Leong, Chak Tou, Wang, Wenjie, Li, Yongqi, Li, Wenjie · 2025 · DOI 10.18653/v1/2025.emnlp-main.165

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

State commitment learning: training language models to distinguish computation from memory

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

Introduces state commitment learning and Counterfactual Erasure RL (CERL) to train models to commit only persistent state, reducing answer dependence on hidden thoughts across math, logic, QA, and tool-use tasks without accuracy loss.

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

cs.LG · 2026-06-04 · unverdicted · novelty 6.0

Post-hoc model-based compression of reasoning traces cuts training tokens to 12-30% and speeds training 2-7.6x while retaining up to 96% of raw-trace accuracy, though raw traces remain superior at every scale.

CAT: Confidence-Adaptive Thinking for Efficient Reasoning of Large Reasoning Models

cs.CL · 2026-07-01 · unverdicted · novelty 5.0

CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.

citing papers explorer

Showing 3 of 3 citing papers after filters.

State commitment learning: training language models to distinguish computation from memory cs.LG · 2026-05-22 · unverdicted · none · ref 16
Introduces state commitment learning and Counterfactual Erasure RL (CERL) to train models to commit only persistent state, reducing answer dependence on hidden thoughts across math, logic, QA, and tool-use tasks without accuracy loss.
Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation cs.LG · 2026-06-04 · unverdicted · none · ref 38
Post-hoc model-based compression of reasoning traces cuts training tokens to 12-30% and speeds training 2-7.6x while retaining up to 96% of raw-trace accuracy, though raw traces remain superior at every scale.
CAT: Confidence-Adaptive Thinking for Efficient Reasoning of Large Reasoning Models cs.CL · 2026-07-01 · unverdicted · none · ref 9
CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.

T oken S kip: Controllable Chain-of-Thought Compression in LLM s

fields

years

verdicts

representative citing papers

citing papers explorer