In high-dimensional continual linear regression, optimal fixed L2 regularization strength scales as T/ln T with the number of tasks and mitigates label noise for arbitrary linear teachers.
Canonical reference
Title resolution pending
Canonical reference. 90% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
representative citing papers
Empirical study on production-scale clinical NLP shows direct learning from verifier rejections fails due to sparse data while fixed ontology and evidence-support filters succeed, with selectivity determined by matching verifier evidence.
CARLOS employs an aggregate deep neural network trained on progressively finer time grids with adaptive sampling to learn continuous-time exercise boundaries for optimal stopping, delivering higher values than discrete Bermudan methods.
EST-PRM stress-tests five PRM models on 4,687 reasoning chains from MATH-500, GSM8K, and PRMBench using three label-preserving transformations and reports model-specific vulnerability patterns.
Formalizes continual model routing (CMR), releases CMRBench with over 2000 models, and presents CARvE which outperforms retrieval, fine-tuning and adapter-merging baselines on model/family/domain accuracy.
Introduces Unlearning Depth Score (UDS) via activation patching to quantify LLM unlearning depth and claims it outperforms 20 other metrics in faithfulness and robustness on 150 models.
Introduces a unified benchmark for continual anomaly detection with discrete and continuous protocols plus a training-free DINOSaur method that outperforms prior CAD approaches with zero forgetting and sub-100ms edge inference.
PMF-CL derives Pareto-minimal-forgetting algorithms for linear/basis-function regression and quadratic-bounded losses like logistic regression, achieving static O(d²) memory for d-parameter models.
Formalizes Reasoning Portability (RP) and proposes RDB-CL to modulate per-sample KL regularization in RLVR for MLLM continual learning, achieving +12.0% Last accuracy over vanilla RLVR baseline by preserving reusable reasoning on high-RP samples.
Pre-pretraining on MP-STRUCT matches k-Shuffle Dyck baselines in efficiency while adding human-like resistance to implausible languages and challenges the need for C-RASP definability in effective PPT languages.
Fuzzy ARTMAP models are highly vulnerable to a new white-box attack aligned with their category competition, but progressive selective training yields stronger replay-free robustness than offline adversarial training under adaptive evaluation.
A cross-version swap protocol reveals dominant skills that swing composition success by up to 50 percentage points, and an atomic probe with selective revalidation governs updates at lower cost than always re-testing full compositions.
Preconditioned delta-rule models with a diagonal curvature approximation improve upon standard DeltaNet, GDN, and KDA by better approximating the test-time regression objective.
CapTrack shows post-training causes drift beyond facts, with instruction fine-tuning producing stronger behavioral changes than preference optimization across model families.
LoRA adapters should be scaled by 1/sqrt(rank) rather than 1/rank to stabilize learning and enable effective use of higher ranks during fine-tuning of large language models.
A deterministic episodic-to-semantic consolidation function with a structural lemma proving identity invariance, demonstrated in synthetic experiments on an embodied service agent.
IPLoc-ID extends prior localization-only work to full identification and localization by using a self-posed query in VLMs to reject negative images while preserving comparable localization accuracy.
Quality-aware self-distillation using soft correctness-aware gating and teacher-probability scaling improves VLM performance on GUI grounding benchmarks when both components are combined.
RLVR exhibits correct-set turnover where solved problems regress during training, and a periodic review mechanism exploiting a repair-window principle improves retention and performance over baselines.
Structured text representations like CML and MolJSON outperform SMILES variants on structural tasks while IUPAC dominates semantic tasks such as molecule retrieval across all tested LLMs.
AdvCL repurposes adversarial perturbations into geometric control signals for continual learning using Intra-Smooth, Proto-Clip, and Inter-Align modules, reporting gains in performance, robustness, lower forgetting, and stronger transfer.
PROXYMIX learns a dynamic replay controller on a small proxy model and transfers it to a large target model, improving accuracy by 3.4 points and reducing forgetting by 3.5 points on LLaMA-3-8B continual tuning sequences.
Divergence Decoding steers LLM logits using small auxiliary models to unlearn specific data at inference time, outperforming baselines and generalizing to images.
Introduces 9 synthetic annotation tasks and benchmarks for behavioral cloning, finding hierarchical skill learning, scaling benefits, effective multi-task pretraining, and shared internal representations of task phases and mistakes.
citing papers explorer
-
EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure
EASE closes three residual anchors in federated multimodal unlearning using bilateral displacement, cosine-sine decomposition, and forget lock, achieving near-retrain performance on forget and retain data.