In high-dimensional continual linear regression, optimal fixed L2 regularization strength scales as T/ln T with the number of tasks and mitigates label noise for arbitrary linear teachers.
Canonical reference
Title resolution pending
Canonical reference. 90% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
representative citing papers
CARLOS employs an aggregate deep neural network trained on progressively finer time grids with adaptive sampling to learn continuous-time exercise boundaries for optimal stopping, delivering higher values than discrete Bermudan methods.
EST-PRM stress-tests five PRM models on 4,687 reasoning chains from MATH-500, GSM8K, and PRMBench using three label-preserving transformations and reports model-specific vulnerability patterns.
Formalizes Reasoning Portability (RP) and proposes RDB-CL to modulate per-sample KL regularization in RLVR for MLLM continual learning, achieving +12.0% Last accuracy over vanilla RLVR baseline by preserving reusable reasoning on high-RP samples.
Pre-pretraining on MP-STRUCT matches k-Shuffle Dyck baselines in efficiency while adding human-like resistance to implausible languages and challenges the need for C-RASP definability in effective PPT languages.
Fuzzy ARTMAP models are highly vulnerable to a new white-box attack aligned with their category competition, but progressive selective training yields stronger replay-free robustness than offline adversarial training under adaptive evaluation.
A cross-version swap protocol reveals dominant skills that swing composition success by up to 50 percentage points, and an atomic probe with selective revalidation governs updates at lower cost than always re-testing full compositions.
Preconditioned delta-rule models with a diagonal curvature approximation improve upon standard DeltaNet, GDN, and KDA by better approximating the test-time regression objective.
CapTrack shows post-training causes drift beyond facts, with instruction fine-tuning producing stronger behavioral changes than preference optimization across model families.
LoRA adapters should be scaled by 1/sqrt(rank) rather than 1/rank to stabilize learning and enable effective use of higher ranks during fine-tuning of large language models.
Quality-aware self-distillation using soft correctness-aware gating and teacher-probability scaling improves VLM performance on GUI grounding benchmarks when both components are combined.
RLVR exhibits correct-set turnover where solved problems regress during training, and a periodic review mechanism exploiting a repair-window principle improves retention and performance over baselines.
Structured text representations like CML and MolJSON outperform SMILES variants on structural tasks while IUPAC dominates semantic tasks such as molecule retrieval across all tested LLMs.
AdvCL repurposes adversarial perturbations into geometric control signals for continual learning using Intra-Smooth, Proto-Clip, and Inter-Align modules, reporting gains in performance, robustness, lower forgetting, and stronger transfer.
PROXYMIX learns a dynamic replay controller on a small proxy model and transfers it to a large target model, improving accuracy by 3.4 points and reducing forgetting by 3.5 points on LLaMA-3-8B continual tuning sequences.
Divergence Decoding steers LLM logits using small auxiliary models to unlearn specific data at inference time, outperforming baselines and generalizing to images.
2D-ProteinRAG is a dual-dimensional RAG framework that incorporates BLAST workflows plus horizontal attribute alignment and vertical homology denoising to improve protein-text QA on both in-distribution and out-of-distribution cases.
Silent collapse in recursive learning contracts internal distributions like entropy and diversity despite stable metrics, preceded by three precursors that enable the MTR monitoring framework to intervene early.
OP-Mix is an on-policy data mixing method that uses low-rank adapter interpolation to find near-optimal data mixtures throughout language model training with reduced compute.
SLICE applies gradient surgery via projection and truncated SVD to initialize LoRA adapters, yielding better stability-plasticity trade-offs on continual learning benchmarks including adversarial task sequences.
Early mixing of post-training data into pretraining improves retention of acquired capabilities after subsequent fine-tuning in language models.
Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.
DynaMiCS uses short probing runs to build a slope matrix of cross-domain effects and solves a constrained optimization over mixture weights to improve targets while respecting performance bounds on constrained domains.
Self-evolving LLM agents exhibit capability erosion under continual adaptation, which Capability-Preserving Evolution mitigates by raising retained simple-task performance from 41.8% to 52.8% in workflow evolution under GPT-5.1.
citing papers explorer
-
Streaming Adversarial Robustness in Fuzzy ARTMAP: Mechanism-Aligned Evaluation, Progressive Training, and Interpretable Diagnostics
Fuzzy ARTMAP models are highly vulnerable to a new white-box attack aligned with their category competition, but progressive selective training yields stronger replay-free robustness than offline adversarial training under adaptive evaluation.
-
Atomic-Probe Governance for Skill Updates in Compositional Robot Policies
A cross-version swap protocol reveals dominant skills that swing composition success by up to 50 percentage points, and an atomic probe with selective revalidation governs updates at lower cost than always re-testing full compositions.
-
Not How Many, But Which: Parameter Placement in Low-Rank Adaptation
Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.
-
DynaMiCS: Fine-tuning LLMs with Performance Constraints using Dynamic Mixtures
DynaMiCS uses short probing runs to build a slope matrix of cross-domain effects and solves a constrained optimization over mixture weights to improve targets while respecting performance bounds on constrained domains.
-
Do Self-Evolving Agents Forget? Capability Degradation and Preservation in Lifelong LLM Agent Adaptation
Self-evolving LLM agents exhibit capability erosion under continual adaptation, which Capability-Preserving Evolution mitigates by raising retained simple-task performance from 41.8% to 52.8% in workflow evolution under GPT-5.1.
-
RareCP: Regime-Aware Retrieval for Efficient Conformal Prediction
RareCP improves interval efficiency for time series conformal prediction by retrieving and weighting regime-specific calibration examples while adapting to drift and maintaining coverage.
-
EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure
EASE closes three residual anchors in federated multimodal unlearning using bilateral displacement, cosine-sine decomposition, and forget lock, achieving near-retrain performance on forget and retain data.
-
The Dynamic Gist-Based Memory Model (DGMM): A Memory-Centric Architecture for Artificial Intelligence
DGMM is proposed as an explicit graph-structured memory architecture for AI that enables persistent episodic memory, cue-based recall, and context-dependent interpretation without retraining.
-
Efficient Task Adaptation in Large Language Models via Selective Parameter Optimization
The paper claims a selective fine-tuning method that identifies and freezes core parameters to mitigate catastrophic forgetting in LLMs while improving domain adaptation, shown in experiments with GPT-J and LLaMA-3.