In high-dimensional continual linear regression, optimal fixed L2 regularization strength scales as T/ln T with the number of tasks and mitigates label noise for arbitrary linear teachers.
Canonical reference
Title resolution pending
Canonical reference. 90% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
representative citing papers
CARLOS employs an aggregate deep neural network trained on progressively finer time grids with adaptive sampling to learn continuous-time exercise boundaries for optimal stopping, delivering higher values than discrete Bermudan methods.
EST-PRM stress-tests five PRM models on 4,687 reasoning chains from MATH-500, GSM8K, and PRMBench using three label-preserving transformations and reports model-specific vulnerability patterns.
Formalizes continual model routing (CMR), releases CMRBench with over 2000 models, and presents CARvE which outperforms retrieval, fine-tuning and adapter-merging baselines on model/family/domain accuracy.
Introduces Unlearning Depth Score (UDS) via activation patching to quantify LLM unlearning depth and claims it outperforms 20 other metrics in faithfulness and robustness on 150 models.
Introduces a unified benchmark for continual anomaly detection with discrete and continuous protocols plus a training-free DINOSaur method that outperforms prior CAD approaches with zero forgetting and sub-100ms edge inference.
PMF-CL derives Pareto-minimal-forgetting algorithms for linear/basis-function regression and quadratic-bounded losses like logistic regression, achieving static O(d²) memory for d-parameter models.
Formalizes Reasoning Portability (RP) and proposes RDB-CL to modulate per-sample KL regularization in RLVR for MLLM continual learning, achieving +12.0% Last accuracy over vanilla RLVR baseline by preserving reusable reasoning on high-RP samples.
Pre-pretraining on MP-STRUCT matches k-Shuffle Dyck baselines in efficiency while adding human-like resistance to implausible languages and challenges the need for C-RASP definability in effective PPT languages.
Fuzzy ARTMAP models are highly vulnerable to a new white-box attack aligned with their category competition, but progressive selective training yields stronger replay-free robustness than offline adversarial training under adaptive evaluation.
A cross-version swap protocol reveals dominant skills that swing composition success by up to 50 percentage points, and an atomic probe with selective revalidation governs updates at lower cost than always re-testing full compositions.
Preconditioned delta-rule models with a diagonal curvature approximation improve upon standard DeltaNet, GDN, and KDA by better approximating the test-time regression objective.
CapTrack shows post-training causes drift beyond facts, with instruction fine-tuning producing stronger behavioral changes than preference optimization across model families.
LoRA adapters should be scaled by 1/sqrt(rank) rather than 1/rank to stabilize learning and enable effective use of higher ranks during fine-tuning of large language models.
Quality-aware self-distillation using soft correctness-aware gating and teacher-probability scaling improves VLM performance on GUI grounding benchmarks when both components are combined.
RLVR exhibits correct-set turnover where solved problems regress during training, and a periodic review mechanism exploiting a repair-window principle improves retention and performance over baselines.
Structured text representations like CML and MolJSON outperform SMILES variants on structural tasks while IUPAC dominates semantic tasks such as molecule retrieval across all tested LLMs.
AdvCL repurposes adversarial perturbations into geometric control signals for continual learning using Intra-Smooth, Proto-Clip, and Inter-Align modules, reporting gains in performance, robustness, lower forgetting, and stronger transfer.
PROXYMIX learns a dynamic replay controller on a small proxy model and transfers it to a large target model, improving accuracy by 3.4 points and reducing forgetting by 3.5 points on LLaMA-3-8B continual tuning sequences.
Divergence Decoding steers LLM logits using small auxiliary models to unlearn specific data at inference time, outperforming baselines and generalizing to images.
Introduces 9 synthetic annotation tasks and benchmarks for behavioral cloning, finding hierarchical skill learning, scaling benefits, effective multi-task pretraining, and shared internal representations of task phases and mistakes.
TypedCSIP applies typed counterfactual selective intervention pretraining on expert revisions to lift macro-F1 by 0.9-1.3 pp on the LCR-CN Chinese legislative conflict classification benchmark under a pre-registered multi-seed test.
Introduces a representation-geometry-based taxonomy for continual learning in speech and audio, identifies mismatches with current CL assumptions in foundation models, and lists open challenges.
2D-ProteinRAG is a dual-dimensional RAG framework that incorporates BLAST workflows plus horizontal attribute alignment and vertical homology denoising to improve protein-text QA on both in-distribution and out-of-distribution cases.
citing papers explorer
-
CLIMB: Centroid-Based Hierarchical Memory for Online Continual Self-Supervised Learning
CLIMB uses a bounded hierarchical centroid memory with knowledge distillation to outperform prior OCSSL methods on Split CIFAR-100 and Split ImageNet-100 including irregular task distributions.
-
EVAF: A Test-Retest Protocol for Selective Parametric Consolidation
EVAF and test-retest protocol show selective parametric consolidation of high-valence experiences in GPT-2 and TinyLlama while preserving factual retrieval.
-
ROMEVA: Geometry-Preserving Vocabulary Expansion for Roman Urdu Language Models
ROMEVA stabilizes embeddings during vocabulary expansion for Roman Urdu but naive fine-tuning outperforms it on sentiment classification, showing a disconnect between stability and task results.
-
Fine-tuning MLIP foundation models: strategies for accuracy and transferability
Systematic tests show naive fine-tuning excels for single-task accuracy while multihead replay best preserves out-of-distribution robustness in MLIP adaptation.
-
TaskFusion: Continual Anomaly Detection for Heterogeneous Tabular Data
TaskFusion combines AGF feature mapping, cross-task augmentation, and distilled replay for continual anomaly detection on heterogeneous tabular data, reporting gains over baselines on 21 datasets.
-
Non-Forgetting Knowledge Allocation with Bi-level Competition for Class-Incremental Learning
NoFA-BC proposes a non-forgetting allocator using recursive least-squares and bi-level competition for improved knowledge allocation in class-incremental learning.
-
Audio Deepfake Detection with Half-Truth Localisation Using Cross-Attentive Feature Fusion
CAFNet performs joint ternary classification and temporal boundary regression for half-truth audio deepfakes via cross-attentive fusion of MFCC, LFCC, and Chroma-STFT features, reporting 92.71% accuracy and 0.075s MAE on MLADDC T2+T3.
-
Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting
A plug-and-play KL regularizer that masks the target token and renormalizes probabilities to improve the learning-forgetting trade-off in LoRA adaptation of LLMs.
-
Adapting Automotive Aerodynamics Surrogates to New Vehicle Families via Transfer Learning
LoRA adapters enable a 61.47M-parameter aerodynamics Transformer pretrained on four vehicle families to adapt to a held-out fifth family with 20 samples, reaching R²=0.85 and outperforming full fine-tuning and from-scratch training with 3x more data.
-
Forgetting in Language Models: Capacity, Optimization, and Self-Generated Replay
Self-generated replay from language models nearly eliminates catastrophic forgetting during finetuning except when models are pretrained close to saturation.
-
Reinforcement Learning Improves LLM Accuracy and Reasoning in Disease Classification from Radiology Reports
SFT followed by GRPO improves LLM accuracy and reasoning recall in disease classification from radiology reports on three radiologist-annotated datasets.
-
Reward-Free Code Alignment from Pretrained or Fine-Tuned LLM: Unpacking the Trade-offs for Code Generation
Empirical study on five LLMs finds pretrained-to-aligned paths yield bigger gains over baseline than finetuned-to-aligned paths, though absolute accuracy remains lower for pretrained starts.
-
RECALL: Recovery Experience Collection for Active Lifelong Learning in Vision-Language-Action Models
RECALL introduces uncertainty-guided active data collection for continual fine-tuning of VLAs, showing efficiency gains over passive imitation but requiring replay or regularization to mitigate catastrophic forgetting.
-
What Shapes Emergent Misalignment? Insights from Training Dynamics, Model Priors, and Data
Empirical study finds that pre-fine-tuning model activations predict post-fine-tuning alignment scores and that activation deltas show moderate-to-high subspace overlap between training and evaluation data.
-
CLaaS: Continual learning as a service for sample efficient online learning
CLaaS enables sample-efficient online continual learning for agents via replay-buffered parametric updates, outperforming in-context learning in forward transfer and retention on an adversarial task.
-
RIZZ: Routing Interactions to Near Zero-Interference Zones for Continual Adaptation of Black-Box Agents
RIZZ is a continual adaptation framework for black-box LLM agents that uses dynamically spawned memory branches, context-aware routing, verifier-gated updates, and prompt compilation to control interference across nonstationary inputs.
-
Gyan: An Explainable Neuro-Symbolic Language Model
Gyan is a novel explainable non-transformer language model that achieves SOTA results on multiple datasets by mimicking human-like compositional context and world models.
-
MPCS: Neuroplastic Continual Learning via Multi-Component Plasticity and Topology-Aware EWC
MPCS integrates eleven plasticity mechanisms and reaches a Normalized Efficiency Score of 94.2 on a 31-task benchmark, with ablations showing that removing EWC and Hebbian updates yields higher performance at lower cost.
-
Transparent and Controllable Recommendation Filtering via Multimodal Multi-Agent Collaboration
A multi-agent multimodal system with fact-grounded adjudication and a dynamic two-tier preference graph cuts false positives in content filtering by 74.3% and nearly doubles F1-score versus text-only baselines while supporting user-driven Delta adjustments.
-
Adaptive Unknown Fault Detection and Few-Shot Continual Learning for Condition Monitoring in Ultrasonic Metal Welding
The method detects unknown faults in ultrasonic metal welding at 96% accuracy and incorporates new fault types from only five labeled samples to reach 98% classification accuracy.
-
How Complexity Contributes to Learning Opacity in Machine Learning
Neural network learning opacity stems from three dynamical complexity properties in training, rendering some sources of opacity irreducible.
-
The Dynamic Gist-Based Memory Model (DGMM): A Memory-Centric Architecture for Artificial Intelligence
DGMM is proposed as an explicit graph-structured memory architecture for AI that enables persistent episodic memory, cue-based recall, and context-dependent interpretation without retraining.
-
Efficient Task Adaptation in Large Language Models via Selective Parameter Optimization
The paper claims a selective fine-tuning method that identifies and freezes core parameters to mitigate catastrophic forgetting in LLMs while improving domain adaptation, shown in experiments with GPT-J and LLaMA-3.
- ARROW: Augmented Replay for RObust World models