pith. sign in

arxiv: 2606.26687 · v1 · pith:2FBI4X6Vnew · submitted 2026-06-25 · 💻 cs.CV

DeCoFlow: Structural Decomposition of Normalizing Flows for Continual Anomaly Detection

Pith reviewed 2026-06-26 05:47 UTC · model grok-4.3

classification 💻 cs.CV
keywords continual anomaly detectionnormalizing flowsparameter isolationlow-rank adaptersMVTec-ADVisAcatastrophic forgettingdensity estimation
0
0 comments X

The pith

DeCoFlow decomposes normalizing flows into a frozen universal base and task-specific low-rank adapters to enable continual anomaly detection without forgetting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to prevent catastrophic forgetting when normalizing flows learn new anomaly detection tasks one after another without access to prior data. It exploits the property that affine coupling layers keep their transformations valid and invertible no matter how their subnets are parameterized. The solution freezes a shared base subnet and attaches only low-rank adapters for each new task. Three supporting techniques compensate for the frozen base: task-specific alignment, auxiliary coupling layers, and a tail-aware loss. This yields high image-level AUROC on standard industrial benchmarks together with zero parameter-level forgetting and a small fixed parameter budget per task.

Core claim

By exploiting that affine coupling layers maintain transformation validity regardless of subnet parameterization, DeCoFlow decomposes the subnets into a frozen universal base and task-specific low-rank adapters. This isolates updates to prevent distortion of the density manifold from previous tasks. Task-Specific Alignment, Auxiliary Coupling Layers, and Tail-Aware Loss are introduced to offset the rigidity from the frozen base. The method reports image-level AUROCs of 98.40% on MVTec-AD and 93.00% on VisA, with 0.00% forgetting under correct routing using 2.27M parameters per task.

What carries the argument

Structural decomposition of affine coupling layer subnets into a frozen universal base and task-specific low-rank adapters that preserves invertibility and Jacobian validity.

If this is right

  • Parameter updates for new tasks leave the density estimates of earlier tasks unchanged.
  • The exact invertibility and density estimation properties of the original normalizing flow are retained across the sequence of tasks.
  • Only 2.27M parameters need to be stored and updated per additional task.
  • Zero forgetting at the parameter level holds only when task routing is correct at inference time.
  • State-of-the-art image-level AUROC is achieved on the MVTec-AD and VisA benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same subnet decomposition could be applied to other invertible generative models that require parameter isolation.
  • If task identity must be inferred rather than provided, an auxiliary routing mechanism would be needed to preserve the zero-forgetting guarantee.
  • Low-rank adapters may allow the total memory footprint to grow sub-linearly when the number of tasks becomes large.
  • The method could be tested on non-image modalities such as tabular or time-series data to check whether the decomposition remains effective outside vision.

Load-bearing premise

Affine coupling layers maintain transformation validity and correct Jacobian determinants no matter how their subnets are parameterized.

What would settle it

A direct calculation or experiment showing that training the low-rank adapters on new data changes the log-likelihood or Jacobian determinant for samples from earlier tasks.

Figures

Figures reproduced from arXiv: 2606.26687 by Hun Im, Jungi Lee, Pilsung Kang, Subeen Cha.

Figure 1
Figure 1. Figure 1: Overall pipeline of DeCoFlow. Top: The sequential flow of multi-scale features through the frozen backbone, TSA calibration, and the Normalizing Flow stages. Middle: Detailed structural views of (A) the affine coupling process, (B) the scale subnet architecture with spatial context, and (C) the parallel Wbase + ∆W decomposition in DCL. Bottom: The training timeline where the base is frozen after Task 0 to … view at source ↗
Figure 2
Figure 2. Figure 2: VisA fine-defect localization failures (Input, GT, DeCoFlow). Defect regions are detected, but thin or low-contrast boundaries are blurred—consistent with the fine-defect P-AP gap rather than a density-modeling failure. 30 50 100 200 500 1000 Total System Parameters (M) 70 75 80 85 90 95 100 Image AUC (%) DeCoFlow (Ours) DNE UCAD IUF ReplayCAD CDAD (a) Image-level Detection 30 50 100 200 500 1000 Total Sys… view at source ↗
Figure 3
Figure 3. Figure 3: Performance, parameter cost, and forgetting comparison on MVTec-AD 15-class. Structural Cost Analysis [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative anomaly maps for four representative classes (one normal and two anomalous samples per class). DeCoFlow assigns consistently low scores to normal regions and localized high responses to defect regions [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Block-wise transformation analysis. (a) Cumulative | log | det Jf (z)||: DCL 6 blocks contribute 68.5%, ACL 2 blocks contribute 31.5%. (b) Off-Diagonal Covariance: DCL introduces cross-dimensional coupling, which ACL restores to independence. (c) Q￾Q Correlation: stagnates through DCL, then sharply aligns after ACL. Gray lines denote individual classes (15), bold line denotes the mean. Block-Wise Complemen… view at source ↗
read the original abstract

In industrial environments, new product categories arrive sequentially, requiring continual anomaly detection without access to past data. Normalizing Flows (NFs) provide exact density estimation but suffer from catastrophic forgetting as parameter updates across tasks distort the density manifold. While parameter isolation can prevent interference, it must preserve the strict invertibility and Jacobian validity of NFs. To satisfy these requirements, we exploit the inherent property that affine coupling layers maintain transformation validity regardless of subnet parameterization. Based on this, we propose DeCoFlow, which decomposes subnets into a frozen universal base and task-specific low-rank adapters to isolate updates. We further introduce Task-Specific Alignment, Auxiliary Coupling Layers, and Tail-Aware Loss to compensate for frozen-base rigidity. DeCoFlow achieves state-of-the-art image-level AUROCs of 98.40% on MVTec-AD and 93.00% on VisA, while maintaining parameter-level zero forgetting (0.00% FM under correct routing) with only 2.27M parameters per task.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes DeCoFlow for continual anomaly detection without access to past data. It exploits the property that affine coupling layers in normalizing flows remain bijective with tractable Jacobians under arbitrary subnet parameterization. The subnets are decomposed into a frozen universal base plus task-specific low-rank adapters for parameter isolation, with added Task-Specific Alignment, Auxiliary Coupling Layers, and Tail-Aware Loss to mitigate rigidity from freezing. The paper claims state-of-the-art image-level AUROCs of 98.40% on MVTec-AD and 93.00% on VisA, zero forgetting (0.00% FM under correct routing), and 2.27M parameters per task.

Significance. If the results hold, the work would advance continual learning for density-based anomaly detection by providing a parameter-isolation strategy that preserves exact invertibility and likelihood computation. The core technical insight correctly follows from the standard RealNVP coupling construction, where the Jacobian remains triangular and the determinant is unaffected by subnet decomposition; this is a genuine strength that enables zero-forgetting without architectural violations.

major comments (1)
  1. [Abstract and §4] Abstract and §4 (Experiments): The manuscript asserts specific SOTA AUROC values, forgetting metrics, and parameter counts, yet the abstract supplies no experimental protocol, baseline list, ablation details, or statistical significance tests. If §4 similarly omits these elements, the empirical support for the central performance claims cannot be evaluated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We address the single major comment below regarding the presentation of experimental details. Section 4 of the manuscript provides the required protocol, baselines, ablations, and metrics to support the reported results; the abstract follows standard conventions by summarizing outcomes rather than methods.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Experiments): The manuscript asserts specific SOTA AUROC values, forgetting metrics, and parameter counts, yet the abstract supplies no experimental protocol, baseline list, ablation details, or statistical significance tests. If §4 similarly omits these elements, the empirical support for the central performance claims cannot be evaluated.

    Authors: We agree that the abstract is concise and omits full experimental protocol, baseline lists, ablation details, and significance tests, as is conventional for abstracts in computer vision and machine learning papers. However, this does not apply to §4. Section 4 explicitly describes: (i) the continual learning protocol on MVTec-AD and VisA (sequential task arrival, no access to past data); (ii) the full list of baselines (NF-based methods and continual learning approaches); (iii) ablation studies isolating the contributions of Task-Specific Alignment, Auxiliary Coupling Layers, and Tail-Aware Loss; (iv) all metrics including image-level AUROC, forgetting measure (FM), and parameter counts per task; and (v) evaluation over multiple runs for reliability. The SOTA claims (98.40% on MVTec-AD, 93.00% on VisA, 0.00% FM, 2.27M params/task) are directly supported by these experiments. No changes to §4 are needed, though we can add a one-sentence pointer from the abstract to §4 if the editor prefers. revision: no

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central justification invokes the standard bijectivity and Jacobian tractability of affine coupling layers (a direct consequence of the RealNVP construction), which is treated as an external architectural property rather than derived or fitted inside the work. The proposed decomposition (frozen universal base + task-specific low-rank adapters) preserves the triangular Jacobian form by the unchanged coupling structure; zero forgetting under correct routing follows immediately from parameter isolation once that external property is granted. No equations reduce claimed AUROC or forgetting metrics to quantities defined by the method itself, no self-citation chains are load-bearing, and no ansatz or uniqueness result is smuggled in. Empirical results on MVTec-AD and VisA are presented as measured outcomes, not forced predictions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, invented entities, or additional axioms beyond the stated coupling-layer property are visible.

axioms (1)
  • domain assumption Affine coupling layers maintain transformation validity regardless of subnet parameterization.
    Invoked in abstract as the foundation allowing decomposition into frozen base and task-specific adapters.

pith-pipeline@v0.9.1-grok · 5712 in / 1226 out tokens · 26849 ms · 2026-06-26T05:47:19.405337+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 4 canonical work pages

  1. [1]

    In: European Conference on Com- puter Vision (ECCV)

    Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: Learning what (not) to forget. In: European Conference on Com- puter Vision (ECCV). pp. 139–154 (2018)

  2. [2]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Bergmann, P., Fauser, M., Sattlegger, D., Steger, C.: Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9592–9600 (2019) 16 Im et al

  3. [3]

    In: Advances in Neural Information Processing Systems (2025)

    Cheng, Q., Wan, Y., Wu, L., Hou, C., Zhang, L.: Continuous subspace optimization for continual learning. In: Advances in Neural Information Processing Systems (2025)

  4. [4]

    A continual learning survey: Defying forgetting in classification tasks,

    De Lange, M., Aljundi, R., Masana, M., Parisot, S., Jia, X., Leonardis, A., Slabaugh, G., Tuytelaars, T.: A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence44(7), 3366–3385 (2022).https://doi.org/10.1109/TPAMI.2021.3057446

  5. [5]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Deng,H.,Li,X.:Anomalydetectionviareversedistillationfromone-classembedding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9737–9746 (2022)

  6. [6]

    In: International Conference on Learning Representations (2017)

    Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real-nvp. In: International Conference on Learning Representations (2017)

  7. [7]

    Trends in Cognitive Sciences3(4), 128–135 (1999)

    French, R.M.: Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences3(4), 128–135 (1999)

  8. [8]

    In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision

    Gudovskiy, D., Ishizaka, S., Kozuka, K.: Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 98–107 (2022)

  9. [9]

    In: International Conference on Learning Representations (2022)

    Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W.: Lora: Low-rank adaptation of large language models. In: International Conference on Learning Representations (2022)

  10. [10]

    arXiv preprint arXiv:2505.06603 (2025)

    Hu, L., Gan, Z., Deng, L., Liang, J., Liang, L., Huang, S., Chen, T.: Replay- cad: Generative diffusion replay for continual anomaly detection. arXiv preprint arXiv:2505.06603 (2025)

  11. [11]

    Proceedings of the national academy of sciences114(13), 3521–3526 (2017)

    Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al.: Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences114(13), 3521–3526 (2017)

  12. [12]

    Lee, A., Zhang, Y., Gomes, H.M., Bifet, A., Pfahringer, B.: Look at me, no replay! SurpriseNet:Anomalydetectioninspiredclassincrementallearning.In:International Conference on Information and Knowledge Management (2023)

  13. [13]

    IEEE Access10, 78446–78454 (2022)

    Lee, S., Lee, S., Song, B.C.: CFA: Coupled-hypersphere-based feature adaptation for target-oriented anomaly localization. IEEE Access10, 78446–78454 (2022). https://doi.org/10.1109/ACCESS.2022.3193699

  14. [14]

    In: Proceedings of the 30th ACM International Conference on Multimedia (2022)

    Li, C.L., Lin, T.H., Tsai, Y.Y., Lin, Y.C., Yang, M.H., Kira, Z.: Towards continual adaptation in industrial anomaly detection. In: Proceedings of the 30th ACM International Conference on Multimedia (2022)

  15. [15]

    In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2025)

    Li, X., Tan, X., Chen, Z., Zhang, Z., Zhang, R., Guo, R., Jiang, G., Chen, Y., Qu, Y., Ma, L., Xie, Y.: One-for-more: Continual diffusion model for anomaly detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2025)

  16. [16]

    In: European conference on computer vision

    Li, Z., Hoiem, D.: Learning without forgetting. In: European conference on computer vision. pp. 614–629. Springer (2017)

  17. [17]

    In: Advances in Neural Information Processing Systems (2025)

    Liang, Y.S., Chen, J.R., Li, W.J.: Gated integration of low-rank adaptation for continual learning of large language models. In: Advances in Neural Information Processing Systems (2025)

  18. [18]

    In: Proceedings of the IEEE International Conference on Computer Vision

    Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2980–2988 (2017) DeCoFlow: Continual Anomaly Detection via NF Decomposition 17

  19. [19]

    In: Proceedings of the AAAI Conference on Artificial Intelligence

    Liu, J., Wang, K., Jiang, Q., Feng, Z., Zhang, W., You, Z.: Unsupervised continual anomaly detection with contrastively-learned prompt. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 38 (2024)

  20. [20]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Liu, Z., Zhou, Y., Xu, Y., Wang, Z.: Simplenet: A simple network for image anomaly detection and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 20402–20411 (2023)

  21. [21]

    In: Advances in Neural Information Processing Systems (2017)

    Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. In: Advances in Neural Information Processing Systems (2017)

  22. [22]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    Mallya, A., Lazebnik, S.: PackNet: Adding multiple tasks to a single network by iterative pruning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 7765–7773 (2018)

  23. [23]

    In: Psychology of Learning and Motivation, vol

    McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: The sequential learning problem. In: Psychology of Learning and Motivation, vol. 24, pp. 109–165. Academic Press (1989)

  24. [24]

    In: Advances in Neural Information Processing Systems (2025)

    Momeni, S., Noroozi, V., et al.: AnaCP: Toward upper-bound continual learning via analytic contrastive projection. In: Advances in Neural Information Processing Systems (2025)

  25. [25]

    Neural Networks181, 106788 (2025).https://doi.org/ 10.1016/j.neunet.2024.106788

    Pang, J., Li, C.: Context-aware feature reconstruction for class-incremental anomaly detection and localization. Neural Networks181, 106788 (2025).https://doi.org/ 10.1016/j.neunet.2024.106788

  26. [26]

    In: Advances in Neural Information Processing Systems (2025)

    Qiu, Z., Xu, Y., He, C., Meng, F., Xu, L., Wu, Q., Li, H.: MINGLE: Mixtures of null-space gated low-rank experts for test-time continual model merging. In: Advances in Neural Information Processing Systems (2025)

  27. [27]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Roth, K., Pemula, L., Zepeda, J., Schölkopf, B., Brox, T., Gehler, P.: Towards total recall in industrial anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 14318–14328 (2022)

  28. [28]

    arXiv preprint arXiv:1606.04671 (2016)

    Rusu, A.A., Rabinowitz, N.C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R., Hadsell, R.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)

  29. [29]

    In: International Conference on Machine Learning

    Serrà, J., Surís, D., Miron, M., Karatzoglou, A.: Overcoming catastrophic forgetting with hard attention to the task. In: International Conference on Machine Learning. pp. 4548–4557. PMLR (2018)

  30. [30]

    In: Advances in Neural Information Processing Systems

    Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual learning with deep generative replay. In: Advances in Neural Information Processing Systems. pp. 2990–2999 (2017)

  31. [31]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 761–769 (2016)

  32. [32]

    In: European Conference on Computer Vision (ECCV)

    Tang, J., Lu, H., Xu, X., Wu, R., Hu, S., Zhang, T., Cheng, T.W., Ge, M., Chen, Y.C., Tsung, F.: An incremental unified framework for small defect inspection. In: European Conference on Computer Vision (ECCV). pp. 306–323. Springer (2024)

  33. [33]

    arXiv preprint arXiv:2502.17920 (2025)

    Wang, X., Zhang, R., Liang, Y.S., Li, W.J.: C-LoRA: Continual low-rank adaptation for pre-trained models. arXiv preprint arXiv:2502.17920 (2025)

  34. [34]

    arXiv preprint arXiv:2511.08634 (2025)

    Yang, G., Deng, Z., Man, J.: CADIC: Continual anomaly detection based on incremental coreset. arXiv preprint arXiv:2511.08634 (2025)

  35. [35]

    In: European Conference on Computer Vision (ECCV)

    Yao, X., Li, R., Qian, Z., Wang, L., Zhang, C.: Hierarchical gaussian mixture normalizing flow modeling for unified anomaly detection. In: European Conference on Computer Vision (ECCV). pp. 93–110. Springer (2024)

  36. [36]

    In: Advances in Neural Information Processing Systems (2022) 18 Im et al

    You, Z., Cui, L., Shen, Y., Yang, K., Lu, X., Zheng, Y., Le, X.: A unified model for multi-class anomaly detection. In: Advances in Neural Information Processing Systems (2022) 18 Im et al

  37. [37]

    arXiv preprint arXiv:2111.07677 (2021)

    Yu, J., Zheng, Y., Wang, X., Li, W., Wu, Y., Zhao, R., Wu, L.: Fastflow: Unsuper- vised anomaly detection and localization via 2d normalizing flows. arXiv preprint arXiv:2111.07677 (2021)

  38. [38]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Zhao, Y., Zhu, K., Shi, H., Guo, Y., Bai, C.: OmniAL: A unified cnn framework for unsupervised anomaly localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3924–3933 (2023)

  39. [39]

    IEEE Transactions on Neural Networks and Learning Systems (2024).https://doi.org/10.1109/TNNLS.2023

    Zhou, Y., Xu, X., Song, J., Shen, F., Shen, H.T.: MSFlow: Multiscale flow-based framework for unsupervised anomaly detection. IEEE Transactions on Neural Networks and Learning Systems (2024).https://doi.org/10.1109/TNNLS.2023. 3346541

  40. [40]

    arXiv preprint arXiv:2409.00942 (2024)

    Zhou, Y., Xu, X., Song, J., Shen, F., Shen, H.T.: VQ-Flow: Taming normalizing flows for multi-class anomaly detection via hierarchical vector quantization. arXiv preprint arXiv:2409.00942 (2024)

  41. [41]

    catas- trophic forgetting in anomaly detection

    Zhou, Y., Li, X., Ding, J.: Multimodal task representation memory bank vs. catas- trophic forgetting in anomaly detection. arXiv preprint arXiv:2502.06194 (2025)

  42. [42]

    In: European Conference on Computer Vision

    Zou, Y., Jeong, J., Pemula, L., Zhang, D., Dabeer, O.: Spot-the-difference self- supervised pre-training for anomaly detection and segmentation. In: European Conference on Computer Vision. pp. 392–408. Springer (2022)