Explainable Outlier Detection for Multivariate Functional Data

Horst Lewitschnig; Marcus Mayrhofer; Peter Filzmoser; Una Radoji\v{c}i\'c

arxiv: 2605.20325 · v1 · pith:IKQO7SFWnew · submitted 2026-05-19 · 📊 stat.ME · stat.CO

Explainable Outlier Detection for Multivariate Functional Data

Marcus Mayrhofer , Una Radoji\v{c}i\'c , Horst Lewitschnig , Peter Filzmoser This is my paper

Pith reviewed 2026-05-21 00:58 UTC · model grok-4.3

classification 📊 stat.ME stat.CO

keywords multivariate functional dataoutlier detectionseparable covarianceMMCD estimatorShapley valuesrobust covariance estimationMahalanobis distanceinterpretability

0 comments

The pith

Multivariate functional data with separable covariance structures correspond to matrix-variate distributions, enabling robust MMCD estimation and linear-complexity Shapley explanations for outlier detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a connection between stochastic processes that have separable covariance and the matrix-variate distributions formed by their basis representations. This link allows the matrix-variate minimum covariance determinant estimator, together with a truncated functional Mahalanobis distance, to produce robust estimates of mean and covariance. The same connection supports a generalization of Shapley-value explanations that decomposes an observation's overall outlyingness into contributions from individual time points and variables. The decomposition runs in linear rather than exponential time relative to the number of components. A sympathetic reader would care because the approach aims to deliver both reliable detection and human-understandable reasons for flagged outliers in settings where multivariate functional observations arise.

Core claim

Stochastic processes with separable covariance structures have basis representations whose joint distribution is matrix-variate; this fact permits direct application of the matrix minimum covariance determinant estimator to obtain robust location and scatter estimates for multivariate functional data. The resulting robust Mahalanobis semi-distance then serves as an outlyingness measure. For interpretability, the paper generalizes multivariate Shapley-value outlier explanations to the functional setting and shows that the otherwise exponential complexity in the number of components reduces to linear complexity while preserving the essential additive properties of the Shapley value.

What carries the argument

The correspondence between separable-covariance stochastic processes and matrix-variate distributions of their basis representations, which permits direct use of the matrix-variate minimum covariance determinant estimator.

If this is right

Robust mean and covariance estimates become available for multivariate functional data that satisfy the separability condition.
Outlyingness can be decomposed into explicit time-coordinate and variable contributions.
The computational cost of the decomposition scales linearly rather than exponentially with the number of components.
The integrated use of MMCD estimation, truncated Mahalanobis distances, and Shapley decompositions yields both detection and explanation within a single framework.
Theoretical properties of the matrix-variate estimator carry over to the functional setting under the separability premise.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The linear-complexity decomposition could be applied directly to data sets with a larger number of components than previously feasible.
If separability holds only approximately, the method may still serve as a computationally convenient approximation whose accuracy can be checked against non-separable baselines.
The same basis-representation link might be reusable for other robust estimation tasks that currently rely on vectorized representations of functional data.

Load-bearing premise

The multivariate functional data under study possess a separable covariance structure.

What would settle it

Simulated data generated from a non-separable covariance model in which the MMCD-based robust distances and the associated Shapley decompositions fail to recover the true outlying observations more accurately than standard non-robust alternatives.

Figures

Figures reproduced from arXiv: 2605.20325 by Horst Lewitschnig, Marcus Mayrhofer, Peter Filzmoser, Una Radoji\v{c}i\'c.

**Figure 2.** Figure 2: Rank-based comparison of methods across all separable simulation settings in Table [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: Density plots of F-score, AUC, and log covariance estimation error for a representative setting with [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: AUC values for detecting small (upper row) and large (lower row) shift outliers in non-separable processes [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: Mirrored density plots of AUC values for component- (left) and time-specific (right) Shapley values, absolute [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Q–Q plot of robust squared Mahalanobis distances against χ 2 quantiles for all 75 periods. 1950:1951 1954:1955 1956:1957 1957:1958 1972:1973 1965:1966 1982:1983 1983:1984 1987:1988 1988:1989 1991:1992 1997:1998 1998:1999 2005:2006 2015:2016 2023:2024 3 4 5 6 7 8 2 3 4 5 Robust fMD of region 'Niño 3.4' Robust fMMD Number of 'El Niño' and 'La Niña' months in period 0 3 6 9 12 [PITH_FULL_IMAGE:figures/full_… view at source ↗

**Figure 7.** Figure 7: Distance-distance comparison of the robust [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

**Figure 8.** Figure 8: Smoothed SST curves across the four Niño regions (columns) during El Niño and La Niña years (rows). The [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

**Figure 9.** Figure 9: ROC curves and their AUC values to evaluate outlier detection performance in the resistance spot welding [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗

**Figure 10.** Figure 10: Smoothed welding DRCs with two of the outliers highlighted by a dashed/dotted line. The color intensity of [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Map of the 4 regions in the equatorial Pacific Ocean where SST related to ENSO are measured: Niño 1+2 [PITH_FULL_IMAGE:figures/full_fig_p040_11.png] view at source ↗

**Figure 12.** Figure 12: Smoothed SST measurements of all four Niño regions with outlying observations colored either red or blue, [PITH_FULL_IMAGE:figures/full_fig_p041_12.png] view at source ↗

**Figure 13.** Figure 13: Smoothed SST curves for the years 1956:1957 (left) and 2015:2016 (right across the four Niño regions [PITH_FULL_IMAGE:figures/full_fig_p041_13.png] view at source ↗

**Figure 14.** Figure 14: Q–Q plot of robust squared Mahalanobis distances against [PITH_FULL_IMAGE:figures/full_fig_p042_14.png] view at source ↗

**Figure 15.** Figure 15: Smoothed age-specific fertility curves for all 22 countries/regions. [PITH_FULL_IMAGE:figures/full_fig_p043_15.png] view at source ↗

**Figure 16.** Figure 16: Q–Q plot of robust squared Mahalanobis distances against [PITH_FULL_IMAGE:figures/full_fig_p044_16.png] view at source ↗

**Figure 17.** Figure 17: Robust analysis of the smoothed ASFRs. AUT BEL CAN CHE CHL CZE DEUTE DEUTNP DEUTW FRATNP GBRTENW HUN ISL JPN KOR NLD NOR POL PRT SVK SVN USA 15 to 18 18 to 21 21 to 24 24 to 27 27 to 30 30 to 33 33 to 36 36 to 39 39 to 42 42 to 45 Age Country/Region AUT BEL CAN CHE CHL CZE DEUTE DEUTNP DEUTW FRATNP GBRTENW HUN ISL JPN KOR NLD NOR POL PRT SVK SVN USA 1960:1964 1965:1969 1970:1974 1975:1979 1980:1984 1985:1… view at source ↗

**Figure 18.** Figure 18: Age-specific (left) or year-specific (right) outlyingness contributions based on Shapley values for the [PITH_FULL_IMAGE:figures/full_fig_p045_18.png] view at source ↗

**Figure 19.** Figure 19: Age-specific and year-specific outlyingness contributions based on Shapley values for the smoothed ASFRs [PITH_FULL_IMAGE:figures/full_fig_p046_19.png] view at source ↗

**Figure 20.** Figure 20: Rank-based comparison of methods across all simulation settings in the non-Gaussian setting for [PITH_FULL_IMAGE:figures/full_fig_p047_20.png] view at source ↗

**Figure 21.** Figure 21: Precision of shift outlier detection for a stochastic process with t-distributed innovations ( [PITH_FULL_IMAGE:figures/full_fig_p048_21.png] view at source ↗

**Figure 22.** Figure 22: Recall of shift outlier detection for a stochastic process with t-distributed innovations ( [PITH_FULL_IMAGE:figures/full_fig_p049_22.png] view at source ↗

**Figure 23.** Figure 23: F-score of shift outlier detection for a stochastic process with t-distributed innovations ( [PITH_FULL_IMAGE:figures/full_fig_p050_23.png] view at source ↗

**Figure 24.** Figure 24: AUC values of shift outlier detection for a stochastic process with t-distributed innovations ( [PITH_FULL_IMAGE:figures/full_fig_p051_24.png] view at source ↗

**Figure 25.** Figure 25: Log covariance estimation error for a stochastic process with t-distributed innovations ( [PITH_FULL_IMAGE:figures/full_fig_p052_25.png] view at source ↗

**Figure 26.** Figure 26: Comparison of log computation time in seconds for the Gaussian setting outlined in Section [PITH_FULL_IMAGE:figures/full_fig_p053_26.png] view at source ↗

**Figure 27.** Figure 27: Density plots of F-score for a representative setting with [PITH_FULL_IMAGE:figures/full_fig_p054_27.png] view at source ↗

**Figure 28.** Figure 28: Density plots of AUC for a representative setting with [PITH_FULL_IMAGE:figures/full_fig_p055_28.png] view at source ↗

**Figure 29.** Figure 29: Density plots of log covariance estimation error for a representative setting with [PITH_FULL_IMAGE:figures/full_fig_p056_29.png] view at source ↗

**Figure 30.** Figure 30: Comparison of Spearman correlation of squared Mahalanobis distances based on the sample covariance of [PITH_FULL_IMAGE:figures/full_fig_p057_30.png] view at source ↗

**Figure 31.** Figure 31: Precision for detecting small and large shift outliers in non-separable processes obtained as sums of [PITH_FULL_IMAGE:figures/full_fig_p058_31.png] view at source ↗

**Figure 32.** Figure 32: Recall for detecting small and large shift outliers in non-separable processes obtained as sums of [PITH_FULL_IMAGE:figures/full_fig_p059_32.png] view at source ↗

**Figure 33.** Figure 33: F-score for detecting small and large shift outliers in non-separable processes obtained as sums of [PITH_FULL_IMAGE:figures/full_fig_p060_33.png] view at source ↗

**Figure 34.** Figure 34: AUC for detecting small and large shift outliers in non-separable processes obtained as sums of [PITH_FULL_IMAGE:figures/full_fig_p061_34.png] view at source ↗

**Figure 35.** Figure 35: Rank-based comparison of methods across all simulation settings in Table [PITH_FULL_IMAGE:figures/full_fig_p062_35.png] view at source ↗

**Figure 36.** Figure 36: Density plots of precision for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p063_36.png] view at source ↗

**Figure 37.** Figure 37: Density plots of recall for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p064_37.png] view at source ↗

**Figure 38.** Figure 38: Density plots of F-score for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p065_38.png] view at source ↗

**Figure 39.** Figure 39: Density plots of AUC values for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p066_39.png] view at source ↗

**Figure 40.** Figure 40: Density plots of log covariance estimation error for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p067_40.png] view at source ↗

**Figure 41.** Figure 41: Density plots of precision for shape outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p068_41.png] view at source ↗

**Figure 42.** Figure 42: Density plots of recall for shape outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p069_42.png] view at source ↗

**Figure 43.** Figure 43: Density plots of F-score for shape outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p070_43.png] view at source ↗

**Figure 44.** Figure 44: Density plots of AUC values for shape outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p071_44.png] view at source ↗

**Figure 45.** Figure 45: Density plots of log covariance estimation error for shape outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p072_45.png] view at source ↗

**Figure 46.** Figure 46: Density plots of precision for isolated outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p073_46.png] view at source ↗

**Figure 47.** Figure 47: Density plots of recall for isolated outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p074_47.png] view at source ↗

**Figure 48.** Figure 48: Density plots of F-score for isolated outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p075_48.png] view at source ↗

**Figure 49.** Figure 49: Density plots of AUC values for isolated outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p076_49.png] view at source ↗

**Figure 50.** Figure 50: Density plots of log covariance estimation error for isolated outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p077_50.png] view at source ↗

**Figure 51.** Figure 51: Density plots of precision for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p078_51.png] view at source ↗

**Figure 52.** Figure 52: Density plots of recall for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p079_52.png] view at source ↗

**Figure 53.** Figure 53: Density plots of F-score for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p080_53.png] view at source ↗

**Figure 54.** Figure 54: Density plots of AUC values for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p081_54.png] view at source ↗

**Figure 55.** Figure 55: Density plots of log covariance estimation error for shift outliers in the Gaussian setting with [PITH_FULL_IMAGE:figures/full_fig_p082_55.png] view at source ↗

read the original abstract

This work addresses the challenges of robust covariance estimation and interpretable outlier detection for multivariate functional data with separable covariance structure. We develop a method that simultaneously improves robustness and interpretability in this context by establishing a connection between stochastic processes with separable covariance structures and the corresponding matrix-variate distribution of their basis representations. Leveraging this connection, we employ the recently developed matrix-variate counterpart of the Minimum Covariance Determinant estimator (MMCD) in conjunction with a truncated multivariate functional Mahalanobis semi-distance to robustly estimate mean and covariance for multivariate functional data. For interpretable outlier detection, we generalize multivariate outlier explanations based on Shapley values to decompose overall multivariate functional outlyingness into time-coordinate-specific contributions. Importantly, we reduce the otherwise exponential computational complexity (relative to the number of components) to linear complexity, while retaining the key properties of the Shapley value. This integrated framework combines robust Mahalanobis distances, MMCD estimators, and Shapley value-based outlyingness decomposition to provide a robust and interpretable approach for analyzing multivariate functional data with separable covariance structures. The effectiveness of this approach is demonstrated through both theoretical analysis and practical applications, including simulations and real-world examples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper links separable multivariate functional data to matrix-variate MMCD for robust estimation and adds a linear-complexity Shapley decomposition for time-and-variable explanations of outliers.

read the letter

The main point is that this work gives a practical route to robust outlier detection for multivariate functional data when the covariance separates into time and variable parts. It maps the functional process to a matrix-variate distribution, plugs in the MMCD estimator for mean and covariance, and then decomposes the resulting outlyingness score with a truncated Mahalanobis distance plus a Shapley-style breakdown that runs in linear time instead of exponential.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a framework for robust covariance estimation and interpretable outlier detection in multivariate functional data that possess a separable covariance structure. It links stochastic processes with separable covariances to the matrix-variate distribution of their basis representations, applies the matrix-variate Minimum Covariance Determinant (MMCD) estimator together with a truncated functional Mahalanobis semi-distance, and generalizes Shapley-value explanations to decompose outlyingness into time-coordinate contributions while reducing computational cost from exponential to linear in the number of components. Effectiveness is illustrated via theoretical analysis, simulations, and real-data examples.

Significance. If the separability assumption is satisfied and the claimed equivalence is rigorously derived, the work would provide a useful combination of robustness and interpretability for outlier detection in multivariate functional data, with the linear-complexity Shapley decomposition offering a clear practical benefit over standard approaches.

major comments (2)

[Abstract] Abstract: the central reduction of the functional covariance operator to a Kronecker-structured matrix-variate covariance is invoked to justify direct use of the MMCD estimator and the subsequent truncated Mahalanobis semi-distance. No diagnostic for checking separability nor any perturbation analysis for mild violations is supplied, yet such violations are common and would invalidate both the robustness guarantees and the Shapley decomposition.
[Theoretical analysis] Theoretical analysis section: the claimed equivalence between a separable-covariance stochastic process and the matrix-variate distribution of its finite basis coefficients is load-bearing for all downstream results, yet the manuscript provides neither the explicit operator-level derivation nor the conditions under which the finite-basis approximation preserves the Kronecker structure.

minor comments (2)

[Method] Notation for the truncated functional Mahalanobis semi-distance should be introduced with an explicit equation number rather than inline description.
[Simulations] The simulation study would benefit from an explicit table reporting the fraction of replicates in which separability was approximately satisfied.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of the separability assumption and its theoretical justification. We address each major comment below and describe the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central reduction of the functional covariance operator to a Kronecker-structured matrix-variate covariance is invoked to justify direct use of the MMCD estimator and the subsequent truncated Mahalanobis semi-distance. No diagnostic for checking separability nor any perturbation analysis for mild violations is supplied, yet such violations are common and would invalidate both the robustness guarantees and the Shapley decomposition.

Authors: We agree that the separability assumption underpins the connection to the matrix-variate MMCD estimator and the associated Shapley decomposition. While the manuscript focuses on the case where separability holds, we acknowledge that guidance on verifying this assumption would improve usability. In the revised version we will add a dedicated paragraph discussing practical diagnostics for separability (referencing existing tests for functional data) together with a short simulation study that examines the effect of mild violations on the MMCD estimator and the linear-complexity Shapley values. This addition will clarify the scope and limitations of the proposed framework without altering its core contribution. revision: yes
Referee: [Theoretical analysis] Theoretical analysis section: the claimed equivalence between a separable-covariance stochastic process and the matrix-variate distribution of its finite basis coefficients is load-bearing for all downstream results, yet the manuscript provides neither the explicit operator-level derivation nor the conditions under which the finite-basis approximation preserves the Kronecker structure.

Authors: The referee is correct that an explicit derivation is necessary to support the subsequent results. We will expand the Theoretical analysis section to include a step-by-step operator-level argument showing how the separable covariance kernel induces the Kronecker product structure on the covariance matrix of the basis coefficients. We will also state the precise conditions (orthonormality of the basis, truncation level, and moment assumptions) under which the finite-dimensional representation exactly or approximately inherits the Kronecker form. These additions will make the theoretical foundation fully rigorous and self-contained. revision: yes

Circularity Check

0 steps flagged

No significant circularity; central claims rest on external estimators and explicit separability assumption

full rationale

The paper establishes a connection between stochastic processes with separable covariance and the matrix-variate distribution of their basis representations to justify applying the MMCD estimator and a truncated functional Mahalanobis distance. This connection is presented as a theoretical reduction rather than a self-referential definition. MMCD is described as the 'recently developed matrix-variate counterpart' of an existing estimator, indicating external origin. Shapley-value decomposition is generalized from the multivariate case with a complexity reduction to linear. No equations in the abstract or described framework reduce a claimed prediction or result to a parameter fitted inside the same paper. Separability is stated as a required premise for the reduction, not derived from the method itself. The derivation chain therefore remains self-contained against external benchmarks and does not collapse to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption of separability and on standard properties of Mahalanobis distance and Shapley values drawn from prior literature; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption Multivariate functional data possess a separable covariance structure
This assumption is invoked to establish the connection to matrix-variate distributions and to justify use of the MMCD estimator.

pith-pipeline@v0.9.0 · 5750 in / 1278 out tokens · 66669 ms · 2026-05-21T00:58:47.912387+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

establishing a connection between stochastic processes with separable covariance structures and the corresponding matrix-variate distribution of their basis representations... employ the recently developed matrix-variate counterpart of the Minimum Covariance Determinant estimator (MMCD)
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

fMMD2(X;M) := ... truncated functional multivariate Mahalanobis semi-distance... under separable covariance structures

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

J., and Zamar, R

Agostinelli, C., Leung, A., Yohai, V . J., and Zamar, R. H. (2015). Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. Test, 24:441–461. Alvarez, M. A., Rosasco, L., Lawrence, N. D., et al. (2012). Kernels for vector-valued functions: A review.Foundations and Trends® in Machine Learning, 4(3):195...

work page 2015
[2]

Aston, J

Academic press. Aston, J. A., Pigoli, D., and Tavakoli, S. (2017). Tests for separability in nonparametric covariance operators of random surfaces. The Annals of Statistics, pages 1431–1461. Basna, R., Nassar, H., and Podgórski, K. (2022). Data driven orthogonal basis selection for functional data analysis. Journal of Multivariate Analysis, 189:104868. Be...

work page arXiv 2017
[3]

Ferraty, F. (2006). Nonparametric Functional Data Analysis. Springer. Galeano, P., Joseph, E., and Lillo, R. E. (2015). The Mahalanobis distance for functional data with applications to classification. Technometrics, 57(2):281–291. Genton, M. G. (2007). Separable approximations of space-time covariance matrices. Environmetrics: The Official Journal of the...

work page 2006
[4]

Zuo, Y . (2003). Projection-based depth functions and associated medians.The Annals of Statistics, 31(5):1460–1490. Zuo, Y . and Serfling, R. (2000). General notions of statistical depth function.The Annals of Statistics, pages 461–482. 23 Explainable Outlier Detection for Multivariate Functional DataA PREPRINT A Further Preliminaries A.1 Matrix Normal Di...

work page 2003
[5]

for computing the maximum likelihood estimates (20)- (21). Starting from any positive definite initialization, the proposed procedure is shown to converge almost surely to the positive definite covariance estimates, provided h≥ ⌊ p/m+ m/p⌋+ 2 . The convergence also holds if the ellipticity assumption is violated. For technical and implementation details, ...

work page 2025
[6]

By collecting the coefficients in a matrixA= (a 1,

, p. By collecting the coefficients in a matrixA= (a 1, . . . ,ap)′ ∈R p×m we can write the multivariate process as Y(t) =A ϕ(t) + ˜ε(t). The coefficients ajk, j= 1, . . . , p , k= 1, . . . , m, are usually determined based on a least squares approach, and often a roughness penalty is involved; see Ramsay and Silverman (2005) for more details. A.5 FPCA fo...

work page 2005
[7]

, mgive the orthonormality of the corresponding products

, p.(23) To see that the relations in (23) indeed hold, observe first that orthogonality of V row and orthonormality of ξi, i= 1, . . . , mgive the orthonormality of the corresponding products. Furthermore, λker i λrow j (ξi(s)vrow j ) =λ row j Z T κ(s, t)ξi(t)vrow j dt = Z T Σrowκ(s, t)ξi(t)vrow j dt= Z T K(s, t)(ξi(t)vrow j ) dt. (24) The uniqueness of ...

work page 2018
[8]

Thus, functions in {ξ(j) i ej :i≥1, j= 1,

, p . Thus, functions in {ξ(j) i ej :i≥1, j= 1, . . . , p} , are the eigenfunctions of K, while λ(j) i , i≥1, j= 1, . . . , p are the corresponding eigenvalues. In other words, the spectrum of K corresponds to the union of the spectra of individual covariance operators Kj, j= 1, . . . , p . Then, for m1, . . . , mp as described in the statement of the res...

work page 2008
[9]

Corollary B.0.1.Let X(t) =a ′ ϕ(t) be a rank m∈N stochastic process with mean µ and covariance κ, with coefficientsa∈R m and basisϕ= (ϕ 1,

finally gives that A∼ MN(0,Σ col,Σ row), thus completing the proof. Corollary B.0.1.Let X(t) =a ′ ϕ(t) be a rank m∈N stochastic process with mean µ and covariance κ, with coefficientsa∈R m and basisϕ= (ϕ 1, . . . , ϕm)′. Then the following holds: (i)a has a multivariate distribution with mean ma and covariance Cov(a) =Σ∈PDS(m) such that m′ a ϕ(t) =µ(t)and...

work page 2025
[10]

,fMMD(Xn)) Algorithm 1 yields robust estimators

Run MMCD procedure onAand get( ˆMA,H ∗ , ˆΣrow H ∗ , ˆΣcol H ∗ ,MMD(A)); 3:Obtain functional data objects for mean and covariance ˆµ(t) = ˆMA,H ∗ ϕ(t); ˆΣrow = ˆΣcol H ∗; ˆκ(s, t) =ϕ′(s) ˆΣrow H ∗ ϕ(t); fMMD(Xi) = fMMD(Xi, ˆµ; ˆΣrow,ˆκ, mp) = MMD(Ai, ˆMA,H ∗; ˆΣrow H ∗ , ˆΣcol H ∗) Output: ˆµ, ˆΣrow,ˆκ,(fMMD(X1), . . . ,fMMD(Xn)) Algorithm 1 yields robust...

work page 1999
[11]

The pointwise estimates of mean and covariance evaluated at observed time points t1,

Algorithm 1 can be easily adapted for the analysis of raw data: step 1 in the algorithm is omitted, and p×q matrices of raw data observations are supplied to the MMCD in step 2 . The pointwise estimates of mean and covariance evaluated at observed time points t1, . . . , tq are the output of MMCD step 2 . Usually, post-smoothing is applied to those estima...

work page 2005
[12]

(a) Univariate functional data. Symbol Description X(t)∈L 2(T)Univariate stochastic process µ(t)Mean function ofX κ(s, t)Covariance kernel ofX KCovariance operator ofX ψi,π i Eigenfunctions and eigenvalues ofK T= Sd a=1 Ta Functional domain partitioned intoddisjoint subintervals ⟨X, Y⟩ Ta = R Ta X(t)Y(t)dtInner product on subintervalT a D={1, . . . , d}In...

work page 2024
[13]

We selected the subset of n= 22 countries/regions with no missing values for the years between 1960 and

In the context of ASFRs, we refer to women towards the lower end of the spectrum as younger women, women aged around 30 years old as middle-aged women, and women at the higher end of the spectrum 42 Explainable Outlier Detection for Multivariate Functional DataA PREPRINT as older women. We selected the subset of n= 22 countries/regions with no missing val...

work page 1960
[14]

, 2015:2019), which results in observations that are naturally arranged in12×31 matrices for each country

To facilitate the interpretability of the results, we aggregate the annual ASFRs into five-year intervals (1960:1964, 1965:1969, . . . , 2015:2019), which results in observations that are naturally arranged in12×31 matrices for each country. This matrix structure reflects the average ASFRs for each of the 12 five-year periods across the ages from 15 to

work page 1960
[15]

Overall, fertility is clearly declining over the years, and women give birth at older ages as time progresses

Here, every plot shows the fertility curves for one of the five-year intervals. Overall, fertility is clearly declining over the years, and women give birth at older ages as time progresses. Moreover, the curves are similar in the last period while there is more difference between the countries in the earlier years. We see that some countries/regions form...

work page 2000
[16]

Taking Belgium (BEL) as an example, the outlyingness scores reveal the following: The Age-specific outlying contributions highlight higher than expected fertility for women aged 15 to 24 years and lower than average fertility for 44 Explainable Outlier Detection for Multivariate Functional DataA PREPRINT BEL CHE CZE DEUTNP DEUTW KOR NLD NOR POL SVK USA 20...

work page 1960

[1] [1]

J., and Zamar, R

Agostinelli, C., Leung, A., Yohai, V . J., and Zamar, R. H. (2015). Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. Test, 24:441–461. Alvarez, M. A., Rosasco, L., Lawrence, N. D., et al. (2012). Kernels for vector-valued functions: A review.Foundations and Trends® in Machine Learning, 4(3):195...

work page 2015

[2] [2]

Aston, J

Academic press. Aston, J. A., Pigoli, D., and Tavakoli, S. (2017). Tests for separability in nonparametric covariance operators of random surfaces. The Annals of Statistics, pages 1431–1461. Basna, R., Nassar, H., and Podgórski, K. (2022). Data driven orthogonal basis selection for functional data analysis. Journal of Multivariate Analysis, 189:104868. Be...

work page arXiv 2017

[3] [3]

Ferraty, F. (2006). Nonparametric Functional Data Analysis. Springer. Galeano, P., Joseph, E., and Lillo, R. E. (2015). The Mahalanobis distance for functional data with applications to classification. Technometrics, 57(2):281–291. Genton, M. G. (2007). Separable approximations of space-time covariance matrices. Environmetrics: The Official Journal of the...

work page 2006

[4] [4]

Zuo, Y . (2003). Projection-based depth functions and associated medians.The Annals of Statistics, 31(5):1460–1490. Zuo, Y . and Serfling, R. (2000). General notions of statistical depth function.The Annals of Statistics, pages 461–482. 23 Explainable Outlier Detection for Multivariate Functional DataA PREPRINT A Further Preliminaries A.1 Matrix Normal Di...

work page 2003

[5] [5]

for computing the maximum likelihood estimates (20)- (21). Starting from any positive definite initialization, the proposed procedure is shown to converge almost surely to the positive definite covariance estimates, provided h≥ ⌊ p/m+ m/p⌋+ 2 . The convergence also holds if the ellipticity assumption is violated. For technical and implementation details, ...

work page 2025

[6] [6]

By collecting the coefficients in a matrixA= (a 1,

, p. By collecting the coefficients in a matrixA= (a 1, . . . ,ap)′ ∈R p×m we can write the multivariate process as Y(t) =A ϕ(t) + ˜ε(t). The coefficients ajk, j= 1, . . . , p , k= 1, . . . , m, are usually determined based on a least squares approach, and often a roughness penalty is involved; see Ramsay and Silverman (2005) for more details. A.5 FPCA fo...

work page 2005

[7] [7]

, mgive the orthonormality of the corresponding products

, p.(23) To see that the relations in (23) indeed hold, observe first that orthogonality of V row and orthonormality of ξi, i= 1, . . . , mgive the orthonormality of the corresponding products. Furthermore, λker i λrow j (ξi(s)vrow j ) =λ row j Z T κ(s, t)ξi(t)vrow j dt = Z T Σrowκ(s, t)ξi(t)vrow j dt= Z T K(s, t)(ξi(t)vrow j ) dt. (24) The uniqueness of ...

work page 2018

[8] [8]

Thus, functions in {ξ(j) i ej :i≥1, j= 1,

, p . Thus, functions in {ξ(j) i ej :i≥1, j= 1, . . . , p} , are the eigenfunctions of K, while λ(j) i , i≥1, j= 1, . . . , p are the corresponding eigenvalues. In other words, the spectrum of K corresponds to the union of the spectra of individual covariance operators Kj, j= 1, . . . , p . Then, for m1, . . . , mp as described in the statement of the res...

work page 2008

[9] [9]

Corollary B.0.1.Let X(t) =a ′ ϕ(t) be a rank m∈N stochastic process with mean µ and covariance κ, with coefficientsa∈R m and basisϕ= (ϕ 1,

finally gives that A∼ MN(0,Σ col,Σ row), thus completing the proof. Corollary B.0.1.Let X(t) =a ′ ϕ(t) be a rank m∈N stochastic process with mean µ and covariance κ, with coefficientsa∈R m and basisϕ= (ϕ 1, . . . , ϕm)′. Then the following holds: (i)a has a multivariate distribution with mean ma and covariance Cov(a) =Σ∈PDS(m) such that m′ a ϕ(t) =µ(t)and...

work page 2025

[10] [10]

,fMMD(Xn)) Algorithm 1 yields robust estimators

Run MMCD procedure onAand get( ˆMA,H ∗ , ˆΣrow H ∗ , ˆΣcol H ∗ ,MMD(A)); 3:Obtain functional data objects for mean and covariance ˆµ(t) = ˆMA,H ∗ ϕ(t); ˆΣrow = ˆΣcol H ∗; ˆκ(s, t) =ϕ′(s) ˆΣrow H ∗ ϕ(t); fMMD(Xi) = fMMD(Xi, ˆµ; ˆΣrow,ˆκ, mp) = MMD(Ai, ˆMA,H ∗; ˆΣrow H ∗ , ˆΣcol H ∗) Output: ˆµ, ˆΣrow,ˆκ,(fMMD(X1), . . . ,fMMD(Xn)) Algorithm 1 yields robust...

work page 1999

[11] [11]

The pointwise estimates of mean and covariance evaluated at observed time points t1,

Algorithm 1 can be easily adapted for the analysis of raw data: step 1 in the algorithm is omitted, and p×q matrices of raw data observations are supplied to the MMCD in step 2 . The pointwise estimates of mean and covariance evaluated at observed time points t1, . . . , tq are the output of MMCD step 2 . Usually, post-smoothing is applied to those estima...

work page 2005

[12] [12]

(a) Univariate functional data. Symbol Description X(t)∈L 2(T)Univariate stochastic process µ(t)Mean function ofX κ(s, t)Covariance kernel ofX KCovariance operator ofX ψi,π i Eigenfunctions and eigenvalues ofK T= Sd a=1 Ta Functional domain partitioned intoddisjoint subintervals ⟨X, Y⟩ Ta = R Ta X(t)Y(t)dtInner product on subintervalT a D={1, . . . , d}In...

work page 2024

[13] [13]

We selected the subset of n= 22 countries/regions with no missing values for the years between 1960 and

In the context of ASFRs, we refer to women towards the lower end of the spectrum as younger women, women aged around 30 years old as middle-aged women, and women at the higher end of the spectrum 42 Explainable Outlier Detection for Multivariate Functional DataA PREPRINT as older women. We selected the subset of n= 22 countries/regions with no missing val...

work page 1960

[14] [14]

, 2015:2019), which results in observations that are naturally arranged in12×31 matrices for each country

To facilitate the interpretability of the results, we aggregate the annual ASFRs into five-year intervals (1960:1964, 1965:1969, . . . , 2015:2019), which results in observations that are naturally arranged in12×31 matrices for each country. This matrix structure reflects the average ASFRs for each of the 12 five-year periods across the ages from 15 to

work page 1960

[15] [15]

Overall, fertility is clearly declining over the years, and women give birth at older ages as time progresses

Here, every plot shows the fertility curves for one of the five-year intervals. Overall, fertility is clearly declining over the years, and women give birth at older ages as time progresses. Moreover, the curves are similar in the last period while there is more difference between the countries in the earlier years. We see that some countries/regions form...

work page 2000

[16] [16]

Taking Belgium (BEL) as an example, the outlyingness scores reveal the following: The Age-specific outlying contributions highlight higher than expected fertility for women aged 15 to 24 years and lower than average fertility for 44 Explainable Outlier Detection for Multivariate Functional DataA PREPRINT BEL CHE CZE DEUTNP DEUTW KOR NLD NOR POL SVK USA 20...

work page 1960