A Generalized Central Limit Theorem for Two–Sample U–Statistics Under Dependence

1 Introduction

This document sketches (at a high level) a proof of the asymptotic normality of two–sample U–statistics, even though the summands exhibit dependence across indices. The standard i.i.d. Central Limit Theorem (CLT) does not apply directly because each $X_i$ is re-used against all $Y_j$ (and vice versa), causing correlations. Nonetheless, there is a well-known generalized CLT (often attributed to Hoeffding [Hoe48], detailed in [Ser80], [Vaa98], and others) that shows such U–statistics still converge to a normal distribution when properly normalized.

2 Two–Sample U–Statistics

Let $\{X_i\}_{i=1}^m$ be i.i.d. random variables from a distribution $F$, and let $\{Y_j\}_{j=1}^n$ be i.i.d. from a distribution $G$, independent of $\{X_i\}$. We define a (measurable) kernel

\[ \phi (x,y) : \mathcal {X} \times \mathcal {Y} \;\to \; \mathbb {R} \]

with finite second moments. The associated two–sample U–statistic is

\[ U_{m,n} \;=\; \frac {1}{m\,n} \sum _{i=1}^m \sum _{j=1}^n \phi \bigl (X_i, Y_j\bigr ). \]

We want to prove that for large $m,n$, this statistic is asymptotically normal after an appropriate centering and scaling.

Notation and Limits

Define $N = m+n$ as the total sample size. Often one assumes $m,n \to \infty $ with

\[ \frac {m}{m+n} \;\to \; \alpha \in (0,1), \quad \text {so } \frac {n}{m+n} \;\to \; 1-\alpha . \]

We will show that

\[ \sqrt {m+n}\,\Bigl (U_{m,n} - \E [U_{m,n}]\Bigr ) \;\xrightarrow {\;d\;}\; \mathcal {N}\bigl (0,\sigma ^2\bigr ) \]

for some $\sigma ^2 > 0$ that depends on $F,G,\phi $ and on the asymptotic ratio $\alpha $.

3 Hoeffding–Type Decomposition

A classical approach is to write a Hoeffding decomposition of the kernel $\phi (x,y)$ in terms of its “one–dimensional projections.” That is, we write

\[ \phi (x,y) \;=\; \theta \;+\; f(x) \;+\; g(y) \;+\; h(x,y), \]

where:

\[ \theta \;=\; \E [\phi (X,Y)], \quad f(x) \;=\; \E [\phi (x,Y)] - \theta , \quad g(y) \;=\; \E [\phi (X,y)] - \theta , \]

and

\[ h(x,y) \;=\; \phi (x,y) - \theta - f(x) - g(y). \]

By construction, $h(x,y)$ satisfies

\[ \E [h(X,Y)\,\bigl |\bigr .\,X=x] \;=\; 0, \quad \E [h(X,Y)\,\bigl |\bigr .\,Y=y] \;=\; 0. \]

Hence $h$ has mean $0$ (both marginally in $x$ and $y$).

U–Statistic in Decomposed Form

Then

\[ U_{m,n} \;=\; \frac {1}{mn} \sum _{i=1}^m \sum _{j=1}^n \phi (X_i,Y_j) \;=\; \theta \;+\; \frac {1}{mn}\sum _{i=1}^m \sum _{j=1}^n \Bigl [f(X_i) + g(Y_j) + h(X_i,Y_j)\Bigr ]. \]

We separate this into three sums:

\[ U_{m,n} - \theta \;=\; \frac {1}{mn}\sum _{i,j} f(X_i) \;+\; \frac {1}{mn}\sum _{i,j} g(Y_j) \;+\; \frac {1}{mn}\sum _{i,j} h(X_i,Y_j). \]

Notice that

\[ \frac {1}{mn}\sum _{i=1}^m \sum _{j=1}^n f(X_i) \;=\; \frac {1}{m}\,\sum _{i=1}^m f(X_i) \quad \text {and} \quad \frac {1}{mn}\sum _{i=1}^m \sum _{j=1}^n g(Y_j) \;=\; \frac {1}{n}\,\sum _{j=1}^n g(Y_j). \]

Hence

\[ U_{m,n} - \theta \;=\; \frac {1}{m}\,\sum _{i=1}^m f(X_i) \;+\; \frac {1}{n}\,\sum _{j=1}^n g(Y_j) \;+\; \frac {1}{mn}\sum _{i=1}^m\sum _{j=1}^n h(X_i,Y_j). \]

Define

\[ A_m \;=\; \sum _{i=1}^m \bigl [f(X_i)\bigr ], \quad B_n \;=\; \sum _{j=1}^n \bigl [g(Y_j)\bigr ], \quad C_{m,n} \;=\; \sum _{i=1}^m\sum _{j=1}^n h(X_i,Y_j). \]

\[ U_{m,n} - \theta \;=\; \frac {A_m}{m} \;+\; \frac {B_n}{n} \;+\; \frac {C_{m,n}}{mn}. \]

Centering

We also have $\E [A_m] = m\,\E [f(X)] = 0$ by construction of $f$. Similarly $\E [B_n]=0$, and $\E [C_{m,n}]=0$. Thus

\[ \E [U_{m,n}] \;=\; \theta . \]

4 Proof of Asymptotic Normality: Outline

We want to show

\[ \sqrt {m+n} \, \Bigl (U_{m,n} - \theta \Bigr ) \;=\; \sqrt {m+n} \, \Bigl ( \frac {A_m}{m} + \frac {B_n}{n} + \frac {C_{m,n}}{mn} \Bigr ) \;\xrightarrow {\;d\;}\; \mathcal {N}(0,\sigma ^2). \]

Below is a sketch of how each term behaves and how one applies classical CLT arguments:

Term 1: $\sqrt {m+n}\,\tfrac {A_m}{m}$

Recall $A_m = \sum _{i=1}^m f(X_i)$ where $\{X_i\}$ are i.i.d. from $F$. Then

\[ \E [f(X_i)] = 0, \quad \Var [f(X_i)] =: \sigma _f^2 < \infty . \]

A classical CLT for i.i.d. sequences shows

\[ \frac {A_m}{\sqrt {m}} \;=\; \frac {1}{\sqrt {m}} \sum _{i=1}^m f(X_i) \;\xrightarrow {\;d\;}\; \mathcal {N}(0,\,\sigma _f^2). \]

But we are multiplying by $\sqrt {m+n}$ and dividing by $m$. Observe that

\[ \sqrt {m+n}\,\frac {A_m}{m} \;=\; \frac {\sqrt {m+n}}{m}\,\sum _{i=1}^m f(X_i) \;=\; \sqrt {\frac {m+n}{m}} \,\frac {A_m}{\sqrt {m}}. \]

If $m/(m+n)\to \alpha \in (0,1)$, then $\frac {m+n}{m} \to \frac {1}{\alpha }$. Thus

\[ \sqrt {\frac {m+n}{m}} \;\to \; \frac {1}{\sqrt {\alpha }}. \]

Hence

\[ \sqrt {m+n}\,\frac {A_m}{n} \;=\; \left (\frac {1}{\sqrt {\alpha }} + o(1)\right )\,\frac {A_m}{\sqrt {m}}. \]

So in the limit, that term converges in distribution to $\mathcal {N}\bigl (0,\,\sigma _f^2/(1-\alpha )\bigr )$ (by Slutsky’s theorem).

Term 2: $\sqrt {m+n}\,\tfrac {B_n}{m}$

An exactly analogous argument applies to $B_n = \sum _{j=1}^n g(Y_j)$ with $\Var [g(Y_j)] =: \sigma _g^2$, so

\[ \sqrt {m+n}\,\frac {B_n}{n} \;=\; \sqrt {\frac {m+n}{n}} \, \frac {B_n}{\sqrt {n}} \;\xrightarrow {\;d\;}\; \mathcal {N}\Bigl (0,\;\frac {\sigma _g^2}{1-\alpha }\Bigr ). \]

Term 3: $\sqrt {m+n}\,\tfrac {C_{m,n}}{mn}$

Recall

\[ C_{m,n} \;=\; \sum _{i=1}^m\sum _{j=1}^n h(X_i,Y_j), \quad \E [C_{m,n}] = 0, \]

where $h(x,y)$ has $\E [h(X,Y)\mid X]=0$ and $\E [h(X,Y)\mid Y]=0$. Intuitively, $h(X_i,Y_j)$ is a zero-mean “interaction part” that is uncorrelated across different pairs $(i,j)\neq (i',j')$, except for sharing $X_i$ or $Y_j$. The literature on two–sample U–statistics (see [Ser80], [Vaa98, Chapter 12]) shows that

\[ \Var \Bigl (\sum _{i,j} h(X_i,Y_j)\Bigr ) \;=\; O\bigl (m\,n\bigr ). \]

More precisely, the partial overlaps do not inflate the order beyond $mn$ (they do not produce an $(mn)^2$ term!).

Thus heuristically

\[ \Var \!\Bigl (\frac {C_{m,n}}{mn}\Bigr ) \;=\; \frac {1}{(mn)^2}\,\Var \!\Bigl (\sum _{i,j} h(X_i,Y_j)\Bigr ) \;=\; O\Bigl (\frac {m\,n}{(mn)^2}\Bigr ) \;=\; O\Bigl (\frac {1}{mn}\Bigr ). \]

Hence

\[ \sqrt {m+n}\,\frac {C_{m,n}}{mn} \;=\; O_p\Bigl (\sqrt {m+n}\,\frac {1}{\sqrt {mn}}\Bigr ) \;=\; O_p\Bigl (\sqrt {\frac {m+n}{mn}}\Bigr ). \]

Since $mn/(m+n)\to \alpha (1-\alpha )\,N$, we see that

\[ \sqrt {\frac {m+n}{mn}} \;\;=\;\; \sqrt {\frac {1}{m} + \frac {1}{n}} \;\to \; 0 \quad \text {(if $m,n\to \infty $ proportionally)}. \]

Thus $\sqrt {m+n}\,\frac {C_{m,n}}{mn} \xrightarrow {p} 0$, meaning it is negligible in the $\sqrt {m+n}$ scaling.

Combining the Three Terms

Summarizing:

\[ \sqrt {m+n}\,\Bigl (U_{m,n} - \theta \Bigr ) \;=\; \underbrace {\sqrt {m+n}\,\frac {A_m}{m}}_{\text {CLT } \to \, \mathcal {N}(0,\,\sigma _f^2/\alpha )} \;+\; \underbrace {\sqrt {m+n}\,\frac {B_n}{n}}_{\text {CLT } \to \, \mathcal {N}(0,\,\sigma _g^2/(1-\alpha ))} \;+\; \underbrace {\sqrt {m+n}\,\frac {C_{m,n}}{mn}}_{\xrightarrow {p} 0}. \]

By Slutsky’s theorem (and properties of sums of asymptotically normal terms), we get an overall normal limit whose variance is $\sigma _f^2/\alpha + \sigma _g^2/(1-\alpha )$. Here,

\[ \sigma _f^2 \;=\; \Var \!\bigl (f(X)\bigr ), \quad \sigma _g^2 \;=\; \Var \!\bigl (g(Y)\bigr ), \]

and $ f(x) = \E [\phi (x,Y)]-\theta , \; g(y) = \E [\phi (X,y)]-\theta . $

That yields the main theorem:

\[ \sqrt {m+n}\,\bigl (U_{m,n} - \theta \bigr ) \;\xrightarrow {d}\; \mathcal {N}\Bigl (0,\;\sigma ^2\Bigr ) \quad \text {where}\quad \sigma ^2 \;=\; \frac {\Var [f(X)]}{\alpha } \;+\; \frac {\Var [g(Y)]}{1-\alpha }. \]

(There are alternative expressions involving $\xi _{10}$, $\xi _{01}$ from the “four-case” breakdown, etc. They turn out to coincide with these $\sigma _f^2,\sigma _g^2$ under appropriate definitions; see [Ser80; Vaa98] for details.)

Remark 1 (Multiple or vector-valued kernels). The above argument generalizes to the case $\phi : \mathcal {X}\times \mathcal {Y} \to \mathbb {R}^d$, so that $(U_{m,n}-\theta )$ is in $\mathbb {R}^d$. Then $f$ and $g$ also take values in $\mathbb {R}^d$. The same decomposition and variance considerations apply (just use the multivariate CLT).

5 Variance estimators

5.1 Projection-Based (Hoeffding) Variance Estimator

The variance

In the case of an uni-dimensional kernel, it is straightforward to estimate the variance $\sigma ^2$ of the limiting normal distribution by replacing $\theta $, $\E [\phi (x,Y)]$, and $\E [\phi (X,y)]$ by their empirical version.

Estimate $\theta $
A consistent estimate is the U–statistic itself:
\[ \hat {\theta } \;=\; U_{m,n} \;=\; \frac {1}{mn}\,\sum _{i=1}^m \sum _{j=1}^n \phi (X_i,Y_j). \]
Estimate $f(x)$
For each observed $X_i$, the true $f(X_i)$ is $\E [\phi (X_i, Y)] - \theta $. We approximate $\E [\phi (X_i, Y)]$ by averaging over the $Y_j$’s:
\[ \hat {f}_i \;=\; \frac {1}{n}\,\sum _{j=1}^n \phi (X_i, Y_j) \;-\; \hat {\theta }. \]
Estimate $g(y)$
By symmetry, for each $Y_j$,
\[ \hat {g}_j \;=\; \frac {1}{m}\,\sum _{i=1}^m \phi (X_i, Y_j) \;-\; \hat {\theta }. \]

Sample-based estimates of $\Var [f(X)]$ and $\Var [g(Y)]$ are given by

\[ \widehat {\Var }\bigl [f(X)\bigr ] \;=\; \frac {1}{m}\,\sum _{i=1}^m \bigl (\hat {f}_i\bigr )^2 \qquad \text {and} \qquad \widehat {\Var }\bigl [g(Y)\bigr ] \;=\; \frac {1}{n}\,\sum _{j=1}^n \bigl (\hat {g}_j\bigr )^2. \]

(One might prefer an $m-1$ or $n-1$ in the denominators for unbiasedness, but for large $m,n$ it does not matter.)

Putting it all together, if

\[ \alpha _{m,n} \;=\; \frac {m}{m+n} \to \alpha \in (0,1), \]

a consistent estimator for the asymptotic variance $\sigma ^2$ is

\[ \widehat {\sigma }^2 \;=\; \frac {\widehat {\Var }[f(X)]}{\,\alpha _{m,n}\,} \;+\; \frac {\widehat {\Var }[g(Y)]}{\,1-\alpha _{m,n}\,}. \]

5.2 Direct Covariance Decomposition Estimator

An alternative way to arrive at a variance estimator is to “unpack” the variance of the U-statistic by writing

\[ U_{m,n} = \frac {1}{mn} \sum _{i=1}^m \sum _{j=1}^n \phi (X_i,Y_j) \]

and then writing

\[ \Var (U_{m,n}) = \frac {1}{m^2n^2}\sum _{i,j}\sum _{i',j'} \Cov \Bigl (\phi (X_i,Y_j),\phi (X_{i'},Y_{j'})\Bigr ). \]

Because the two samples are independent, we have four distinct cases for the indices:

Case 1: $i=i'$ and $j=j'$
There are $mn$ terms. Each contributes
\[ \Var \bigl (\phi (X_i,Y_j)\bigr ). \]
Case 2: $i=i'$ and $j\neq j'$
There are $m \, n(n-1)$ terms. For a fixed $i$, using the law of total expectation and by conditioning on $X_i$,
\begin{align*} \Cov \Bigl (\phi (X_i,Y_j),\phi (X_i,Y_{j'})\Bigr ) &= \E \Bigl [\phi (X_i,Y_j)\phi (X_i,Y_{j'})\Bigr ] - \theta ^2 \\ &= \E \Bigl [ \bigl (\E [\phi (X_i,Y_j) \mid X_i]\bigr )^2\Bigr ] - \theta ^2 \\ &= \sigma _1^2, \end{align*}

where $\sigma _1^2 = \Var \bigl (\E [\phi (X,Y) \mid X]\bigr )$.
Case 3: $i\neq i'$ and $j=j'$
Similarly, there are $n \, m(m-1)$ terms, each with covariance
\[ \Cov \Bigl (\phi (X_i,Y_j),\phi (X_{i'},Y_j)\Bigr ) = \sigma _2^2, \]

where $\sigma _2^2 = \Var \bigl (\E [\phi (X,Y) \mid Y]\bigr )$.
Case 4: $i\neq i'$ and $j\neq j'$
In this case the pairs $(X_i,Y_j)$ and $(X_{i'},Y_{j'})$ are independent so the covariance is zero.

Thus, in population terms we have

\[ \Var (U_{m,n}) = \frac {1}{m^2 n^2}\Bigl \{\, m\,n\Var \bigl (\phi (X,Y)\bigr ) + m\,n(n-1)\,\sigma _1^2 + n\,m(m-1)\,\sigma _2^2 \Bigr \}. \]

A natural plug-in estimator replaces the population quantities by their sample analogues. Define the overall U–statistic mean

\[ U_{m,n} = \frac {1}{mn}\sum _{i=1}^m\sum _{j=1}^n \phi (X_i,Y_j), \]

and then form the following estimates:

For the variance term (Case 1):
\[ \widehat {V} = \frac {1}{mn-1}\sum _{i=1}^m\sum _{j=1}^n \Bigl [\phi (X_i,Y_j)-U_{m,n}\Bigr ]^2. \]
For the covariance among terms sharing the same $X_i$ (Case 2):
\[ \widehat {C}_1 = \frac {1}{m\,n(n-1)} \sum _{i=1}^m \sum _{\substack {j=1}}^n \sum _{\substack {j'=1 \\ j'\neq j}}^n\Bigl [\phi (X_i,Y_j)-U_{m,n}\Bigr ] \Bigl [\phi (X_i,Y_{j'})-U_{m,n}\Bigr ]. \]
For the covariance among terms sharing the same $Y_j$ (Case 3):
\[ \widehat {C}_2 = \frac {1}{n\,m(m-1)} \sum _{j=1}^n \sum _{i=1}^m \sum _{\substack {i'=1 \\ i'\neq i}}^m \Bigl [\phi (X_i,Y_j)-U_{m,n}\Bigr ] \Bigl [\phi (X_{i'},Y_j)-U_{m,n}\Bigr ]. \]

Then a direct plug-in estimator for the variance of $U_{m,n}$ is given by

\[ \widehat {\Var }(U_{m,n}) \;=\; \frac {1}{m^2n^2}\Bigl \{\, mn\,\widehat {V} \;+\; m\,n(n-1)\,\widehat {C}_1 \;+\; n\,m(m-1)\,\widehat {C}_2 \Bigr \}. \]

Equivalently, after canceling a factor of $mn$ in the numerator,

\[ \widehat {\Var }(U_{m,n}) \;=\; \frac {\widehat {V}}{mn} \;+\; \frac {n-1}{mn}\,\widehat {C}_1 \;+\; \frac {m-1}{mn}\,\widehat {C}_2. \]

This estimator directly reflects the contribution to the variance from the four covariance configurations in the double sum, and under standard conditions it provides a consistent estimate of $\Var (U_{m,n})$.

Matrix Representation

For compactness and ease of computation, one can arrange the centered kernels into an $m\times n$ matrix

\[ C = \Bigl [\phi (X_i,Y_j)-U_{m,n}\Bigr ]_{i=1,\dots ,m; \; j=1,\dots ,n}. \]

Then, by introducing the $m\times m$ and $n\times n$ matrices of ones, $J_m$ and $J_n$, and using some matrix algebra, one can show that the variance estimator is equivalent to

\[ \widehat {\Var }(U_{m,n}) = \frac {1}{m^2n^2}\trace \Bigl (C\bigl [J_m C + C J_n - C\bigr ]^\top \Bigr ), \]

where $I_m$ (and $I_n$) is the identity matrix of the appropriate size.

References

[Hoe48]: Wassily Hoeffding. “A Class of Statistics with Asymptotically Normal Distribution”. In: The Annals of Mathematical Statistics 19.3 (Sept. 1948), pp. 293–325. doi: 10.1214/aoms/1177730196.
[Ser80]: Robert J. Serfling. Approximation Theorems of Mathematical Statistics. Wiley Series in Probability and Mathematical Statistics. Hoboken: John Wiley & Sons, 1980. doi: 10.1002/9780470316481.
[Vaa98]: A. W. van der Vaart. Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press, 1998. doi: 10.1017/CBO9780511802256.

A Matrix Representation of the Variance

We start by writing

\[ c_{ij} \;=\; \phi (X_i,Y_j)-U_{m,n}, \]

and let

\[ C = \bigl [c_{ij}\bigr ]_{i=1,\dots ,m; \,j=1,\dots ,n}. \]

Also, define the row- and column–sums as

\[ S_{i\cdot } \;=\; \sum _{j=1}^n c_{ij} \quad \text {and}\quad S_{\cdot j} \;=\; \sum _{i=1}^m c_{ij}. \]

Expressing the Trace

Notice that

\[ \trace \Bigl ( C\bigl [J_m C + C J_n - C\bigr ]^\top \Bigr ) =\sum _{i=1}^m\sum _{j=1}^n c_{ij}\,\Bigl [(J_m C)_{ij}+(C J_n)_{ij}-c_{ij}\Bigr ]. \]

Because $J_m$ is the $m\times m$ matrix of ones and $J_n$ is the $n\times n$ matrix of ones, we have

\[ (J_m C)_{ij} = \sum _{k=1}^m c_{kj} \;=\; S_{\cdot j} \]

and

\[ (C J_n)_{ij} = \sum _{\ell =1}^n c_{i\ell } \;=\; S_{i\cdot }. \]

Thus, the trace becomes

\begin{align*} \trace \Bigl (C\bigl [J_m C + C J_n - C\bigr ]^\top \Bigr ) &= \sum _{i=1}^m \sum _{j=1}^n c_{ij} \Bigl [S_{\cdot j} + S_{i\cdot } - c_{ij}\Bigr ] \\ &= \sum _{j=1}^n \Bigl (\sum _{i=1}^m c_{ij}\Bigr )^2 +\sum _{i=1}^m \Bigl (\sum _{j=1}^n c_{ij}\Bigr )^2 -\sum _{i=1}^m\sum _{j=1}^n c_{ij}^2 \\ &= \sum _{j=1}^n S_{\cdot j}^2 + \sum _{i=1}^m S_{i\cdot }^2 - \sum _{i=1}^m \sum _{j=1}^n c_{ij}^2. \end{align*}

Rewriting the Variance Estimator

The variance estimator is given by

\[ \widehat {\Var }(U_{m,n}) \;=\; \frac {\widehat {V}}{mn} \;+\; \frac {n-1}{mn}\, \widehat {C}_1 \;+\; \frac {m-1}{mn}\,\widehat {C}_2, \]

with

\[ \widehat {V} \;=\; \frac {1}{mn-1} \sum _{i,j} c_{ij}^2. \]

To see the connection with the trace formula, let’s reexpress the “covariance-terms.” In fact, a short calculation shows that

\[ \widehat {C}_1 \;=\; \frac {1}{m\,n(n-1)}\Biggl [\sum _{i=1}^m S_{i\cdot }^2 - \sum _{i,j} c_{ij}^2\Biggr ] \]

and

\[ \widehat {C}_2 \;=\; \frac {1}{n\,m(m-1)}\Biggl [\sum _{j=1}^n S_{\cdot j}^2 - \sum _{i,j} c_{ij}^2\Biggr ]. \]

Substitute these into the variance estimator. After some algebra (and noting that the factors combine so that all three terms have an overall denominator proportional to $m^2n^2$), one finds that

\[ \widehat {\Var }(U_{m,n}) \;=\; \frac {1}{m^2n^2} \Bigl [\sum _{i=1}^m S_{i\cdot }^2 + \sum _{j=1}^n S_{\cdot j}^2 - \sum _{i,j} c_{ij}^2\Bigr ]. \]

But the right-hand side is exactly the trace expression we obtained above. Hence,

\[ \widehat {\Var }(U_{m,n}) \;=\; \frac {1}{m^2n^2} \trace \Bigl (C \bigl [J_m C + C J_n - C\bigr ]^\top \Bigr ). \]

A Generalized Central Limit Theorem for Two–Sample U–Statistics Under Dependence

1 Introduction

2 Two–Sample U–Statistics

Notation and Limits

3 Hoeffding–Type Decomposition

U–Statistic in Decomposed Form

Centering

4 Proof of Asymptotic Normality: Outline

Term 1: \(\sqrt {m+n}\,\tfrac {A_m}{m}\)

Term 2: \(\sqrt {m+n}\,\tfrac {B_n}{m}\)

Term 3: \(\sqrt {m+n}\,\tfrac {C_{m,n}}{mn}\)

Combining the Three Terms

5 Variance estimators

5.1 Projection-Based (Hoeffding) Variance Estimator

5.2 Direct Covariance Decomposition Estimator

Matrix Representation

References

A Matrix Representation of the Variance

Expressing the Trace

Rewriting the Variance Estimator