Public Documentation
Documentation for CovarianceEstimation.jl
's public interface.
See Internal Documentation for internal package docs.
Contents
Index
CovarianceEstimation.AnalyticalNonlinearShrinkage
CovarianceEstimation.BiweightMidcovariance
CovarianceEstimation.CommonCovariance
CovarianceEstimation.ConstantCorrelation
CovarianceEstimation.DiagonalCommonVariance
CovarianceEstimation.DiagonalUnequalVariance
CovarianceEstimation.DiagonalUnitVariance
CovarianceEstimation.LinearShrinkage
CovarianceEstimation.NormLossCov
CovarianceEstimation.PerfectPositiveCorrelation
CovarianceEstimation.StatLossCov
CovarianceEstimation.WoodburyEstimator
Statistics.cov
Public Interface
Statistics.cov
— Functioncov(lse::LinearShrinkage, X, [weights::FrequencyWeights]; dims=1, mean=nothing)
Linear shrinkage covariance estimator for matrix X
along dimension dims
. Computed using the method described by lse
.
Optionally provide weights
associated with each observation in X
(see StatsBase.FrequencyWeights
).
Theoretical guidance for the use of weights in shrinkage estimation seems sparse. FrequencyWeights
have a straightforward implementation, but support for other AbstractWeight
subtypes awaits analytical justification.
cov(ans::AnalyticalNonlinearShrinkage, X, [weights]; dims=1, mean=nothing)
Nonlinear covariance estimator derived from the sample covariance estimator S
and its eigenvalue decomposition (which can be given through decomp
). See Ledoit and Wolf's paper http://www.econ.uzh.ch/static/wp/econwp264.pdf The keyword mean
can be nothing
(centering via estimated mean), zero (no centering) or a provided vector. In the first case, a rank-1 modification is applied and therefore the effective sample size is decreased by one (see analytical_nonlinear_shrinkage
). In the latter two case the mean cannot have been estimated on the data (otherwise the effective sample size will be 1 larger than it should be resulting in numerical instabilities). If you are unsure, use either nothing
or provide an explicit (non-estimated) vector (possibly a zero vector) and avoid the use of mean=0
.
- Time complexity (including formation of
S
)- (p<n): O(np^2 + n^2) with moderate constant
- (p>n): O(p^3) with low constant (dominated by eigendecomposition of S)
cov(estimator::WoodburyEstimator, X::AbstractMatrix, weights::FrequencyWeights...; dims::Int=1, mean=nothing, UsV=nothing)
Estimate the covariance matrix from the data matrix X
using a "spiked" covariance model
Σ = σ²I + U * Λ * U',
where U
is a low-rank matrix of eigenvectors and Λ
is a diagonal matrix.
Reference: Donoho, D.L., Gavish, M. and Johnstone, I.M., 2018. Optimal shrinkage of eigenvalues in the spiked covariance model. Annals of statistics, 46(4), p.1742.
When σ²
is not supplied in estimator
, it is calculated from the residuals X - X̂
, where X̂
is the low-rank approximation of X
used to generate U
and Λ
.
If X
is too large to manipulate in memory, you can pass UsV = (U, s, V)
(a truncated SVD of X - mean(X; dims)
) and then X
will only be used compute the dimensionality and number of observations. This requires that you specify estimator.σ²
.
CovarianceEstimation.LinearShrinkage
— TypeLinearShrinkage(target, shrinkage; corrected=false, drop_var0=false)
Linear shrinkage estimator described by equation $(1 - \lambda) S + \lambda F$ where $S$ is standard covariance matrix, $F$ is shrinkage target described by argument target
and $\lambda$ is a shrinkage parameter, either given explicitly in shrinkage
or automatically determined according to one of the supported methods.
The corrected estimator is used if corrected
is true. drop_var0=true
drops the zero-variance variables from the computation of \lambda
.
CovarianceEstimation.DiagonalUnitVariance
— TypeDiagonalUnitVariance
Target for linear shrinkage: unit matrix. A subtype of LinearShrinkageTarget
where
- $F_{ij}=1$ if $i=j$ and
- $F_{ij}=0$ otherwise
CovarianceEstimation.DiagonalCommonVariance
— TypeDiagonalCommonVariance
Target for linear shrinkage: unit matrix multiplied by average variance of variables. A subtype of LinearShrinkageTarget
where
- $F_{ij}=v$ if $i=j$ with $v=\mathrm{tr}(S)/p$ and
- $F_{ij}=0$ otherwise
CovarianceEstimation.DiagonalUnequalVariance
— TypeDiagonalUnequalVariance
Target for linear shrinkage: diagonal of covariance matrix. A subtype of LinearShrinkageTarget
where
- $F_{ij}=s_{ij}$ if $i=j$ and
- $F_{ij}=0$ otherwise
CovarianceEstimation.CommonCovariance
— TypeCommonCovariance
Target for linear shrinkage: see target_C
. A subtype of LinearShrinkageTarget
where
- $F_{ij}=v$ if $i=j$ with $v=\mathrm{tr}(S)/p$ and
- $F_{ij}=c$ with $c=\sum_{i\neq j} S_{ij}/(p(p-1))$ otherwise
CovarianceEstimation.PerfectPositiveCorrelation
— TypePerfectPositiveCorrelation
Target for linear shrinkage: see target_E
. A subtype of LinearShrinkageTarget
where
- $F_{ij}=S_{ij}$ if $i=j$ and
- $F_{ij}=\sqrt{S_{ii}S_{jj}}$ otherwise
CovarianceEstimation.ConstantCorrelation
— TypeConstantCorrelation
Target for linear shrinkage: see target_F
. A subtype of LinearShrinkageTarget
where
- $F_{ij}=S_{ij}$ if $i=j$ and
- $F_{ij}=\overline{r}\sqrt{S_{ii}S_{jj}}$ otherwise where $\overline{r}$ is the average sample correlation
CovarianceEstimation.AnalyticalNonlinearShrinkage
— TypeAnalyticalNonlinearShrinkage
Analytical nonlinear shrinkage estimator. See docs for analytical_nonlinear_shrinkage
for details.
CovarianceEstimation.BiweightMidcovariance
— TypeBiweightMidcovariance(; c=9.0, modify_sample_size=false)
The biweight midcovariance is a covariance estimator that is resilient to outliers.
The technique derives originally from astrophysics [1], and is implemented in the Python module Astropy [2], as well as in NIST's Dataplot [3].
Consider two random variables $x$ and $y$, for which we have $n$ observations $\{(x_i, y_i)\}$. The biweight midcovariance is then defined to be:
\[n_s\cdot\frac{ \sum_{|u_i|<1,|v_i|<1}(x_i - M_x)(1 - u_i^2)^2(y_i - M_y)(1 - v_i^2)^2 }{ \left(\sum_{|u_i|<1}(1 - u_i^2)(1-5u_i^2)\right) \left(\sum_{|v_i|<1}(1 - v_i^2)(1-5v_i^2)\right) }\]
where $n_s$ is the sample size, $M_x$ and $M_y$ are the medians of $\{x_i\}$ and $\{y_i\}$ respectively, and
\[\begin{aligned} u_i &= \frac{x_i - M_x}{c \cdot \mathrm{MAD}_x} \\ v_i &= \frac{y_i - M_y}{c \cdot \mathrm{MAD}_y}, \end{aligned}\]
where $\mathrm{MAD}$ represents the median absolute deviation,
\[\begin{aligned} \mathrm{MAD}_x &= \mathrm{median}(\left\{\left|x_i - M_x\right|\right\}) \\ \mathrm{MAD}_y &= \mathrm{median}(\left\{\left|y_i - M_y\right|\right\}). \end{aligned}\]
If either $\mathrm{MAD}_x = 0$ or $\mathrm{MAD}_y = 0$, the pairwise covariance is defined to be zero.
The parameter $c$ is a tuning constant, for which the default is $9.0$. Larger values will reduce the number of outliers that are removed — i.e. reducing robustness, but increasing sample efficiency.
Fields
c::Float64
: The tuning constant corresponding to $c$ above.modify_sample_size::Bool
: Iffalse
, then we use a sample size $n_s$ equal to the total number of observations $n$. This is consistent with the standard definition of biweight midcovariance in the literature. Otherwise, we count only those elements which are not rejected as outliers in the numerator, i.e. those for which $|u_i|<1$ and $|v_i|<1$. This follows the implementation in astropy [2].
Complexity
- Space: $O(p^2)$
- Time: $O(np^2)$
References
[1] Beers, Flynn, and Gebhardt (1990; AJ 100, 32) "Measures of Location and Scale for Velocities in Clusters of Galaxies – A Robust Approach"
CovarianceEstimation.NormLossCov
— TypeNormLossCov(norm::Symbol, pivotidx::Int)
Specify a loss function for which the estimated covariance will be optimal. norm
is one of :L1
, :L2
, or :Linf
, and pivotidx
is an integer from 1 to 7, as specified in Table 1 (p. 1755) of Donoho et al. (2018). In the table below, A
and B
are the target and sample covariances, respectively, and the loss function is the specified norm on the quantity in the pivot
column:
pivotidx | pivot | Notes |
---|---|---|
1 | A - B | |
2 | A⁻¹ - B⁻¹ | |
3 | A⁻¹ B - I | Not available for :L1 |
4 | B⁻¹ A - I | Not available for :L1 |
5 | A⁻¹ B + B⁻¹ A - 2I | Not supported |
6 | sqrt(A) \ B / sqrt(A) - I | |
7 | log(sqrt(A) \ B / sqrt(A)) | Not supported |
See also StatLossCov
.
Reference: Donoho, D.L., Gavish, M. and Johnstone, I.M., 2018. Optimal shrinkage of eigenvalues in the spiked covariance model. Annals of statistics, 46(4), p.1742.
CovarianceEstimation.StatLossCov
— TypeStatLossCov(mode::Symbol)
Specify a loss function for which the estimated covariance will be optimal. mode
is one of :st
, :ent
, :div
, :aff
, or :fre
, as specified in Table 2 (p. 1757) of Donoho et al. (2018). In the table below, A
and B
are the target and sample covariances, respectively:
mode | loss | Interpretation |
---|---|---|
:st | st(A, B) = tr(A⁻¹ B - I) - log(det(B)/det(A)) | Minimize KL-divergence between N(0, A) and N(0, B) where N is normal distribution |
:ent | st(B, A) | Minimize errors in Mahalanobis distances |
:div | st(A, B) + st(B, A) | |
:aff | 0.5 * log(det(A + B) / (2 * sqrt(det(A*B)))) | Minimize Hellinger distance between N(0, A) and N(0, B) |
:fre | tr(A + B - 2sqrt(A*B)) |
CovarianceEstimation.WoodburyEstimator
— TypeWoodburyEstimator(loss::LossFunction, rank::Integer;
σ²::Union{Real,Nothing}=nothing, corrected::Bool=false)
Specify that covariance matrices should be estimated using a "spiked" covariance model
Σ = σ²I + U * Λ * U'
loss
is either a NormLossCov
or StatLossCov
object, which specifies the loss function for which the estimated covariance will be optimal. rank
is the maximum number of eigenvalues Λ
to retain in the model. Optionally, one may specify σ²
directly, or it can be estimated from the data matrix (σ²=nothing
). Set corrected=true
to use the unbiased estimator of the variance.