Review Article Volume 7 Issue 3
Virginia Polytechnic Institute and State University, USA
Correspondence: Donald R Jensen, Professor Emeritus, Virginia Polytechnic Institute and State University, Blacksburg VA, 24061, USA, Tel 540 639 0865
Received: May 02, 2018 | Published: May 23, 2018
Citation: Jensen DR. Linear inference under alpha–stable errors. Biom Biostat Int J. 2018;7(3):205–210. DOI: 10.15406/bbij.2018.07.00210
Linear inference remains pivotal in statistical practice, despite errors often having excessive tails and thus deficient of moments required in conventional usage. Such errors are modeled here via spherical α –stable measures on ℝn with stability index α∈(0, 2], arising in turn through multivariate central limit theory devoid of the second moments required for Gaussian limits. This study revisits linear inference under α–stable errors, focusing on aspects to be salvaged from the classical theory even without moments. Critical entities include Ordinary Least Squares (OLS) solutions, residuals, and conventional F ratios in inference. Closure properties are seen in that OLS solutions and residual vectors under α –stable errors also have α–stable distributions, whereas F ratios remain exact in level and power as for Gaussian errors. Although correlations are undefined for want of second moments, corresponding scale parameters are seen to gauge degrees of association under α –stable symmetry.
AMS subject classification: 62E15, 62H15, 62J20
Keywords:excessive errors, central limit theory, stable laws, linear inference
Models here are {Y=Xβ+ε}with error vector ε∈ℝn.Classical linear inference rests heavily on means, variances, correlations, skewness and kurtosis parameters, these requiring moments to fourth order. To the contrary, distributions having excessive tails, and devoid of moments even of first or second order, arise in a variety of circumstances. These encompass acoustics, image processing, radar tracking, biometrics, portfolio analysis and risk management in finance, and other venues in contemporary practice. Supporting references include1–6 monographs of note are,7–9 together with the recent work of Nolan.10 In these settings the classical foundations necessarily must be reworked.
To place this study in perspective, alternatives to Gaussian laws long have been sought in theory and practice, culminating in the class {Sn(0,Σ)}consisting of elliptically contoured distributions in ℝncentered at 0 with scale parameters Σ. These typically are taken to be rich in moments, and to provide alternatives to the use of large-sample approximate Gaussian distributions under conditions for central limit theory. Comprehensive treatises on the theory and applications of these models are.11–13
In contrast, errors having excessive tails are modeled on occasion via spherically symmetric αstable (SαS)distributions in ℝn with index α∈(0, 2 ]. These comprise the limit distributions of standardized vector sums, specifically, Gaussian limits at Cauchy limits at α=2, and corresponding stable limits otherwise. These distributions are contained in the class {Sn(0,In)},thus sharing its essential geometric features, but instead are deficient in moments usually ascribed to {Sn(0,In)}.Despite the venues cited, αstable errors have seen limited usage for want of closed expressions for stable density functions, known only in selected cases but topics of continuing research. Nonetheless, findings reported here rest on well defined characteristic functions (chfs), on critical representations for these, and on the inversion of the latter in order to represent the α-stable densities themselves. Even here a divide emerges between independent, identically distributed α-stable sequences, and dependent SαS variables, as reported in Jensen14 and as summarized here for completeness in an Appendix. In addition, many findings of the present study are genuinely nonparametric, in applying for all or portions of distributions in the range α∈(0, 2], and thus remaining distribution-free within that class. An outline follows.
Notation and technical foundations are provided in the next major section, Preliminaries, to include Notation and accounts of Special Distributions, Central Limit Theory and Essentials of SαS Distributions as subsections. The principal sections following these address Linear Models under SαS Errors, with a separate subsection on Models Having Cauchy Errors, and Conclusions. Collateral topics are contained for completeness in Appendix A.
Notation
Spaces of note include ℝnas Euclidean nspace, with Snas the real symmetric (n×n) matrices and S+n as their positive definite varieties. Vectors and matrices are set in bold type; the transpose, inverse, trace, and determinant of are A', A−1, tr(A), and |A|; the unit vector in ℝn is ; and Inis the (n×n)identity.
Moreover, Diag(A1,…,Ak)is a block-diagonal array, and Σ12is the spectral square root of
Special distributions
Given Y=[ Y1,…,Yn ]′∈ℝn,its distribution, expected value, and dispersion matrix are designated as L(Y), E(Y)=μ, and V(Y)=∑,with variance Var(Y)=σ2on ℝ1.Specifically, L(Y)=Nn(m, ∑)is Gaussian on ℝnwith parameters (μ,∑).Distributions on ℝ1of note include the χ2(u;ν,λ)and related χ(u;ν,λ)distributions, together with the Snedecor -Fisher F(u;ν1,ν2,λ),these having (ν,ν1,ν2)as degrees of freedom and λa noncentrality parameter. The characteristic function (chf)for Y∈ℝnis the expectation ϕY(t)=E[e ιt'Y]with argument t'=[ t1,…,tn ] and ι=√−1;a standard source is Lukacs & Laha.15 Attention is drawn subsequently to probability density (pdf) and cumulative distribution (cdf) functions. Moreover, the class {L(Z)∈Sn(0,∑)}consists of elliptically contoured distributions in ℝncentered at 0 and having chf’s of type ϕZ(t)=ψ(t'St). We adopt the following.
Definition 1 A distribution P on ℝn is said to be monotone unimodal about 0∈ℝnif for every y∈ℝnand every convex set C symmetric about 0∈ℝn,P[C+ky] is no increasing in k∈[ 0, ∞).See reference.16
Central limit theory
For iidvectors {Z1,Z2,Z3,…}in ℝn,let ˉZN=N−1[Z1+…+ZN],and consider limit distributions of type {L∞(cˉZN)=liminfL(cˉZN)}for suitably chosen C. On specializing from the elliptical class Sn(d,∑)having location-scale parameters (d,S),we consider αstable limit distributions as follow on identifying L∞(cˉZN)with
Definition 2 Let L(Z)∈Sαn(d,∑)designate an elliptical α-Stable law on ℝncentered at d∈ℝnwith scale parameters ∑and stable index α∈(0, 2 ],having the chf ϕZ(t)=exp{ι t'd−12(t'St)α2}. Each marginal distribution of Sαn(δ1n,In) on ℝ1, namely Sα1(δ,1),has the chf ϕZi(t)=exp{ι t δ−12| t|α}.Let SαS={Sαn(d,S); (d,∑)∈(ℝn⊗∑n)}designate the class of all such distributions.
Remark 1 L(Z)is of full rank and has a density in ℝnif and only if ∑is of full rank in S+n;otherwise L(Z)is concentrated in a subspace of ℝnof dimension equal to the rank of
To continue, designate by Dαthe domain of attraction of each element Zi in {Z1,Z2,Z3,…} in ℝn having liminfL(cˉZN)in SαS.That is, their chfs satisfy {liminfϕcˉZN(t)=exp[ι t'd−12(t'St)α2]}when scaled suitably. Specifically, the distributions D2 attracted to Gaussian limits comprise all distributions L(Zi)in ℝnhaving second moments. More generally, domains of attraction to distributions in SαShave been studied in references,17–20 to include Lindeberg conditions in Barbosa & Dorea,21 together with rates of convergence to stable limits in Paulauskas.22
Remark 2 That ΦZ(t)=exp[ι t'd−12(t'St)α2] has elliptical contours derives from the spherical chf ϕU(t)=exp[ι t'q−12(t't)α2] through the transformation a
Essentials for SαSdistributions
As noted, closed expressions for SαSdensities are known in selected cases only, to be complemented by results to follow. Here gn(u;d,∑)is the Gaussian density on ℝn having parameters (δ,∑), and fαn(μ;δ,∑) is the provisional SαS density corresponding to ϕZ(t)=exp[ι t'd−12(t'St)α2]. The following properties are essential.
Theorem 1 Let L(Z)∈Sαn(d,S)have the chf ϕZ(t)=exp[ι t'd−12(t'St)α2] and density function fαn(z;δ,∑) if defined. Then the following properties hold.
Proof: Conclusion (i) is Theorem 6.5.4 of Press.23 Conclusion (ii) invokes a result of Hartman et al.24 namely, the process {Zt;t=1,2,…}is spherically invariant if and only if, for each n andZ=[ Z1,…,Zn ],the chf ϕZ(t) is a scale mixture of spherical Gaussian s on ℝn, to give conclusion (ii) on transforming from spherical to elliptical symmetry. To continue, fZ(z)=(2π)−n∫ℝne−i t'zϕZ(t)Λ(dt)is the standard inversion formula from s to densities in ℝn with Λ(⋅) as Lebesgue measure, so that from conclusion (ii) we recover
fαn(μ;δ,In)=1(2π)n∫ℝke−i t'x∫∞0eit'δ− t't/2s dΨ(s;α)Λ(dt).(1)
Reversing the order of integration inverts the Gaussian chf to give conclusion (iii). Conclusion (iv) follows as in Wolfe25 conjunction with conclusion (iii). Finally observe from conclusion (iii), with ∫∞0gn(Z;δ,∑/s) dΨ(s;α), that the change of variables Z→U=T(Z) behind the integral is independent of Ψ(s;α) since T(Z) is scale-invariant independently of s to give conclusion (v).
It remains to reconsider degrees of association in SαS distributions, as distinct from the classical second-moment correlation parameters {ρij=σij/(σiiσjj)12}. For L(Z)∈Sαk(δ,∑) with α<2, the elements of s serve instead as scale parameters, since U=∑−12Z and U'U=Z'∑−1Zare dimensionless. As to whether {ρij} again might quantify associations for α<2, a definitive answer is supplied in the following.
Lemma 1 Let L(Z)∈Sαn(δ,∑). For , the parameters {ρij=σij/(σiiσjj)12}serve to quantify degrees of association between (Zi,Zj),the extent of their association increasing with
Proof: It suffices to consider (Z1,Z2)centered at (0, 0) with S=[1ρρ1].On taking U=(Z1−Z2), L(U) clearly is symmetric about 0 with scale parameter σU=2(1−ρ).A result of Fefferman et al.26 shows for each c>0that P(U∈(−c, c)) is decreasing in σUthus increasing in ρ.Equivalently, P(|Z1−Z2|≤c)↑1as ρ↑1,identifying the sense in which (Z1,Z2)become increasingly indistinguishable, thus associated, with increasing values of
Definition 3 For L(Z)∈Sαn(δ,S)with α<2,the entities {ρij=σij/(σiiσjj)12}are called pseudo–correlation, specifically, α-association parameters
The principal findings
Take L(Y)∈Sαn(Xβ, σ2In)with (Xβ, σ2In)as centering and scale parameters, where {Y∈ℝn,X∈Fn×k,β∈ℝk}. OLSsolutions ⌢β=(X'X)−1X'Y, as minimally dispersed unbiased linear estimates, are available here only for α=2,whereas alternative moment criteria necessarily are subject to moment constraints. Specifically, for scalars (⌢θ,θ)∈ℝ1 under loss L(⌢θ,θ)=|⌢θ−θ|,the risk R(⌢θ)=E[L(⌢θ,θ)] is undefined for α<1 as for Cauchy errors at α=1. Moreover, risk functions {R(⌢θ)=E(|⌢θ−θ|κ)} are defined but concave for {κ<α<1}, and for {1<κ<α≤2} are convex, at issue in attaining global optima. Versions of these apply also for vector parameters; however, minimal risk estimation would require not only knowledge regarding α,but also optimizing algorithms. Instead we seek what might be salvaged from classical linear models under the constraints of SαSerrors. In addition, portions of our findings extend beyond Gauss–Markov theory and OLS to include the much larger class of equivariant estimators.
Definition 4 An estimator δ(Y) for β∈ℝk is translation –equivariant if for {Y→Y+Xb}, then {δ(Y+Xb)=δ(Y)+b} for every b∈ℝk.
On taking P=[In−X(X'X)−1X'], the elements of e=PY comprise the observed residuals and S2=e'e/(n−k) the residual mean square. Normal–theory tests for H0:β=β0against H1:β≠β0utilize F=(⌢β−β0)′X'X(⌢β−β0)/S2 having the distribution F(u;k,n−k,λ) with λ=(β−β0)′X'X(β−β0)/σ2. We proceed to examine essential properties of Sαn(Xβ, σ2In)as αranges over ( 0, 2 ),where some expressions simplify on taking σ2=1,then reinstating σ2as needed. The following properties are fundamental.
Theorem 2 Given L(Y)=Sαn(Xβ,σ2In),consider [⌢β, e]with e=PYas the residual vector, and U=(n−k)S2/σ2.Then
(i) L(⌢β,e)=Sαn+k([β,0], S),with ∑=σ2Diag((X'X)−1,P),a distribution on ℝn+kof rank s
(ii) The marginal’s are L(⌢β)=Sαk(β, σ2(X'X)−1)centered at βwith scale parameters σ2(X'X)−1,and
(iii) L(e)=Sαn(0, σ2P)on ℝnof rank n−kcentered at 0with scale parameters σ2P;
(iv) U=(n−k)S2/σ2has density f(u;ν,α)=∫∞0h(u;ν,s) dΨ(s;α)with h(u;ν,s)as the central chi–squared density on ν=(n−k)degrees of freedom, scaled by S, and with Ψ(s;α)as a mixing distribution.
Proof. Let L'=(X'X)−1X'and P=[In−X(X'X)−1X']to project onto the error space, so that G=[L,P]operates on y to give
Z=G'Y=[⌢βe]=[L'P']Y∈ℝn+k and G'G=[(X'X)−100P], (2)
the latter of order [(n+k)×(n+k)] and rank n.The chf with argument s'=[ s1,…,sn+k]is E[exp(ι s'Z)]=E[exp(ι s'G'Y)]=E[exp(ι v'Y)]= ϕY(v)with argument v=Gs replacing t, to give conclusion (i). Next partition s'=[s1', s2']with s1'=[s1,…,sk], to obtain = ϕZ(s)=exp[ι s'G'Xβ−12(s'G'Gs)α2]=exp[ι s1'β−12(s1'(X'X)−1s1+s2'Ps2)α2]. The marginal s of ⌢β and e follow on setting s2=0, then s1=0in succession, to give conclusions (ii) and (iii). Conclusion (iv) attributes to Hartman et al.24 through Theorem 1. Specifically, a change of variables u→e→e'e=(n−k)S2 behind the integral on the right of Theorem 1(iii) gives the conditional density for L((n−k)S2| s), namely the scaled chi–squared density h(u;ν,s)depending on s so that integrating with respect to dΨ(s;α) gives conclusion (iv).
Remark 3 That S=σ2Diag(X'X,P) is block–diagonal in conclusion (i), assures under SαS errors that (⌢β,e)are α–unassociated as in Definition 3, well known to be mutually uncorrelated under second moments.
It remains to reexamine topics in inference under errors. The following are germane.
Definition 5 An estimator ⌢θ for θ∈ℝk is said to be linearly median unbiased if and only if the median med(a'⌢θ)=a'θfor each a∈ℝk; and to be modal unbiased provided that the mode
Definition 6 An estimator ⌢θfor θis said to be more concentrated about θthan provided that ⌢θ P((ˆθ−θ)∈C0)≥P((˜θ−θ)∈C0)for every convex set C0 in ℝk symmetric under reflection about 0∈ℝk.
Essential properties under SαS errors include the following.
Theorem 3 For L(Y)=Sαn(Xβ,σ2In),consider properties of the OLSsolutions ⌢β=(X'X)−1X'Y,and of the equivariant estimators ⌢β=δ(Y) of Definition 4.
⌢βis most concentrated about βamong all equivariant estimators ⌢β=δ(Y).
Proof. Conclusions (i)–(vi) carry over from reference Jensen DR27 without benefit of moments, regardless of membership in the SαSclass. To consider concentration properties of modal–unbiased estimators, begin with ϕY(t)=exp[ι t'Xβ−12(t't)α2],and consider ˜β=L'Ywith L'=[(X'X)−1X',G'],so that
ϕ˜β(s)= exp[ι s'L'Xβ−12(s'L'Ls)α2];
s'L'Xβ= s'[(X'X)−1X',G']Xβ.
That ⌢β should have mode at β, it is necessary that s'L'Xβ=s'β, i.e. G'X=0. accordingly, ϕ⌢β(s)=exp[ι s'β−12[s'Ωs]α2], with Ω=L'L=[(X'X)−1+G'G].Clearly the matrix [L'L−(X'X)−1]=G'Gis positive semi definite, giving conclusion (vii) from Jensen.28 Conclusion (viii) follows from Theorem 2.7 of Burk et al.29 since distributions are unimodal from Theorem 1(iv).
Spherical multivariate t errors on v degrees of freedom trace to Zellner30 to include Cauchy errors at ν=1, equivalently, at α=1 in the class SαS. Specializing from Theorem 1(ii), the spherical Cauchy chf is ϕZ(t)=exp[i t'd−12(t't)12]. Recast in terms of linear inference, we have the following specialization of Theorems 1 and 2.
Corollary 1 Under the conditions of Theorems and 2, the following properties hold under spherical Cauchy errors.
f1n(z;δ,In)= ∫∞0gn(z;δ,s−2In) dΨ(s;1)
= c(n) [1+(z−δ)′(z−δ)]−n+12
c(n)= Γ(n+12)/πn+12
where dΨ(s;1)=e−s22/(2π)12,the mixing χ(s;1)density.
f1k(⌢β;β,X'X)=c(k)[1+(⌢β−β)′X'X(⌢β−β)]−k+12 (3)
Proof. The multivariate tdistribution on ℝn is that of {Ti=Yi/S; 1≤i≤n} from L(Y)=Nn(δ,σ2In), with S as a sample standard deviation on ν degrees of freedom, known to be spherical Cauchy at ν=1. This gives conclusion (i) on specializing the conventional multivariate density. Conclusion (ii) follows directly on specializing Theorem 2(ii) at α=1
The viability (Yi) for each of n=13 biological specimens was recorded after storage under additives Xi1 and Xi2 as listed in Table 1;31 Walpole RE & Myers RH.31 The model is {Yi=β0+β1Xi1+β2Xi2+εi},where the errors are taken to be spherical Cauchy. The conventional OLS solutions are ⌢β0=36.094, ^β1=1.031, ⌢β2=−1.870, as elements of ⌢β=[⌢β0,⌢β1,⌢β2]′.The matrix X'X,its inverse (X'X)−1,and the transition of the latter into its α-association form of Definition 3 are given respectively by
[1359.4381.8259.43394.7255360.662181.82360.6621576.7264]−1=[ 1.0114−0.0494−0.1126−0.0494 0.0083 0.0018−0.1126 0.0018 0.0166]→[ 1−0.5392−0.8690−0.53921 0.1533−0.8690 0.1533 1].
The following properties are evident.
Yi |
Xi1 |
Xi2 |
Yi |
Xi1 |
Xi2 |
0.5 |
1.74 |
3.3 |
31.2 |
6.32 |
5.42 |
0.9 |
6.22 |
8.41 |
38.4 |
10.52 |
4.63 |
0.4 |
1.19 |
11.6 |
26.7 |
1.22 |
5.85 |
0.4 |
4.1 |
6.62 |
25.9 |
6.32 |
8.72 |
0 |
4.08 |
4.42 |
25.2 |
4.15 |
7.6 |
0.7 |
10.15 |
4.83 |
35.7 |
1.72 |
3.12 |
0.5 |
1.7 |
5.3 |
|
|
|
Table 1 The viability (Yi) of n=13 biological specimens after storage under additives Xi1 and Xi2
This study offers further insight into the class SαScomprising the spherical α–stable laws as limit distributions under conditions for central limit theory. In addition to their essential properties, expanded here to include representations for density functions, this study focuses on models of type {Y=Xβ+e}when devoid of moments undergirding the classical theory. Recall that normal–theory procedures routinely are applied in practice as large–sample approximations in distributions attracted to Gaussian laws. Specifically, Berry–Esséen bounds on rates of convergence to Gaussian limits are given Jensen,32,33 with special reference to linear models in Jensen.34,35 Results here validate corresponding large–sample approximations for distributions attracted to laws as cited in references.17–21 Of similar importance are rates of convergence to stable limits as in Paulauskas.22 By showing that many standard properties carry over in essence under significantly weakened assumptions, this study gives further credence to the widely and correctly held view that Gauss–Markov estimation and normal theory inferences extend considerably beyond the confines of the classical theory.
The preceding study has developed exclusively around spherically dependent SαSerrors, as alternative to iidstable errors. This choice is prompted by discrepancies encountered in the simplest case {Zi→Zi+δ;i=1,2,…,N} with common location parameter. Essential details from Jensen14 may be summarized as follows. To distinguish the disparate properties of iidvs spherical SαSmodels, sequences ℤ={Z1,Z2,Z3,…}are fundamental in order to take limits. Of significance is that averages of SαSsequences with α<2may be inconsistent for iidsequences but consistent under SαSsymmetry. Accordingly, let L∞(ZN)=liminfL(ZN).Essentials follow.
Lemma 2 Given ℤ={Z1,Z2,Z3,…},consider the case that Z'=[ Z1,…,ZN ]either are iid Sα1(δ, 1), with chf ϕZi(t)=exp{ι t δ−|t|α},or are SαS on ℝNwith chf ϕZ(t)=exp{ι δt'1N−(t't)α2}. Let SN=(Z1+…+ZN)and ˉZN=N−1SN, and consider the standardized variables UN=N12(ˉZN−δ).
For 0<α<1: ϕˉZN(t)=eιtδ−Nε|t|αfor ε>0,so that ˉZNis inconsistent for δ.
For α=1, ϕˉZN(t)=eιtδ−|t|α≡ϕZi(t),so that ˉZNis inconsistent for
For 1<α≤2, ϕˉZN(t)=eιtδ−N−ε|t|αfor ε>0,so that ˉZNis consistent for δ.
None.
Authors declare that there is no conflict of interest.
©2018 Jensen. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7