Research Article Volume 6 Issue 3
Department of Statistics, Virginia Tech, USA
Correspondence: Donald Jensen, Department of Statistics, Virginia Tech, Blacksburg, VA 24061, USA
Received: October 25, 2016 | Published: October 5, 2017
Citation: Jensen D. On directed alternatives in linear inference. Biom Biostat Int J. 2017;6(3):364-371. DOI: 10.15406/bbij.2017.06.00171
Tests for vector hypotheses H0: θ=θ0H0: θ=θ0 against H1:θ≠θ0H1:θ≠θ0 in ℝk typically have powers depending on quadratic forms of typeλ=(θ−θ0)'Ξ−1(θ−θ0) . This study examines the case that (μ−μ0) is restricted to subspaces, for example, (μ−μ0)'=c (1,1,0,…,0) differing only in their first two coordinates. These are called directed alternatives. The spectral decomposition of Ξ supports the identification of one–dimensional alternatives least likely and most likely to be discerned, to complement conventional data analysis. Applications are drawn in the use of Hotelling’s Τ2 and of F –tests in linear inference. Moreover, it is seen that a given design may be recast so as to reverse the least likely and most likely alternatives. Numerical examples serve to illustrate the findings.
62J05, 62H10 and 62P30
Keywords: Linear models; F tests; Hotelling's T2 tests; Directed alternatives; Reversal designs
Power in statistical inference is driven by non–null distributions. For observations in ℝk having dispersion matrixΞ , noncentrality parameters often emerge as the Mahalanobis [1] distance between points (u,v) inℝk , namely,
D2Ξ(u,v)=(u−v)'Ξ−1(u−v). (1)
This specializes to the Euclidean metric for the case that Ξ=c Ιk , in which case the model is called isotropic. In particular, nonparametric and other statistics often have noncentral chi–squared distributions, either in small samples or asymptotically. In addition, pervasive venues in parametric inference, to be reexamined in some detail, include the following.
Case (i). Hotelling [2] Test: T2=n(ˉY−μ0)S−1(ˉY−μ0) where (ˉY,S) are the sample mean and dispersion matrix of n Gaussian vectors in ℝk having the location–scale parameters (μ,Σ) . Then in testing H0:μ=μ0 against H1:μ≠μ0 , the power function is Ψ(λ) with noncentrality λ=n(μ−μ0)'Σ−1(μ−μ0).
Case (ii). The General Linear Model: {Y=Xβ+ε} with Gaussian errors having zero means and dispersion matrix σ2In . Then in testing H0:β=βo against H1:β≠βo in ℝk , the power function is Ψ(λ) with noncentrality λ=(β−βo)'X'X(β−βo)/σ2 .
Classical theory allows for any (μ−μ0)∈ℝk for Case (i), and (β−βo)∈ℝk for Case (ii). On the other hand, alternatives lying in designated subspaces may hold substantive interest per se. For example, taking (μ−μ0)'=c (1,1,0,…,0) allows for discrepancies between μ and μ0 in their first two coordinates only, whereas (μ−μ0)'=c (1,1,…,1) allows for deviations along the equiangular line in ℝk . Both are one–dimensional; subspaces of dimension greater than one are considered subsequently. Alternatives lying in designated subspaces of ℝk are called directed alternatives, and the goal here is to study powers of tests against alternatives of these types.
The present study expands on this as follows. Not only do distinct alternatives differ in importance to users, but so too their probabilities of detection. Here the spectral decomposition of Ξ , if anisotropic, supports the identification of alternatives least likely and most likely to be discovered, as well as intermediate cases. These serve to bracket the effective range of inferences intrinsic to a given study, and thereby complement conventional options in data analysis. Applications are drawn in the use of Hotelling’s T2 in multivariate samples, and of F–tests in the analysis of linear models. Moreover, it is shown that a given design may be modified so as to reverse the least likely and most likely alternatives, in the event that this would better serve the objectives of an experiment.
This study is organized as follows. Supporting developments are given next in Section 2, followed by the principal findings of Section 3. Several examples in Section 4 illustrate the essential results. Collateral materials are deferred for completeness to an Appendix.
Notation
Spaces include ℝn as Euclidean n -space; ℝn+ as its positive orthant; Sn as the real symmetric (n×n) matrices; S+n as their positive definite varieties; Fn×k as the real (n×k) matrices of rank k≤n ; and Ok as the (k×k) orthogonal group. Vectors and matrices are set in bold type; the transpose, inverse, trace, and determinant of A are A' , A−1 , tr(A) , and |A| ; the unit vector in ℝn is 1n=[1,…,1]' ; In is the (n×n) identity; and Diag(A1,…,Ak) is a block-diagonal array. If B=[b1,…,bk] is of order (n×k) and rank k<n , then Sp(B) designates the column span of B , i.e., the k –dimensional subspace of ℝn spanned by [b1,…,bk] . The ordered eigenvalues of A∈Sn are {λi(A)=αi; 1≤i≤n} with {α1≥α2≥…≥αn} , and its spectral decomposition is A=PDαP'=Σni=1 αipipi' , where P=[p1,…,pn]∈On and Dα=Diag(α1,...,αn) . By convention its condition number is C1(A)=α1/αn . The singular decomposition of B∈Fn×k is B=PDδQ'= Σki=1 δipiqi' , where the mutually orthogonal columns of P=[p1,…,pk] comprise the left–singular vectors; Dδ=Diag(δ1,…,δk) are its singular values; and columns of Q∈Ok are the right–singular vectors.
Special Distributions
For Y∈ℝn , its distribution, mean, and dispersion matrix are L(Y), E(Y)=μ and V(Y)=Σ , say, with variance Var(Y)=σ2 on ℝ1 . Specifically, L(Y) =Nn(μ,Σ) is Gaussian on ℝn with parameters (μ,Σ) . Distributions on ℝ1+ include the χ2(⋅;ν,λ) with ν degrees of freedom and noncentrality parameter λ ; the Snedecor–Fisher F(⋅;ν1,ν2,λ) with degrees of freedom (ν1,ν2) and noncentrality λ ; and Hotelling [2] T2k(⋅,ν,λ) of order k having ν degrees of freedom and noncentrality λ . Recall that F(⋅;ν1,ν2,λ) increases stochastically with increasing λ with other parameters held fixed. Identify cα in context as the upper α –level rejection rule. The power of a test, to be considered as a function of λ , is designated by ψ(λ) .
Directed alternatives
Our notation encompasses both (i) Hotelling [2] T2 and (ii) General Linear Models, having location–scale parameters (δ,Ξ) . What distinguishes this study are directed alternatives with examples as noted, but expanded to include alternatives {θj;1≤j≤k} aligned with the orthonormal eigenvectors Q=[q1,…,qk] of Ξ , thus standardized to unit lengths. To continue, as Ω=Ξ−1 assumes a central role, take Ω= Σki=1 κiqiqi'=QDκQ'∈S+k as its spectral decomposition, with {κ1≥…≥κk>0} . As in Appendix A.1, undertake the expansions
Ω= Q1Dκ1Q1'+Q2Dκ2Q2'+Q3Dκ3Q3' (2)
Ω= Q1Dκ1Q1'+krQ2Dκ2Q2'+Q3Dκ3Q3';kr repeated s times , (3)
where elements of Q=[Q1,Q2,Q3] are of orders {(k×(r−1)),(k×s),(k×d)} with d=k−r−s+1 , and where Dk=Diag(Dκ1,Dκ2,Dκ3) is partitioned conformably. In regard to quadratic forms of type Q(u)=u'Ω u serving as noncentrality parameters, a principal result is the following.
Theorem 1. Given is a location–scale model with parameters (δ,Ξ) , together with a test for H0:δ=δ0 against H1:δ≠δ0 having power Ψ(λ) increasing monotonically with λ= D2Ξ(δ,δ0)=(δ−δ0)'Ξ−1(δ−δ0) . Take {(δ−δ0)=θj;1≤j≤k} in succession as the eigenvectors {θj=qj;1≤j≤k} of Ξ−1 with eigen values {κ1≥…≥κk>0} .
Proof: Conclusion (i) follows directly from λ(θj)=θ′jΞ−1θj=θ′j(∑ki=1κiqiq′i)θj=κj since θj=qj , whereas {q′jqi=0;i≠j} and q′jqj=1 by orthonormality. Conclusion (ii) follows directly from variational properties of Rayleigh quotients as in Lemma A.1(i) of the Appendix. In like manner conclusion (iii) follows from Lemma A.1(ii) as variational properties over subspaces. Conclusion (iv) follows from (iii) since {q′jqi=0;i≠j}
Remark 1. The directed alternatives δ'1=[1,1,0,...,0] and δ'2=[1,1,...,1] were featured earlier as discrepancies in the first two coordinates of (δ−δ0) , and as deviations about the equiangular line in ℝk . Let Ξ−1=[ξij] . Then powers Ψ(λ) at these alternatives will depend on λ1=∑2i=1∑2j=1ξij at δ′1=[1,1,0,...,0] , and on λ2=∑ki=1∑kj=1ξij at δ′2=[1,1,...,1] .
Corollary 1. On specializing the location–scale parameters (δ,Ξ) , Theorem 1 applies verbatim as follows.
(i) Hotelling [2] T2: (δ,Ξ)=(μ,Σ); L(T2)=T2k(⋅;n−1,λ) , the power Ψ(λ) depending on λ=n(μ−μ0)'Σ−1(μ−μ0) .
(ii) General Linear Models: (δ,Ξ)=(β,(X'X)−1); L(F)=F(⋅;k,n−k,λ) , the power Ψ(λ) depending on λ=(β−βo)'X'X(β−βo)/σ2 ,
Proof: The noncentral distribution F(⋅;k,n−k,λ) clearly satisfies the assumptions of Theorem 1 on identifying (δ,Ξ)=(β,(X'X)−1) as claimed. Similarly in testing H0:μ=μ0 against Η1:μ≠μ0 , Hotelling’s T2 inherits these properties through the conversion L((ν−k+1)T2kν)=F(⋅;k,ν−k+1,λ) . With λ=n(μ−μ0)'Σ−1(μ−μ0).
Remark 2. Note that alternatives {θj=qj;1≤j≤k} of unit lengths give noncentrality parameters {κj;1≤j≤k} . If instead the directed alternatives are {θj=cjqi;1≤j≤k} , then the noncentrality parameters will be {c2jκj;1≤j≤k} .
Remark 3 Note that the foregoing developments are for the general case that Ω=QDκQ' is anisotropic with {κ1≥…≥κk>0} . If isotropic, then the following applies.
Definition 1. The model Ξ is called isotropic if and only if Ξ=dIk , in which case power functions are directionally invariant, not depending on directions of alternatives in ℝk . Add: Bounds on ARLs from restricted variation.
Sphericity
The density for Νk(μ,Σ) has spherical contours for the case that Σ=dIk , i.e., the model is isotropic. Sample evidence regarding the isotropy of T2 is available. Mauchly [3] derived the Likelihood Ratio test for sphericity, namely, Η0:Σ=dIk against Η1:Σ≠dIk . A contemporary test utilizes the modified statistic
LRM=− (ν−2k2+k+26k) ln [kk|S|(trS)k] (4)
taking S as the sample dispersion matrix from n observations, rejecting at level α for LRM>cα with v=n−1 and with cα as the upper percentile of the central distribution χ2(k(k+1)/2−1,0) . See, for example, Rencher [4].
Design Reversals
Developments thus far are predicated in part on the desirability to identify alternatives having varying powers of discernment. These include the most likely and least likely as in Theorem 1(ii). If the least likely is deemed to be of greatest interest, it remains to ask whether it might serve instead as the most likely alternative. In the context of designed experiments the answer is affirmative, as the intrinsic structure offers a venue for modifying a given design so as to achieve these ends. Details follow.
Consider the model {Y=[1n, X][α,β']'+ε} with X centered such that 1′nX=0 , where location–scale parameters for ˆβ are (β,Ξ) with Ξ=(X'X)−1 as in Corollary 1(ii). In particular, the test for Η0:β=βo against Η1:β≠βo utilizes F=(ˆβ−βo)'X'X(ˆβ−βo)/kS2 with S2 as the residual mean square and with noncentrality λ=(β−βo)'X'X(β−βo)/σ2 , where it often suffices to take σ2=1.0 . For X∈Fn×k its singular decomposition, followed by X'X , is
X=PDδQ'= k∑i=1δipiq′i,X'X=QDκQ' (5)
with P=[p1,…,pk] as its left–singular vectors, Dδ as its singular values, and columns of Q∈Ok as its right–singular vectors. Clearly {κj=δ2i;1≤i≤κ} . Our principal reconstruction is articulated in the following.
Theorem 2. Let π(δ) be a permutation operator reversing the ordered array [δ1≥…≥δk] to {δk≤{δ2,…,δk−2}≤δ1} , and let Dπ=Diag(δk,δ2,…,δk−2,δ1) . Next construct
Xπ=PDπQ'=δkp1q′1+ ∑k−1i=2δipiq′i+δ1pkq′k
such that pairs (δk,q1) and (δ1,qk) are realigned.Conclusion: The most likely and least likely alternatives for design Xπ are reversed from those of X , so that θ1=qk now is most likely with power Ψ(κ1) depending on κ1 , and θk=q1 least likely with power Ψ(κk) depending on κk .
Proof: Clearly the conventional reordering of eigenvalues gives
X′πXπ=κ1qkq′k+k−1∑i=2κiqiq′i+κkq1q′1 (6)
and the conclusion follows on applying Theorem 1(ii) in the context of Corollary 1(ii).
Remark 4. Variations on Xπ are apparent. Any permutation of {δ2,…,δκ−1} gives the same conclusion. In addition, any pairs {(δi,qi),(δj,qj)} may be selected in like manner as most likely and least likely to be discerned. Note, however, that these tools are available in the case of first–order designs.
Studies in Hotelling [2] T2 and second–order response models are given, to illustrate Theorem 1 and Corollary 1. Moreover, an example design serves to illustrates the Theorem 2 reversal of most likely and least likely alternatives.
Hotelling’s T2 Tests
We reexamine the role of calcium in the growth of turnip greens, using data as reported in Kramer et al. [5]. In each of 29 experimental plots the plant calcium ( Y1 ) was determined, and the soil calcium was assayed as available ( Y2 ) and exchangeable ( Y3 ) calcium. The units all are milliequivalents per hundred grams. Horticultural specialists expect these to run at about 15.00, 6.00 and 2.85 units, respectively. The sample means are ˉY '=[17.97, 4.39, 2.46]; and the sample dispersion matrix is S with inverse S(−1) = QDkQ' in spectral form, as listed in
Q=[0.010240.22479-0.974350.05973-0.97280 −0.22381-0.99816 −0.05591 −0.02339]
where Dk = Diag (5.61102, 0.04857, 0.00416) The data are ill–conditioned, with condition number
c1(S)=1,348.80 :
The statistic reported is T2 = n (ˉY− μ0) ' S−1(ˉY −μ0) = 24.97 with μ0′= 15.00, 6.00, 2.85 , rejecting at level α = 0.01 the hypothesis H0: μ=μ0 in favor of some H1: μ≠μ0 . Indeed, the p –value is P(T2> 24.97 | H0) = 0.000751 with Cα = [14.980] . On applying Corollary 1(i) with S in lieu of Σ , the columns of Q= [q1, q2, q3] are taken as successive alternatives to (μ − μ0) , namely
[μ1− 15.00μ2− 6.00μ3 − 2.85] ∈ [0.010240.22479-0.974350.05973-0.97280 −0.22381-0.99816 −0.05591 −0.02339]
where the dominant terms are in bold type. The noncentrality parameters {λi =nκi; 1≤ i ≤ 3} are {162.720, 1.409, 0.121}, and taking α=0.05 and Cα=9.612, powers at these alternatives are
P((T2|λ1)>9.612= 1.0000,P((T2|λ2)>9.612=0.1316, P((T2|λ3)>9.612) = 0.0562.
Accordingly, T2 has essentially unit power to distinguish the hypothetical deviation (μ3− 2.85) from -0.99816, since the discrepancies 0.01024 for (μ1− 15.00) and 0.05973 for (μ2− 6.00) in q1 are negligible. Similarly, T2 is marginally able to distinguish [( μ1 −15.00), ( μ2 −6.00)] from [0.22479, −0.97280] with power 0.1316, but is virtually unable to separate [( μ1 −15.00), ( μ2 −6.00)] from [−0.97435, −0.22381] with negligible power of 0.0562. In short, the latter suggests [14.03, 5.78] to be plausible values for [μ1,μ2] .
This is an example, as seen subsequently also, where elements of Q= [q1, q2, q3] , especially q1 , convey useful information in regard to the objectives of the study. In summary, details regarding directed alternatives, enabled here by Theorem 1 and Corollary 1(i), go beyond conventional useage for T2 .
Hotelling’s T2 Charts.
Multivariate diagnostics figure prominently in Statistical Process Control (SPC), as reviewed subsequently. In monitoring the manufacture of bomb sights during World War II, Hotelling [6] devised T2 charts for multivariate means in {R}^{K} . Here successive values {T2i; i = 1, 2,...} are charted against time, where the chart signals the process to be out–of–control at level α whenever T2i> Cα. Moreover, with power ψ(λ) = P ((T2i|λ) > Cα), the Average Run Length (ARL) of time–to–signal is ARL = 1/ψ(λ) . To monitor the mean μ against its target value μ0 , successive samples of size n yield (¯Yi, Si ) , together with T2i=n(¯Yi−μ0)′ S−1i(¯Yi−μ0) having the T2k(∴n−1,λ) distribution with λ= n (μ−μ0)'Σ−1(μ−μ0) . Phase I in SPC is set to establish base line process capabilities, to include parameter estimation, followed in Phase II by implementing the control charts themselves.
To continue, consider the data of Quesenberry [7] to be in Phase I, comprising n=30 records of 11 quality characteristics indexed in time–order of production. Following Williams et al. [8], dimensions are reduced on selecting the first k=5 quality characteristics, namely, [Y1,...,Y5] having means [μ1,...,μ5] , respectively. As in Section Hotelling’s T2 Tests we take S in lieu of ∑ , finding the spectral resolution S−1 = QDκQ' as reported in Table 1. The data are seen to be highly ill–conditioned, with condition number c1(S)=3,945.64 . In keeping with Corollary 1(i), five directed alternatives comprise the columns of Q = [q1,...,q5] in Table 1, where dominant elements again are in bold type. In particular, this example shows {q1,...,q5} to be separately informative per se, as each corresponds essentially to deviations in (μi−μi0) for observations {Y1,Y3,Y5,Y2,Y4} , respectively, since values other than those in bold type are negligible.
Eigenvalues |
||||
κ1 520.4304 |
κ2 11.9529 |
κ3 1.1404 |
κ4 1.0843 |
κ5 0.1319 |
Eigenvectors |
||||
q1 |
q2 |
q3 |
q4 |
q5 |
0.99893 −0.00607 0.04535 −0.00396 −0.00505 |
−0.04553 −0.00445 0.99863 −0.01681 −0.01907 |
0.00490 0.14300 0.01937 −0.02275 0.98926 |
0.00543 0.98712 0.00322 0.07515 −0.14105 |
0.00291 −0.07126 0.01722 0.99676 0.03287 |
Power at α =0.05 and n =30 |
||||
1.0000 |
1.0000 |
0.9914 |
0.9882 |
0.2367 |
Power at α =0.05 and n =8 |
||||
1.0000 |
1.0000 |
0.1848 |
0.1779 |
0.0645 |
ARLs at α =0.05 and n =8 |
||||
1.0000 |
1.0000 |
5.41 |
5.62 |
15.50 |
Table 1: Spectral values for S−1=QDκQ' for the data of Quesenberry (2001) of order (30×5).
Taking {λi = n κi; 1 ≤ i ≤ 5} with n=30, α=0.05 , and critical value Cα = 15.097 , powers against these directed alternatives are listed in Table 1. These show that deviations in the directions of {μ1,μ2,μ3,μ5} essentially would be discerned with power at least 0.9882, whereas power in the direction of μ4 would be diminished to 0.2367. Note, however, that these values are inflated by the value n=30 in Phase I. Samples much smaller in size ordinarily would be taken in Phase II, say n=8 in this example. Then the powers corresponding to {λi= 8 ki; 1 ≤ i ≤ 5} at α=0.05 are given subsequently in Table 1. In short, with ARL=1/ψ(λ) as in the final row of Table 1 with samples of size n=8, these charts would detect changes in μ1 and μ3 immediately on average, but less responsive otherwise with ARLs of (5.41, 5.62, 15.50) for (μ5, μ2, μ4) , respectively.
Remark 5. This example goes beyond conventional uses of T2 charts, in demonstrating that the capacity of a given chart to detect alternatives may differ widely across alternatives in ℝ5 of compelling practical interest. In short, each of five ARLs pertains here to an informative one–dimensional alternative.
Second–Order Designs
Second–order models of type
{Yi= β0 + β1 X1i+ β2X2i+β11X21i+β22X22i+β12X1iX2i+ εi; 1 ≤ i≤n} (8)
are considered having zero mean, uncorrelated errors with variance σ2 . In a typical setting the yield ( Yi ) of a chemical process is examined at specified reaction time (X1i) and temperature (X2i) . Small designs of historical consequence are the Central Composite (CCD) designs of Box et al. [9], having design points as listed in Table 2.
X' |
-1.00 -1.00 |
-1.00 1.00 |
−α 0.00 |
α 0.00 |
0.00 −α |
0.00 α |
1.00 -1.00 |
1.00 1.00 |
0.00 0.00 |
Table 2: Regressor vectors for the CCD design X' of order (2×9), where α=√2 .
Proceeding as in Corollary 1(ii), we seek spectral values for the Fisher Information Matrix X′X , specifically, the eigenvalues and eigenvectors as listed in Table 3, with dominant terms again in bold type. Powers at these directed alternatives follow on taking μ3 as surrogates in the noncentrality parameters for F = (⌢β−βo)′X′X (⌢β − βo)/ kS2 with k=6 and S2 as the residual mean square. Owing to only two degrees of freedom for error, computations replaced S2i by σ2 = 1.0, then powers were determined from scaled noncentral X2(6,κi) distributions with noncentrality parameters as listed in Table 3.
Eigenvalues |
|||||
κ1 24.3427 |
κ2 8.0000 |
κ3 8.0000 |
κ4 8.0000 |
κ5 4.0000 |
κ6 0.6573 |
Eigenvectors |
|||||
q1 |
q2 |
q3 |
q4 |
q5 |
q6 |
0.59349 0.00000 0.00000 0.56911 0.56911 0.00000 |
0.00000 0.00000 0.00000 0.70711 -0.70711 0.00000 |
0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 |
0.00000 0.00000 -1.00000 0.00000 0.00000 0.00000 |
0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 |
0.80484 0.00000 0.00000 -0.41966 -0.41966 0.00000 |
Power at α =0.05 |
|||||
0.9765 |
0.5307 |
0.5307 |
0.5307 |
0.2698 |
0.0775 |
Table 3: Spectral values for the Fisher Information Matrix X'X for the CCD design.
Arranged in decreasing order of their powers, these are q1 with power 0.9765; {q2,q3,q4} each with power 0.5307; and q5 and q6 as alternatives with powers 0.2698 and 0.0775. Here elements of Q are separately informative: q2 for discrepancies between (β11,β22) and their hypothetical values; and (q3,q4,q5) for discrepancies between (β1, β2,β12) and their hypothetical values, respectively. To continue, observe that the eigenvalue κ2=8.0000 is repeated three times. On applying Theorem 1(iv) in the context of Corollary 1(ii), we see that all standardized elements in Sp(q2,q3,q4) have power 0.5307. These include, in addition to q3 for β1 and q4 for β2 , the standardized sums
θ1 = (q3+q4)/√2 = [0, 1, −1, 0,0,0]′/√2 (9)
θ2 = (q2+ q3 + q4)/√3 = [0, 1, −1, 0,7071, −0.7071, 0]′/√3 (10)
for example, with θ1 for the discrepancy between [ (β1− β10), (β2− β20)] and [ 1, −1] /√2 .
In summary, the directed second–order alternatives treated here are innovations not found in classical linear inference. Instead, these are enabled by Theorem 1 and Corollary 1(ii). Again the elements of Q , especially {q2,...., q5}, are separately informative about coefficients of the model (4.2). Their simple and revealing structure may be attributed to the symmetry and balance of CCD designs.
Design Reversals
Begin with { Y = [1n, X][β0 ,β′]′+ε} with X0=[1n,X] and X in centered form as in Section 3.2, with β′ = [β1,β2,β3], having the design X = PDiag (δ1,δ2,δ3) Q′ as listed in Table 4. Construct X1 = PDiag (δ3,δ2,δ1)Q′ on permuting singular values, but retaining the left and right singular vectors.
Design X'=[PDiag(δ1,δ2,δ3)Q']' |
|
X' |
-1.0000 1.0000 -1.4142 1.4142 -1.0000 1.0000 0.0000 0.0000 |
0.0000 0.0000 1.0000 1.0000 0.0000 0.0000 -1.0000 -1.0000 |
|
-0.8000 0.6000 -0.8000 2.0000 -0.8000 0.6000 -0.8000 0.0000 |
|
Design X'1=[PDiag(δ3,δ2,δ1)Q']' |
|
X'1 |
0.4279 -1.0391 0.2809 0.7126 0.4279 -1.0391 -1.1080 1.3370 |
-0.0193 -0.3197 -1.3211 -0.3556 -0.0193 -0.3197 0.4993 1.8556 |
|
-0.1811 0.8999 0.2484 -1.1704 -0.1811 0.8999 1.1798 -1.6954 |
|
Left–Singular Vectors P' |
|
P' |
-0.3308 0.2946 -0.3736 0.6594 -0.3308 0.2946 -0.1792 -0.0342 |
0.0960 -0.1138 0.6024 0.3785 0.0960 -0.1138 -0.5082 -0.4372 |
|
0.0996 -0.3618 -0.1302 0.2927 0.0996 -0.3618 -0.3435 0.7055 |
|
Right–Singular Vectors [θ1,θ2,θ3] |
|
Q |
-0.7099 -0.1307 -0.6921 |
0.3508 -0.9177 -0.1865 |
|
0.6108 0.3752 -0.6973 |
|
Detection Probabilities: Design X |
|
|
0.4947 0.1828 0.0577 |
Table 4: Design matrix X=PDδQ' and the modified X1=[PDiag(δ3,δ2,δ1)Q'] the left ( P' ) and right ( Q ) singular vectors; and the singular values δ'=[ 3.8198, 2.0992, 0.5318 ].
To continue, take the right–singular vectors, now Q = [θ1, θ2, θ3], as directed alternatives to (β − βo)′ = [(β1−β10), (β2 − β20), (β3−β30)], together with k′ = [14.5909, 4. 4066, 0.2828] as { κi = δ2i; 1 ≤ i ≤ 3} . Here the level 0.05 critical value is 6.5914; then powers are determined in turn from F (∴ 3, 4, κi) together with the power functions { ψ (λ) = P(F > 6.5914| κi); 1 ≤ i ≤ 3} having values listed in the final row of Table 4.
In particular, discovering alternatives (β−βo)′ in the direction θ3 = −[0.6921, 0.1865, 0.6973] is seen to be unlikely, with power 0.0577. On the other hand, suppose instead that it is critical in context to discover alternatives in the negative orthant of ℝ3 . Then the design X1 serves to reverse these so that alternatives in the directions of { θ3, θ2, θ1} are now detected with probabilities [0.4947, 0.1828, 0.0577] , respectively.
Further properties of the design X and its reversal X1 deserve mention. Observe that X′0X0 =Diag (n, X′X), whereas X′X = Q Diag(κ1,κ2,κ3)Q′ and X′1 X1 = QR Diag(κ1,κ2,κ3)Q′R , where now QR = [q3,q2,q1] following the convention that {κ1 ≥ κ2 ≥ κ3} remain ordered. Clearly V(⌢β|X) = σ2(X′X)−1 and V(⌢β|X1) = σ2(X′1X1)−1 differ, where their diagonal elements are listed as variances in Table 5. However, their eigenvalues are identical by construction, as are their {A,D,E} efficiency indices as the trace, determinant, and largest eigenvalues of (X′0X0)−1 under both designs X and X1 .
Design Characteristics |
||||
Design |
X |
X1 |
X |
X1 |
Estimates |
Variances |
Eigenvalues of |
||
⌢β0 ⌢β1 ⌢β2 ⌢β3 |
0.12500 1.38170 0.69002 1.76012 |
0.12500 1.83546 0.26120 1.73518 |
3.53636 0.22694 0.12500 0.06854 |
3.53636 0.22694 0.12500 0.06854 |
Diagnostic |
A |
D |
E |
|
Σ0,Ξ0 |
3.95684 |
0.00688 |
3.53636 |
Table 5: Variances of OLS solutions; eigenvalues of the dispersion matrix Γ∈{Σ0,Ξ0} as (X'0X0)−1 for the designs {X,X1} ; and A,D, and E efficiencies for these designs.
This study reexamines the concept of directional invariance, or isotropy, for distributions on ℝk having location–scale parameters (δ, Ξ) . Powers of tests for H0: δ = δ0 against H0: δ≠ δ0 often depend on noncentrality parameters of type λ= (δ− δ0)′Ξ−1(δ − δ0) . The spectral decomposition of Ξ supports the identification of directed alternatives in directions determined by the eigenvectors of Ξ , to encompass the alternatives most likely and least likely in a given study. Powers of these types are independent of direction if and only if Ξ= σ2Ιk . Applications are drawn in the use of Hotelling [2] T2 in multivariate samples, and of F-tests in linear models. Case studies are given where this approach leads to the discovery of further insight regarding the natural parameters of a problem.
One concept of directional invariance figures prominently in the SPC literature. The following is excerpted from Linna et al. [10].
“It is well known that the ARL performance of multivariate SPC procedures depends heavily on the covariance structure of the observed data. See, for example, Mason et al. [11]. Further, it has been noted by Pignatiello et al. [12] that many multivariate procedures, including the χ2 chart, Hotellings T2 chart, and most of the multivariate CUSUM charts, are directionally invariant. The performance of the multivariate EWMA chart proposed in Lowry et al. [13] is also directionally invariant. Lowry et al. [14] and others also note the directional invariance of many of these multivariate control charting methods. Directional invariance means that the performance of a procedure does not depend on the specific direction in p-space of a shift in the mean vector of the process variables being monitored. Instead, performance of a directionally invariant procedure depends only on the statistical (or Mahalanobis) distance between the in-control mean vector μ0 and the out-of-control mean vector under consideration, μ1 .” (Italics supplied.) That is, D2Ξ(μ0, μ1) = (μ0− μ1)′Ξ−1(μ0−μ1) .
Unfortunately, this notion of directional invariance is grossly misleading, is antithetical to the very concept of invariance as in our Definition 1, at best is a misnomer, and in any event deserves to be clarified in the SPC literature. In fact, such essentials as power functions and ARLs do indeed depend on directions of alternatives, as seen in Section 4.2 as counter examples, unless the model is isotropic.
On the other hand, a disclaimer of Tsui et al. [15] should be noted: “There is no reason in practice, however, for a shift to μ = μ* to be always considered as important as a shift to μ = μ** just because the corresponding values of the noncentrality parameters are equal.” Our study represents a substantial elaboration on this point.
Rayleigh Quotients.
At issue are variational properties of quadratic forms of type Q(υ) = υ′ Ω υ/υ′ υ, known as Rayleigh Quotients; see Bellman [16]. Write Ω = Σki=1 κiqiq′i = QDkQ′ in its spectral form with Q ∈ Ok so that {q′iΩ qi = κi; 1 ≤ i ≤ k} . Further partition Q = [Q1,Q2,Q3] of orders { (k× (r−1)), (k×s) (k × d)} with d = k − r−s + 1, and partition Dκ = Diag(Dκ1,Dκ2,Dκ3) conformably. Then
Ω= Q1Dκ1Q′1 + Q2Dκ2Q′2 + Q3Dκ3Q′3 (11)
where, in particular, Q2 = [qr,...,qr+s] and Dk2 = Diag (κr,...,κr+s) . Essentials follow.
Lemma A.1 Consider the positive definite form υ′Ωυ/υ′ υ with Ω as in expression (11).
(i) Variational properties of Q(υ) as u varies over ℝk are
κk≤ υ′Ωυυ′υ ≤ κ1 (12)
where the lower and upper limits are attained at υ = qk and υ= q1 , respectively.
(ii) Variational properties of Q(υ) as u varies over Sp(Q2) are
κr+s ≤ υ′Ωυυ′υ ≤ κr (13)
where the lower and upper limits are attained at υ = qr+s and υ= qr , respectively.
Proof: Conclusion (i) is given in Bellman [16], where the limits are attained as given since κk= q′k Ω qk and κ1 = q′1Ω q1 . To see conclusion (ii), υ∈ Sp (Q2) may be represented as υ= Q2 a with a′ = [ar,..., ar+s] . Then for υ ∈ Sp (Q2) , (A.1) gives υ′Ωυ = a′Q′2Q2Dκ2Q′2Q2a = Σr+si=r κia2i . The lower and upper limits for υ ∈ Sp (Q2) follow as in conclusion (i) for υ∈ ℝk , except that now these are attained as given since q′r+sΩqr+s = κr+s and q′r Ωqr= κr .The author is indebted to Professor Donald E. Ramirez for substantial contributions, including computations using the MINITAB and MAPLE software packages.
©2017 Jensen. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7