On directed alternatives in linear inference

doi:10.15406/bbij.2017.06.00171

eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 6 Issue 3

On directed alternatives in linear inference

Donald Jensen

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

Department of Statistics, Virginia Tech, USA

Correspondence: Donald Jensen, Department of Statistics, Virginia Tech, Blacksburg, VA 24061, USA

Received: October 25, 2016 | Published: October 5, 2017

Citation: Jensen D. On directed alternatives in linear inference. Biom Biostat Int J. 2017;6(3):364-371. DOI: 10.15406/bbij.2017.06.00171

Download PDF

Abstract

Tests for vector hypotheses $H_{0} : θ = θ_{0}$ against $H_{1} : θ \neq θ_{0}$ in $ℝ^{k}$ typically have powers depending on quadratic forms of type $λ = (θ - θ_{0})' Ξ^{- 1} (θ - θ_{0})$ . This study examines the case that $(μ - μ_{0})$ is restricted to subspaces, for example, $(μ - μ_{0})' = c (1, 1, 0, \dots, 0)$ differing only in their first two coordinates. These are called directed alternatives. The spectral decomposition of $Ξ$ supports the identification of one–dimensional alternatives least likely and most likely to be discerned, to complement conventional data analysis. Applications are drawn in the use of Hotelling’s $Τ^{2}$ and of $F$ –tests in linear inference. Moreover, it is seen that a given design may be recast so as to reverse the least likely and most likely alternatives. Numerical examples serve to illustrate the findings.

$62 J 05, 62 H 10 a n d 62 P 30$

Keywords: Linear models; $F$ tests; Hotelling's $T^{2}$ tests; Directed alternatives; Reversal designs

Introduction

Power in statistical inference is driven by non–null distributions. For observations in $ℝ^{k}$ having dispersion matrix $Ξ$ , noncentrality parameters often emerge as the Mahalanobis [1] distance between points $(u, v)$ in $ℝ^{k}$ , namely,

$D_{Ξ}^{2} (u, v) = (u - v)' Ξ^{- 1} (u - v) .$ (1)

This specializes to the Euclidean metric for the case that $Ξ = c Ι_{k}$ , in which case the model is called isotropic. In particular, nonparametric and other statistics often have noncentral chi–squared distributions, either in small samples or asymptotically. In addition, pervasive venues in parametric inference, to be reexamined in some detail, include the following.

Case (i). Hotelling [2] Test: $T^{2} = n (\bar{Y} - μ_{0}) S^{- 1} (\bar{Y} - μ_{0})$ where $(\bar{Y}, S)$ are the sample mean and dispersion matrix of $n$ Gaussian vectors in $ℝ^{k}$ having the location–scale parameters $(μ, Σ)$ . Then in testing $H_{0} : μ = μ_{0}$ against $H_{1} : μ \neq μ_{0}$ , the power function is $Ψ (λ)$ with noncentrality $λ = n (μ - μ_{0})' Σ^{- 1} (μ - μ_{0}) .$

Case (ii). The General Linear Model: ${Y = X β + ε}$ with Gaussian errors having zero means and dispersion matrix $σ^{2} I_{n}$ . Then in testing $H_{0} : β = β_{o}$ against $H_{1} : β \neq β_{o}$ in $ℝ^{k}$ , the power function is $Ψ (λ)$ with noncentrality $λ = (β - β_{o})' X' X (β - β_{o}) / σ^{2}$ .

Classical theory allows for any $(μ - μ_{0}) \in ℝ^{k}$ for Case (i), and $(β - β_{o}) \in ℝ^{k}$ for Case (ii). On the other hand, alternatives lying in designated subspaces may hold substantive interest per se. For example, taking $(μ - μ_{0})' = c (1, 1, 0, \dots, 0)$ allows for discrepancies between $μ$ and $μ_{0}$ in their first two coordinates only, whereas $(μ - μ_{0})' = c (1, 1, \dots, 1)$ allows for deviations along the equiangular line in $ℝ^{k}$ . Both are one–dimensional; subspaces of dimension greater than one are considered subsequently. Alternatives lying in designated subspaces of $ℝ^{k}$ are called directed alternatives, and the goal here is to study powers of tests against alternatives of these types.

The present study expands on this as follows. Not only do distinct alternatives differ in importance to users, but so too their probabilities of detection. Here the spectral decomposition of $Ξ$ , if anisotropic, supports the identification of alternatives least likely and most likely to be discovered, as well as intermediate cases. These serve to bracket the effective range of inferences intrinsic to a given study, and thereby complement conventional options in data analysis. Applications are drawn in the use of Hotelling’s $T^{2}$ in multivariate samples, and of F–tests in the analysis of linear models. Moreover, it is shown that a given design may be modified so as to reverse the least likely and most likely alternatives, in the event that this would better serve the objectives of an experiment.

This study is organized as follows. Supporting developments are given next in Section 2, followed by the principal findings of Section 3. Several examples in Section 4 illustrate the essential results. Collateral materials are deferred for completeness to an Appendix.

Preliminaries

Notation

Spaces include $ℝ^{n}$ as Euclidean $n$ -space; $ℝ_{^{+}}^{n}$ as its positive orthant; $S_{n}$ as the real symmetric $(n \times n)$ matrices; $S_{n}^{+}$ as their positive definite varieties; $F_{n \times k}$ as the real $(n \times k)$ matrices of rank $k \leq n$ ; and $O_{k}$ as the $(k \times k)$ orthogonal group. Vectors and matrices are set in bold type; the transpose, inverse, trace, and determinant of $A$ are $A'$ , $A^{- 1}$ , $t r (A)$ , and $| A |$ ; the unit vector in $ℝ^{n}$ is $1_{n} = [1, \dots, 1]'$ ; $I_{n}$ is the $(n \times n)$ identity; and $D i a g (A_{1}, \dots, A_{k})$ is a block-diagonal array. If $B = [b_{1}, \dots, b_{k}]$ is of order $(n \times k)$ and rank $k < n$ , then $S_{p} (B)$ designates the column span of $B$ , i.e., the $k$ –dimensional subspace of $ℝ^{n}$ spanned by $[b_{1}, \dots, b_{k}]$ . The ordered eigenvalues of $A \in S_{n}$ are ${λ_{i} (A) = α_{i}; 1 \leq i \leq n}$ with ${α_{1} \geq α_{2} \geq \dots \geq α_{n}}$ , and its spectral decomposition is $A = P D_{α} P' = Σ_{i = 1}^{n} α_{i} p_{i} p_{i}'$ , where $P = [p_{1}, \dots, p_{n}] \in O_{n}$ and $D_{α} = D i a g (α_{1}, ..., α_{n})$ . By convention its condition number is $C_{1} (A) = α_{1} / α_{n}$ . The singular decomposition of $B \in F_{n \times k}$ is $B = P D_{δ} Q' = Σ_{i = 1}^{k} δ_{i} p_{i} q_{i}'$ , where the mutually orthogonal columns of $P = [p_{1}, \dots, p_{k}]$ comprise the left–singular vectors; $D_{δ} = D i a g (δ_{1}, \dots, δ_{k})$ are its singular values; and columns of $Q \in O_{k}$ are the right–singular vectors.

Special Distributions

For $Y \in ℝ^{n}$ , its distribution, mean, and dispersion matrix are L(Y), $E (Y) = μ$ and $V (Y) = Σ$ , say, with variance $V a r (Y) = σ^{2}$ on $ℝ^{1}$ . Specifically, L(Y) $= N_{n} (μ, Σ)$ is Gaussian on $ℝ^{n}$ with parameters $(μ, Σ)$ . Distributions on $ℝ_{^{+}}^{1}$ include the $χ 2 (\cdot; ν, λ)$ with $ν$ degrees of freedom and noncentrality parameter $λ$ ; the Snedecor–Fisher $F (\cdot; ν_{1}, ν_{2}, λ)$ with degrees of freedom $(ν_{1}, ν_{2})$ and noncentrality $λ$ ; and Hotelling [2] $T_{k}^{2} (\cdot, ν, λ)$ of order $k$ having $ν$ degrees of freedom and noncentrality $λ$ . Recall that $F (\cdot; ν_{1}, ν_{2}, λ)$ increases stochastically with increasing $λ$ with other parameters held fixed. Identify $c_{α}$ in context as the upper $α$ –level rejection rule. The power of a test, to be considered as a function of $λ$ , is designated by $ψ (λ)$ .

The Principal Findings

Directed alternatives

Our notation encompasses both (i) Hotelling [2] $T^{2}$ and (ii) General Linear Models, having location–scale parameters $(δ, Ξ)$ . What distinguishes this study are directed alternatives with examples as noted, but expanded to include alternatives ${θ_{j}; 1 \leq j \leq k}$ aligned with the orthonormal eigenvectors $Q = [q_{1}, \dots, q_{k}]$ of $Ξ$ , thus standardized to unit lengths. To continue, as $Ω = Ξ^{- 1}$ assumes a central role, take $Ω = Σ_{i = 1}^{k} κ_{i} q_{i} q_{i}' = Q D_{κ} Q' \in S_{k}^{+}$ as its spectral decomposition, with ${κ_{1} \geq \dots \geq κ_{k} > 0}$ . As in Appendix A.1, undertake the expansions

$Ω = Q_{1} D_{κ_{1}} Q_{1}' + Q_{2} D_{κ_{2}} Q_{2}' + Q_{3} D_{κ}_{_{3}} Q_{3}'$ (2)

$Ω = Q_{1} D_{κ_{1}} Q_{1}' + k_{r} Q_{2} D_{κ_{2}} Q_{2}' + Q_{3} D_{κ}_{_{3}} Q_{3}'; k_{r} r e p e a t e d s t i m e s$ , (3)

where elements of $Q =[Q_{1}, Q_{2}, Q_{3}]$ are of orders ${(k \times (r - 1)), (k \times s), (k \times d)}$ with $d = k - r - s + 1$ , and where $D_{k} = D i a g (D_{κ}_{_{1}}, D_{κ}_{_{2}}, D_{κ}_{_{3}})$ is partitioned conformably. In regard to quadratic forms of type $Q (u) = u' Ω u$ serving as noncentrality parameters, a principal result is the following.

Theorem 1. Given is a location–scale model with parameters $(δ, Ξ)$ , together with a test for $H_{0} : δ = δ_{0}$ against $H_{1} : δ \neq δ_{0}$ having power $Ψ (λ)$ increasing monotonically with $λ = D_{Ξ}^{2} (δ, δ_{0}) = (δ - δ_{0})' Ξ^{- 1} (δ - δ_{0})$ . Take ${(δ - δ_{0}) = θ_{j}; 1 \leq j \leq k}$ in succession as the eigenvectors ${θ_{j} = q_{j}; 1 \leq j \leq k}$ of $Ξ^{- 1}$ with eigen values ${κ_{1} \geq \dots \geq κ_{k} > 0}$ .

Then powers $Ψ (λ_{j})$ of the test at alternatives ${(δ - δ_{0}) = θ_{j}; 1 \leq j \leq k}$ depend on the noncentrality parameters ${λ_{j} = κ_{j}; 1 \leq j \leq k}$ , respectively.
In particular, the alternatives most likely and least likely to be discerned in terms of power are $θ_{1}$ and $θ_{k}$ having powers $Ψ (k_{1})$ and $Ψ (k_{k})$ , respectively.
Consider alternatives ${γ_{j} \in S_{p} (Q_{2})}$ standardized to unit lengths. Then bounds on powers at these local alternatives are given by
$Ψ (κ_{r + s}) \leq {Ψ (λ_{j}) f o r e v e r y γ_{j} \in S_{p} (Q_{^{2}})} \leq Ψ (κ_{r}^{})$
Suppose that $k_{r}$ is repeated s times as in the spectral resolution (2) for $Ω$ . Then for each alternative ${γ_{j} = Q_{2} a_{j} \in S_{p} (Q_{2}), w i t h {a^{'}}_{j} = [a_{j}_{_{1}}, \dots, a_{j}_{_{s}}]}$ , the noncentrality parameter is ${λ_{j} = κ_{r} {a^{'}}_{j} a_{j}}$ , with corresponding power ${Ψ (λ_{j}) : γ_{j} \in S_{p} (Q_{_{2}})}$ .

Proof: Conclusion (i) follows directly from $λ (θ_{j}) = {θ^{'}}_{j} Ξ^{- 1} θ_{j} = {θ^{'}}_{j} (\sum_{i = 1}^{k} κ_{i} q_{i} {q^{'}}_{i}) θ_{j} = κ_{j}$ since $θ_{j} = q_{j}$ , whereas ${{q^{'}}_{j} q_{i} = 0; i \neq j}$ and ${q^{'}}_{j} q_{j} = 1$ by orthonormality. Conclusion (ii) follows directly from variational properties of Rayleigh quotients as in Lemma A.1(i) of the Appendix. In like manner conclusion (iii) follows from Lemma A.1(ii) as variational properties over subspaces. Conclusion (iv) follows from (iii) since ${{q^{'}}_{j} q_{i} = 0; i \neq j}$

Remark 1. The directed alternatives $δ_{1}^{'} = [1, 1, 0, ..., 0]$ and $δ_{2}^{'} = [1, 1, ..., 1]$ were featured earlier as discrepancies in the first two coordinates of $(δ - δ_{0})$ , and as deviations about the equiangular line in $ℝ^{k}$ . Let $Ξ^{- 1} = [ξ^{i j}]$ . Then powers $Ψ (λ)$ at these alternatives will depend on $λ_{1} = \sum_{i = 1}^{2} \sum_{j = 1}^{2} ξ^{_{i j}}$ at ${δ^{'}}_{1} = [1, 1, 0, ..., 0]$ , and on $λ_{2} = \sum_{i = 1}^{k} \sum_{j = 1}^{k} ξ^{i j}$ at ${δ^{'}}_{2} = [1, 1, ..., 1]$ .

Corollary 1. On specializing the location–scale parameters $(δ, Ξ)$ , Theorem 1 applies verbatim as follows.

(i) Hotelling [2] $T^{2} : (δ, Ξ) = (μ, Σ); L (T^{2}) = T_{k}^{2} (\cdot; n - 1, λ)$ , the power $Ψ (λ)$ depending on $λ = n (μ - μ_{0})' Σ^{- 1} (μ - μ_{0})$ .

(ii) General Linear Models: $(δ, Ξ) = (β, {(X' X)}^{- 1}); L (F) = F (\cdot; k, n - k, λ)$ , the power $Ψ (λ)$ depending on $λ = (β - β_{o})' X' X (β - β_{o}) / σ^{2}$ ,

Proof: The noncentral distribution $F (\cdot; k, n - k, λ)$ clearly satisfies the assumptions of Theorem 1 on identifying $(δ, Ξ) = (β, {(X' X)}^{- 1})$ as claimed. Similarly in testing $H_{0} : μ = μ_{0}$ against $Η_{1} : μ \neq μ_{0}$ , Hotelling’s $T^{2}$ inherits these properties through the conversion $L (\frac{(ν - k + 1) T^{2}}{k ν}) = F (\cdot; k, ν - k + 1, λ)$ . With $λ = n (μ - μ_{0})' Σ^{- 1} (μ - μ_{0}) .$

Remark 2. Note that alternatives ${θ_{j} = q_{j}; 1 \leq j \leq k}$ of unit lengths give noncentrality parameters ${κ_{j}; 1 \leq j \leq k}$ . If instead the directed alternatives are ${θ_{j} = c_{j} q_{i}; 1 \leq j \leq k}$ , then the noncentrality parameters will be ${c_{j}^{2} κ_{j}; 1 \leq j \leq k}$ .

Remark 3 Note that the foregoing developments are for the general case that $Ω = Q D_{κ} Q'$ is anisotropic with ${κ_{1} \geq \dots \geq κ_{k} > 0}$ . If isotropic, then the following applies.

Definition 1. The model $Ξ$ is called isotropic if and only if $Ξ = d I_{k}$ , in which case power functions are directionally invariant, not depending on directions of alternatives in $ℝ^{k}$ . Add: Bounds on ARLs from restricted variation.

Sphericity

The density for $Ν_{k} (μ, Σ)$ has spherical contours for the case that $Σ = d I_{k}$ , i.e., the model is isotropic. Sample evidence regarding the isotropy of $T^{2}$ is available. Mauchly [3] derived the Likelihood Ratio test for sphericity, namely, $Η_{0} : Σ = d I_{k}$ against $Η_{1} : Σ \neq d I_{k}$ . A contemporary test utilizes the modified statistic

$L R_{M} = - (ν - \frac{2 k^{2} + k + 2}{6 k}) l n [\frac{k^{k} | S |}{{(t r S)}^{k}}]$ (4)

taking $S$ as the sample dispersion matrix from n observations, rejecting at level α for $L R_{M} > c_{α}$ with $v = n - 1$ and with $c_{α}$ as the upper percentile of the central distribution $χ^{2} (k (k + 1) / 2 - 1, 0)$ . See, for example, Rencher [4].

Design Reversals

Developments thus far are predicated in part on the desirability to identify alternatives having varying powers of discernment. These include the most likely and least likely as in Theorem 1(ii). If the least likely is deemed to be of greatest interest, it remains to ask whether it might serve instead as the most likely alternative. In the context of designed experiments the answer is affirmative, as the intrinsic structure offers a venue for modifying a given design so as to achieve these ends. Details follow.

Consider the model ${Y = [1_{n}, X] [α, β']' + ε}$ with $X$ centered such that ${1^{'}}_{n} X = 0$ , where location–scale parameters for $\hat{β}$ are $(β, Ξ)$ with $Ξ = {(X' X)}^{- 1}$ as in Corollary 1(ii). In particular, the test for $Η_{0} : β = β_{o}$ against $Η_{1} : β \neq β_{o}$ utilizes $F = (\hat{β} - β_{o})' X' X (\hat{β} - β_{o}) / k S^{2}$ with $S^{2}$ as the residual mean square and with noncentrality $λ = (β - β_{o})' X' X (β - β_{o}) / σ^{2}$ , where it often suffices to take $σ^{2} = 1.0$ . For $X \in F_{n \times k}$ its singular decomposition, followed by $X' X$ , is

$X = P D_{δ} Q' = \sum_{i = 1}^{k} δ_{i} p_{i} {q^{'}}_{i}, X' X = Q D_{κ} Q'$ (5)

with $P = [p_{1}, \dots, p_{k}]$ as its left–singular vectors, $D_{δ}$ as its singular values, and columns of $Q \in O_{k}$ as its right–singular vectors. Clearly ${κ_{j} = δ_{i}^{2}; 1 \leq i \leq κ}$ . Our principal reconstruction is articulated in the following.

Theorem 2. Let $π (δ)$ be a permutation operator reversing the ordered array $[δ_{1} \geq \dots \geq δ_{k}]$ to ${δ_{k} \leq {δ_{2}, \dots, δ_{k - 2}} \leq δ_{1}}$ , and let $D_{π} = D i a g (δ_{k}, δ_{2}, \dots, δ_{k - 2}, δ_{1})$ . Next construct

$X_{π} = P D_{π} Q' = δ_{k} p_{1} {q^{'}}_{1} + \sum_{i = 2}^{k - 1} δ_{i} p_{i} {q^{'}}_{i} + δ_{1} p_{k} {q^{'}}_{k}$

such that pairs

(δ_{k}, q_{1})

and

(δ_{1}, q_{k})

are realigned.

Conclusion: The most likely and least likely alternatives for design $X_{π}$ are reversed from those of $X$ , so that $θ_{1} = q_{k}$ now is most likely with power $Ψ (κ_{1})$ depending on $κ_{1}$ , and $θ_{k} = q_{1}$ least likely with power $Ψ (κ_{k})$ depending on $κ_{k}$ .

Proof: Clearly the conventional reordering of eigenvalues gives

${X^{'}}_{π} X_{π} = κ_{1} q_{k} {q^{'}}_{k} + \sum_{i = 2}^{k - 1} κ_{i} q_{i} {q^{'}}_{i} + κ_{k} q_{1} {q^{'}}_{1}$ (6)

and the conclusion follows on applying Theorem 1(ii) in the context of Corollary 1(ii).

Remark 4. Variations on $X_{π}$ are apparent. Any permutation of ${δ_{2}, \dots, δ_{κ - 1}}$ gives the same conclusion. In addition, any pairs ${(δ_{i}, q_{i}), (δ_{j}, q_{j})}$ may be selected in like manner as most likely and least likely to be discerned. Note, however, that these tools are available in the case of first–order designs.

Case Studies

Studies in Hotelling [2] $T^{2}$ and second–order response models are given, to illustrate Theorem 1 and Corollary 1. Moreover, an example design serves to illustrates the Theorem 2 reversal of most likely and least likely alternatives.

Hotelling’s $T^{2}$ Tests

We reexamine the role of calcium in the growth of turnip greens, using data as reported in Kramer et al. [5]. In each of 29 experimental plots the plant calcium ( $Y_{1}$ ) was determined, and the soil calcium was assayed as available ( $Y_{2}$ ) and exchangeable ( $Y_{3}$ ) calcium. The units all are milliequivalents per hundred grams. Horticultural specialists expect these to run at about 15.00, 6.00 and 2.85 units, respectively. The sample means are $\bar{Y}' = [17.97, 4.39, 2.46];$ and the sample dispersion matrix is $S$ with inverse $S^{(- 1)} = Q D_{k} Q'$ in spectral form, as listed in

$Q = [\begin{array}{l} 0.01024 0.22479 - 0.97435 \\ 0.05973 - 0.97280 - 0.22381 \\ - 0.99816 - 0.05591 - 0.02339 \end{array}]$

where $D_{k} = D i a g (5.61102, 0.04857, 0.00416)$ The data are ill–conditioned, with condition number

$c_{1} (S) = 1, 348.80$ :

The statistic reported is $T^{2} = n (\bar{Y} - μ_{0})' S^{- 1} (\bar{Y} - μ_{0}) = 24.97$ with $μ_{0}^{'} = 15.00, 6.00, 2.85$ , rejecting at level $α = 0.01$ the hypothesis $H_{0} : μ = μ_{0}$ in favor of some $H_{1} : μ \neq μ_{0}$ . Indeed, the $p$ –value is $P (T^{2} > 24.97 | H_{0}) = 0.000751$ with $C_{α} = [14.980]$ . On applying Corollary 1(i) with $S$ in lieu of $Σ$ , the columns of $Q = [q_{_{1}}, q_{_{2},} q_{_{3}}]$ are taken as successive alternatives to $(μ - μ_{_{0}})$ , namely

$[\begin{array}{l} μ_{_{1} - 15.00} \\ μ_{2} - 6.00 \\ μ_{3 - 2.85} \end{array}] \in [\begin{array}{l} 0.01024 0.22479 - 0.97435 \\ 0.05973 - 0.97280 - 0.22381 \\ - 0.99816 - 0.05591 - 0.02339 \end{array}]$

where the dominant terms are in bold type. The noncentrality parameters ${λ_{i} = n κ_{i}; 1 \leq i \leq 3}$ are {162.720, 1.409, 0.121}, and taking $α = 0.05$ and $C_{α} = 9.612,$ powers at these alternatives are

$P ((T^{2} | λ_{1}) > 9.612 = 1.0000, P ((T^{2} | λ_{2}) > 9.612 = 0.1316, P ((T^{2} | λ_{3}) > 9.612) = 0.0562.$

Accordingly, $T^{2}$ has essentially unit power to distinguish the hypothetical deviation $(μ_{3} - 2.85)$ from -0.99816, since the discrepancies 0.01024 for $(μ_{1} - 15.00)$ and 0.05973 for $(μ_{2} - 6.00)$ in $q_{1}$ are negligible. Similarly, $T_{_{2}}$ is marginally able to distinguish [( $μ_{1}$ −15.00), ( $μ_{2}$ −6.00)] from [0.22479, −0.97280] with power 0.1316, but is virtually unable to separate [( $μ_{1}$ −15.00), ( $μ_{2}$ −6.00)] from [−0.97435, −0.22381] with negligible power of 0.0562. In short, the latter suggests [14.03, 5.78] to be plausible values for $[μ_{1}, μ_{2}]$ .

This is an example, as seen subsequently also, where elements of $Q = [q_{_{1}}, q_{_{2},} q_{_{3}}]$ , especially $q_{_{1}}$ , convey useful information in regard to the objectives of the study. In summary, details regarding directed alternatives, enabled here by Theorem 1 and Corollary 1(i), go beyond conventional useage for $T^{2}$ .

Hotelling’s $T^{2}$ Charts.

Multivariate diagnostics figure prominently in Statistical Process Control (SPC), as reviewed subsequently. In monitoring the manufacture of bomb sights during World War II, Hotelling [6] devised $T^{2}$ charts for multivariate means in ${R}^{K}$ . Here successive values ${T_{i}^{2}; i = 1, 2, ...}$ are charted against time, where the chart signals the process to be out–of–control at level $α$ whenever $T_{i}^{2} > C_{α} .$ Moreover, with power $ψ (λ) = P ((T_{i}^{2} | λ) > C_{α}),$ the Average Run Length (ARL) of time–to–signal is $A R L = 1 / ψ (λ)$ . To monitor the mean $μ$ against its target value $μ_{0}$ , successive samples of size n yield $(\bar{Y_{i}}, S_{i})$ , together with $T_{i}^{2} = n (\bar{Y_{i}} - μ_{0})^{'} S_{i}^{- 1} (\bar{Y_{i}} - μ_{0})$ having the $T_{k}^{2} (∴ n - 1, λ)$ distribution with $λ = n (μ - μ_{0})' Σ^{- 1} (μ - μ_{0})$ . Phase I in SPC is set to establish base line process capabilities, to include parameter estimation, followed in Phase II by implementing the control charts themselves.

To continue, consider the data of Quesenberry [7] to be in Phase I, comprising n=30 records of 11 quality characteristics indexed in time–order of production. Following Williams et al. [8], dimensions are reduced on selecting the first k=5 quality characteristics, namely, $[Y_{1}, ..., Y_{5}]$ having means $[μ_{1}, ..., μ_{5}]$ , respectively. As in Section Hotelling’s $T^{2}$ Tests we take $S$ in lieu of $\sum$ , finding the spectral resolution $S^{- 1} = Q D_{κ} Q'$ as reported in Table 1. The data are seen to be highly ill–conditioned, with condition number $c_{1} (S) = 3, 945.64$ . In keeping with Corollary 1(i), five directed alternatives comprise the columns of $Q = [q_{1}, ..., q_{5}]$ in Table 1, where dominant elements again are in bold type. In particular, this example shows ${q_{1}, ..., q_{5}}$ to be separately informative per se, as each corresponds essentially to deviations in $(μ_{i} - μ_{i 0})$ for observations ${Y_{1}, Y_{3}, Y_{5}, Y_{2}, Y_{4}}$ , respectively, since values other than those in bold type are negligible.

Eigenvalues
$κ_{1}$ 520.4304	$κ_{2}$ 11.9529	$κ_{3}$ 1.1404	$κ_{4}$ 1.0843	$κ_{5}$ 0.1319
Eigenvectors
$q_{1}$	$q_{2}$	$q_{3}$	$q_{4}$	$q_{5}$
0.99893 −0.00607 0.04535 −0.00396 −0.00505	−0.04553 −0.00445 0.99863 −0.01681 −0.01907	0.00490 0.14300 0.01937 −0.02275 0.98926	0.00543 0.98712 0.00322 0.07515 −0.14105	0.00291 −0.07126 0.01722 0.99676 0.03287
Power at $α$ =0.05 and $n$ =30
1.0000	1.0000	0.9914	0.9882	0.2367
Power at $α$ =0.05 and $n$ =8
1.0000	1.0000	0.1848	0.1779	0.0645
ARLs at $α$ =0.05 and $n$ =8
1.0000	1.0000	5.41	5.62	15.50

Table 1: Spectral values for $S^{- 1} = Q D_{κ} Q'$ for the data of Quesenberry (2001) of order (30×5).

Taking ${λ_{i} = n κ_{i}; 1 \leq i \leq 5}$ with $n = 30, α = 0.05$ , and critical value $C_{α} = 15.097$ , powers against these directed alternatives are listed in Table 1. These show that deviations in the directions of ${μ_{1}, μ_{2}, μ_{3}, μ_{5}}$ essentially would be discerned with power at least 0.9882, whereas power in the direction of $μ_{4}$ would be diminished to 0.2367. Note, however, that these values are inflated by the value n=30 in Phase I. Samples much smaller in size ordinarily would be taken in Phase II, say n=8 in this example. Then the powers corresponding to ${λ_{i} = 8 k_{i}; 1 \leq i \leq 5}$ at α=0.05 are given subsequently in Table 1. In short, with $A R L = 1 / ψ (λ)$ as in the final row of Table 1 with samples of size n=8, these charts would detect changes in $μ_{1}$ and $μ_{3}$ immediately on average, but less responsive otherwise with $A R L s o f (5.41, 5.62, 15.50) f o r (μ_{5}, μ_{2}, μ_{4})$ , respectively.

Remark 5. This example goes beyond conventional uses of $T^{2}$ charts, in demonstrating that the capacity of a given chart to detect alternatives may differ widely across alternatives in $ℝ^{5}$ of compelling practical interest. In short, each of five ARLs pertains here to an informative one–dimensional alternative.

Second–Order Designs

Second–order models of type

${Y_{i} = β_{0} + β_{1} X_{1 i} + β_{2} X_{2 i} + β_{11} X_{1 i}^{2} + β_{22} X_{2 i}^{2} + β_{12} X_{1 i} X_{2 i} + ε_{i;} 1 \leq i \leq n}$ (8)

are considered having zero mean, uncorrelated errors with variance $σ^{2}$ . In a typical setting the yield ( $Y_{i}$ ) of a chemical process is examined at specified reaction time $(X_{1 i})$ and temperature $(X_{2 i})$ . Small designs of historical consequence are the Central Composite (CCD) designs of Box et al. [9], having design points as listed in Table 2.

$X'$

-1.00

1.00

$- α$

0.00

$α$

0.00

$- α$

0.00

$α$

1.00

-1.00

1.00

0.00

Table 2: Regressor vectors for the CCD design $X'$ of order (2×9), where $α = \sqrt{2}$ .

Proceeding as in Corollary 1(ii), we seek spectral values for the Fisher Information Matrix $X^{'} X$ , specifically, the eigenvalues and eigenvectors as listed in Table 3, with dominant terms again in bold type. Powers at these directed alternatives follow on taking $μ_{3}$ as surrogates in the noncentrality parameters for $F = (\overset{⌢}{β} - β_{o})^{'} X^{'} X (\overset{⌢}{β} - β_{o}) / k S^{2}$ with $k = 6$ and $S^{2}$ as the residual mean square. Owing to only two degrees of freedom for error, computations replaced $S_{i}^{2} b y σ^{2} = 1.0,$ then powers were determined from scaled noncentral $X^{2} (6, κ_{i})$ distributions with noncentrality parameters as listed in Table 3.

Eigenvalues
$κ_{1}$ 24.3427	$κ_{2}$ 8.0000	$κ_{3}$ 8.0000	$κ_{4}$ 8.0000	$κ_{5}$ 4.0000	$κ_{6}$ 0.6573
Eigenvectors
$q_{1}$	$q_{2}$	$q_{3}$	$q_{4}$	$q_{5}$	$q_{6}$
0.59349 0.00000 0.00000 0.56911 0.56911 0.00000	0.00000 0.00000 0.00000 0.70711 -0.70711 0.00000	0.00000 1.00000 0.00000 0.00000 0.00000 0.00000	0.00000 0.00000 -1.00000 0.00000 0.00000 0.00000	0.00000 0.00000 0.00000 0.00000 0.00000 1.00000	0.80484 0.00000 0.00000 -0.41966 -0.41966 0.00000
Power at $α$ =0.05
0.9765	0.5307	0.5307	0.5307	0.2698	0.0775

Table 3: Spectral values for the Fisher Information Matrix $X' X$ for the CCD design.

Arranged in decreasing order of their powers, these are $q_{1}$ with power 0.9765; ${q_{2}, q_{3}, q_{4}}$ each with power 0.5307; and $q_{5}$ and $q_{6}$ as alternatives with powers 0.2698 and 0.0775. Here elements of $Q$ are separately informative: $q_{2}$ for discrepancies between $(β_{11}, β_{22})$ and their hypothetical values; and $(q_{3}, q_{4}, q_{5})$ for discrepancies between $(β_{1}, β_{2}, β_{12})$ and their hypothetical values, respectively. To continue, observe that the eigenvalue $κ_{2} = 8.0000$ is repeated three times. On applying Theorem 1(iv) in the context of Corollary 1(ii), we see that all standardized elements in $S_{p} (q_{2}, q_{3}, q_{4})$ have power 0.5307. These include, in addition to $q_{3}$ for $β_{1}$ and $q_{4}$ for $β_{2}$ , the standardized sums

$θ_{1} = (q_{3} + q_{4}) / \sqrt{2} = [0, 1, - 1, 0, 0, 0]^{'} / \sqrt{2}$ (9)

$θ_{2} = (q_{2} + q_{3} + q_{4}) / \sqrt{3} = [0, 1, - 1, 0, 7071, - 0.7071, 0]^{'} / \sqrt{3}$ (10)

for example, with $θ_{1}$ for the discrepancy between $[(β_{1} - β_{10}), (β_{2} - β_{20})]$ and $[1, - 1] / \sqrt{2}$ .

In summary, the directed second–order alternatives treated here are innovations not found in classical linear inference. Instead, these are enabled by Theorem 1 and Corollary 1(ii). Again the elements of $Q$ , especially ${q_{2}, ...., q_{5}},$ are separately informative about coefficients of the model (4.2). Their simple and revealing structure may be attributed to the symmetry and balance of CCD designs.

Design Reversals

Begin with ${Y = [1_{n,} X] [β_{0}, β^{'}]^{'} + ε}$ with $X_{0} = [1_{n}, X]$ and $X$ in centered form as in Section 3.2, with $β^{'} = [β_{1}, β_{2}, β_{3}],$ having the design $X = P D i a g (δ_{1}, δ_{2}, δ_{3}) Q^{'}$ as listed in Table 4. Construct $X_{1} = P D i a g (δ_{3}, δ_{2}, δ_{1}) Q^{'}$ on permuting singular values, but retaining the left and right singular vectors.

Design $X' = [P D i a g (δ_{1}, δ_{2}, δ_{3}) Q']'$
$X'$	-1.0000 1.0000 -1.4142 1.4142 -1.0000 1.0000 0.0000 0.0000
	0.0000 0.0000 1.0000 1.0000 0.0000 0.0000 -1.0000 -1.0000
	-0.8000 0.6000 -0.8000 2.0000 -0.8000 0.6000 -0.8000 0.0000
Design $X_{1}^{'} = [P D i a g (δ_{3}, δ_{2}, δ_{1}) Q']'$
$X_{1}^{'}$	0.4279 -1.0391 0.2809 0.7126 0.4279 -1.0391 -1.1080 1.3370
	-0.0193 -0.3197 -1.3211 -0.3556 -0.0193 -0.3197 0.4993 1.8556
	-0.1811 0.8999 0.2484 -1.1704 -0.1811 0.8999 1.1798 -1.6954
Left–Singular Vectors $P'$
$P'$	-0.3308 0.2946 -0.3736 0.6594 -0.3308 0.2946 -0.1792 -0.0342
	0.0960 -0.1138 0.6024 0.3785 0.0960 -0.1138 -0.5082 -0.4372
	0.0996 -0.3618 -0.1302 0.2927 0.0996 -0.3618 -0.3435 0.7055
Right–Singular Vectors $[θ_{1}, θ_{2}, θ_{3}]$
$Q$	-0.7099 -0.1307 -0.6921
	0.3508 -0.9177 -0.1865
	0.6108 0.3752 -0.6973
Detection Probabilities: Design $X$
	0.4947 0.1828 0.0577

Table 4: Design matrix $X = P D_{δ} Q'$ and the modified $X_{1} = [P D i a g (δ_{3}, δ_{2}, δ_{1}) Q']$ the left ( $P'$ ) and right ( $Q$ ) singular vectors; and the singular values $δ' = [3.8198, 2.0992, 0.5318] .$

To continue, take the right–singular vectors, now $Q = [θ_{1}, θ_{2}, θ_{3}],$ as directed alternatives to $(β - β_{o})^{'} = [(β_{1} - β_{10}),$ $(β_{2} - β_{20}), (β_{3} - β_{30})],$ together with $k^{'} = [14.5909, 4. 4066, 0.2828]$ as ${κ_{i} = δ_{i}^{2}; 1 \leq i \leq 3}$ . Here the level 0.05 critical value is 6.5914; then powers are determined in turn from $F (∴ 3, 4, κ_{i})$ together with the power functions ${ψ (λ) = P (F > 6.5914 | κ_{i}); 1 \leq i \leq 3}$ having values listed in the final row of Table 4.

In particular, discovering alternatives $(β - β_{o})^{'}$ in the direction $θ_{3} = - [0.6921, 0.1865, 0.6973]$ is seen to be unlikely, with power 0.0577. On the other hand, suppose instead that it is critical in context to discover alternatives in the negative orthant of $ℝ^{3}$ . Then the design $X_{1}$ serves to reverse these so that alternatives in the directions of ${θ_{3}, θ_{2}, θ_{1}}$ are now detected with probabilities $[0.4947, 0.1828, 0.0577]$ , respectively.

Further properties of the design $X$ and its reversal $X_{1}$ deserve mention. Observe that ${X^{'}}_{0} X_{0} = D i a g (n, X^{'} X),$ whereas $X^{'} X = Q D i a g (κ_{1}, κ_{2}, κ_{3}) Q^{'}$ and ${X^{'}}_{1} X_{1} = Q_{R} D i a g (κ_{1}, κ_{2}, κ_{3}) {Q^{'}}_{R}$ , where now $Q_{R} = [q_{3}, q_{2}, q_{1}]$ following the convention that ${κ_{1} \geq κ_{2} \geq κ_{3}}$ remain ordered. Clearly $V (\overset{⌢}{β} | X) = σ^{2} {(X^{'} X)}^{- 1}$ and $V (\overset{⌢}{β} | X_{1}) = σ^{2} {({X^{'}}_{1} X_{1})}^{- 1}$ differ, where their diagonal elements are listed as variances in Table 5. However, their eigenvalues are identical by construction, as are their ${A, D, E}$ efficiency indices as the trace, determinant, and largest eigenvalues of ${({X^{'}}_{0} X_{0})}^{- 1}$ under both designs $X a n d X_{1}$ .

Design Characteristics
Design	$X$	$X_{1}$	$X$	$X_{1}$
Estimates	Variances		Eigenvalues of
${\overset{⌢}{β}}_{0}$ ${\overset{⌢}{β}}_{1}$ ${\overset{⌢}{β}}_{2}$ ${\overset{⌢}{β}}_{3}$	0.12500 1.38170 0.69002 1.76012	0.12500 1.83546 0.26120 1.73518	3.53636 0.22694 0.12500 0.06854	3.53636 0.22694 0.12500 0.06854
Diagnostic	A	D		E
$Σ_{0}, Ξ_{0}$	3.95684	0.00688		3.53636

Table 5: Variances of OLS solutions; eigenvalues of the dispersion matrix $Γ \in {Σ_{0}, Ξ_{0}}$ as ${(X_{0}^{'} X_{0})}^{- 1}$ for the designs ${X, X_{1}}$ ; and $A, D, a n d E$ efficiencies for these designs.

Summary and Discussion

This study reexamines the concept of directional invariance, or isotropy, for distributions on $ℝ^{k}$ having location–scale parameters $(δ, Ξ)$ . Powers of tests for $H_{0} : δ = δ_{0}$ against $H_{0} : δ \neq δ_{0}$ often depend on noncentrality parameters of type $λ = (δ - δ_{0})^{'} Ξ^{- 1} (δ - δ_{0})$ . The spectral decomposition of $Ξ$ supports the identification of directed alternatives in directions determined by the eigenvectors of $Ξ$ , to encompass the alternatives most likely and least likely in a given study. Powers of these types are independent of direction if and only if $Ξ = σ^{2} Ι_{k}$ . Applications are drawn in the use of Hotelling [2] $T^{2}$ in multivariate samples, and of F-tests in linear models. Case studies are given where this approach leads to the discovery of further insight regarding the natural parameters of a problem.

One concept of directional invariance figures prominently in the SPC literature. The following is excerpted from Linna et al. [10].

“It is well known that the ARL performance of multivariate SPC procedures depends heavily on the covariance structure of the observed data. See, for example, Mason et al. [11]. Further, it has been noted by Pignatiello et al. [12] that many multivariate procedures, including the $χ^{2}$ chart, Hotellings $T^{2}$ chart, and most of the multivariate CUSUM charts, are directionally invariant. The performance of the multivariate EWMA chart proposed in Lowry et al. [13] is also directionally invariant. Lowry et al. [14] and others also note the directional invariance of many of these multivariate control charting methods. Directional invariance means that the performance of a procedure does not depend on the specific direction in p-space of a shift in the mean vector of the process variables being monitored. Instead, performance of a directionally invariant procedure depends only on the statistical (or Mahalanobis) distance between the in-control mean vector $μ_{0}$ and the out-of-control mean vector under consideration, $μ_{1}$ .” (Italics supplied.) That is, $D_{Ξ}^{2} (μ_{0}, μ_{1}) = (μ_{0} - μ_{1})^{'} Ξ^{- 1} (μ_{0} - μ_{1})$ .

Unfortunately, this notion of directional invariance is grossly misleading, is antithetical to the very concept of invariance as in our Definition 1, at best is a misnomer, and in any event deserves to be clarified in the SPC literature. In fact, such essentials as power functions and ARLs do indeed depend on directions of alternatives, as seen in Section 4.2 as counter examples, unless the model is isotropic.

On the other hand, a disclaimer of Tsui et al. [15] should be noted: “There is no reason in practice, however, for a shift to $μ = μ *$ to be always considered as important as a shift to $μ = μ * *$ just because the corresponding values of the noncentrality parameters are equal.” Our study represents a substantial elaboration on this point.

A Appendix

Rayleigh Quotients.

At issue are variational properties of quadratic forms of type $Q (υ) = υ^{'} Ω υ / υ^{'} υ,$ known as Rayleigh Quotients; see Bellman [16]. Write $Ω = Σ_{i = 1}^{k} κ_{i} q_{i} {q^{'}}_{i} = Q D_{k} Q^{'}$ in its spectral form with $Q \in O_{k}$ so that ${{q^{'}}_{i} Ω q_{i} = κ_{i}; 1 \leq i \leq k}$ . Further partition $Q = [Q_{1}, Q_{2}, Q_{3}]$ of orders ${(k \times (r - 1)), (k \times s) (k \times d)}$ with $d = k - r - s + 1,$ and partition $D_{κ} = D i a g (D_{κ_{1}}, D_{κ_{2}}, D_{κ_{3}})$ conformably. Then

$Ω = Q_{1} D_{κ_{1}} {Q^{'}}_{1} + Q_{2} D_{κ_{2}} {Q^{'}}_{2} + Q_{3} D_{κ_{3}} {Q^{'}}_{3}$ (11)

where, in particular, $Q_{2} = [q_{r}, ..., q_{r + s}] a n d D_{k_{2}} = D i a g (κ_{r}, ..., κ_{r + s})$ . Essentials follow.

Lemma A.1 Consider the positive definite form $υ^{'} Ω υ / υ^{'} υ w i t h Ω$ as in expression (11).

(i) Variational properties of $Q (υ)$ as u varies over $ℝ^{k}$ are

$κ_{k} \leq \frac{υ^{'} Ω υ}{υ^{'} υ} \leq κ_{1}$ (12)

where the lower and upper limits are attained at $υ = q_{k}$ and $υ = q_{1}$ , respectively.

(ii) Variational properties of $Q (υ)$ as u varies over $S_{p} (Q_{2})$ are

$κ_{r + s} \leq \frac{υ^{'} Ω υ}{υ^{'} υ} \leq κ_{r}$ (13)

where the lower and upper limits are attained at $υ = q_{r + s}$ and $υ = q_{r}$ , respectively.

Proof: Conclusion (i) is given in Bellman [16], where the limits are attained as given since

κ_{k} = {q^{'}}_{k} Ω q_{k}

and

κ_{1} = {q^{'}}_{1} Ω q_{1}

. To see conclusion (ii),

υ \in S_{p} (Q_{2})

may be represented as

υ = Q_{2} a w i t h a^{'} = [a_{r}, ..., a_{r + s}]

. Then for

υ \in S_{p} (Q_{2})

, (A.1) gives

υ^{'} Ω υ = a^{'} {Q^{'}}_{2} Q_{2} D_{κ_{2}} {Q^{'}}_{2} Q_{2} a = Σ_{i = r}^{r + s} κ_{i} a_{i}^{2}

. The lower and upper limits for

υ \in S_{p} (Q_{2})

follow as in conclusion (i) for

υ \in ℝ^{k}

, except that now these are attained as given since

{q^{'}}_{r + s} Ω q_{r + s} = κ_{r + s}

and

{q^{'}}_{r} Ω q_{r} = κ_{r}

Acknowledgement

The author is indebted to Professor Donald E. Ramirez for substantial contributions, including computations using the MINITAB and MAPLE software packages.