Research Article Volume 6 Issue 3
On directed alternatives in linear inference
Donald Jensen
Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.
Department of Statistics, Virginia Tech, USA
Correspondence: Donald Jensen, Department of Statistics, Virginia Tech, Blacksburg, VA 24061, USA
Received: October 25, 2016 | Published: October 5, 2017
Citation: Jensen D. On directed alternatives in linear inference. Biom Biostat Int J. 2017;6(3):364-371. DOI: 10.15406/bbij.2017.06.00171
Download PDF
Abstract
Tests for vector hypotheses
against
in
typically have powers depending on quadratic forms of type
. This study examines the case that
is restricted to subspaces, for example,
differing only in their first two coordinates. These are called directed alternatives. The spectral decomposition of
supports the identification of one–dimensional alternatives least likely and most likely to be discerned, to complement conventional data analysis. Applications are drawn in the use of Hotelling’s
and of
–tests in linear inference. Moreover, it is seen that a given design may be recast so as to reverse the least likely and most likely alternatives. Numerical examples serve to illustrate the findings.
Keywords: Linear models;
tests; Hotelling's
tests; Directed alternatives; Reversal designs
Introduction
Power in statistical inference is driven by non–null distributions. For observations in
having dispersion matrix
, noncentrality parameters often emerge as the Mahalanobis [1] distance between points
in
, namely,
(1)
This specializes to the Euclidean metric for the case that
, in which case the model is called isotropic. In particular, nonparametric and other statistics often have noncentral chi–squared distributions, either in small samples or asymptotically. In addition, pervasive venues in parametric inference, to be reexamined in some detail, include the following.
Case (i). Hotelling [2] Test:
where
are the sample mean and dispersion matrix of
Gaussian vectors in
having the location–scale parameters
. Then in testing
against
, the power function is
with noncentrality
Case (ii). The General Linear Model:
with Gaussian errors having zero means and dispersion matrix
. Then in testing
against
in
, the power function is
with noncentrality
.
Classical theory allows for any
for Case (i), and
for Case (ii). On the other hand, alternatives lying in designated subspaces may hold substantive interest per se. For example, taking
allows for discrepancies between
and
in their first two coordinates only, whereas
allows for deviations along the equiangular line in
. Both are one–dimensional; subspaces of dimension greater than one are considered subsequently. Alternatives lying in designated subspaces of
are called directed alternatives, and the goal here is to study powers of tests against alternatives of these types.
The present study expands on this as follows. Not only do distinct alternatives differ in importance to users, but so too their probabilities of detection. Here the spectral decomposition of
, if anisotropic, supports the identification of alternatives least likely and most likely to be discovered, as well as intermediate cases. These serve to bracket the effective range of inferences intrinsic to a given study, and thereby complement conventional options in data analysis. Applications are drawn in the use of Hotelling’s
in multivariate samples, and of F–tests in the analysis of linear models. Moreover, it is shown that a given design may be modified so as to reverse the least likely and most likely alternatives, in the event that this would better serve the objectives of an experiment.
This study is organized as follows. Supporting developments are given next in Section 2, followed by the principal findings of Section 3. Several examples in Section 4 illustrate the essential results. Collateral materials are deferred for completeness to an Appendix.
Preliminaries
Notation
Spaces include
as Euclidean
-space;
as its positive orthant;
as the real symmetric
matrices;
as their positive definite varieties;
as the real
matrices of rank
; and
as the
orthogonal group. Vectors and matrices are set in bold type; the transpose, inverse, trace, and determinant of
are
,
,
, and
; the unit vector in
is
;
is the
identity; and
is a block-diagonal array. If
is of order
and rank
, then
designates the column span of
, i.e., the
–dimensional subspace of
spanned by
. The ordered eigenvalues of
are
with
, and its spectral decomposition is
, where
and
. By convention its condition number is
. The singular decomposition of
is
, where the mutually orthogonal columns of
comprise the left–singular vectors;
are its singular values; and columns of
are the right–singular vectors.
Special Distributions
For
, its distribution, mean, and dispersion matrix are L(Y),
and
, say, with variance
on
. Specifically, L(Y)
is Gaussian on
with parameters
. Distributions on
include the
with
degrees of freedom and noncentrality parameter
; the Snedecor–Fisher
with degrees of freedom
and noncentrality
; and Hotelling [2]
of order
having
degrees of freedom and noncentrality
. Recall that
increases stochastically with increasing
with other parameters held fixed. Identify
in context as the upper
–level rejection rule. The power of a test, to be considered as a function of
, is designated by
.
The Principal Findings
Directed alternatives
Our notation encompasses both (i) Hotelling [2]
and (ii) General Linear Models, having location–scale parameters
. What distinguishes this study are directed alternatives with examples as noted, but expanded to include alternatives
aligned with the orthonormal eigenvectors
of
, thus standardized to unit lengths. To continue, as
assumes a central role, take
as its spectral decomposition, with
. As in Appendix A.1, undertake the expansions
(2)
, (3)
where elements of
are of orders
with
, and where
is partitioned conformably. In regard to quadratic forms of type
serving as noncentrality parameters, a principal result is the following.
Theorem 1. Given is a location–scale model with parameters
, together with a test for
against
having power
increasing monotonically with
. Take
in succession as the eigenvectors
of
with eigen values
.
- Then powers
of the test at alternatives
depend on the noncentrality parameters
, respectively.
- In particular, the alternatives most likely and least likely to be discerned in terms of power are
and
having powers
and
, respectively.
- Consider alternatives
standardized to unit lengths. Then bounds on powers at these local alternatives are given by
- Suppose that
is repeated s times as in the spectral resolution (2) for
. Then for each alternative
, the noncentrality parameter is
, with corresponding power
.
Proof: Conclusion (i) follows directly from
since
, whereas
and
by orthonormality. Conclusion (ii) follows directly from variational properties of Rayleigh quotients as in Lemma A.1(i) of the Appendix. In like manner conclusion (iii) follows from Lemma A.1(ii) as variational properties over subspaces. Conclusion (iv) follows from (iii) since
Remark 1. The directed alternatives
and
were featured earlier as discrepancies in the first two coordinates of
, and as deviations about the equiangular line in
. Let
. Then powers
at these alternatives will depend on
at
, and on
at
.
Corollary 1. On specializing the location–scale parameters
, Theorem 1 applies verbatim as follows.
(i) Hotelling [2]
, the power
depending on
.
(ii) General Linear Models:
, the power
depending on
,
Proof: The noncentral distribution
clearly satisfies the assumptions of Theorem 1 on identifying
as claimed. Similarly in testing
against
, Hotelling’s
inherits these properties through the conversion
. With
Remark 2. Note that alternatives
of unit lengths give noncentrality parameters
. If instead the directed alternatives are
, then the noncentrality parameters will be
.
Remark 3 Note that the foregoing developments are for the general case that
is anisotropic with
. If isotropic, then the following applies.
Definition 1. The model
is called isotropic if and only if
, in which case power functions are directionally invariant, not depending on directions of alternatives in
. Add: Bounds on ARLs from restricted variation.
Sphericity
The density for
has spherical contours for the case that
, i.e., the model is isotropic. Sample evidence regarding the isotropy of
is available. Mauchly [3] derived the Likelihood Ratio test for sphericity, namely,
against
. A contemporary test utilizes the modified statistic
(4)
taking
as the sample dispersion matrix from n observations, rejecting at level α for
with
and with
as the upper percentile of the central distribution
. See, for example, Rencher [4].
Design Reversals
Developments thus far are predicated in part on the desirability to identify alternatives having varying powers of discernment. These include the most likely and least likely as in Theorem 1(ii). If the least likely is deemed to be of greatest interest, it remains to ask whether it might serve instead as the most likely alternative. In the context of designed experiments the answer is affirmative, as the intrinsic structure offers a venue for modifying a given design so as to achieve these ends. Details follow.
Consider the model
with
centered such that
, where location–scale parameters for
are
with
as in Corollary 1(ii). In particular, the test for
against
utilizes
with
as the residual mean square and with noncentrality
, where it often suffices to take
. For
its singular decomposition, followed by
, is
(5)
with
as its left–singular vectors,
as its singular values, and columns of
as its right–singular vectors. Clearly
. Our principal reconstruction is articulated in the following.
Theorem 2. Let
be a permutation operator reversing the ordered array
to
, and let
. Next construct
such that pairs
and
are realigned.
Conclusion: The most likely and least likely alternatives for design
are reversed from those of
, so that
now is most likely with power
depending on
, and
least likely with power
depending on
.
Proof: Clearly the conventional reordering of eigenvalues gives
(6)
and the conclusion follows on applying Theorem 1(ii) in the context of Corollary 1(ii).
Remark 4. Variations on
are apparent. Any permutation of
gives the same conclusion. In addition, any pairs
may be selected in like manner as most likely and least likely to be discerned. Note, however, that these tools are available in the case of first–order designs.
Case Studies
Studies in Hotelling [2]
and second–order response models are given, to illustrate Theorem 1 and Corollary 1. Moreover, an example design serves to illustrates the Theorem 2 reversal of most likely and least likely alternatives.
Hotelling’s
Tests
We reexamine the role of calcium in the growth of turnip greens, using data as reported in Kramer et al. [5]. In each of 29 experimental plots the plant calcium (
) was determined, and the soil calcium was assayed as available (
) and exchangeable (
) calcium. The units all are milliequivalents per hundred grams. Horticultural specialists expect these to run at about 15.00, 6.00 and 2.85 units, respectively. The sample means are
and the sample dispersion matrix is
with inverse
in spectral form, as listed in
where
The data are ill–conditioned, with condition number
:
The statistic reported is
with
, rejecting at level
the hypothesis
in favor of some
. Indeed, the
–value is
with
. On applying Corollary 1(i) with
in lieu of
, the columns of
are taken as successive alternatives to
, namely
where the dominant terms are in bold type. The noncentrality parameters
are {162.720, 1.409, 0.121}, and taking
and
powers at these alternatives are
Accordingly,
has essentially unit power to distinguish the hypothetical deviation
from -0.99816, since the discrepancies 0.01024 for
and 0.05973 for
in
are negligible. Similarly,
is marginally able to distinguish [(
−15.00), (
−6.00)] from [0.22479, −0.97280] with power 0.1316, but is virtually unable to separate [(
−15.00), (
−6.00)] from [−0.97435, −0.22381] with negligible power of 0.0562. In short, the latter suggests [14.03, 5.78] to be plausible values for
.
This is an example, as seen subsequently also, where elements of
, especially
, convey useful information in regard to the objectives of the study. In summary, details regarding directed alternatives, enabled here by Theorem 1 and Corollary 1(i), go beyond conventional useage for
.
Hotelling’s
Charts.
Multivariate diagnostics figure prominently in Statistical Process Control (SPC), as reviewed subsequently. In monitoring the manufacture of bomb sights during World War II, Hotelling [6] devised
charts for multivariate means in
. Here successive values
are charted against time, where the chart signals the process to be out–of–control at level
whenever
Moreover, with power
the Average Run Length (ARL) of time–to–signal is
. To monitor the mean
against its target value
, successive samples of size n yield
, together with
having the
distribution with
. Phase I in SPC is set to establish base line process capabilities, to include parameter estimation, followed in Phase II by implementing the control charts themselves.
To continue, consider the data of Quesenberry [7] to be in Phase I, comprising n=30 records of 11 quality characteristics indexed in time–order of production. Following Williams et al. [8], dimensions are reduced on selecting the first k=5 quality characteristics, namely,
having means
, respectively. As in Section Hotelling’s
Tests we take
in lieu of
, finding the spectral resolution
as reported in Table 1. The data are seen to be highly ill–conditioned, with condition number
. In keeping with Corollary 1(i), five directed alternatives comprise the columns of
in Table 1, where dominant elements again are in bold type. In particular, this example shows
to be separately informative per se, as each corresponds essentially to deviations in
for observations
, respectively, since values other than those in bold type are negligible.
Eigenvalues |
520.4304 |
11.9529 |
1.1404 |
1.0843 |
0.1319 |
Eigenvectors |
|
|
|
|
|
0.99893 −0.00607 0.04535 −0.00396 −0.00505 |
−0.04553 −0.00445 0.99863 −0.01681 −0.01907 |
0.00490 0.14300 0.01937 −0.02275 0.98926 |
0.00543 0.98712 0.00322 0.07515 −0.14105 |
0.00291 −0.07126 0.01722 0.99676 0.03287 |
Power at
=0.05 and
=30 |
1.0000 |
1.0000 |
0.9914 |
0.9882 |
0.2367 |
Power at
=0.05 and
=8 |
1.0000 |
1.0000 |
0.1848 |
0.1779 |
0.0645 |
ARLs at
=0.05 and
=8 |
1.0000 |
1.0000 |
5.41 |
5.62 |
15.50 |
Table 1: Spectral values for
for the data of Quesenberry (2001) of order (30×5).
Taking
with
, and critical value
, powers against these directed alternatives are listed in Table 1. These show that deviations in the directions of
essentially would be discerned with power at least 0.9882, whereas power in the direction of
would be diminished to 0.2367. Note, however, that these values are inflated by the value n=30 in Phase I. Samples much smaller in size ordinarily would be taken in Phase II, say n=8 in this example. Then the powers corresponding to
at α=0.05 are given subsequently in Table 1. In short, with
as in the final row of Table 1 with samples of size n=8, these charts would detect changes in
and
immediately on average, but less responsive otherwise with
, respectively.
Remark 5. This example goes beyond conventional uses of
charts, in demonstrating that the capacity of a given chart to detect alternatives may differ widely across alternatives in
of compelling practical interest. In short, each of five ARLs pertains here to an informative one–dimensional alternative.
Second–Order Designs
Second–order models of type
(8)
are considered having zero mean, uncorrelated errors with variance
. In a typical setting the yield (
) of a chemical process is examined at specified reaction time
and temperature
. Small designs of historical consequence are the Central Composite (CCD) designs of Box et al. [9], having design points as listed in Table 2.
|
-1.00 -1.00 |
-1.00 1.00 |
0.00 |
0.00 |
0.00
|
0.00
|
1.00 -1.00 |
1.00 1.00 |
0.00 0.00 |
Table 2: Regressor vectors for the CCD design
of order (2×9), where
.
Proceeding as in Corollary 1(ii), we seek spectral values for the Fisher Information Matrix
, specifically, the eigenvalues and eigenvectors as listed in Table 3, with dominant terms again in bold type. Powers at these directed alternatives follow on taking
as surrogates in the noncentrality parameters for
with
and
as the residual mean square. Owing to only two degrees of freedom for error, computations replaced
then powers were determined from scaled noncentral
distributions with noncentrality parameters as listed in Table 3.
Eigenvalues |
24.3427 |
8.0000 |
8.0000 |
8.0000 |
4.0000 |
0.6573 |
Eigenvectors |
|
|
|
|
|
|
0.59349 0.00000 0.00000 0.56911 0.56911 0.00000 |
0.00000 0.00000 0.00000 0.70711 -0.70711 0.00000 |
0.00000 1.00000 0.00000 0.00000 0.00000 0.00000 |
0.00000 0.00000 -1.00000 0.00000 0.00000 0.00000 |
0.00000 0.00000 0.00000 0.00000 0.00000 1.00000 |
0.80484 0.00000 0.00000 -0.41966 -0.41966 0.00000 |
Power at
=0.05 |
0.9765 |
0.5307 |
0.5307 |
0.5307 |
0.2698 |
0.0775 |
Table 3: Spectral values for the Fisher Information Matrix
for the CCD design.
Arranged in decreasing order of their powers, these are
with power 0.9765;
each with power 0.5307; and
and
as alternatives with powers 0.2698 and 0.0775. Here elements of
are separately informative:
for discrepancies between
and their hypothetical values; and
for discrepancies between
and their hypothetical values, respectively. To continue, observe that the eigenvalue
is repeated three times. On applying Theorem 1(iv) in the context of Corollary 1(ii), we see that all standardized elements in
have power 0.5307. These include, in addition to
for
and
for
, the standardized sums
(9)
(10)
for example, with
for the discrepancy between
and
.
In summary, the directed second–order alternatives treated here are innovations not found in classical linear inference. Instead, these are enabled by Theorem 1 and Corollary 1(ii). Again the elements of
, especially
are separately informative about coefficients of the model (4.2). Their simple and revealing structure may be attributed to the symmetry and balance of CCD designs.
Design Reversals
Begin with
with
and
in centered form as in Section 3.2, with
having the design
as listed in Table 4. Construct
on permuting singular values, but retaining the left and right singular vectors.
Design
|
|
-1.0000 1.0000 -1.4142 1.4142 -1.0000 1.0000 0.0000 0.0000 |
0.0000 0.0000 1.0000 1.0000 0.0000 0.0000 -1.0000 -1.0000 |
-0.8000 0.6000 -0.8000 2.0000 -0.8000 0.6000 -0.8000 0.0000 |
Design
|
|
0.4279 -1.0391 0.2809 0.7126 0.4279 -1.0391 -1.1080 1.3370 |
-0.0193 -0.3197 -1.3211 -0.3556 -0.0193 -0.3197 0.4993 1.8556 |
-0.1811 0.8999 0.2484 -1.1704 -0.1811 0.8999 1.1798 -1.6954 |
Left–Singular Vectors
|
|
-0.3308 0.2946 -0.3736 0.6594 -0.3308 0.2946 -0.1792 -0.0342 |
0.0960 -0.1138 0.6024 0.3785 0.0960 -0.1138 -0.5082 -0.4372 |
0.0996 -0.3618 -0.1302 0.2927 0.0996 -0.3618 -0.3435 0.7055 |
Right–Singular Vectors
|
|
-0.7099 -0.1307 -0.6921 |
0.3508 -0.9177 -0.1865 |
0.6108 0.3752 -0.6973 |
Detection Probabilities: Design
|
|
0.4947 0.1828 0.0577 |
Table 4: Design matrix
and the modified
the left (
) and right (
) singular vectors; and the singular values
To continue, take the right–singular vectors, now
as directed alternatives to
together with
as
. Here the level 0.05 critical value is 6.5914; then powers are determined in turn from
together with the power functions
having values listed in the final row of Table 4.
In particular, discovering alternatives
in the direction
is seen to be unlikely, with power 0.0577. On the other hand, suppose instead that it is critical in context to discover alternatives in the negative orthant of
. Then the design
serves to reverse these so that alternatives in the directions of
are now detected with probabilities
, respectively.
Further properties of the design
and its reversal
deserve mention. Observe that
whereas
and
, where now
following the convention that
remain ordered. Clearly
and
differ, where their diagonal elements are listed as variances in Table 5. However, their eigenvalues are identical by construction, as are their
efficiency indices as the trace, determinant, and largest eigenvalues of
under both designs
.
Design Characteristics |
Design |
|
|
|
|
Estimates |
Variances |
Eigenvalues of |
|
0.12500 1.38170 0.69002 1.76012 |
0.12500 1.83546 0.26120 1.73518 |
3.53636 0.22694 0.12500 0.06854 |
3.53636 0.22694 0.12500 0.06854 |
Diagnostic |
A |
D |
E |
|
3.95684 |
0.00688 |
3.53636 |
Table 5: Variances of OLS solutions; eigenvalues of the dispersion matrix
as
for the designs
; and
efficiencies for these designs.
Summary and Discussion
This study reexamines the concept of directional invariance, or isotropy, for distributions on
having location–scale parameters
. Powers of tests for
against
often depend on noncentrality parameters of type
. The spectral decomposition of
supports the identification of directed alternatives in directions determined by the eigenvectors of
, to encompass the alternatives most likely and least likely in a given study. Powers of these types are independent of direction if and only if
. Applications are drawn in the use of Hotelling [2]
in multivariate samples, and of F-tests in linear models. Case studies are given where this approach leads to the discovery of further insight regarding the natural parameters of a problem.
One concept of directional invariance figures prominently in the SPC literature. The following is excerpted from Linna et al. [10].
“It is well known that the ARL performance of multivariate SPC procedures depends heavily on the covariance structure of the observed data. See, for example, Mason et al. [11]. Further, it has been noted by Pignatiello et al. [12] that many multivariate procedures, including the
chart, Hotellings
chart, and most of the multivariate CUSUM charts, are directionally invariant. The performance of the multivariate EWMA chart proposed in Lowry et al. [13] is also directionally invariant. Lowry et al. [14] and others also note the directional invariance of many of these multivariate control charting methods. Directional invariance means that the performance of a procedure does not depend on the specific direction in p-space of a shift in the mean vector of the process variables being monitored. Instead, performance of a directionally invariant procedure depends only on the statistical (or Mahalanobis) distance between the in-control mean vector
and the out-of-control mean vector under consideration,
.” (Italics supplied.) That is,
.
Unfortunately, this notion of directional invariance is grossly misleading, is antithetical to the very concept of invariance as in our Definition 1, at best is a misnomer, and in any event deserves to be clarified in the SPC literature. In fact, such essentials as power functions and ARLs do indeed depend on directions of alternatives, as seen in Section 4.2 as counter examples, unless the model is isotropic.
On the other hand, a disclaimer of Tsui et al. [15] should be noted: “There is no reason in practice, however, for a shift to
to be always considered as important as a shift to
just because the corresponding values of the noncentrality parameters are equal.” Our study represents a substantial elaboration on this point.
A Appendix
Rayleigh Quotients.
At issue are variational properties of quadratic forms of type
known as Rayleigh Quotients; see Bellman [16]. Write
in its spectral form with
so that
. Further partition
of orders
with
and partition
conformably. Then
(11)
where, in particular,
. Essentials follow.
Lemma A.1 Consider the positive definite form
as in expression (11).
(i) Variational properties of
as u varies over
are
(12)
where the lower and upper limits are attained at
and
, respectively.
(ii) Variational properties of
as u varies over
are
(13)
where the lower and upper limits are attained at
and
, respectively.
Proof: Conclusion (i) is given in Bellman [16], where the limits are attained as given since
and
. To see conclusion (ii),
may be represented as
. Then for
, (A.1) gives
. The lower and upper limits for
follow as in conclusion (i) for
, except that now these are attained as given since
and
.
Acknowledgement
The author is indebted to Professor Donald E. Ramirez for substantial contributions, including computations using the MINITAB and MAPLE software packages.
References
- Mahalanobis PC (1936) On the generalised distance in statistics. Proceedings National Institute of Science India 2(1): 49-55.
- Hotelling H (1931) The generalization of Student’s ratio. Ann Math Statist 2: 360-378.
- Mauchly JW (1940) Significance test for sphericity of a normal n-variate distribution. Ann Math Statist 11(2): 204-209.
- Rencher AC (2002) Methods of Multivariate Analysis, (2nd edn), Wiley, New York, USA.
- Kramer CY, Jensen DR (1969) Fundamentals of Multivariate Analysis, Part I. Inference about Means. J Quality Technology 1: 120-133.
- Hotelling H (1947) Multivariate quality control. In: Eisenhart C, et al. (Eds.) Techniques of Statistical Analysis, McGraw-Hill, New York, USA. pp. 111-184.
- Quesenberry CP (2001) The multivariate short-run snapshot Q chart. Quality Engineering 13(4): 679-683.
- Williams JD, Woodall WH, Birch JB, Sullivan JH (2005) Distribution of Hotelling’s T2 statistic based on sucessive differences covariance matrix estimator. J Quality Technology 38: 217-229.
- Box GEP, Wilson KB (1951) On the experimental attainment of optimum conditions. J Royal Statist Soc Ser B 13(1): 1-45.
- Linna KW, Woodall WH, Busby KL (2001) Performance of multivariate control charts in the presence of measurement error. J Quality Technology 33: 349-355.
- Mason RL, Champ CW, Tracy ND, Wierda SJ, Young JC (1997) Assessment of multivariate process control techniques. J Quality Technology 29: 140-143.
- Pignatiello JJ, Runger GC (1990) Comparisons of multivariate CUSUM charts. J Quality Technology 22(3): 173-186.
- Lowry CA, Woodall WH, Champ CW, Rigdon SE (1992) A multivariate exponentially weighted moving average control chart. Technometrics 34(1): 46-53.
- Lowry CA, Montgomery DC (1995) A review of multivariate control charts. IIE Transactions 27(6): 800-810.
- Tsui KL, Woodall WH (1993) Multivariate control charts based on loss functions. Sequential Analysis 12(1): 79-92.
- Bellman R (1960) Introduction to Matrix Analysis. McGraw-Hill, New York, USA.
©2017 Jensen. This is an open access article distributed under the terms of the,
which
permits unrestricted use, distribution, and build upon your work non-commercially.