Research Article Volume 8 Issue 2
1BVEF Research Institute, University of Latvia, Latvia
2Aviation Department, Riga Aeronautical Institute, Latvia
Correspondence: Nicholas Nechval, BVEF Research Institute, University of Latvia, Riga LV-1586, Latvia
Received: March 30, 2024 | Published: April 11, 2024
Citation: Nechval N, Berzins G, Nechval K. Novel approach to the smart constructing adequate predictive or confidence decisions for applied mathematical models under parametric uncertainty via pivotal quantities and ancillary statistics. Aeron Aero Open Access J. 2024;8(2):58-76. DOI: 10.15406/aaoaj.2024.08.00194
The technique used here emphasizes pivotal quantities and ancillary statistics relevant for obtaining statistical predictive or confidence decisions for anticipated outcomes of applied stochastic models under parametric uncertainty and is applicable whenever the statistical problem is invariant under a group of transformations that acts transitively on the parameter space. It does not require the construction of any tables and is applicable whether the experimental data are complete or Type II censored. The proposed technique is based on a probability transformation and pivotal quantity averaging to solve real-life problems in all areas including engineering, science, industry, automation & robotics, business & finance, medicine and biomedicine. It is conceptually simple and easy to use.
Keywords: anticipated outcomes, parametric uncertainty, unknown (nuisance) parameters, elimination, pivotal quantities, ancillary statistics, new-sample prediction, within-sample prediction
Statistical predictive or confidence decisions (under parametric uncertainty) for future random quantities (future outcomes, order statistics, etc.) based on the past and current data is the most prevalent form of statistical inference. Predictive inferences for future random quantities are widely used in risk management, finance, insurance, economics, hydrology, material sciences, telecommunications, and many other industries. Predictive inferences (predictive distributions, prediction or tolerance limits (or intervals), confidence limits (or intervals) for future random quantities on the basis of the past and present knowledge represent a fundamental problem of statistics, arising in many contexts and producing varied solutions. The approach used here is a special case of more general considerations applicable whenever the statistical problem is invariant under a group of transformations, which acts transitively on the parameter space.1–29
Theorem 1: Let us assume that Y1£ … £Yn will be a new (future) random sample of n ordered observations from a known distribution with a probability density function (pdf) fρ(y), cumulative distribution function (cdf) Fρ(y) , where ρ is the parameter (in general, vector). Then the adequate mathematical models for a cumulative probability distribution function of the kth order statistic Yk, kÎ{1, 2, …, n}, to construct one-sided γ − content tolerance limits (or two-sided tolerance interval) for Yk with confidence level β, are given as follows:
Fρ(yk)∫0fk,n−k+1(r)dr=Pρ(Yk≤yk|n)=n∑j=k(nj)[Fρ(yk)]j [1−Fρ(yk)]n−j. (1)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUk with confidence level β can be obtained by using the following formula:
E{Pr(Fρ(yUk)∫0fk,n−k+1(r)dr≥γ)}=E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β, (2)
where
fk,n−k+1(r)=1Β(k,n−k+1)rk−1(1−r)(n−k+1)−1, 0<r<1, (3)
is the probability density function (pdf) of the beta distribution (Beta(k,n−k+1)) with the shape parameters k and nk+1.
Proof: It follows from (1) that
ddykFρ(yk)∫0fk,n−k+1(r)dr=ddykPρ(Yk≤yk|n). (4)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(Pρ(Yk>yLk|n)≥γ)}=E{Pr(1−Fμ(yLk)∫0fk,n−k+1(u)du≥γ)}=β. (5)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLk(E{Pr(Pρ(Yk>yLk|n)≥γ)}=β), argyUk(E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β)]
=[argyLk(E{Pr(Fμ(yLk)∫0fk,n−k+1(r)dr≤1−γ)}=β), argyUk(E{Pr(Fρ(yUk)∫0fk,n−k+1(r)dr≥γ)}=β)]
=[yLk, yUk].
(6)
1∫1−Fρ(yk)fn−k+1,k(r)dr=Pρ(Yk≤yk|n)=n∑j=k(nj)[Fρ(yk)]j [1−Fρ(yk)]n−j. (7)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUk with confidence level β can be obtained by using the following formula:
E{Pr(1∫1−Fρ(yUk)fn−k+1,k(r)dr≥γ)=E{Pr(Pρ(Yk≤yUk|n)≥γ)}}=β, (8)
where
fn−k+1,l(u)=1Β(n−k+1,k)r(n−k+1)−1(1−r)k−1fk,n−k+1(r), 0<r<1, (9)
is the probability density function (pdf) of the beta distribution (Beta(n−k+1,k)) with the shape parameters nk+1 and k.
Proof: It follows from (9) that
ddyk1∫1−Fρ(yk)fn−k+1,k(r)dr=ddykPρ(Yk≤yk|n). (10)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(Pρ(Yk>yLk|n)≥γ)}=E{Pr(1−1∫1−Fρ(yLk)fn−k+1,k(r)dr≥γ)}=β. (11)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLk(E{Pr(Pρ(Yk>yLk|n)≥γ)}=β), argyUk(E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β)]
=[yLk, yUk].
(12)
n−k+1kFρ(yk)1−Fρ(yk)∫0φk,n−k+1(r)dr=Pρ(Yk≤yk|n)=n∑j=k(nj)[Fρ(yk)]j [1−Fρ(yk)]n−j. (13)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUk with confidence level β can be obtained by using the following formula:
E{Pr(n−k+1kFρ(yUk)1−Fρ(yUk)∫0φk,n−k+1(r)dr≥γ)}=E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β, (14)
where
φk,n−k+1(r)=1Β(k,n−k+1) [kn−k+1r]k−1[1+kn−k+1r]n+1kn−k+1, r∈(0,∞), (15)
is the probability density function (pdf) of the F distribution (F(k,n−k+1)) with parameters k and n−k+1, which are positive integers known as the degrees of freedom for the numerator and the degrees of freedom for the denominator.
Proof: It follows from (13) that
ddykn−k+1kFρ(yk)1−Fρ(yk)∫0φk,n−k+1(r)dr=ddykPρ(Yk≤yk|n). (16)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(Pρ(Yk>yLk|n)≥γ)}=E{Pr(1−n−k+1kFρ(yLk)1−Fρ(yLk)∫0φk,n−k+1(r)dr≥γ)}=β. (17)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLk(E{Pr(Pρ(Yk>yLk|n)≥γ)}=β), argyUk(E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β)]
=[argyLk(E{Pr(n−k+1kFρ(yLk)1−Fρ(yLk)∫0φk,n−k+1(r)dr≤1−γ)}=β), argyUk(E{Pr(n−k+1kFρ(yUk)1−Fρ(yUk)∫0φk,n−k+1(r)dr≥γ)}=β)]
=[yLk, yUk].
(18)
∞∫kn−k+11−Fρ(yk)Fρ(yk)φn−k+1,k(r)dr=Pρ(Yk≤yk|n)=n∑j=k(nj)[Fρ(yk)]j [1−Fρ(yk)]n−j. (19)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUk with confidence level β can be obtained by using the following formula:
E{Pr(∞∫kn−k+11−Fρ(yUk)Fρ(yUk)φn−k+1,k(r)dr≥γ)}=E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β, (20)
where
φn−k+1,k(r)=n−k+1kΒ(n−k+1,k)[n−k+1kr]n−k[1+n−k+1kr]n+1, r∈(0,∞), (21)
is the probability density function (pdf) of the F distribution (F(n−k+1,k)) with parameters n−k+1 and k, which are positive integers known as the degrees of freedom for the numerator and the degrees of freedom for the denominator.
Proof: It follows from (19) that
ddyk∞∫kn−k+11−Fρ(yk)Fρ(yk)φn−k+1,k(r)dr=ddykPρ(Yk≤yk|n). (22)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(1−∞∫kn−k+11−Fρ(yLk)Fρ(yLk)φn−k+1,k(r)dr≥γ)}=E{Pr(Pρ(Yk>yLk|n)≥γ)}=β. (23)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLk(E{Pr(Pρ(Yk>yLk|n)≥γ)}=β), argyUk(E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β)]
=[argyLk(E{Pr(∞∫kn−k+11−Fρ(yLk)Fρ(yLk)φn−k+1,k(r)dr≤1−γ)}=β), argyUk(E{Pr(∞∫kn−k+11−Fρ(yUk)Fρ(yUk)φn−k+1,k(r)dr≥γ)}=β)]
=[yLk, yUk].
(24)
Theorem 2: Let us assume that Y1£ … £Yn will be a new (future) random sample of n ordered observations from a known distribution with a probability density function (pdf) fρ(y), cumulative distribution function (cdf) Fρ(y) , where ρ is the parameter (in general, vector). Then the adequate mathematical models for a conditional cumulative distribution function (ccdf) of the lth order statistic Yl, lÎ{2, …, n}, to construct one-sided γ − content tolerance limits (or two-sided tolerance interval) for Yl (1 £ k < l £ n) ), given Yk=yk, with confidence level β, are determined as follows:
1−ˉFρ(yl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr=Pρ(Yl≤yl|Yk=yk;n)=n−k∑j=l−k(n−kj)[1−ˉFρ(yl)ˉFρ(yk)]j[ˉFρ(yl)ˉFρ(yk)]n−k−j, (25)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUl with confidence level β can be obtained by using the following formula:
E{Pr(1−ˉFρ(yUl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≥γ)}=E{Pr(Pρ(Yl≤yUl|Yk=yk;n)≥γ)}=β, (26)
where ˉFμ(z)=1−Fμ(z),
fl−k,n−l+1(r)=rl−k−1(1−r)(n−l+1)−1Β(l−k,n−l+1), 0<r<1, (27)
is the probability density function (pdf) of the beta distribution (Beta(l−k,n−l+1)) with shape parameters l−k and n−l+1.
Proof: It follows from (25) that
ddyl1−ˉFρ(yl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr=ddylPρ(Yl≤yl|Yk=yk;n). (28)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(1−1−ˉFρ(yLl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≥γ)}=E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β. (29)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLl(E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β), argyUl(E{Pr(Pρ(Yl≤yUl|n)≥γ)}=β)]
=[argyLk(E{Pr(1−ˉFρ(yLl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≤1−γ)}=β), argyUk(E{Pr(1−ˉFρ(yUl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≥γ)}=β)]
=[yLl, yUl].
(30)
1∫ˉFρ(yl)ˉFρ(yk)fn−l+1,l−k(r)dr=Pρ(Yl≤yl|Yk=yk;n)=n−k∑j=l−k(n−kj)[1−ˉFρ(yl)ˉFρ(yk)]j[ˉFρ(yl)ˉFρ(yk)]n−k−j, (31)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUl with confidence level β can be obtained by using the following formula:
E{Pr(1∫ˉFρ(yUl)ˉFρ(yk)fn−l+1,l−k(r)dr≥γ)}=E{Pr(Pρ(Yl≤yUl|Yk=yk;n)≥γ)}=β, (32)
where ˉFρ(y)=1−Fρ(y),
fn−l+1,l−k(r)=r(n−l+1)−1(1−r)l−k−1Β(n−l+1,l−k), 0<r<1, (33)
is the probability density function (pdf) of the beta distribution (Beta(n−l+1,l−k)) with shape parameters n−l+1 and l−k.
Proof: It follows from (31) that
ddyl1∫ˉFρ(yl)ˉFρ(yk)fn−l+1,l−k(r)dr=ddylPρ(Yl≤yl|Yk=yk;n). (34)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(1−1∫ˉFρ(yLl)ˉFρ(yk)fn−l+1,l−k(r)dr≥γ)}=E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β. (35)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLl(E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β), argyUl(E{Pr(Pρ(Yl≤yUl|Yk=yk;n)≥γ)}=β)]
=[argyLl(E{Pr(1∫ˉFρ(yLl)ˉFρ(yk)fn−l+1,l−k(r)dr≤1−γ)}=β), argyUk(E{Pr(1∫ˉFρ(yUl)ˉFρ(yk)fn−l+1,l−k(r)dr≥γ)}=β)]
=[yLl, yUl].
(36)
This ends the proof.
n−l+1l−k(1−ˉFρ(yl)ˉFρ(yk))/ˉFρ(yl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr=Pρ(Yl≤yl|Yk=yk;n)=n−k∑j=l−k(n−kj)[1−ˉFρ(yl)ˉFρ(yk)]j[ˉFρ(yl)ˉFρ(yk)]n−k−j, (37)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUl with confidence level β can be obtained by using the following formula:
E{Pr(n−l+1l−k(1−ˉFρ(yUl)ˉFρ(yk))/ˉFρ(yUl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≥γ)}=E{Pr(Pρ(Yl≤yUl|Yk=yk;n)≥γ)}=β, (38)
where ˉFρ(y)=1−Fρ(y),
fl−k,n−l+1(r)=l−kn−l+1Β(l−k,n−l+1)[l−kn−l+1r]l−k−1[1+l−kn−l+1r]n−k+1, r∈(0,∞), (39)
is the probability density function (pdf) of the F distribution (F(l−k,n−l+1)) with parameters l−k and n−l+1, which are positive integers known as the degrees of freedom for the numerator and the degrees of freedom for the denominator.
Proof: It follows from (36) that
ddyln−l+1l−k(1−ˉFρ(yl)ˉFρ(yk))/ˉFρ(yl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr=ddylPρ(Yl≤yl|Yk=yk;n). (40)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(1−n−l+1l−k(1−ˉFρ(yLl)ˉFρ(yk))/ˉFρ(yLl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≥γ)}=E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β. (41)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLl(E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β), argyUl(E{Pr(Pr(Pρ(Yl≤yUl|Yk=yk;n)≥γ))}=β)]
=[argyLl(E{Pr(n−l+1l−k(1−ˉFρ(yLl)ˉFρ(yk))/ˉFρ(yLl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≤1−γ)}=β),argyUk(E{Pr(n−l+1l−k(1−ˉFρ(yUl)ˉFρ(yk))/ˉFρ(yUl)ˉFρ(yk)∫0fl−k,n−l+1(r)dr≥γ)}=β)]
=[yLl, yUl].
(42)
This ends the proof.
∞∫l−kn−l+1ˉFρ(yl)ˉFρ(yk)/(1−ˉFρ(yl)ˉFρ(yk))fn−l+1,l−k,(r)dr
=Pρ(Yl≤yl|Yk=yk;n)
=n−k∑j=l−k(n−kj)[1−ˉFρ(yl)ˉFρ(yk)]j[ˉFρ(yl)ˉFρ(zk)]n−k−j
(43)
In the above case, a (γ,β) upper, one-sided γ− content tolerance limit yUl with confidence level β can be obtained by using the following formula:
E{Pr(∞∫l−kn−l+1ˉFρ(yUl)ˉFρ(yk)/(1−ˉFρ(yUl)ˉFρ(yk))fn−l+1,l−k,(r)dr≥γ)}=E{Pr(Pρ(Yl≤yUl|Yk=yk;n)≥γ)}=β, (44)
where ˉFρ(y)=1−Fρ(y),
fn−l+1,l−k(r)=l−kn−l+1Β(l−k,n−l+1)[l−kn−l+1r]l−k−1[1+l−kn−l+1r]n−k+1, r∈(0,∞), (45)
is the probability density function (pdf) of the F distribution (F(n−l+1,l−k)) with parameters nl+1 and lk, which are positive integers known as the degrees of freedom for the numerator and the degrees of freedom for the denominator.
Proof: It follows from (36) that
ddyl∞∫l−kn−l+1ˉFρ(yl)ˉFρ(yk)/(1−ˉFρ(yl)ˉFρ(yk))fn−l+1,l−k,(r)dr=ddylPρ(Yl≤yl|Yk=yk;n). (46)
This ends the proof.
A (γ,β) lower, one-sided γ− content tolerance limit with confidence level β can be obtained by using the following formula:
E{Pr(1−∞∫l−kn−l+1ˉFρ(yLl)ˉFρ(yk)/(1−ˉFρ(yLl)ˉFρ(yk))fn−l+1,l−k,(r)dr≥γ)}=E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β. (47)
A (γ,β) two-sided γ− content tolerance interval with confidence level β can be obtained by using the following formula:
[argyLl(E{Pr(Pρ(Yl>yLl|Yk=yk;n)≥γ)}=β), argyUl(E{Pr(Pr(Pρ(Yl≤yUl|Yk=yk;n)≥γ))}=β)]
=[argyLl(E{Pr(∞∫l−kn−l+1ˉFρ(yLl)ˉFρ(yk)/(1−ˉFρ(yLl)ˉFρ(yk))fn−l+1,l−k,(r)dr≤1−γ)}=β),argyUk(E{Pr(∞∫l−kn−l+1ˉFρ(yUl)ˉFρ(yk)/(1−ˉFρ(yUl)ˉFρ(yk))fn−l+1,l−k,(r)dr≥γ)}=β)]
=[yLl, yUl].
(48)
This ends the proof.
Let Y = (Y1 £ ... £ Ym) be the first m ordered observations (order statistics) in a sample of size h from the two-parameter exponential distribution with the probability density function
fρ(y)=ϑ−1exp(−y−υϑ), ϑ>0, υ≥0, (49)
and the cumulative probability distribution function
Fρ(y)=1−exp(−y−υϑ), ˉFρ(y)=1−Fρ(y)=exp(−y−υϑ), (50)
where ρ=(υ,ϑ), υ is the shift parameter and ϑ is the scale parameter. It is assumed that these parameters are unknown. In Type II censoring, which is of primary interest here, the number of survivors is fixed and Y is a random variable. In this case, the likelihood function is given by
L(υ,ϑ)=m∏i=1fρ(yi)(ˉFρ(ym))h−m
=1ϑmexp(−[m∑i=1(yi−υ)+(h−m)(ym−υ)]/ϑ)
=1ϑmexp(−[m∑i=1(yi−y1+y1−υ)+(h−m)(ym−y1+y1−υ)]/ϑ)
=1ϑm−1exp(−[m∑i=1(yi−y1)+(h−m)(ym−y1)]/ϑ)
×1ϑexp(−h(y1−υ)ϑ)
=1ϑm−1exp(−smϑ)×1ϑexp(−h(s1−υ)ϑ),
(51)
where
S=(S1=Y1, Sm=m∑i=1(Yi−Y)1+(h−m)(Ym−Y1)) (52)
is the complete sufficient statistic for ρ. The probability density function of S = (S1, Sm) is given by
fρ(s1,sm)
=1ϑm−1exp(−smϑ)×1ϑexp(−h(s1−υ)ϑ)1sm−2m∞∫0sm−2mϑm−1exp(−smϑ)dsm×1q∞∫0hϑexp(−h(s1−υ)ϑ)ds1
=1ϑm−1exp(−smϑ)×1ϑexp(−h(s1−υ)ϑ)Γ(m−1)sm−2m×1h
=1Γ(m−1)ϑm−1sm−2mexp(−smϑ)
×hϑexp(−h(s1−υ)ϑ)=fϑ(sm)fρ(s1),
(53)
where
fρ(s1)=hϑexp(−h(s1−υ)ϑ), s1≥υ, (54)
fϑ(sm)=1Γ(m−1)ϑm−1sm−2mexp(−smϑ), sm≥0. (55)
V1=S1−υϑ (56)
is the pivotal quantity, the probability density function of which is given by
f1(v1)=hexp(−hv1), v1≥0, (57)
Vm=Smϑ (58)
is the pivotal quantity, the probability density function of which is given by
fm(vm)=1Γ(m−1)vm−2mexp(−vm), vm≥0. (59)
Theorem 3: Let Y1£…£Ym be the first m ordered observations from the preliminary sample of size h from a two-parameter exponential distribution defined by the probability density function (49). Then the upper one-sided γ-content tolerance limit (with a confidence level β) yUk on the kth order statistic Yk from a set of n future ordered observations Y1£…£Yn also from the distribution (49), which satisfies
E{Pr(Pρ(Yk≤yUk|n)≥γ)}=β, (60)
is given by
yUk={S1+Smh[1−(Ωhγβ)1m−1], if (Ωhγβ)1m−1≤1,S1+Smh[(Ωhγβ)1m−1−1], if (Ωhγβ)1m−1>1, (61)
where
Ωγ=1−q(k,n−k+1),γ(Beta(k,n-k+1), γ quantile). (62)
Proof: It follows from (2) and (3) that
E{Pr(Pρ(Yk≤yUk|n)≥γ)}
=E{Pr(Fρ(yUk)∫0fk,n−k+1(r)dr≥γ)}
=E{Pr(1−exp(−yUk−υϑ)≥qk,n−k+1;γ)}
=E{Pr(exp(−yUk−υϑ)≤1−qk,n−k+1;γ)}
=E{Pr(−yUk−υϑ≤ln(1−qk,n−k+1;γ))}=E{Pr(yUk−υϑ≥−ln(1−qk,n−k+1;γ))}
=E{Pr(yUk−S1SmSmϑ+S1−υϑ≥−ln(1−qk,n−k+1;γ))}
=E{Pr(S1−υϑ≥−yUk−S1SmSmϑ−ln(1−qk,n−k+1;γ))}
=E{Pr(V1≥−ηUkVm−lnΩγ)}=E{1−Pr(V1≤−ηUkVm−lnΩγ)}=E{1−−ηUkVm−lnΩγ∫0f1(v1)dv1},
(63)
where
ηUk=yUk−S1Sm. (64)
It follows from (63) and (64) that
E{1−−ηUkVm−lnΩγ∫0f1(v1)dv1}
=E{1−−ηUkVm−lnΩγ∫0hexp(−hv1)dv1}
=E{1−[1−exp(−h[−ηUkVm−lnΩγ])]}
=E{exp(hηUkVm)exp(lnΩhγ)}
=E{Ωhγexp(hηUkVm)}
=∞∫0(Ωhγexp(hηUkvm))fm(vm)dvm
=∞∫0(Ωhγexp(hηUkvm))1Γ(m−1)vm−2mexp(−vm)dvm=Ωhγ∞∫01Γ(m−1)vm−2mexp(−vm[1−hηUk])dvm
=Ωhγ[1−hηUk]m−1=β.
(65)
It follows from (64) and (65) that
ηUk=yUk−S1Sm=1h(1−[Ωhγβ]1m−1). (66)
It follows from (66) that
yUk=S1+Smh(1−[Ωhγβ]1m−1). (67)
Then (61) follows from (67), this ends the proof.
Theorem 4: Let Y1£…£Ym be the first m ordered observations from the preliminary sample of size h from a two-parameter exponential distribution defined by the probability density function (49). Then the lower one-sided γ-content tolerance limit (with a confidence level β) yLk on the kth order statistic Yk from a set of n future ordered observations Y1£…£Yn also from the distribution (49)), which satisfies
E{Pr(Pμ(Yk>yLk|n)≥γ)}=β, (68)
is given by
yLk={S1+Smh[1−(Ωh1−γ1−β)1m−1], if (Ωh1−γ1−β)1m−1≤1,S1+Smh[(Ωh1−γ1−β)1m−1−1], if (Ωh1−γ1−β)1m−1>1, (69)
where
Ω1−γ=1− q(k,n−k+1),1−γ(Beta(k,n-k+1), 1−γ quantile). (70)
Proof: It follows from (3) and (5) that
E{Pr(Pρ(Yk>yLk|n)≥γ)}
=E{Pr(Fρ(yLk)∫0fk,n−k+1(r)dr≤1−γ)}
=E{Pr(exp(−yLk−υϑ)≥1−qk,n−k+1;1−γ)}
=E{Pr(yLk−S1SmSmϑ+S1−υϑ≤−ln(1−qk,n−k+1;1−γ))}
=E{Pr(S1−υϑ≤−yLk−S1SmSmϑ−ln(1−qk,n−k+1;1−γ))}
=E{Pr(V1≤−ηLkVm−lnΩ1−γ)}=E{−ηLkVm−lnΩ1−γ∫0f1(v1)dv1},
(71)
where
ηLk=yLk−S1Sm. (72)
It follows from (57) and (71) that
E{−ηLkVm−lnΩ1−γ∫0f1(v1)dv1}=E{−ηLkVm−lnΩ1−γ∫0hexp(−hv1)dv1}
=E{1−exp(−h[−ηLkVm−lnΩ1−γ])}
=E{1−exp(hηLkVm)exp(qlnΩ1−γ)}
=E{1−Ωh1−γexp(hηLkVm)}
=∞∫0(1−Ωh1−γexp(hηLkvm))fm(vm)dvm
=∞∫0(1−Ωh1−γexp(hηLkvm))1Γ(m−1)vm−2mexp(−vm)dvm=1−Ωh1−γ∞∫01Γ(m−1)vm−2mexp(−vm[1−hηLk])dvm
=1−Ωh1−γ[1−hηLk]m−1=β.
(73)
It follows from (72) and (73) that
ηLk=yLk−S1Sm=1h(1−[Ωh1−γ1−β]1m−1). (74)
It follows from (74) that
yLk=S1+Smh(1−[Ωh1−γ1−β]1m−1). (75)
Then (69) follows from (75), this ends the proof.
Let us assume that k =5, m =8, h =10, n=12, γ = β = 0.95,
S=(S1=Y1=9, Sm=m∑i=1(Yi−Y)1+(h−m)(Ym−Y1))
=(S1=9, Sm=0+1+2+4+6+10+15+23+(10−8)23=107), (76)
Then , the (γ=0.95,β=0.95) upper, one-sided γ− content tolerance limit yUk with confidence level β can be obtained from (61), where the quantile of Beta(k,n-k+1),γ is given by
q(k,n−k+1),γ=0.609138, (77)
Ω1−γ=1−q(k,n−k+1),1−γ=1−0.609138=0.390862. (78)
It follows from (61), (76) and (78) that
yUk=S1+Smh[1−(Ωhγβ)1m−1]=9+10710[(1−[0.390862]100.95)18−1]=9+7.883285=16.883285. (79)
The (γ=0.95,β=0.95) lower, one-sided γ− content tolerance limit yUk with confidence level β can be obtained from (69), where the quantile of Beta(k,n-k+1),1−γ is given by
q(k,n−k+1),1−γ=0.181025, (80)
Ω1−γ=1−q(k,n−k+1),1−γ=1−0.181025=0.818975. (81)
It follows from (69), (76) and (81) that
yLk=S1+Smh[(Ωhγ1−β)1m−1−1]=9+10710[([0.818975]101−0.95)18−1−1]=9+10710[1.15335326−1]=10.64088. (82)
The (γ=0.95,β=0.95) two-sided γ− content tolerance interval with confidence level β can be obtained by using (6), (79) and (82):
[yLk, yUk]=[10.64088, 16.883285]. (83)
Theorem 5: If W1∈N(0,1) and W2∈χ2(υ) are independent random variables, then
W1/√W2/υ=T(υ), (84)
where t(υ) follows the student’s t distribution with υ degrees of freedom,
t(υ)∼f(t)=Γ((υ+1)/2))√πυ Γ(υ/2)[1+t2υ]−(υ+1)/2, −∞<t<∞. (85)
Proof.
w1∼f1(w1)=1√2πexp(−w212), −∞<w1<∞, (86)
where
w1=t[w2υ]1/2, dw1=[w2υ]1/2dt. (87)
It follows from (86) and (87) that
f1(w1)dw1=1√2πexp(−w212)dw1
=1√2πexp(−t2[w2/υ]2)[w2υ]1/2dt=f(t|w2)dt, −∞<t<∞.
(88)
w2∼f2(w2)=1Γ(υ/2)2υ/2w(υ/2)−12exp(−w22), 0<w2<∞. (89)
It follows from (88) and (89) that
f(t)=∞∫0f(t|w2)f2(w2)dw2
=∞∫01√2πexp(−t2[w2/υ]2)[w2υ]1/21Γ(υ/2)2υ/2w(υ/2)−12exp(−w22)dw2
=∞∫01√πυΓ(υ/2)2(υ+1)/2w(υ+1)/2)−12exp(−w22[1+t2υ])dw2
=Γ((υ+1)/2))√πυ Γ(υ/2)[1+t2υ]−(υ+1)/2, −∞<t<∞.
(90)
This ends the proof.
In most applications, two populations are compared using the difference in the means. Let U1, U2, ..., Um be a sample of size m from a normal population having mean μm and variance σ2m and let Z1, ..., Zn be a sample of size n from a different normal population having mean μn and variance σ2n and suppose that the two samples are independent of each other. We are interested in constructing a confidence interval for μm−μn. To obtain this confidence interval, we need the distribution of ˉUm−ˉZn, where
ˉUm=m∑i=1Ui/m∼N(μm,σ2m/m), ˉZn=m∑i=1Zi/n∼N(μn,σ2n/n). (91)
It follows from (91) that
ˉUm−ˉZn∼N(μm−μn,σ2mm+σ2nn). (92)
It follows from (92) that
ˉUm−ˉZn−(μm−μn)σ2m/m+σ2n/n=W1∼N(0,1). (93)
This is independent of
m∑i=1(Ui−ˉUm)2/σ2m=(m−1)σ2mm∑i=1(Ui−ˉUm)2(m−1)=(m−1)S2mσ2m∼χ2m−1 (94)
and
n∑i=1(Zi−ˉZn)2/σ2n=(n−1)σ2nn∑i=1(Zi−Zn)2(n−1)=(n−1)S2nσ2n∼χ2n−1, (95)
where
(m−1)S2mσ2m+(n−1)S2nσ2n=W2∼χ2(m+n−2). (96)
Taking (84), (93) and (96) into account, we have that
W1√W2/(m+n−2)=ˉUm−ˉZn−(μm−μn)σ2m/m+σ2n/n√[(m−1)S2mσ2m+(n−1)S2nσ2n]/(m+n−2)
=ˉUm−ˉZn−(μm−μn)√(m−1)S2m/σ2m+(n−1)S2n/σ2n√m+n−2σ2m/m+σ2n/n=T(m+n−2)∼f(t),
(97)
where T(m+n-2) is a t-random variable with m + n – 2 degrees of freedom,
f(t)=Γ((m+n−1)/2)√π(m+n−2) Γ((m+n−2)/2)[1+t2m+n−2]−(m+n−1)/2, −∞<t<∞. (98)
Using (97) and (98), it can be obtained a 100(1-a)% confidence interval for ˉUm−ˉZn−(μm−μn) from
P(t1≤T(m+n−2)≤t2)
=P(t1≤ˉUm−ˉZn−(μm−μn)√(m−1)S2m/σ2m+(n−1)S2n/σ2n√m+n−2√σ2m/m+σ2n/n≤t2)
=P(t1√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+σ2n/n≤ˉUm−ˉZn−(μm−μn) ≤t2√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+σ2n/n)=1−α
(99)
by suitably choosing the decision variables t1 and t2. Hence, the statistical confidence interval for ˉUm−ˉZn−(μm−μn) is given by
[t1√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−21√σ2m/m+σ2n/n, t2√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−21√σ2m/m+σ2n/n]. (100)
The length of the statistical confidence interval for ˉUm−ˉZn−(μm−μn) is given by
L(t1,t2|√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+σ2n/n)
=(t2−t1)√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+σ2n/n.
(101)
In order to find the confidence interval of shortest-length for ˉUm−ˉZn−(μm−μn) , we should find a pair of decision variables t1 and t2 such that (101) is minimum.
It follows from (98) and (99) that
t2∫t1f(t)dt=t2∫0f(t)dt−t1∫0f(t)dt =(1−α+p)−p=1−α, (102)
where p (0≤p≤α) is a decision variable,
t2∫0f(t)dt=1−α+p (103)
and
t1∫0f(t)dt=p. (104)
Then t2 represents the (1−α+p) - quantile, which is given by
t2=q1−α+p;(t(m+n−2)), (105)
t1represents the p - quantile, which is given by
t1=qp;(t(m+n−2)). (106)
The shortest length confidence interval for ˉUm−ˉZn−(μm−μn) can be found as follows:
Minimize
(t2−t1)2=(q1−α+p;(t(m+n−2))−qp;(t(m+n−2)))2 (107)
subject to
0≤p≤α, (108)
The optimal numerical solution minimizing (t2−t1)2 can be obtained using the standard computer software "Solver" of Excel 2016. If σ2m=σ2n, it follows from (101) that
L(t1,t2|√(m−1)S2m+(n−1)S2nm+n−2√m+nmn)=(t2−t1)√(m−1)S2m+(n−1)S2nm+n−2√m+nmn. (109)
If, for example, m=58, n=27, a = 0.05, ˉUm=70.7, ˉZn=76.13, S2m=(1.8)2, S2n=(2.42)2, then the optimal numerical solution of (107) is given by
p=0.025, t1=qp;(t(m+n−2))=−1.98896, t2=q1−α+p;(t(m+n−2))=1.98896 (110)
and it follows from (99) and (109) that the 100(1-a)% confidence interval of shortest-length (or equal tails) for μ1−μ2 is given by
(μm−μn)∈((ˉUm−ˉZn)−t2√(m−1)S2m+(n−1)S2nm+n−2√m+nmn,(ˉUm−ˉZn)−t1√(m−1)S2m+(n−1)S2nm+n−2√m+nmn)=(−6.330947,−4.52905) (111)
or
−6.330947 ≤ μ−mμn≤−4.52905. (112)
Ratio in the means is used to compare two populations of positive data. Let U1, U2, ..., Um be a sample of size m from a normal population having mean μm and variance σ2m and let U1, ..., Un be a sample of size n from a different normal population having mean μn and variance σ2n and suppose that the two samples are independent of each other. We are interested in constructing a confidence interval for the ratio of means (μm,μn) of two different normal populations To obtain this confidence interval, we need the distribution of ˉUm−κˉUn, where
ˉUm=m∑i=1Ui/m∼N(μm,σ2m/m), ˉUn=n∑i=1Ui/n∼N(μn,σ2n/n). (113)
It can be shown that
ˉUm−κˉUn∼N(μm−κμn,σ2mm+κ2σ2nn) (114)
or
ˉUm−κˉUn−(μm−κμn)√σ2mm+κ2σ2nn=W1∼N(0,1). (115)
This is independent of
m∑i=1(Ui−ˉUm)2/σ2m=(m−1)σ2mm∑i=1(Ui−ˉUm)2(m−1)=(m−1)S2mσ2m∼χ2m−1 (116)
and
n∑j=1(Uj−ˉUn)2/σ2n=(n−1)σ2nn∑j=1(Uj−ˉUn)2(n−1)=(n−1)S2nσ2n∼χ2n−1, (117)
where
(m−1)S2mσ2m+(n−1)S2nσ2n=W2∼χ2(m+n−2). (118)
It follows from (84), (115) and (118) that
W1√W2/(m+n−2)=ˉUm−κˉUn−(μm−κμn)√σ2mm+κ2σ2nn1√[(m−1)S2mσ2m+(n−1)S2nσ2n]/(m+n−2)
=ˉUm−κˉUn−(μm−κμn)√(m−1)S2m/σ2m+(n−1)S2n/σ2n√m+n−2σ2m/m+κ2σ2n/n=T(m+n−2)∼f(t), (119)
where T(m+n-2) is a t-random variable with m + n – 2 degrees of freedom. Taking Theorem 5 into account, we have that
f(t)=Γ((m+n−1)/2)√π(m+n−2) Γ((m+n−2)/2)[1+t2m+n−2]−(m+n−1)/2, −∞<t<∞. (120)
Using (119) and (120), it can be obtained a 100(1-a)% confidence interval for ˉUm−κˉUn−(μm−κμn) from
P(t1≤T(m+n−2|ˉUm−κˉUn−(μm−κμn))≤t2)
=P(t1√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+κ2σ2n/n≤ˉUm−κˉUn−(μm−κμn) ≤t2√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+κ2σ2n/n)=1−α
(121)
By suitably choosing the decision variables t1 and t2. Hence, the statistical confidence interval for ˉUm−κˉUn−(μm−κμn) is given by
[t1√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−21√σ2m/m+κ2σ2n/n, t2√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−21√σ2m/m+κ2σ2n/n]. (122)
The length of the statistical confidence interval for ˉUm−κˉUn−(μm−κμn) is given by
L(t1,t2|√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+κ2σ2n/n) =(t2−t1)√(m−1)S2m/σ2m+(n−1)S2n/σ2nm+n−2√σ2m/m+κ2σ2n/n. (123)
In order to find the confidence interval of shortest-length for ˉUm−κˉUn−(μm−κμn) , we should find a pair of decision variables t1 and t2 such that (123) is minimum. It follows from (121) and (123) that
t2∫t1f(t)dt=t2∫0f(t)dt−t1∫0f(t)dt =(1−α+p)−p=1−α, (124)
where p (0≤p≤α) is a decision variable,
t2∫0f(t)dt=1−α+p (125)
and
t1∫0f(t)dt=p. (126)
Then t2 represents the (1−α+p) - quantile, which is given by
t2=q1−α+p;(t(m+n−2)), (127)
t1 represents the p - quantile, which is given by
t1=qp;(t(m+n−2)). (128)
The shortest length confidence interval for ˉUm−κˉUn−(μm−κμn) can be found as follows:
Minimize
(t2−t1)2=(q1−α+p;(t(m+n−2))−qp;(t(m+n−2)))2 (129)
subject to
0≤p≤α, (130)
The optimal numerical solution minimizing (t2−t1)2 can be obtained using the standard computer software "Solver" of Excel 2016. If σ2m=σ2n, it follows from (123) that
L(t1,t2|√(m−1)S2m+(n−1)S2nm+n−2√1m+κ2n)=(t2−t1)√(m−1)S2m+(n−1)S2nm+n−2√1m+κ2n. (131)
If, for example, m=6, n=4, a = 0.05,ˉUm=117.5, ˉUn=126.8, S2m=(9.7)2, S2n=(12)2, then the optimal numerical solution of (129) is given by
p=0.025, t1=qp;(t(m+n−2))=−2.306, t2=q1−α+p;(t(m+n−2))=2.306 (132)
and it follows from (121) and (131) that the 100(1-a)% confidence interval of shortest-length (or equal tails) for μ1−κμ2 is given by
(ˉUm−κˉUn−(μm−κμn)≥t1√(m−1)S2m+(n−1)S2nm+n−2√1m+κ2n,ˉUm−κˉUn−(μm−κμn)≤t2√(m−1)S2m+(n−1)S2nm+n−2√1m+κ2n) (133)
If it follows from (133) that
(μm−μn)∈((ˉUm−ˉUn)−t2√(m−1)S2m+(n−1)S2nm+n−2√1m+1n,(ˉUm−ˉUn)−t1√(m−1)S2m+(n−1)S2nm+n−2√1m+1n)
=((117.5−126.8)−2.306×10.6√16+14,(117.5−126.8)+2.306×10.6√16+14)=(−25.07, 6.47)
(134)
or
−25.07 < μ−mμn<6.47. (135)
An analytical expression for determining the optimal value of κ (the ratio in means of two different normal populations) can be obtained from (121), where it is assumed that σ2m=σ2n and (μm−κμn)=0:
(t1√(m−1)S2m+(n−1)S2nm+n−2√1/m+κ2/n≤ˉUm−κˉUn≤t2√(m−1)S2m+(n−1)S2nm+n−2√1/m+κ2/n)
=(κˉUn+t1√(m−1)S2m+(n−1)S2nm+n−2√1/m+κ2/n≤ˉUm,ˉUm≤κˉUn+t2√(m−1)S2m+(n−1)S2nm+n−2√1/m+κ2/n)
=(κ+t1√(m−1)S2m+(n−1)S2nm+n−2ˉUn√1/m+κ2/n≤ˉUmˉUn,ˉUmˉUn≤κ+t2√(m−1)S2m+(n−1)S2nm+n−2√1/m+κ2/n)
=(κ≤ˉUmˉUn−t1√(m−1)S2m+(n−1)S2nm+n−2ˉUn√1/m+κ2/n,κ≥ˉUmˉUn−t2√(m−1)S2m+(n−1)S2nm+n−2ˉUn√1/m+κ2/n)
=(κ≤0.926656+2.30610.6126.8√1/6+κ2/4,κ≥0.926656−2.30610.6126.8√1/6+κ2/4)
=(κ≤0.926656+0.192773√0.166667+0.25κ2,κ≥0.926656−0.192773√0.166667+0.25κ2)
⇒( minimize:(κ−0.926656−0.192773√0.166667+0.25κ2)2,(κ−0.926656+0.192773√0.166667+0.25κ2)2, subject to: κ≥0.)=(κ≤1.05526,κ≥0.815431).
(136)
Thus, it follows from (136) that
κ∈(0.815431, 1.05526). (137)
The new intelligent computational models proposed in this paper are conceptually simple, efficient, and useful for constructing accurate statistical tolerance or prediction limits and shortest-length or equal-tailed confidence intervals under the parametric uncertainty of applied stochastic models. The methods listed above are based on adequate computational models of the cumulative distribution function of order statistics and constructive use of the invariance principle in mathematical statistics. These methods can be used to solve real-life problems in all areas including engineering, science, industry, automation & robotics, machine learning, business & finance, medicine and biomedicine, optimization, planning and scheduling.
None.
The authors declare that there is no conflict of interest.
©2024 Nechval, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.