Correspondence:
Received: January 01, 1970 | Published: ,
Citation: DOI:
Download PDF
Abstract
We propose three efficient sequential designs in the software reliability estimation. The fully sequential design the multistage sequential design and the accelerated sequential design. These designs make allocation decisions dynamically throughout the testing process. We then refine these estimated reliabilities in an iterative manner as we sample. Monte Carlo simulation seems to indicate that these sequential designs are second order optimal.
Keywords: software reliability, partition testing, fully sequential design, multistage sequential design, accelerated sequential design
Introduction
Reliability of a system is an important aspect of any system design since any user of the system would expect some type of guarantee that the system will function to some level of confidence. Failing to meet such guarantee will result in disastrous consequences. On the other hand, overly exceeding such guarantee level may incur additional and unnecessary expense to the developers. Moreover, for any non-trivial software system, an exhaustive testing among the entire input domain can be very expensive. By adopting the partition testing strategy, we attempt to break up the testable input domain of possible test cases into partitions, which must be non-overlapping, such that if test case
belongs to partition
, then no partition other than
will contain
. Sayre and Poore1–11 have given several possible mechanics to partition the domain into finitely many subdomains,
, such that :
which allows us to define the system reliability by a weighted sum of reliabilities of these subdomains, i.e.
Where
denotes the system reliability
and is the reliability of each subdomain
; and
, parameters of the operational profile is the likelihood of this test case belongs to partition
, which are assumed to be known.12 As mentioned above, a complete testing of any software system of non-trivial size is practically impossible,
are usually unknown parameters to us. So as to gain knowledge about
, we must distribute the
test cases among these
partitions, and generate reasonable estimates for each. Specifically, we denote
as sizes of the samples which are taken from sub domain
, respectively, where
We model the outcome of the
taken from the
partition as a Bernoulli random variable
such that:
and each
follows a Bernoulli distribution with parameter
. Then, the estimate of the overall system reliability,
denoted by
can thus be defined as:
where
is the estimate of
after
test cases have been allocated to partition such that:
and
Optimal sampling scheme
Ideally, we would like to execute all possible test paths through the software and determine the true overall reliability of the system. In practice though, resources are often limited, sample test cases must be chosen and allocated strategically to attain the best reliability estimate possible given all kinds of constraints. One of the criteria of distributing test cases among the partitions, which proceeds from rewriting (1.1) as follows:
which is bounded below by the first term:
with equality of (2.1) to hold is and only if:
for all . Hence, the optimal allocation is determined by:
for
, and
and the variance incurred by this optimal allocation is:
Note that the optimal allocation depends on the actual reliabilities
which are unknown. Therefore the optimal sampling scheme is not practical. It is this shortcoming that motivates us to adopt such dynamic allocation approaches that will be discussed in the following three sections.
Fully sequential sampling scheme
By adopting a fully Bayesian approach with Beta priors, Rekab, Thompson and Wei5 proposed a fully sequential design shown to be first order optimal. The fully sequential design is defined as follows;
We first test one case from each of the partitions and estimate the reliability for each of these partitions.
Stage 1:
After cases have been allocated,
the next test will be taken from partition
if for all ,
where
is the cumulative test cases allocated to partition
after
tests have been allocated and the current estimated reliability for partition
is determined by:
Stage 2:
Repeat step 2 sequentially until all the test cases are allocated, and the final estimate of reliability for partition is:
And thus, the estimate of the overall reliability of the system is:
Multistage sequential sampling
By adopting a fully Bayesian approach with Beta priors, Rekab, Thompson & Wei6 proposed a multistage sequential design shown to be first order optimal. Instead of making an allocation decision for each test at a time, the multistage sequential sampling allocates test cases among the partitions in stages in batches, where and are both pre-specified. The multistage sequential design is defined as follows:
Stage 1:
We start with an initial sample of test cases, which are allocated among the partitions with a balanced allocation scheme, such that:
and estimate the reliability for partition by:
Therefore,
Stage 2 through L:
At stage
for partition the test cases are distributed by the following way:
for partition
the test cases are distributed by the following way:
and
where
At the final stage , the cumulative number of tests allocated to partition is:
and
Among several ways of determining the number of cases at each sampling stage, the simplest one is to select:
However, choosing stage sizes, especially the initial stage size, can be done by following some general criteria, a good initial stage size can be
for when a two stage sampling scheme is considered, and more generally, Rekab9 shows that for a two stage procedure, must be chosen such that:
Accelerated sampling scheme
By adopting a fully Bayesian approach with Beta priors, Rekab, Thompson and Wei6 proposed an accelerated sequential design shown to be first order optimal. The accelerated sampling scheme combines the fully sequential sampling scheme and the multistage sampling scheme. It is defined as follows:
Stage 1:
The procedure starts with an initial sample
, which satisfies the conditions specified in (4.1). Then, allocate
equally among partitions
Stage 2 through
At stage
for partition
the test cases are distributed by the following way:
and
where
At each stage,
must satisfy the two conditions as
.
Stage L:
In the final stage, we adopt a fully sequential approach by allocating one test from partition , if for all ,
until all the test cases have been allocated. Note that
,
are the cumulative test cases allocated to partition and after the allocations of a total of test cases, where
Therefore, the estimate for the whole system is finally obtained as:
Optimality of sequential designs
First order optimality of these three sequential designs was established by Rekab, Thompson & Wu,5–7 although the focus here is on minimizing the variance incurred by the sequential designs rather than minimizing the Bayes risk incurred by these designs. For estimating the mean difference of two independent normal populations, Woodroofe and Hardwick12 adopted a quasi-Bayesian approach. They determined an asymptotic lower bound for the integrated risk and proposed a three-stage design that is second-order optimal. For estimating the mean difference of two general one -parameter exponential family, Rekab and Tahir10 adopted a fully Bayesian approach with conjugate priors. They determined an asymptotic second order lower bound for the Bayes risk.
Monte carlo simulations
We consider the case where the test domain is partitioned into two subdomains
and
with reliability
and
respectively with equal usage probabilities
Second order optimality of the three sequential designs is investigated through Monte Carlo simulations.
(R1,R2)
|
N=300
|
N=500
|
N=800
|
N=2000
|
N=5000
|
N=8000
|
0.1,0.9
|
28.1562
|
3.2104
|
0.739
|
2.4225
|
9.304
|
0.7133
|
0.5,0.2
|
29.038
|
6.102
|
0.63
|
7.9375
|
4.67
|
4.05
|
0.5,0.5
|
1.5972
|
0.146
|
0.212
|
0.9125
|
0.512
|
0.4933
|
0.5,0.9
|
80.7747
|
71.9032
|
19.951
|
7.96
|
5.796
|
1.2666
|
0.9,0.3
|
64.0246
|
53.8008
|
5.5497
|
3.2565
|
15.2546
|
2.6517
|
Table 1
by Fully Sequential Scheme
(R1,R2)
|
N=300
|
N=500
|
N=800
|
N=2000
|
N=5000
|
N=8000
|
0.1,0.9
|
10.9835
|
0.0288
|
4.0718
|
55.7962
|
2.332
|
1.8422
|
0.5,0.2
|
11.8285
|
19.172
|
2.0555
|
5.51
|
0.8432
|
7.1258
|
0.5,0.5
|
0.02749
|
0.0459
|
0.175
|
0.4568
|
0.47324
|
0.1064
|
0.5,0.9
|
37.6665
|
31.6257
|
17.6417
|
3.6377
|
6.8483
|
9.2952
|
0.9,0.3
|
20.2034
|
2.86352
|
7.6161
|
3.1589
|
11.7869
|
5.2018
|
Table 2
by Multistage Scheme
(R1,R2)
|
N=300
|
N=500
|
N=800
|
N=2000
|
N=5000
|
N=8000
|
0.1,0.9
|
30.5713
|
6.587
|
6.4568
|
17.4023
|
2.1138
|
1.0098
|
0.5,0.2
|
8.1424
|
6.1346
|
7.1683
|
0.8968
|
0.1462
|
2.1009
|
0.5,0.5
|
0.03801
|
0.23284
|
0.0319
|
0.7025
|
0.064
|
0.0392
|
0.5,0.9
|
37.6665
|
31.6257
|
17.6417
|
3.6377
|
6.8483
|
9.2952
|
0.9,0.3
|
48.4148
|
13.817
|
7.8473
|
8.4656
|
5.8734
|
1.6708
|
Table 3
by Accelerated Scheme
Table I, II, III seem to indicate that the speed
is bounded.
Conclusion
Second optimal designs are more efficient than the first order optimal designs especially when the total number of cases is very large. This is the main argument that led us to investigate the second order optimality of these three dynamic designs. Simulation studies seem to indicate that these designs are second order optimal. We conjecture that second order optimally may be obtained theoretically as well.
It is very common in parametric estimation to use the squared error loss. However, in reliability estimation one should distinguish between the cost of overestimating and underestimating the system reliability. Examples of practical loss functions were presented by Stüger:2
and by Granger:1
where represents the overestimation and underestimation costs, respectively.
Acknowledgments
Conflicts of interest
Authors declare that there are no conflicts of interests.
References
- Sankaran M. The discrete Poisson-Lindley distribution. Biometrics. 1970;26(1):145–149.
- Ghitany ME, Al-Mutairi DK. Estimation Methods for the discrete Poisson-Lindley distribution. Journal of Statistical Computation and Simulation. 2009;79(1):1–9.
- Shanker R, Mishra A. A two-parameter Poisson-Lindley distribution. International Journal of Statistics and Systems. 2014;9(1):79–85.
- Shanker R, Mishra A. A two-parameter Lindley distribution. Statistics in Transition new Series. 2013;14(1):45–56.
- Shanker R, Mishra A. A quasi Poisson-Lindley distribution (submitted). 2015.
- Shanker R, Mishra A. A quasi Lindley distribution. African journal of Mathematics and Computer Science Research. 2013;6(4):64–71.
- Shanker R, Sharma S, Shanker R. A Discrete two-Parameter Poisson Lindley Distribution. Journal of Ethiopian Statistical Association. 2012;21:15–22.
- Shanker R, Sharma S, Shanker R. A two-parameter Lindley distribution for modeling waiting and survival times data. Applied Mathematics. 2013;4:363–368.
- Shanker R, Tekie AL. A new quasi Poisson-Lindley distribution. International Journal of Statistics and Systems. 2014;9(1):87–94.
- Shanker R, Amanuel AG. A new quasi Lindley distribution. International Journal of Statistics and Systems. 2013;8(2):143–156.
- Johnson NL, Kotz S, Kemp AW. Univariate Discrete Distributions, 2nd ed. John Wiley & sons Inc; 1992.
- Fisher RA, Corpet AS, Williams CB. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology. 1943;12(1):42–58.
- Kempton RA. A generalized form of Fisher’s logarithmic series. Biometrika. 1975;62(1):29–38.
- Tripathi RC, Gupta RC. A generalization of the log-series distribution. Comm. in Stat. (Theory and Methods). 1985;14(8):1779–1799.
- Mishra A, Shanker R. Generalized logarithmic series distribution-Its nature and applications. Proceedings of the Vth International Symposium on Optimization and Statistics. 2002:155–168.
- Loeschke V, Kohler W. Deterministic and Stochastic models of the negative binomial distribution and the analysis of chromosomal aberrations in human leukocytes. Biometrische Zeitschrift. 1976;18:427–451.
- Janardan KG, Schaeffer DJ. Models for the analysis of chromosomal aberrations in human leukocytes. Biometrical Journal. 1977;(8):599–612.
- Mc Guire JU, Brindley TA, Bancroft TA. The distribution of European corn-borer larvae pyrausta in field corn. Biometrics. 1957;13:65–78.
- Catcheside DG, Lea DE, Thoday JM. ypes of chromosome structural change induced by the irradiation on Tradescantia microspores. Journal of Genetics. 1946;47:113–136.
- Catcheside DG, Lea DE, Thoday JM. The production of chromosome structural changes in Tradescantia microspores in relation to dosage, intensity and temperature. Journal of Genetics. 1946;47:137–149.
- Lindley DV. Fiducial distributions and Bayes theorem. Journal of Royal Statistical Society Ser. B. 1958;20:102–107.
© . This is an open access article distributed under the terms of the,
which
permits unrestricted use, distribution, and build upon your work non-commercially.