Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 6 Issue 4

New family of time series models and its bayesian analysis

D Venkatesan,1 Michele Gallo2

1Department of Statistics, Annamalai University, India
2Department of Social Science, University of Naples-L’Orientale, Italy

Correspondence: D. Venkatesan, Department of Statistics, Annamalai University, India

Received: August 23, 2017 | Published: October 10, 2017

Citation: Venkatesan D, Gallo M. New family of time series models and its bayesian analysis. Biom Biostat Int J. 2017;6(4):373-379. DOI: 10.15406/bbij.2017.06.00172

Download PDF

Synoptic Abstract

A new family of time series models, called the Full Range Autoregressive model, is introduced which avoids the difficult problem of order determination in time series analysis. Some of the basic statistical properties of the new model are studied. Further, the paper describes the Bayesian inference and forecasting as applied to the Full Range Autoregressive model. The Canadian lynx data is used to compare the efficiency of the predictive power of the new model with those of some of the existing models in the time series literature.

Keywords: full range autoregressive model, identifiability, stationary condition, posterior distribution, bayesian predictive distribution

Introducton

 The early days of time series analysis, most of the models fitted to the real life data were restricted to low orders because of availability of high speed computers and other facilities. However, now with the availability of high speed computers, there is no need for this type of restriction on the order determination and estimation of the fitted models. Further, most of the work in time series analysis are concerned with series having the property that the degree of dependence between observations, separated by a long time span, is zero or highly negligible. However, the empirical studies by Lawrance and Kottegoda1 reveal, particularly in cases arising in economics and hydrology, that the degree of dependence between observations a long time span apart, though small, is by no means negligible. Therefore, there is still a need for a family of models which can fully depict the properties of stationarity, linearity and long range dependence.

Moreover, the existing theory of autoregressive models assume that the coefficients of the model are not connected in any way among each other. Therefore, it would be useful, from practical point of view, to propose new models, called the Full Range Auto Regressive model and denoted as FRAR model for short, which can accommodate long range dependence and have the property that the coefficients of the past values in the model are functions of a limited number of parameters.

Thus, the chief objective of this paper is to introduce a family of new models which would involve only a few parameters and at the same time incorporate long range dependence, which would be an acceptable alternative to the current models representing stationary time series.

A family of models, introduced in this paper, called Full Range Auto Regressive model and denoted as FRAR model for short, are defined in such a way that they possess the following basic features. 

  1. The models should be capable of representing long term persistence. This is justified by the fact that the future may not depend on the present and a few past values alone, but may depend on the present and the whole past.
  2. The parameters of the model, which are likely to be large in number due to (1), should exhibit some degree of dependence among themselves.

Therefore, the new models are expected to have infinite structure with a finite number of parameters and so completely avoid the problem of order determination.

An outline of this paper is as follows. In Section 2, the FRAR model is defined, the identifiability region is obtained, the stationarity condition is derived, and the asymptotic stationarity is studied. In Section 3, the Bayesian analysis of the FRAR model is discussed and the predictive density of a single future observation is derived. In Section 4 the Canadian lynx data is used for forecasting through the FRAR model. In Section 5 a comparative study is provided to examine the efficiency of FRAR model. In Section 6 the summary and conclusion is given.

The full range autoregressive model

The model

We define a family of models by a discrete-time stochastic process (Xt)(Xt) , t=0,±1,±2,...t=0,±1,±2,... , called the Full Range Auto Regressive (FRAR) model, by the difference equation

Xt=r=1arXtr+etXt=r=1arXtr+et       (1)

where ar=ksin(rθ)cos(rϕ)/αrar=ksin(rθ)cos(rϕ)/αr , (r=1, 2, 3,...)(r=1, 2, 3,...) , kk , αα , θθ  and ϕϕ  are parameters, e1e1 , e2e2 , e3e3 , … are independent and identically distributed normal random variables with mean zero and variance σ2σ2 . The initial assumptions about the parameters are as follows:

It is assumed that XtXt  will influence Xt+nXt+n  for all positive n and the influence of XtXt  on Xt+nXt+n  will decrease, at least for large n, and become insignificant as n becomes very large, because more important for the recent observations and less important for an older observations. Hence anan  must tend to zero as n goes to infinity. This is achieved by assuming that α>1α>1 . The feasibility of XtXt  having various magnitudes of influence on Xt+nXt+n , when n is small, is made possible by allowing k to take any real value. Because of the periodicity of the circular functions sine and cosine, the domain of θθ  and ϕϕ  are restricted to the interval [0, 2π)[0, 2π) .

Thus, the initial assumptions are α>1α>1 , kRkR , and θθ , ϕ[0, 2π)ϕ[0, 2π) . i.e., Θ=(α, k, θ, ϕ)S*Θ=(α, k, θ, ϕ)S* , where S*={α, k, θ, ϕ | kR, α>1, θ, ϕ[0, 2π)}S*={α, k, θ, ϕ | kR, α>1, θ, ϕ[0, 2π)} . Further restrictions on the range of the parameters are placed by examining the identifiability of the model.

Identifiability condition

Identifiability ensures that there is a one to one correspondence between the parameter space and set of associated probability models. Without identifiability it is meaningless to proceed to estimate the parameters of a model using a set of given data. In the present context, identifiability is achieved by restricting the parameters space in such a way that no two points in the parameter space could produce the same time series model.

The coefficients anan ’s in (1) are functions of kk , αα , θθ , ϕϕ  as well as n. That is, an=an(k, α, θ, ϕ)=ksin(nθ)cos(nϕ)/αnan=an(k, α, θ, ϕ)=ksin(nθ)cos(nϕ)/αn , θS*θS* , n=1, 2, 3,...n=1, 2, 3,... .

Define A={α, k, θ, ϕ | α>1, kR, πθ, ϕ<2π}A={α, k, θ, ϕ | α>1, kR, πθ, ϕ<2π} ,

B={α, k, θ, ϕ | α>1, kR, 0θ<π, πϕ<2π}B={α, k, θ, ϕ | α>1, kR, 0θ<π, πϕ<2π} , (2)

C={α, k, θ, ϕ | α>1, kR, πθ<2π, 0ϕ<π}C={α, k, θ, ϕ | α>1, kR, πθ<2π, 0ϕ<π} ,

D={α, k, θ, ϕ | α>1, kR, 0θ, ϕ<π}D={α, k, θ, ϕ | α>1, kR, 0θ, ϕ<π} .

Since an=an(k, α, θ, ϕ)=an(k, α, 2πθ,  2πϕ)an=an(k, α, θ, ϕ)=an(k, α, 2πθ,  2πϕ) , θS*θS*           

to each (α, k, θ, ϕ)(α, k, θ, ϕ)  belonging to A there is aa   (α, k, θ', ϕ')(θ'=2πθ and ϕ'=2πϕ)(α, k, θ', ϕ')(θ'=2πθ and ϕ'=2πϕ)  belonging to D such that an(k, α, θ, ϕ)=an(k, α, θ',  ϕ')an(k, α, θ, ϕ)=an(k, α, θ',  ϕ') . So A is omitted. Similarly, it can be shown that B and C can also be omitted.
Define    D1={α, k, θ, ϕ | α>1, kR, π/2θ, ϕ<π}D1={α, k, θ, ϕ | α>1, kR, π/2θ, ϕ<π} ,

D2={α, k, θ, ϕ | α>1, kR, 0θ<π/2, π/2ϕ<π}D2={α, k, θ, ϕ | α>1, kR, 0θ<π/2, π/2ϕ<π} ,

D3={α, k, θ, ϕ | α>1, kR, 0θ, ϕ<π/2}D3={α, k, θ, ϕ | α>1, kR, 0θ, ϕ<π/2} ,

D4={α, k, θ, ϕ | α>1, kR, π/2θ<π, 0ϕ<π/2}D4={α, k, θ, ϕ | α>1, kR, π/2θ<π, 0ϕ<π/2} .

Since an(k, α, θ, ϕ)=an(k, α, πθ,  πϕ)an(k, α, θ, ϕ)=an(k, α, πθ,  πϕ)

 for kR,  α>1, 0θ, ϕ<πkR,  α>1, 0θ, ϕ<π                                                (3)

Using (3) it can be shown as before, that the regions D1D1  and D2D2  can be omitted. Since no further reduction is possible, it is finally deduced that the region of identifiability of the model is given by S={α, k, θ, ϕ | kR, α>1, θ[0,π), ϕ[0,π/2)}S={α, k, θ, ϕ | kR, α>1, θ[0,π), ϕ[0,π/2)} .

Stationarity of the FRAR process

The stationarity of the newly developed FRAR time series model is now examined. The model is given by Xt=r=1arXtr+etXt=r=1arXtr+et . That is, (1a1Ba2B2)Xt=et(1a1Ba2B2)Xt=et , where B is the backward shift operator, defined by BnXt=XtnBnXt=Xtn . Thus, the model is given by Ψ(B)Xt=etΨ(B)Xt=et , or Xt=Ψ1(B)etXt=Ψ1(B)et , where Ψ(B)=1a1Ba2B2Ψ(B)=1a1Ba2B2 .

Box and Jenkins2 and Priestley3 have shown that a necessary condition for the stationarity of such processes is that the roots of the equation Ψ(B)=0Ψ(B)=0  must all lie outside the unit circle. So, it is now proposed to investigate the nature of the zeros of Ψ(B)Ψ(B) .

The power series Ψ(B)Ψ(B)  may be rewritten as Ψ(B)=1[a1B+a2B2+]=1n=1anBnΨ(B)=1[a1B+a2B2+]=1n=1anBn , where anBn=(kBn/αn)[sin(nθ)cos(nϕ)]=(k'Bn/αn)[sin(nθ1)+sin(nθ2)]anBn=(kBn/αn)[sin(nθ)cos(nϕ)]=(k'Bn/αn)[sin(nθ1)+sin(nθ2)] , k'=k/2k'=k/2 , θ1=θ+ϕθ1=θ+ϕ  and θ2=θϕθ2=θϕ . Therefore, n=1anBn=n=1k'Bnαnsin(nθ1)+n=1k'Bnαnsin(nθ2)n=1anBn=n=1k'Bnαnsin(nθ1)+n=1k'Bnαnsin(nθ2) .

The above two series are separately evaluated below.

n=1k'Bnαnsin(nθ1)=IP of n=1k'Bnαneinθ1=IP{kBeiθ1(αBeiθ1)1}=k'Bαsin(θ1)/G1n=1k'Bnαnsin(nθ1)=IP of n=1k'Bnαneinθ1=IP{kBeiθ1(αBeiθ1)1}=k'Bαsin(θ1)/G1 , where G1=B2+α22Bαcos(θ1)G1=B2+α22Bαcos(θ1) and IP stands for imaginary part.

Similarly, it can be shown that n=1k'Bnαnsin(nθ2)=(k'Bαsin(nθ2))/G2n=1k'Bnαnsin(nθ2)=(k'Bαsin(nθ2))/G2 , where G2=B2+α22Bαcos(θ2)G2=B2+α22Bαcos(θ2) .

Therefore,

n=1anBn=k'Bα[(B2+α2)(sin(θ1)+sin(θ2))2Bα(sin(θ1)cos(θ2)cos(θ1)sin(θ2))]/G1G2n=1anBn=k'Bα[(B2+α2)(sin(θ1)+sin(θ2))2Bα(sin(θ1)cos(θ2)cos(θ1)sin(θ2))]/G1G2 .

Thus, Ψ(B)=1n=1anBn=0Ψ(B)=1n=1anBn=0 implies that (B2+α22Bαcos(θ1))(B2+α22Bαcos(θ2))k'Bα[(B2+α2)s12Bd1c2]=0(B2+α22Bαcos(θ1))(B2+α22Bαcos(θ2))k'Bα[(B2+α2)s12Bd1c2]=0

where c1=cos(θ1)+cos(θ2)=2cos(θ)cos(φ)c1=cos(θ1)+cos(θ2)=2cos(θ)cos(φ) , c2=sin(2θ2)c2=sin(2θ2) , s1=sin(θ1)+sin(θ2)=2sin(θ)cos(φ)s1=sin(θ1)+sin(θ2)=2sin(θ)cos(φ) , d1=cos(θ1)cos(θ2)d1=cos(θ1)cos(θ2) . After simplifying, the above equation becomes B4B3α(2c1+k's1)+B2α2(2+4d1+2k'c2)Bα3(2c1+k's1)+α4=0B4B3α(2c1+k's1)+B2α2(2+4d1+2k'c2)Bα3(2c1+k's1)+α4=0 . Thus,

B4B3αA1+B2α2A2Bα3A1+α4=0B4B3αA1+B2α2A2Bα3A1+α4=0 (4)

or S4A1S3+A2S2A1S+1=0S4A1S3+A2S2A1S+1=0 (5)

where A1=2c1+k's1=cos(φ)(4cos(θ)+ksin(φ))A1=2c1+k's1=cos(φ)(4cos(θ)+ksin(φ)) , A2=2+4d1+2k'c2=2[1sin(ϕ)(4sin(θ)kcos(ϕ))]A2=2+4d1+2k'c2=2[1sin(ϕ)(4sin(θ)kcos(ϕ))] , and S=B/αS=B/α . This equation (of degree 4) reduces to Z2A1Z+(A22)=0Z2A1Z+(A22)=0  where Z=S+(1/S)Z=S+(1/S) .

The roots of this equation are, say r1r1  and r2r2 , are given by Z=(1/2)[A1±(A214A2+8)]Z=(1/2)[A1±(A214A2+8)] .

Since Z=S+(1/S)Z=S+(1/S) , one finally gets the four roots of the equation (4), as R1=(1/2)[r1+r214]R1=(1/2)[r1+r214] , R2=(1/2)[r1r214]R2=(1/2)[r1r214] , R3=(1/2)[r2+r224]R3=(1/2)[r2+r224]  and R2=(1/2)[r2r224]R2=(1/2)[r2r224] .

The equation (5) implies that, if S0S0  is a root of the equation (5) then 1/S01/S0  is also a root. This implies that αS0αS0  and (α/S0)(α/S0)  are roots of equation (4). Therefore the process is stationary for sufficiently large values of αα . But when αα  is small it seems difficult to examine the stationarity of the process by this approach. Hence, it is proposed to study the asymptotic stationarity of the process in the following section.

Asymptotic stationarity of the FRAR process

In this section we derive the condition for asymptotic stationarity of the FRAR process. For which one has to solve the difference equation (1), so as to obtain an expression for XtXt  in terms of etet , et1et1 , et2et2 , et3et3 , .... The precise solution of this equation depends on the initial conditions. So to investigate the nature of the first and second moments of XtXt , following Priestley,3 it is assumed that Xt=0Xt=0  for t<Nt<N , N being the number of observations in the time series. Then solving (1) by repeated substitutions one obtains

Xt=et+a11Xt1+a12Xt2+a13Xt3+Xt=et+a11Xt1+a12Xt2+a13Xt3+ ,

where a1j=aj=(k/αj)sin(jθ)cos(jϕ)a1j=aj=(k/αj)sin(jθ)cos(jϕ) ; j=1, 2, j=1, 2,  ,

=et+a11et1+a22Xt2+a23Xt3+a24Xt4=et+a11et1+a22Xt2+a23Xt3+a24Xt4 ,

where a2j=a11a1j1+a1ja2j=a11a1j1+a1j ; j=2, 3, 4 j=2, 3, 4  .

Similarly proceeding one finally gets

Xt=[et+a11et1+a22et2+a33et3+a44et4++appetp]Xt=[et+a11et1+a22et2+a33et3+a44et4++appetp] +[ap+1p+1Xt(p+1)+ap+1p+2Xt(p+2)+]+[ap+1p+1Xt(p+1)+ap+1p+2Xt(p+2)+]  

where aij=ai1i1a1j+1i+ai1jaij=ai1i1a1j+1i+ai1j  with j>i=2, 3, 4 j>i=2, 3, 4  . Thus, if it is assumed that Xt=0Xt=0  for tNtN , which implies has n=N+t1n=N+t1 , then, Xt=et+a11et1+a22et2+a33et3+a44et4++aN+t1,N+t1e1NXt=et+a11et1+a22et2+a33et3+a44et4++aN+t1,N+t1e1N .

Further, it can be shown that

E[XtXt+1]=σ2e[a11(1+a211+a222++a2N+t1  N+t1)+(a11a12+a22a23++aN+t2  N+t2aN+t2  N+t1)]E[XtXt+1]=σ2e[a11(1+a211+a222++a2N+t1  N+t1)+(a11a12+a22a23++aN+t2  N+t2aN+t2  N+t1)]

E[XtXt+2]=σ2e[a22(1+a211+a222++a2N+t1  N+t1)E[XtXt+2]=σ2e[a22(1+a211+a222++a2N+t1  N+t1)

+a11(a11a12+a22a23++aN+t2  N+t2aN+t2  N+t1)+a11(a11a12+a22a23++aN+t2  N+t2aN+t2  N+t1)

 

+(a11a13+a22a24++aN+t3  N+t3aN+t3  N+t1)]+(a11a13+a22a24++aN+t3  N+t3aN+t3  N+t1)]

E[XtXt+3]=σ2e[a33(1+a211(1+a22+a33++a2N+t3  N+t3))E[XtXt+3]=σ2e[a33(1+a211(1+a22+a33++a2N+t3  N+t3))

+a11(a22a34+a33a45++aN+t3  N+t3aN+t  N+t1)+a11(a22a34+a33a45++aN+t3  N+t3aN+t  N+t1)

+(a11a34+a22a45++aN+t1  N+t1aN+t3  N+t3)]+(a11a34+a22a45++aN+t1  N+t1aN+t3  N+t3)]  

and in general

E[XtXt+s]=σ2e[ass+a11as+1s+1+a22as+2s+2++aN+t1  N+t1aN+t+s1  N+t+s1]E[XtXt+s]=σ2e[ass+a11as+1s+1+a22as+2s+2++aN+t1  N+t1aN+t+s1  N+t+s1]  

where ass=a11as1s1+as1sass=a11as1s1+as1s . Therefore, allowing NN , we get E[Xt]=0E[Xt]=0 , Var[Xt]=σ2e[1+a211+a222+]Var[Xt]=σ2e[1+a211+a222+]  and E[XtXt+s]=σ2e[ass+a11as+1s+1+]E[XtXt+s]=σ2e[ass+a11as+1s+1+]  provided the series on the right converges. Thus, it is seen that if E[XtXt+s]E[XtXt+s]  exists then it is a function of s only. In order to examine the convergence of Var[Xt]Var[Xt]  and E[XtXt+s]E[XtXt+s] , first the behaviour of aijaij , as j tends infinity, is investigated. Since a1j=aj=(k/αj)sin(jθ)cos(jφ)a1j=aj=(k/αj)sin(jθ)cos(jφ) , |a1j||k|/αja1j|k|/αj . Similarly, |a2j||k|(1+|k|)/αja2j|k|(1+|k|)/αj ; j2j2 . Thus, in general |anj||k|(1+|k|)n1/αjanj|k|(1+|k|)n1/αj , for jnjn .

Since α>1α>1 , the above relation implies that |anj|0anj0  as jj , for any fixed n. Thus n=1a2jjn=1a2jj  will converge if |(1+|k|)α|<1(1+|k|)α<1 .

If we assume that 1α<k<α11α<k<α1 , then one can show that Var[Xt]=σ2Xtσ2ek2(1+k)2[α2α2(1+k)2]Var[Xt]=σ2Xtσ2ek2(1+k)2[α2α2(1+k)2]  and E[XtXt+s]σ2ek2(1+k)2k(1+k)s1αs[α2α2(1+k)2]E[XtXt+s]σ2ek2(1+k)2k(1+k)s1αs[α2α2(1+k)2] .

Therefore, the auto-correlation function of the process exists and, as shown earlier, it is a function of s only. Finally allowing tt , it is seen that

  1. limtE[Xt]limtE[Xt]  and limtVar[Xt]limtVar[Xt]  exist finitely;
  2. limtCov[Xt,Xt+s]limtCov[Xt,Xt+s]  exists finitely and is a function of ‘s’ only.

Thus, the condition for {Xt}{Xt}  to be asymptotically stationary is that 1α<k<α11α<k<α1 . Therefore, we summarized the above results by the following theorem 1.

Theorem 1: The Full Range Auto Regressive (FRAR) process {Xt}{Xt}  is asymptotically stationary and identifiable if and only if the domain of the parameter space SS is

{kαθϕ/kR, 1α<k<α1, θ[0,π), ϕ[0,π/2)}{kαθϕ/kR, 1α<k<α1, θ[0,π), ϕ[0,π/2)} α>1α>1 .

Thus, the new FRAR model incorporates long range dependence, involves only four parameters and is totally free from order determination problems.

Bayesian analysis of frar model

The posterior analysis

The Bayesian approach to the analyses of the new model consists in determining the posterior distribution of the parameters of the FRAR model and the predictive distribution of future observations. From the former, one makes posterior inferences about the parameters of the FRAR model including the variance of the white noise. From the latter, one may forecast future observations. All these techniques are illustrated by Broemeling4 for autoregressive models.

We shall consider the FRAR model and assume that it is asymptotically stationary and identifiable.

The problem is to estimate the unknown parameters kk , αα , θθ , ϕϕ  and σ2σ2 , using the Bayesian methodology on the basis of a past random realization of {Xt}{Xt}  say x=(x1,x2,...,xN)x=(x1,x2,...,xN) .

The joint probability density of X1,X2,...,XNX1,X2,...,XN  is given by

P(X/Θ)(σ2)N/2exp[12σ22t=1(xtkr=1arxtr)2]P(X/Θ)(σ2)N/2exp[12σ22t=1(xtkr=1arxtr)2]  (6)

where x=(x1,x2,...,xN)x=(x1,x2,...,xN) , Θ=(k,α,θ,φ,σ2)Θ=(k,α,θ,φ,σ2) and ar=(1/α2)sin(rθ)cos(rϕ)ar=(1/α2)sin(rθ)cos(rϕ) .

The notation PP  is used as a general notation for the probability density function of the random variables given within the parentheses following PP and X0,X1,X2,...X0,X1,X2,...  are the past realizations on XtXt  which are unknown. Following Priestley2 and Broemeling,3 these are assumed to be zero for the purpose of deriving the posterior distribution of ΘΘ . Therefore, the range for the index r, viz., 1 through ∞, reduces to 1 through N and so, in the joint probability density function of the observations given by (6), the range of the summation 1 through ∞ can be replaced by 1 through N. By expanding the square in the exponent and simplifying, one gets
P(X/Θ)(σ2)N/2exp(Q/2σ2)P(X/Θ)(σ2)N/2exp(Q/2σ2) (7)

where Q=T00+k2Nr=1a2rTrr+2k2Nr<s;r,s=1arasTrs2kNr=1arTr0Q=T00+k2Nr=1a2rTrr+2k2Nr<s;r,s=1arasTrs2kNr=1arTr0 , Trs=Nt=1xtrxtsTrs=Nt=1xtrxts , r,s=0,1,...,Nr,s=0,1,...,N , ΘSΘS .

To find the posterior distribution of ΘΘ we first have to specify the prior distribution for the parameters.

αα is distributed as the displaced exponential distribution(since it is bigger than 1) with parameter ββ ; σ2σ2  has the inverted gamma distribution with parameter v and δ; kk , θθ  and ϕϕ  are uniformly distributed over their domain.

Thus, the joint prior density function of ΘΘ   (ΘS)(ΘS) is given by

P(Θ) βexp(β(α1)ν/σ2)(σ2)(δ+1)P(Θ) βexp(β(α1)ν/σ2)(σ2)(δ+1)          (8)

Using (7), (8), and Bayes’ theorem, the joint posterior density of kk , αα , θθ , ϕϕ  and σ2σ2  is obtained as

P(Θ/X)(σ2)N/2exp(Q/2σ2)exp[β(α1)ν/σ2](σ2)(δ+1)P(Θ/X)(σ2)N/2exp(Q/2σ2)exp[β(α1)ν/σ2](σ2)(δ+1)                     (9)

exp[β(α1)]exp[1/2σ2(Q+2ν)](σ2)[(N/2)+δ+1]exp[β(α1)]exp[1/2σ2(Q+2ν)](σ2)[(N/2)+δ+1]                     (10)

Integrating σ2σ2 out of this joint posterior distribution, we obtain the joint posterior distribution of kk , αα , θθ  and ϕϕ ,

P(k,α,θ,ϕ/X)eβ(α1){C[1+A1(kB1)2]}dP(k,α,θ,ϕ/X)eβ(α1){C[1+A1(kB1)2]}d                                                                 (11)

where C=T00B2/A+2νC=T00B2/A+2ν ; B=Nr=1arT0rB=Nr=1arT0r ; A=Nr=1a2rTrr+2Nr,s=1arasTrsA=Nr=1a2rTrr+2Nr,s=1arasTrs ; A1=A/CA1=A/C ; B1=B/CB1=B/C ; d=N2+δd=N2+δ .

Thus, the posterior distribution of k conditional on αα , θθ  and ϕϕ  is a t-distribution located at B1B1  with (2d1)(2d1)  degrees of freedom.

Thus, the joint posterior density function of αα , θθ  and ϕϕ  can be obtained by integrating with respect to k. Thus,

P(α,θ,ϕ/x)exp(β(α1))CdA1/21P(α,θ,ϕ/x)exp(β(α1))CdA1/21 ; with α>1α>1 , 0θ<π0θ<π  and 0ϕ<π/20ϕ<π/2 .      (12)

The above joint posterior density of αα , θθ  and ϕϕ  is a very complicated expression and is analytically intractable. One way of solving the problem is to find the marginal posterior density of αα , θθ  and ϕϕ  from the joint density (12) using ordinary numerical integration, using FORTRAN.

One-step-ahead prediction

In order to forecast xN+1xN+1  using the random realization x1,x2,...,xNx1,x2,...,xN  on (X1,X2,...,XN)(X1,X2,...,XN) , one must find the conditional distribution of XN+1XN+1  given the past observations. This is the predictive distribution of XN+1XN+1  and will be derived by multiplying the conditional density of XN+1XN+1  given X1,X2,...,XNX1,X2,...,XN , ΘΘ  and the posterior density of ΘΘ  given X1,X2,...,XNX1,X2,...,XN  and then integrating with respect to ΘΘ . That is, P(XN+1/X1,X2,...,XN)=ΘP(XN+1/X1,X2,...,XN,Θ)P(Θ/X1,X2,...,XN)dΘP(XN+1/X1,X2,...,XN)=ΘP(XN+1/X1,X2,...,XN,Θ)P(Θ/X1,X2,...,XN)dΘ .

Thus, we obtain

P(xN+1/x1,x2,...,xN,Θ)(σ2)1/2exp[12σ2(xN+1ki=1aixN+1i)2] , xN+1R . (13)

The square in the exponent in the above expression, say Q1 , can be rewritten, after expanding the square, as Q1=x2N+1+k2Ni=1a2iP2i+2k2Ni<j;i=1aiajPij2kNi=1aiPiXN+1 , where Pi=XN+1i  and Pij=XN+1iXN+1j . Now multiplying (13) by the joint posterior density of Θ  and integrating over the parameter space Θ , we obtain,

P(xN+1/x1,x2,...,xN)exp(β(α1))(1/σ2)[N2+δ+12+1]exp[12σ2(Q+Q1+2υ)]dΘ (14)

First, integrating out σ2  in (14), one gets the joint distribution of xN+1 , k , α , θ  and ϕ  as

P(xN+1,k,α,θ,ϕ/x1,x2,...,xN)exp(β(α1))(Q+Q1+2υ)(N+12+δ)                            (15)

where d1=Ni=1a2iTii+2Ni<j;i=1aiajTij , d2=Ni=1a2iP2i+2Ni<j;i=1aiajPij , d3=Ni=1aiTi0 , d4=Ni=1aiPi ; (Q+Q1+2υ)=k2(d1+d2)2k(d3+d4xN+1)+(x2N+1+T00+2υ) .

Thus,

P(xN+1,k,α,θ,ϕ/x1,x2,...,xn)exp(β(α1))C1[1+E1(kC2)2]d (16)

where C1={x2N+1+T00+2υ[(d3+d4xN+1)2/(d1+d2)]} , C2=(d3+d4xN+1)/(d1+d2) , E1=(d1+d2)/C1 .

Further, integrating out < k  from (16) we get

P(xN+1,k,α,θ,ϕ/x1,x2,...,xN)exp(β(α1))Cd1E(1/2)1                                   (17)

with d=(υ+1)/2 which is the conditional predictive distribution of xN+1  given α , θ  and ϕ . Further elimination of the parameters α , θ  and ϕ from (17) is not possible analytically. So the marginal posterior density of xN+1  cannot be expressed in a closed form. Since the distribution in (17) is analytically not tractable, a complete Bayesian analysis is possible by numerical integration technique or simulation based approach, viz., MCMC technique.

Suppose one wants a point estimate (posterior mean) of xN+1 , then one should compute the marginal posterior density of xN+1  from (17) and use it to calculate the marginal posterior mean of xN+1 . Thus four dimensional numerical integration is necessary in order to estimate xN+1 . But it is a very difficult problem.

Practically, to perform four dimensional numerical integration is very difficult and therefore to reduce the dimensions of the numerical integration one may substitute the estimators, posterior means, ˆα , ˆθ  and ˆϕ  respectively in the place of α , θ  and ϕ and then perform one dimensional numerical integration to find the conditional mean of XN+1 . That is, one may eliminate the parameters as much as possible by analytical methods and then use the conditional estimates for the remaining parameters to compute the marginal posterior mean of the future observation.

Numerical example - canadian lynx data

A numerical example is considered for illustrating the one-step ahead predictive analysis of a future observation from the Canadian Lynx data. This data consists of the annual record of the numbers of Canadian Lynx trapped in the Mackenzie River district of North-west Canada for the period 1821 – 1934 (both years inclusive) giving a total of 114 observations. Brockwel and Davis5 (page 501) have transformed these data using the log transformation for the purpose of statistical analysis. These transformed data are used in our Bayesian predictive analysis.

Bayesian predictive distribution of the (r+1)th observation, using the r observation, is obtained. The mean of this distribution is taken to be the (r+1)th  predicted value of the Lynx data. Since the direct evaluation of the mean of the one-step ahead predictive distribution involves four dimensional numerical integration, instead of the marginal predictive distribution of XN+1 , the conditional predictive distribution of XN+1 , given by (17) got by fixing the parameters k , α , θ  and ϕ  at their estimates, is used and the mean (posterior mean) is calculated using FORTRAN language. The posterior mean of the predictive distribution is computed numerically after fixing the parameters at their respective estimated values. The prediction is done for the cases r=11, 12,….,114 by taking first 10 observations as initial observations to estimate the parameters of the model and are given in the Table 1 which contains both the true values and the one-step ahead predicted values for the transformed data and the figure 1 represent graphically, the original data and one step-step ahead predicted values of the same. Figure 2 represent graphically, the original data for the last 14 observations and predicted values of the same through different methods, using FORTRAN program.

S. No.

Y

ˆY

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

S. No.

Y

ˆY

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

S. No.

Y

ˆY

1

2.430

-

41

2.373

2.283

81

2.880

2.963

2

2.506

-

42

2.389

2.360

82

3.115

3.143

3

2.767

-

43

2.742

2.726

83

3.540

3.633

4

2.940

-

44

3.210

3.292

84

3.845

3.881

5

3.169

-

45

3.520

3.569

85

3.800

3.713

6

3.450

-

46

3.828

3.856

86

3.579

3.494

7

3.594

-

47

3.628

3.542

87

3.264

3.249

8

3.774

-

48

2.837

2.656

88

2.538

2.306

9

3.695

-

49

2.406

2.252

89

2.582

2.547

10

3.411

-

50

2.675

2.614

90

2.907

2.917

11

2.718

2.582

51

2.554

2.481

91

3.142

3.204

12

1.991

1.767

52

2.894

2.973

92

3.433

3.473

13

2.265

2.181

53

3.202

3.248

93

3.580

3.562

14

2.446

2.413

54

3.224

3.229

94

3.490

3.408

15

2.612

2.650

55

3.352

3.344

95

3.475

3.406

16

3.359

3.482

56

3.154

3.062

96

3.579

3.539

17

3.429

3.468

57

2.878

2.765

97

2.829

2.663

18

3.533

3.596

58

2.476

2.023

98

1.909

1.587

19

3.261

3.182

59

2.303

2.255

99

1.903

1.833

20

2.612

2.444

60

2.360

2.315

100

2.033

2.069

21

2.179

1.999

61

2.671

2.672

101

2.360

2.439

22

1.653

1.461

62

2.867

2.934

102

2.601

2.621

23

1.832

1.801

63

3.310

3.466

103

3.054

3.108

24

2.328

2.385

64

3.449

3.479

104

3.386

3.409

25

2.737

2.839

65

3.646

3.684

105

3.553

3.528

26

3.014

3.069

66

3.400

3.296

106

3.468

3.454

27

3.328

3.380

67

2.590

2.399

107

3.187

3.150

28

3.404

3.405

68

1.863

1.806

108

2.723

2.518

29

2.981

2.849

69

1.591

1.454

109

2.686

2.646

30

2.557

2.379

70

1.690

1.677

110

2.821

2.864

31

2.576

2.500

71

1.771

1.766

111

3.000

3.053

32

2.352

2.260

72

2.274

2.398

112

3.201

3.231

33

2.556

2.569

73

2.576

2.642

113

3.424

3.464

34

2.864

2.895

74

3.111

3.241

114

3.531

3.512

35

3.214

3.296

75

3.605

3.683

 

 

 

Y – Lynx (Transformed)

- one-step-ahead

Predicted value

 

 

36

3.435

3.481

76

3.543

3.499

37

3.458

3.449

77

2.769

2.589

38

3.326

3.263

78

2.021

1.877

39

2.835

2.668

79

2.185

2.105

40

2.476

2.325

80

2.588

2.671

Table 1  One-Step-ahead predicted values of the transformed Lynx data

Figure 1 Original data and one step-step ahead predicted values of the same.

Figure 2 Predicted values through different methods.

A comparison of the one-step ahead predicted values using FRAR model with other models relating to this data available in the literatures are discussed in the following Section.

Comparative study

Lin6 has studied the Canadian lynx data through various time series models and Nicholls and Quinn7 have used the Canadian lynx data to compare the quality of the predicted values obtained by several methods, viz., (1) Moran-1 (2) Tong (3) NQ-1 (4) Moran-2 and (5) NQ-2 as presented above.

Moran-I refers to the linear predictor obtained from the second order autoregressive model, Tong refers to the linear predictor from autoregressive model of order eleven, NQ-1 denotes the linear predictor obtained from the second order random coefficient model while Moran-2 and NQ-2 denotes the non-linear predictors for the lynx data. The models and other details can found in the Nicholls and Quinn.7

Nicholls and Quinn7 have used these methods to predict the last 14 values of the Canadian lynx data and calculated the error sum of squares (refer Table 8.1 in page 146). To compare the efficiency of prediction of the new FRAR model developed in this paper with those of the others stated above, the Table cited above is reproduced in Table 2 wherein the values predicted by the FRAR model are given as an additional column. The error sum of squares for the last 14 predicted values is 0.0637 under the FRAR model whereas they are 0.2531, 0.2541, 0.2561, 0.2070 and 0.1887 respectively under the other methods. So, at least in the above context the superiority of the FRAR model is established beyond doubt.

S.No

Year

Lynx data

Moran-I

Tong

NQ-1

Moran-2

NQ-2

FRAR

1

1921

2.3598

2.4448

2.4559

2.4596

2.3835

2.3842

2.4390

2

1922

2.6010

2.7971

2.8088

2.8173

2.6271

2.6323

2.6210

3

1923

3.0538

2.8850

2.8991

2.8989

3.1193

3.0955

3.1080

4

1924

3.3860

3.3285

3.2306

3.3474

3.3883

3.3971

3.4090

5

1925

3.5532

3.4471

3.3879

3.4571

3.4955

3.4999

3.5280

6

1926

3.4676

3.4289

3.3321

3.4296

3.4787

3.4781

3.4540

7

1927

3.1867

3.1859

3.0060

3.1759

3.2683

3.2555

3.1500

8

1928

2.7235

2.8628

2.6875

2.8468

2.6405

2.6587

2.5180

9

1929

2.6857

2.4348

2.4286

2.4153

2.3747

2.3650

2.6460

10

1930

2.8209

2.7296

2.7643

2.7299

2.5977

2.6292

2.8640

11

1931

3.0000

2.9440

2.9838

2.9508

3.1277

3.0927

3.0530

12

1932

3.2014

3.0897

3.2169

3.0966

3.1981

3.1762

3.2310

13

1933

3.4244

3.2331

3.3656

3.2390

3.3065

3.2956

3.4640

14

1934

3.5309

3.3896

3.5035

3.3942

3.443

3.4413

3.5120

Error sum of squares

0.2531

0.2541

0.2561

0.2070

0.1887

0.0637

Table 2 One-Step ahead predictors of the transformed lynx data

Summary and conclusion

The Full Range Autoregressive model provides an acceptable alternative to the existing methodology. The main advantage associated with the new method is that one is completely avoiding the problem of order determination of the model as in the existing methods.

Thus, it is not unreasonable to claim the FRAR model proposed and its Bayesian analysis presented above certainly provides a viable alternative to the existing time series methodology, completely avoiding the problem of order determination.

Acknowledgements

We thank Prof. M.Rajagopalan, the editor and referees for helpful suggestions that significantly improve the quality of this article.

Conflicts of interest

None

References

Creative Commons Attribution License

©2017 Venkatesan, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.