Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 7 Issue 5

The application of Monte Carlo methods for learning generalized linear model

Bochao Jia

Eli Lilly and Company, Lilly Corporate Center, USA

Correspondence: Bochao Jia, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN 46285, USA

Received: September 13, 2018 | Published: September 24, 2018

Citation: Jia B. The application of Monte Carlo methods for learning generalized linear model. Biom Biostat Int J. 2018;7(5):422-428. DOI: 10.15406/bbij.2018.07.00241

Download PDF

Abstract

Monte Carlo method is a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other mathematical methods. Basically, many statisticians have been increasingly drawn to Monte Carlo method in three distinct problem classes: optimization, numerical integration, and generating draws from a probability distribution. In this paper, we will introduce the Monte Carlo method for calculating regression coefficients in Generalized Linear Model (GLM), especially for Logistic Regression. Our main methods are Metropolis Hastings (MH) Algorithms and Stochastic Approximation in Monte Carlo Computation (SAMC). For comparison, we also get results automatically using MLE method in R software. Then we apply some Monte Carlo algorithms to a real example study and compare the efficiency of each method.

Keywords: Monte Carlo method, GLM, MH-algorithms, SAMC

Introduction

Markov chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability distribution based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The Metropolis-Hastings (MH) algorithm is one of the best known of these methods, which was developed by Metropolis et al.1 and subsequently generalized by Hastings.2 Firstly, MH algorithm has been used in physics and was little known to statisticians until Muller3 and Tierney4 provided important applications using this algorithm. Then Chib & Greenberg5 gave some recent examples including relevant Markov chain theories which made more applications appeared in the recently literature. In the next section, we will provide a brief introduction to the MH-Algorithms and its work principles. We will also discuss the three different MH-Algorithms based on different sampling methods.

As for Stochastic Approximation in Monte Carlo Computation (SAMC), it was first introduced by Liang et al.6 As we know, the Metropolis-Hastings (MH) algorithm and the Gibbs sampler Geman7 are prone to get trapped into local energy minima in simulations from a system for which the energy landscape is rugged. In terms of physics, the negative of the logarithmic density/mass function is called the energy function of the system. To overcome this problem, many advanced Monte Carlo algorithms have been proposed, such as parallel tempering Geyer,8 Hukushima & Nemoto,9 simulated tempering Marinari & Parisi,10 Geyer & Thompson,11 evolutionary Monte Carlo Liang & Wong,12 dynamic weighting Wong & Liang,13 multicanonical sampling Berg & Neuhaus,14 1/k-ensemble sampling Hesselbo & Stinchcombe,15 the Wang-Landau algorithm Wang & Landau,16 equi-energy sampler Mitsutake et al.17 stochastic approximation Monte Carlo Liang et al.9 among others. In this paper, we adopt the SAMC algorithm to our problem, which has been proofed to be a sophisticated algorithm in both theory and applications. The basic idea of SAMC stems from the Wang-Landau algorithm and extends it to the continuum systems and it is designed to simulate data with a complex model structure.

The rest of the paper is organized as follows. In section 2, we will introduce three different sampling methods based on Metropolis Hastings Algorithms and SAMC algorithm for parameter estimations in the logistic regression. In section 3, we will give a simple simulation on the all of the methods and compare the results with the MLE method provided in R package. In section 4, we apply the proposed method to a Adult Intelligence data. In section 5, we conclude the paper with a brief discussion.

Method

Metropolis hastings algorithms

The working principle of Markov Chain Monte Carlo methods is quite straight forward to describe. Given a target density, we build a Markov kernel  with stationary distribution  and then generate a Markov chain ( X (t) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiwamaaCa aaleqabaGaaGikaiaadshacaaIPaaaaaaa@3ADC@ ) using this kernel so that the limiting distribution of ( X (t) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiwamaaCa aaleqabaGaaGikaiaadshacaaIPaaaaaaa@3ADC@ ) is and integrals can be approximated according to the Ergodic Theorem. The difficulty should thus be in constructing a kernel  that is associated with an arbitrary density. But, quite miraculously, there exist methods for deriving such kernels that are universal in that they are theoretically valid for any density f MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOzaaaa@385F@ .

The Metropolis Hastings algorithm is an example of those methods. Given the target density, it is associated with a working conditional density q(y|x) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyCaiaaiI cacaWG5bGaaGiFaiaadIhacaaIPaaaaa@3CD0@ that, in practice, is easy to simulate. In addition,  can be almost arbitrary in that the only theoretical requirements are that the ratio f(y)/q(y|x) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOzaiaaiI cacaWG5bGaaGykaiaai+cacaWGXbGaaGikaiaadMhacaaI8bGaamiE aiaaiMcaaaa@40D7@ is known up to a constant independent of  and that  has enough dispersion to lead to an exploration of the entire support of p θ t (x) i=1 m ψ(x) e θ ti I(x E i ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiCamaaBa aaleaacqaH4oqCdaWgaaqaaiaadshaaeqaaaqabaGccaaIOaGaamiE aiaaiMcacqGHDisTdaaeWbqabSqaaiaadMgacaaI9aGaaGymaaqaai aad2gaa0GaeyyeIuoakmaalaaabaGaeqiYdKNaaGikaiaadIhacaaI PaaabaGaamyzamaaCaaaleqabaGaeqiUde3aaSbaaeaacaWG0bGaam yAaaqabaaaaaaakiaadMeacaaIOaGaamiEaiabgIGiolaadweadaWg aaWcbaGaamyAaaqabaGccaaIPaaaaa@54C1@ . For every given , we can then construct a Metropolis Hastings kernel such that  is its stationary distribution. Here is the general step of the Metropolis Hastings algorithm.

Algorithm (Metropolis Hastings)

  1. Initialize x (0) q(x) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaCa aaleqabaGaaGikaiaaicdacaaIPaaaaebbfv3ySLgzGueE0jxyaGqb aOGae8hpIOJaamyCaiaaiIcacaWG4bGaaGykaaaa@43D8@ .
  2. Given current value of x (t) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaCa aaleqabaGaaGikaiaadshacaaIPaaaaaaa@3AFC@ , generate a candidate yq(y| x (t) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyEaebbfv 3ySLgzGueE0jxyaGqbaiab=XJi6iaadghacaaIOaGaamyEaiaaiYha caWG4bWaaWbaaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaIPaaaaa@461C@ .
  3. Calculate acceptance probability α( x (t) ,y)=min{ f(y)q( x (t) |y) f( x (t) )q(y| x (t) ) ,1} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaaG ikaiaadIhadaahaaWcbeqaaiaaiIcacaWG0bGaaGykaaaakiaaiYca caWG5bGaaGykaiaai2dacaWGTbGaamyAaiaad6gacaaI7bWaaSaaae aacaWGMbGaaGikaiaadMhacaaIPaGaamyCaiaaiIcacaWG4bWaaWba aSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI8bGaamyEaiaaiMcaae aacaWGMbGaaGikaiaadIhadaahaaWcbeqaaiaaiIcacaWG0bGaaGyk aaaakiaaiMcacaWGXbGaaGikaiaadMhacaaI8bGaamiEamaaCaaale qabaGaaGikaiaadshacaaIPaaaaOGaaGykaaaacaaISaGaaGymaiaa i2haaaa@5FF7@ .
  4. Take x (t+1) =y MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaCa aaleqabaGaaGikaiaadshacqGHRaWkcaaIXaGaaGykaaaakiaai2da caWG5baaaa@3E68@ with probability α( x (t) ,y) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaaG ikaiaadIhadaahaaWcbeqaaiaaiIcacaWG0bGaaGykaaaakiaaiYca caWG5bGaaGykaaaa@3FBE@ and x (t+1) = x (t) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaCa aaleqabaGaaGikaiaadshacqGHRaWkcaaIXaGaaGykaaaakiaai2da caWG4bWaaWbaaSqabeaacaaIOaGaamiDaiaaiMcaaaaaaa@40F2@ otherwise.
  5. Iterate between step (ii) and step (iv) until converge.

In this paper, we consider the logistic regression. We want to use Metropolis Hastings algorithm to fit the parameter of the model. i.e β 0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaicdaaeqaaaaa@39FB@ and β 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaigdaaeqaaaaa@39FC@ in the following function.

Y i Bin(1, π i ),log π i 1 π i = β 0 + x i β 1 ,i=1,..,n MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamywamaaBa aaleaacaWGPbaabeaarqqr1ngBPrgifHhDYfgaiuaakiab=XJi6iaa dkeacaWGPbGaamOBaiaaiIcacaaIXaGaaGilaiabec8aWnaaBaaale aacaWGPbaabeaakiaaiMcacaaISaGaamiBaiaad+gacaWGNbWaaSaa aeaacqaHapaCdaWgaaWcbaGaamyAaaqabaaakeaacaaIXaGaeyOeI0 IaeqiWda3aaSbaaSqaaiaadMgaaeqaaaaakiaai2dacqaHYoGydaWg aaWcbaGaaGimaaqabaGccqGHRaWkcaWG4bWaaSbaaSqaaiaadMgaae qaaOGaeqOSdi2aaSbaaSqaaiaaigdaaeqaaOGaaGilaiaadMgacaaI 9aGaaGymaiaaiYcacaaIUaGaaGOlaiaaiYcacaWGUbaaaa@6271@ (1)

And the likelihood function is

f(y| β 0 , β 1 )= i=1 n ( e β 0 + x i β 1 1+ e β 0 + x i β 1 ) y i ( 1 1+ e β 0 + x i β 1 ) 1 y i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOzaiaaiI cacaWG5bGaaGiFaiabek7aInaaBaaaleaacaaIWaaabeaakiaaiYca cqaHYoGydaWgaaWcbaGaaGymaaqabaGccaaIPaGaaGypamaarahabe WcbaGaamyAaiaai2dacaaIXaaabaGaamOBaaqdcqGHpis1aOWaaeWa aeaadaWcaaqaaiaadwgadaahaaWcbeqaaiabek7aInaaBaaabaGaaG imaaqabaGaey4kaSIaamiEamaaBaaabaGaamyAaaqabaGaeqOSdi2a aSbaaeaacaaIXaaabeaaaaaakeaacaaIXaGaey4kaSIaamyzamaaCa aaleqabaGaeqOSdi2aaSbaaeaacaaIWaaabeaacqGHRaWkcaWG4bWa aSbaaeaacaWGPbaabeaacqaHYoGydaWgaaqaaiaaigdaaeqaaaaaaa aakiaawIcacaGLPaaadaahaaWcbeqaaiaadMhadaWgaaqaaiaadMga aeqaaaaakmaabmaabaWaaSaaaeaacaaIXaaabaGaaGymaiabgUcaRi aadwgadaahaaWcbeqaaiabek7aInaaBaaabaGaaGimaaqabaGaey4k aSIaamiEamaaBaaabaGaamyAaaqabaGaeqOSdi2aaSbaaeaacaaIXa aabeaaaaaaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIXaGaeyOe I0IaamyEamaaBaaabaGaamyAaaqabaaaaaaa@706C@ (2)

2.0cm=exp{n y ¯ β 0 + β 1 i=1 n x i y i log(1+ e β 0 + x i β 1 )} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGOmaiaai6 cacaaIWaGaam4yaiaad2gacaaI9aGaamyzaiaadIhacaWGWbGaaG4E aiaad6gaceWG5bGbaebacqaHYoGydaWgaaWcbaGaaGimaaqabaGccq GHRaWkcqaHYoGydaWgaaWcbaGaaGymaaqabaGcdaaeWbqabSqaaiaa dMgacaaI9aGaaGymaaqaaiaad6gaa0GaeyyeIuoakiaadIhadaWgaa WcbaGaamyAaaqabaGccaWG5bWaaSbaaSqaaiaadMgaaeqaaOGaeyOe I0IaamiBaiaad+gacaWGNbGaaGikaiaaigdacqGHRaWkcaWGLbWaaW baaSqabeaacqaHYoGydaWgaaqaaiaaicdaaeqaaiabgUcaRiaadIha daWgaaqaaiaadMgaaeqaaiabek7aInaaBaaabaGaaGymaaqabaaaaO GaaGykaiaai2haaaa@62F4@ (3)

Consider the prior distribution of β 0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaicdaaeqaaaaa@39FB@ and β 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaigdaaeqaaaaa@39FC@ as the independent normal distribution

β j N( μ j , σ j 2 ),j=0,1. MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaadQgaaeqaaebbfv3ySLgzGueE0jxyaGqbaOGae8hpIOJa amOtaiaaiIcacqaH8oqBdaWgaaWcbaGaamOAaaqabaGccaaISaGaeq 4Wdm3aa0baaSqaaiaadQgaaeaacaaIYaaaaOGaaGykaiaaiYcacaWG QbGaaGypaiaaicdacaaISaGaaGymaiaai6caaaa@4EB0@ (4)

Then the posterior distribution is

f( β 0 , β 1 |y)f(y| β 0 , β 1 )π( β 0 , β 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOzaiaaiI cacqaHYoGydaWgaaWcbaGaaGimaaqabaGccaaISaGaeqOSdi2aaSba aSqaaiaaigdaaeqaaOGaaGiFaiaadMhacaaIPaGaeyyhIuRaamOzai aaiIcacaWG5bGaaGiFaiabek7aInaaBaaaleaacaaIWaaabeaakiaa iYcacqaHYoGydaWgaaWcbaGaaGymaaqabaGccaaIPaGaeqiWdaNaaG ikaiabek7aInaaBaaaleaacaaIWaaabeaakiaaiYcacqaHYoGydaWg aaWcbaGaaGymaaqabaGccaaIPaaaaa@5649@

2.0cmexp{n y ¯ β 0 + β 1 i=1 n [ x i y i log(1+ e β 0 + x i β 1 )] MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGOmaiaai6 cacaaIWaGaam4yaiaad2gacqGHDisTcaWGLbGaamiEaiaadchacaaI 7bGaamOBaiqadMhagaqeaiabek7aInaaBaaaleaacaaIWaaabeaaki abgUcaRiabek7aInaaBaaaleaacaaIXaaabeaakmaaqahabeWcbaGa amyAaiaai2dacaaIXaaabaGaamOBaaqdcqGHris5aOGaaG4waiaadI hadaWgaaWcbaGaamyAaaqabaGccaWG5bWaaSbaaSqaaiaadMgaaeqa aOGaeyOeI0IaamiBaiaad+gacaWGNbGaaGikaiaaigdacqGHRaWkca WGLbWaaWbaaSqabeaacqaHYoGydaWgaaqaaiaaicdaaeqaaiabgUca RiaadIhadaWgaaqaaiaadMgaaeqaaiabek7aInaaBaaabaGaaGymaa qabaaaaOGaaGykaiaai2faaaa@6472@

2.0cm ( β 0 μ 0 ) 2 2 σ 0 2 ( β 1 μ 1 ) 2 2 σ 1 2 } MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGOmaiaai6 cacaaIWaGaam4yaiaad2gacqGHsisldaWcaaqaaiaaiIcacqaHYoGy daWgaaWcbaGaaGimaaqabaGccqGHsislcqaH8oqBdaWgaaWcbaGaaG imaaqabaGccaaIPaWaaWbaaSqabeaacaaIYaaaaaGcbaGaaGOmaiab eo8aZnaaDaaaleaacaaIWaaabaGaaGOmaaaaaaGccqGHsisldaWcaa qaaiaaiIcacqaHYoGydaWgaaWcbaGaaGymaaqabaGccqGHsislcqaH 8oqBdaWgaaWcbaGaaGymaaqabaGccaaIPaWaaWbaaSqabeaacaaIYa aaaaGcbaGaaGOmaiabeo8aZnaaDaaaleaacaaIXaaabaGaaGOmaaaa aaGccaaI9baaaa@57D0@

And we need to get the Markov sequence of the parameters. We propose to consider three sampling methods based on the general Metropolis Hastings algorithm, which is independent sampling, dependent sampling and individual sampling

Independent sampling Assuming that the parameters β 0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaicdaaeqaaaaa@39FB@  and β 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaigdaaeqaaaaa@39FC@ are independent, we can generate β (t) =( β 0 (t) , β 1 (t) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI9aGaaGikaiabek7a InaaDaaaleaacaaIWaaabaGaaGikaiaadshacaaIPaaaaOGaaGilai abek7aInaaDaaaleaacaaIXaaabaGaaGikaiaadshacaaIPaaaaOGa aGykaaaa@486D@  from the proposal distribution N( β (t1) ,diag( σ 0 2 , σ 1 2 )) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOtaiaaiI cacqaHYoGydaahaaWcbeqaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaa iMcaaaGccaaISaGaamizaiaadMgacaWGHbGaam4zaiaaiIcacqaHdp WCdaqhaaWcbaGaaGimaaqaaiaaikdaaaGccaaISaGaeq4Wdm3aa0ba aSqaaiaaigdaaeaacaaIYaaaaOGaaGykaiaaiMcaaaa@4CE5@ . Then the independent sampling can be stated as following.

Algorithm (Independent sampling)

  1. Initialize β (0) =( β 0 (0) , β 1 (0) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaaGimaiaaiMcaaaGccaaI9aGaaGikaiabek7a InaaDaaaleaacaaIWaaabaGaaGikaiaaicdacaaIPaaaaOGaaGilai abek7aInaaDaaaleaacaaIXaaabaGaaGikaiaaicdacaaIPaaaaOGa aGykaaaa@47B0@ .
  2. Generate  from the proposal distribution N( β (t1) ,diag( σ 0 2 , σ 1 2 )) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOtaiaaiI cacqaHYoGydaahaaWcbeqaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaa iMcaaaGccaaISaGaamizaiaadMgacaWGHbGaam4zaiaaiIcacqaHdp WCdaqhaaWcbaGaaGimaaqaaiaaikdaaaGccaaISaGaeq4Wdm3aa0ba aSqaaiaaigdaaeaacaaIYaaaaOGaaGykaiaaiMcaaaa@4CE5@ .
  3. Calculate acceptance probability
  4.                α( β (t1) , β )=min{ f(y| β 0 , β 1 )q( β 0 , β 1 ) f(y| β 0 (t1) , β 1 (t1) )q( β 0 (t1) , β 1 (t1) ) ,1} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaaG ikaiabek7aInaaCaaaleqabaGaaGikaiaadshacqGHsislcaaIXaGa aGykaaaakiaaiYcacuaHYoGygaqbaiaaiMcacaaI9aGaamyBaiaadM gacaWGUbGaaG4EamaalaaabaGaamOzaiaaiIcacaWG5bGaaGiFaiab ek7aInaaBaaaleaaceaIWaGbauaaaeqaaOGaaGilaiabek7aInaaBa aaleaaceaIXaGbauaaaeqaaOGaaGykaiaadghacaaIOaGaeqOSdi2a aSbaaSqaaiqaicdagaqbaaqabaGccaaISaGaeqOSdi2aaSbaaSqaai qaigdagaqbaaqabaGccaaIPaaabaGaamOzaiaaiIcacaWG5bGaaGiF aiabek7aInaaDaaaleaacaaIWaaabaGaaGikaiaadshacqGHsislca aIXaGaaGykaaaakiaaiYcacqaHYoGydaqhaaWcbaGaaGymaaqaaiaa iIcacaWG0bGaeyOeI0IaaGymaiaaiMcaaaGccaaIPaGaamyCaiaaiI cacqaHYoGydaqhaaWcbaGaaGimaaqaaiaaiIcacaWG0bGaeyOeI0Ia aGymaiaaiMcaaaGccaaISaGaeqOSdi2aa0baaSqaaiaaigdaaeaaca aIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaOGaaGykaaaacaaISaGa aGymaiaai2haaaa@7EEE@
  5. We accept β (t) = β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI9aGafqOSdiMbauaa aaa@3E1E@  with probability α( β (t1) , β ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaaG ikaiabek7aInaaCaaaleqabaGaaGikaiaadshacqGHsislcaaIXaGa aGykaaaakiaaiYcacuaHYoGygaqbaiaaiMcaaaa@42B9@ or β (t) = β (t1) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI9aGaeqOSdi2aaWba aSqabeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaaaa@4245@  otherwise.
  6. Iterate between step (ii) and step (iv) until converge.

Dependent sampling If the parameters are not independent, the convergence of stochastic sequence may be less efficient. So we consider the multinomial prior for β 0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaicdaaeqaaaaa@39FB@ . In this case, we use the Fisher information matrix H(β) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamisaiaaiI cacqaHYoGycaaIPaaaaa@3B47@  to build the proposal distribution.

                β q=N(β, c β 2 [H(β)] 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbau aarqqr1ngBPrgifHhDYfgaiuaacqWF8iIocaWGXbGaaGypaiaad6ea caaIOaGaeqOSdiMaaGilaiaadogadaqhaaWcbaGaeqOSdigabaGaaG OmaaaakiaaiUfacaWGibGaaGikaiabek7aIjaaiMcacaaIDbWaaWba aSqabeaacqGHsislcaaIXaaaaOGaaGykaaaa@5020@     (5)

Where c β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaam4yamaaBa aaleaacqaHYoGyaeqaaaaa@3A29@  is a regulation parameter, which can be adjust to reach the target acceptance rate. Based on the likelihood function, we get the information matrix

  H(β)= X T diag( h i )X+ Σ β 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamisaiaaiI cacqaHYoGycaaIPaGaaGypaiaadIfadaahaaWcbeqaaiaadsfaaaGc caWGKbGaamyAaiaadggacaWGNbGaaGikaiaadIgadaWgaaWcbaGaam yAaaqabaGccaaIPaGaamiwaiabgUcaRiabfo6atnaaDaaaleaacqaH YoGyaeaacqGHsislcaaIXaaaaaaa@4BD3@    (6)

Where Σ β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeu4Odm1aaS baaSqaaiabek7aIbqabaaaaa@3AC5@  is the prior covariance matrix of β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdigaaa@3915@ , h i =exp( β 0 + β 1 x i )/(1+exp( β 0 + β 1 x i )) 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiAamaaBa aaleaacaWGPbaabeaakiaai2dacaWGLbGaamiEaiaadchacaaIOaGa eqOSdi2aaSbaaSqaaiaaicdaaeqaaOGaey4kaSIaeqOSdi2aaSbaaS qaaiaaigdaaeqaaOGaamiEamaaBaaaleaacaWGPbaabeaakiaaiMca caaIVaGaaGikaiaaigdacqGHRaWkcaWGLbGaamiEaiaadchacaaIOa GaeqOSdi2aaSbaaSqaaiaaicdaaeqaaOGaey4kaSIaeqOSdi2aaSba aSqaaiaaigdaaeqaaOGaamiEamaaBaaaleaacaWGPbaabeaakiaaiM cacaaIPaWaaWbaaSqabeaacaaIYaaaaaaa@57BE@ , X=( 1 n ,x) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiwaiaai2 dacaaIOaGaaCymamaaBaaaleaacaWGUbaabeaakiaaiYcacaWG4bGa aGykaaaa@3E13@  is a 2×n MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGOmaiabgE na0kaad6gaaaa@3B3A@  matrix.

Then we give the algorithm for this method:

Algorithm (Dependent Sampling)

  1. Initialize β (0) =( β 0 (0) , β 1 (0) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaaGimaiaaiMcaaaGccaaI9aGaaGikaiabek7a InaaDaaaleaacaaIWaaabaGaaGikaiaaicdacaaIPaaaaOGaaGilai abek7aInaaDaaaleaacaaIXaaabaGaaGikaiaaicdacaaIPaaaaOGa aGykaaaa@47B0@ .
  2. Calculate Fisher Information matrix
  3.                 diag( h i )=diag( exp( β 0 (t1) + β 1 (t1) x i ) (1+exp( β 0 (t1) + β 1 (t1) x i )) 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamizaiaadM gacaWGHbGaam4zaiaaiIcacaWGObWaaSbaaSqaaiaadMgaaeqaaOGa aGykaiaai2dacaWGKbGaamyAaiaadggacaWGNbGaaGikamaalaaaba GaamyzaiaadIhacaWGWbGaaGikaiabek7aInaaDaaaleaacaaIWaaa baGaaGikaiaadshacqGHsislcaaIXaGaaGykaaaakiabgUcaRiabek 7aInaaDaaaleaacaaIXaaabaGaaGikaiaadshacqGHsislcaaIXaGa aGykaaaakiaadIhadaWgaaWcbaGaamyAaaqabaGccaaIPaaabaGaaG ikaiaaigdacqGHRaWkcaWGLbGaamiEaiaadchacaaIOaGaeqOSdi2a a0baaSqaaiaaicdaaeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPa aaaOGaey4kaSIaeqOSdi2aa0baaSqaaiaaigdaaeaacaaIOaGaamiD aiabgkHiTiaaigdacaaIPaaaaOGaamiEamaaBaaaleaacaWGPbaabe aakiaaiMcacaaIPaWaaWbaaSqabeaacaaIYaaaaaaakiaaiMcaaaa@7157@
  4.                H( β (t1) )= X T diag( h i )X+ Σ β (t1) 1 S β (t1) = c β (t1) 2 [H( β (t1) )] 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamisaiaaiI cacqaHYoGydaahaaWcbeqaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaa iMcaaaGccaaIPaGaaGypaiaadIfadaahaaWcbeqaaiaadsfaaaGcca WGKbGaamyAaiaadggacaWGNbGaaGikaiaadIgadaWgaaWcbaGaamyA aaqabaGccaaIPaGaamiwaiabgUcaRiabfo6atnaaDaaaleaacqaHYo GydaahaaqabeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaaqa aiabgkHiTiaaigdaaaGccaWGtbWaaSbaaSqaaiabek7aInaaCaaabe qaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaaiMcaaaaabeaakiaai2da caWGJbWaa0baaSqaaiabek7aInaaCaaabeqaaiaaiIcacaWG0bGaey OeI0IaaGymaiaaiMcaaaaabaGaaGOmaaaakiaaiUfacaWGibGaaGik aiabek7aInaaCaaaleqabaGaaGikaiaadshacqGHsislcaaIXaGaaG ykaaaakiaaiMcacaaIDbWaaWbaaSqabeaacqGHsislcaaIXaaaaaaa @6F35@
  5.  

  6. Generate  from the proposal distribution N( β (t1) , S β (t1) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOtaiaaiI cacqaHYoGydaahaaWcbeqaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaa iMcaaaGccaaISaGaam4uamaaBaaaleaacqaHYoGydaahaaqabeaaca aIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaaqabaGccaaIPaaaaa@4717@ .
  7. Calculate acceptance probability
  8.                 α( β (t1) , β )=min{ f(y| β 0 , β 1 )π( β 0 , β 1 )q( β (t1) | β , S β ) f(y| β 0 (t1) , β 1 (t1) )π( β 0 (t1) , β 1 (t1) )q( β | β (t1) , S β (t1) ) ,1} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaaG ikaiabek7aInaaCaaaleqabaGaaGikaiaadshacqGHsislcaaIXaGa aGykaaaakiaaiYcacuaHYoGygaqbaiaaiMcacaaI9aGaamyBaiaadM gacaWGUbGaaG4EamaalaaabaGaamOzaiaaiIcacaWG5bGaaGiFaiab ek7aInaaBaaaleaaceaIWaGbauaaaeqaaOGaaGilaiabek7aInaaBa aaleaaceaIXaGbauaaaeqaaOGaaGykaiabec8aWjaaiIcacqaHYoGy daWgaaWcbaGabGimayaafaaabeaakiaaiYcacqaHYoGydaWgaaWcba GabGymayaafaaabeaakiaaiMcacaWGXbGaaGikaiabek7aInaaCaaa leqabaGaaGikaiaadshacqGHsislcaaIXaGaaGykaaaakiaaiYhacu aHYoGygaqbaiaaiYcacaWGtbWaaSbaaSqaaiqbek7aIzaafaaabeaa kiaaiMcaaeaacaWGMbGaaGikaiaadMhacaaI8bGaeqOSdi2aa0baaS qaaiaaicdaaeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaOGa aGilaiabek7aInaaDaaaleaacaaIXaaabaGaaGikaiaadshacqGHsi slcaaIXaGaaGykaaaakiaaiMcacqaHapaCcaaIOaGaeqOSdi2aa0ba aSqaaiaaicdaaeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaO GaaGilaiabek7aInaaDaaaleaacaaIXaaabaGaaGikaiaadshacqGH sislcaaIXaGaaGykaaaakiaaiMcacaWGXbGaaGikaiqbek7aIzaafa GaaGiFaiabek7aInaaCaaaleqabaGaaGikaiaadshacqGHsislcaaI XaGaaGykaaaakiaaiYcacaWGtbWaaSbaaSqaaiabek7aInaaCaaabe qaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaaiMcaaaaabeaakiaaiMca aaGaaGilaiaaigdacaaI9baaaa@A152@
  9. We accept  with probability α( β (t1) , β ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaaG ikaiabek7aInaaCaaaleqabaGaaGikaiaadshacqGHsislcaaIXaGa aGykaaaakiaaiYcacuaHYoGygaqbaiaaiMcaaaa@42B9@  or β (t) = β (t1) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI9aGaeqOSdi2aaWba aSqabeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaaaa@4245@  otherwise.
  10. Iterate between step (ii) and step (v) until converge.

Individual sampling

In the Dependent Sampling method, we need to adjust the regulation parameter c β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaam4yamaaBa aaleaacqaHYoGyaeqaaaaa@3A29@  that make to reach the target acceptance rate. In order to avoid this process, we propose another MH sampling algorithm, i.e. Individual Sampling. For each parameter, we generate it individually from the normal proposal distribution. We only give the algorithm in two parameters condition, since when the dimension is too large and we need to calculate acceptance rate for each parameter, which is too time-consuming.

Algorithm (Individual Sampling)

  1. Initialize β (0) =( β 0 (0) , β 1 (0) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaaGimaiaaiMcaaaGccaaI9aGaaGikaiabek7a InaaDaaaleaacaaIWaaabaGaaGikaiaaicdacaaIPaaaaOGaaGilai abek7aInaaDaaaleaacaaIXaaabaGaaGikaiaaicdacaaIPaaaaOGa aGykaaaa@47B0@ .
  2. Generate  from the proposal distribution N( β 0 (t1) , σ 0 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOtaiaaiI cacqaHYoGydaqhaaWcbaGaaGimaaqaaiaaiIcacaWG0bGaeyOeI0Ia aGymaiaaiMcaaaGccaaISaGaeq4Wdm3aa0baaSqaaiaaicdaaeaaca aIYaaaaOGaaGykaaaa@446A@ .
  3. Set β =( β 0 , β 1 (t1) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbau aacaaI9aGaaGikaiabek7aInaaBaaaleaaceaIWaGbauaaaeqaaOGa aGilaiabek7aInaaDaaaleaacaaIXaaabaGaaGikaiaadshacqGHsi slcaaIXaGaaGykaaaakiaaiMcaaaa@4539@  and calculate acceptance rate
  4.                 α 0 ( β (t1) , β )=min{ f(y| β 0 , β 1 (t1) )π( β 0 , β 1 (t1) ) f(y| β 0 (t1) , β 1 (t1) )π( β 0 (t1) , β 1 (t1) ) ,1} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySde2aaS baaSqaaiaaicdaaeqaaOGaaGikaiabek7aInaaCaaaleqabaGaaGik aiaadshacqGHsislcaaIXaGaaGykaaaakiaaiYcacuaHYoGygaqbai aaiMcacaaI9aGaamyBaiaadMgacaWGUbGaaG4EamaalaaabaGaamOz aiaaiIcacaWG5bGaaGiFaiabek7aInaaBaaaleaaceaIWaGbauaaae qaaOGaaGilaiabek7aInaaDaaaleaacaaIXaaabaGaaGikaiaadsha cqGHsislcaaIXaGaaGykaaaakiaaiMcacqaHapaCcaaIOaGaeqOSdi 2aaSbaaSqaaiqaicdagaqbaaqabaGccaaISaGaeqOSdi2aa0baaSqa aiaaigdaaeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaOGaaG ykaaqaaiaadAgacaaIOaGaamyEaiaaiYhacqaHYoGydaqhaaWcbaGa aGimaaqaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaaiMcaaaGccaaISa GaeqOSdi2aa0baaSqaaiaaigdaaeaacaaIOaGaamiDaiabgkHiTiaa igdacaaIPaaaaOGaaGykaiabec8aWjaaiIcacqaHYoGydaqhaaWcba GaaGimaaqaaiaaiIcacaWG0bGaeyOeI0IaaGymaiaaiMcaaaGccaaI SaGaeqOSdi2aa0baaSqaaiaaigdaaeaacaaIOaGaamiDaiabgkHiTi aaigdacaaIPaaaaOGaaGykaaaacaaISaGaaGymaiaai2haaaa@8962@
  5. We accept β (t) = β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI9aGafqOSdiMbauaa aaa@3E1E@  with probability α 0 ( β (t1) , β ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySde2aaS baaSqaaiaaicdaaeqaaOGaaGikaiabek7aInaaCaaaleqabaGaaGik aiaadshacqGHsislcaaIXaGaaGykaaaakiaaiYcacuaHYoGygaqbai aaiMcaaaa@43A9@  or β (t) = β (t1) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI9aGaeqOSdi2aaWba aSqabeaacaaIOaGaamiDaiabgkHiTiaaigdacaaIPaaaaaaa@4245@  otherwise.
  6. Generate from the proposal distribution N( β 1 (t) , σ 1 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOtaiaaiI cacqaHYoGydaqhaaWcbaGaaGymaaqaaiaaiIcacaWG0bGaaGykaaaa kiaaiYcacqaHdpWCdaqhaaWcbaGaaGymaaqaaiaaikdaaaGccaaIPa aaaa@42C4@ .
  7. Set β =( β 0 (t) , β 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbau aacaaI9aGaaGikaiabek7aInaaDaaaleaacaaIWaaabaGaaGikaiaa dshacaaIPaaaaOGaaGilaiabek7aInaaBaaaleaaceaIXaGbauaaae qaaOGaaGykaaaa@4391@  and calculate acceptance rate
  8.                 α 1 ( β (t) , β )=min{ f(y| β 0 (t) , β 1 )π( β 0 (t) , β 1 ) f(y| β 0 (t) , β 1 (t) )π( β 0 (t) , β 1 (t) ) ,1} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySde2aaS baaSqaaiaaigdaaeqaaOGaaGikaiabek7aInaaCaaaleqabaGaaGik aiaadshacaaIPaaaaOGaaGilaiqbek7aIzaafaGaaGykaiaai2daca WGTbGaamyAaiaad6gacaaI7bWaaSaaaeaacaWGMbGaaGikaiaadMha caaI8bGaeqOSdi2aa0baaSqaaiaaicdaaeaacaaIOaGaamiDaiaaiM caaaGccaaISaGaeqOSdi2aaSbaaSqaaiqaigdagaqbaaqabaGccaaI PaGaeqiWdaNaaGikaiabek7aInaaDaaaleaacaaIWaaabaGaaGikai aadshacaaIPaaaaOGaaGilaiabek7aInaaBaaaleaaceaIXaGbauaa aeqaaOGaaGykaaqaaiaadAgacaaIOaGaamyEaiaaiYhacqaHYoGyda qhaaWcbaGaaGimaaqaaiaaiIcacaWG0bGaaGykaaaakiaaiYcacqaH YoGydaqhaaWcbaGaaGymaaqaaiaaiIcacaWG0bGaaGykaaaakiaaiM cacqaHapaCcaaIOaGaeqOSdi2aa0baaSqaaiaaicdaaeaacaaIOaGa amiDaiaaiMcaaaGccaaISaGaeqOSdi2aa0baaSqaaiaaigdaaeaaca aIOaGaamiDaiaaiMcaaaGccaaIPaaaaiaaiYcacaaIXaGaaGyFaaaa @7DCB@
  9. We accept β (t+1) = β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiabgUcaRiaaigdacaaIPaaaaOGaaGyp aiqbek7aIzaafaaaaa@3FBB@ with probability α 1 ( β (t) , β ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqySde2aaS baaSqaaiaaigdaaeqaaOGaaGikaiabek7aInaaCaaaleqabaGaaGik aiaadshacaaIPaaaaOGaaGilaiqbek7aIzaafaGaaGykaaaa@4202@  or β (t+1) = β (t) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiabgUcaRiaaigdacaaIPaaaaOGaaGyp aiabek7aInaaCaaaleqabaGaaGikaiaadshacaaIPaaaaaaa@423A@  otherwise.
  10. Iterate between step (ii) and step (vii) until converge.

Stochastic approximation Monte Carlo computation (SAMC)

Aforementioned, we introduce three MH algorithms for learning the Logistics regression. In this part, we will illustrate the Stochastic Approximation Monte Carlo Computation (SAMC) to deal with this problem.

Consider the sampling distribution that p(x)=c p 0 (x) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiCaiaaiI cacaWG4bGaaGykaiaai2dacaWGJbGaamiCamaaBaaaleaacaaIWaaa beaakiaaiIcacaWG4bGaaGykaaaa@40C1@ , where c is a constant, x is generated from the sample space. We let E 1 ,..., E m MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyramaaBa aaleaacaaIXaaabeaakiaaiYcacaaIUaGaaGOlaiaai6cacaaISaGa amyramaaBaaaleaacaWGTbaabeaaaaa@3EAB@  donate  disjoint regions that from a partition of χ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeq4Xdmgaaa@392B@ , which can be partitioned according to any function of x such as the energy function U(x) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyvaiaaiI cacaWG4bGaaGykaaaa@3AB0@  as follows: E 1 ={x:U(x) u 1 }, E 2 ={x: u 1 <U(x) u 2 },, E m ={x:U(x) u m } MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyramaaBa aaleaacaaIXaaabeaakiaai2dacaaI7bGaamiEaiaaiQdacaWGvbGa aGikaiaadIhacaaIPaGaeyizImQaamyDamaaBaaaleaacaaIXaaabe aakiaai2hacaaISaGaamyramaaBaaaleaacaaIYaaabeaakiaai2da caaI7bGaamiEaiaaiQdacaWG1bWaaSbaaSqaaiaaigdaaeqaaOGaaG ipaiaadwfacaaIOaGaamiEaiaaiMcacqGHKjYOcaWG1bWaaSbaaSqa aiaaikdaaeqaaOGaaGyFaiaaiYcacqWIMaYscaaISaGaamyramaaBa aaleaacaWGTbaabeaakiaai2dacaaI7bGaamiEaiaaiQdacaWGvbGa aGikaiaadIhacaaIPaGaeyyzImRaamyDamaaBaaaleaacaWGTbaabe aakiaai2haaaa@656B@ . Then we let  donate the estimate of  and rewritten the invariant distribution  in the generalized Wang-Landau (GWL) algorithm as

                p θ t (x) i=1 m ψ(x) e θ ti I(x E i ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiCamaaBa aaleaacqaH4oqCdaWgaaqaaiaadshaaeqaaaqabaGccaaIOaGaamiE aiaaiMcacqGHDisTdaaeWbqabSqaaiaadMgacaaI9aGaaGymaaqaai aad2gaa0GaeyyeIuoakmaalaaabaGaeqiYdKNaaGikaiaadIhacaaI PaaabaGaamyzamaaCaaaleqabaGaeqiUde3aaSbaaeaacaWG0bGaam yAaaqabaaaaaaakiaadMeacaaIOaGaamiEaiabgIGiolaadweadaWg aaWcbaGaamyAaaqabaGccaaIPaaaaa@54C1@     (7)

For theoretical simplicity, we assume that  for all t, where  is a compact set. Let  be an m-vector with  and , which defines the desired sampling frequency for each of the sub regions. Henceforth,  is called the desired sampling distribution. Let  be a positive, no decreasing sequence satisfying

(a) t=1 γ t =  and  (b) t=1 γ t < MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGikaiaadg gacaaIPaWaaabCaeqaleaacaWG0bGaaGypaiaaigdaaeaacqGHEisP a0GaeyyeIuoakiabeo7aNnaaBaaaleaacaWG0baabeaakiaai2dacq GHEisPqaaaaaaaaaWdbiaacckacaGGGcWdaiaadggacaWGUbGaamiz a8qacaGGGcGaaiiOa8aacaaIOaGaamOyaiaaiMcadaaeWbqabSqaai aadshacaaI9aGaaGymaaqaaiabg6HiLcqdcqGHris5aOGaeq4SdC2a aSbaaSqaaiaadshaaeqaaOGaaGipaiabg6HiLcaa@5A38@ (8)

For some. For example, in this article we set

γ t = t 0 max( t 0 ,t) .t=1,2..., MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeq4SdC2aaS baaSqaaiaadshaaeqaaOGaaGypamaalaaabaGaamiDamaaBaaaleaa caaIWaaabeaaaOqaaiaad2gacaWGHbGaamiEaiaaiIcacaWG0bWaaS baaSqaaiaaicdaaeqaaOGaaGilaiaadshacaaIPaaaaiaai6cacaWG 0bGaaGypaiaaigdacaaISaGaaGOmaiaai6cacaaIUaGaaGOlaiaaiY caaaa@4C5F@ (9)

 

For some specified value of .

 

In the logistic regression model, the invariant distribution p θ t (β)exp{n y ¯ β 0 + β 1 i=1 n x i y i log(1+ e β 0 + x i β 1 )} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiCamaaBa aaleaacqaH4oqCdaWgaaqaaiaadshaaeqaaaqabaGccaaIOaGaeqOS diMaaGykaiabg2Hi1kaadwgacaWG4bGaamiCaiaaiUhacaWGUbGabm yEayaaraGaeqOSdi2aaSbaaSqaaiaaicdaaeqaaOGaey4kaSIaeqOS di2aaSbaaSqaaiaaigdaaeqaaOWaaabmaeqaleaacaWGPbGaaGypai aaigdaaeaacaWGUbaaniabggHiLdGccaWG4bWaaSbaaSqaaiaadMga aeqaaOGaamyEamaaBaaaleaacaWGPbaabeaakiabgkHiTiaadYgaca WGVbGaam4zaiaaiIcacaaIXaGaey4kaSIaamyzamaaCaaaleqabaGa eqOSdi2aaSbaaeaacaaIWaaabeaacqGHRaWkcaWG4bWaaSbaaeaaca WGPbaabeaacqaHYoGydaWgaaqaaiaaigdaaeqaaaaakiaaiMcacaaI 9baaaa@6666@ and the proposal distribution q( β (t) )=N( β (t1) , 1 2 )) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyCaiaaiI cacqaHYoGydaahaaWcbeqaaiaaiIcacaWG0bGaaGykaaaakiaaiMca caaI9aGaamOtaiaaiIcacqaHYoGydaahaaWcbeqaaiaaiIcacaWG0b GaeyOeI0IaaGymaiaaiMcaaaGccaaISaGaaCymamaaBaaaleaacaaI YaaabeaakiaaiMcacaaIPaaaaa@49F7@ . With the foregoing notation, one iteration of SAMC can be described as follows:

Algorithm (SAMC)

Generate a sample β (t) =( β 0 (t) , β 1 (t) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiaaiMcaaaGccaaI9aGaaGikaiabek7a InaaDaaaleaacaaIWaaabaGaaGikaiaadshacaaIPaaaaOGaaGilai abek7aInaaDaaaleaacaaIXaaabaGaaGikaiaadshacaaIPaaaaOGa aGykaaaa@486D@ by a single MH update with the target distribution given by

p θ t (β)exp{n y ¯ β 0 + β 1 i=1 n x i y i log(1+ e β 0 + x i β 1 )} MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiCamaaBa aaleaacqaH4oqCdaWgaaqaaiaadshaaeqaaaqabaGccaaIOaGaeqOS diMaaGykaiabg2Hi1kaadwgacaWG4bGaamiCaiaaiUhacaWGUbGabm yEayaaraGaeqOSdi2aaSbaaSqaaiaaicdaaeqaaOGaey4kaSIaeqOS di2aaSbaaSqaaiaaigdaaeqaaOWaaabCaeqaleaacaWGPbGaaGypai aaigdaaeaacaWGUbaaniabggHiLdGccaWG4bWaaSbaaSqaaiaadMga aeqaaOGaamyEamaaBaaaleaacaWGPbaabeaakiabgkHiTiaadYgaca WGVbGaam4zaiaaiIcacaaIXaGaey4kaSIaamyzamaaCaaaleqabaGa eqOSdi2aaSbaaeaacaaIWaaabeaacqGHRaWkcaWG4bWaaSbaaeaaca WGPbaabeaacqaHYoGydaWgaaqaaiaaigdaaeqaaaaakiaaiMcacaaI 9baaaa@66A6@

Generateaccording to the proposal distribution q( β | β (t) )=N( β (t) , 1 2 )) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyCaiaaiI cacuaHYoGygaqbaiaaiYhacqaHYoGydaahaaWcbeqaaiaaiIcacaWG 0bGaaGykaaaakiaaiMcacaaI9aGaamOtaiaaiIcacqaHYoGydaahaa WcbeqaaiaaiIcacaWG0bGaaGykaaaakiaaiYcacaWHXaWaaSbaaSqa aiaaikdaaeqaaOGaaGykaiaaiMcaaaa@4B02@ . If J( β )S MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOsaiaaiI cacuaHYoGygaqbaiaaiMcacqGHjiYZcaWGtbaaaa@3DB3@ , then update  to, where  denote the collection of indices of the sub regions from which a sample has been proposed and  denote the index of the sub region of sample.

Calculate the ratio

r= e θ tJ( β (t) ) θ tJ( β ) ψ( β )q( β (t) | β ) ψ( β (t) )q( β | β (t) ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamOCaiaai2 dacaWGLbWaaWbaaSqabeaacqaH4oqCdaWgaaqaaiaadshacaWGkbGa aGikaiabek7aInaaCaaabeqaaiaaiIcacaWG0bGaaGykaaaacaaIPa aabeaacqGHsislcqaH4oqCdaWgaaqaaiaadshacaWGkbGaaGikaiqb ek7aIzaafaGaaGykaaqabaaaaOWaaSaaaeaacqaHipqEcaaIOaGafq OSdiMbauaacaaIPaGaamyCaiaaiIcacqaHYoGydaahaaWcbeqaaiaa iIcacaWG0bGaaGykaaaakiaaiYhacuaHYoGygaqbaiaaiMcaaeaacq aHipqEcaaIOaGaeqOSdi2aaWbaaSqabeaacaaIOaGaamiDaiaaiMca aaGccaaIPaGaamyCaiaaiIcacuaHYoGygaqbaiaaiYhacqaHYoGyda ahaaWcbeqaaiaaiIcacaWG0bGaaGykaaaakiaaiMcaaaaaaa@69F7@

Accept the proposal with probability. If accepted β (t+1) = β MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiabgUcaRiaaigdacaaIPaaaaOGaaGyp aiqbek7aIzaafaaaaa@3FBB@ , set, otherwise, set β (t+1) = β (t) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiabgUcaRiaaigdacaaIPaaaaOGaaGyp aiabek7aInaaCaaaleqabaGaaGikaiaadshacaaIPaaaaaaa@423A@ .

For all, update to as follows:

θ t+1,i = θ t,i + γ t+1 ( e t+1,i π i ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqiUde3aaS baaSqaaiaadshacqGHRaWkcaaIXaGaaGilaiaadMgaaeqaaOGaaGyp aiabeI7aXnaaBaaaleaacaWG0bGaaGilaiaadMgaaeqaaOGaey4kaS Iaeq4SdC2aaSbaaSqaaiaadshacqGHRaWkcaaIXaaabeaakiaaiIca caWGLbWaaSbaaSqaaiaadshacqGHRaWkcaaIXaGaaGilaiaadMgaae qaaOGaeyOeI0IaeqiWda3aaSbaaSqaaiaadMgaaeqaaOGaaGykaaaa @52CC@

where e t+1,i =1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamyzamaaBa aaleaacaWG0bGaey4kaSIaaGymaiaaiYcacaWGPbaabeaakiaai2da caaIXaaaaa@3E50@ if β (t+1) E i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaW baaSqabeaacaaIOaGaamiDaiabgUcaRiaaigdacaaIPaaaaOGaeyic I4SaamyramaaBaaaleaacaWGPbaabeaaaaa@40AF@ , and 0 otherwise.

Simulation study

 In this part, we generate simple logistics regression data and use the Monte Carlo method to fit the model.

                log π i 1 π i = β 0 + x i β 1 ,i=1,..,n MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiBaiaad+ gacaWGNbWaaSaaaeaacqaHapaCdaWgaaWcbaGaamyAaaqabaaakeaa caaIXaGaeyOeI0IaeqiWda3aaSbaaSqaaiaadMgaaeqaaaaakiaai2 dacqaHYoGydaWgaaWcbaGaaGimaaqabaGccqGHRaWkcaWG4bWaaSba aSqaaiaadMgaaeqaaOGaeqOSdi2aaSbaaSqaaiaaigdaaeqaaOGaaG ilaiaadMgacaaI9aGaaGymaiaaiYcacaaIUaGaaGOlaiaaiYcacaWG Ubaaaa@51A1@              (10)

 Based on the model and given the value of  and, we generate 1000 samples of.

                x i N(1,1) π i = exp( β 0 + x i β 1 ) 1+exp( β 0 + x i β 1 ) i=1,...n MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaBa aaleaacaWGPbaabeaarqqr1ngBPrgifHhDYfgaiuaakiab=XJi6iaa d6eacaaIOaGaaGymaiaaiYcacaaIXaGaaGykaiabec8aWnaaBaaale aacaWGPbaabeaakiaai2dadaWcaaqaaiaadwgacaWG4bGaamiCaiaa iIcacqaHYoGydaWgaaWcbaGaaGimaaqabaGccqGHRaWkcaWG4bWaaS baaSqaaiaadMgaaeqaaOGaeqOSdi2aaSbaaSqaaiaaigdaaeqaaOGa aGykaaqaaiaaigdacqGHRaWkcaWGLbGaamiEaiaadchacaaIOaGaeq OSdi2aaSbaaSqaaiaaicdaaeqaaOGaey4kaSIaamiEamaaBaaaleaa caWGPbaabeaakiabek7aInaaBaaaleaacaaIXaaabeaakiaaiMcaaa GaamyAaiaai2dacaaIXaGaaGilaiaai6cacaaIUaGaaGOlaiaad6ga aaa@6816@          (11)

 Then  is generated from binomial  distribution. We do iterations for 1000 times and calculate the mean and variance of the ( β ^ 0 , β ^ 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGikaiqbek 7aIzaajaWaaSbaaSqaaiaaicdaaeqaaOGaaGilaiqbek7aIzaajaWa aSbaaSqaaiaaigdaaeqaaOGaaGykaaaa@3ED2@  in different methods (Table 1).

Based on the Table 1, we find that both MH algorithms and SAMC have good performance on calculating the parameters in logistics regression while SAMC has smallest variance which indicates a better convergence rate and robustness. Since the independence of  also the  is close to 0, all the three MH algorithms perform similarly. As for SAMC algorithm, we have to partition the sample space based on the energy function. The energy regions can set up the initial values spanned in the full model space and therefore can have faster convergence rate and lower variance than other MH algorithm. Finally, we use the MLE method, i.e the ’glm’ function in R to estimate the parameters for comparison. It shows that SAMC algorithm even achieve better performance than the MLE method in terms of the variance of the estimators. Therefore, both MH and SAMC algorithms are eligible to obtain the consistent estimators in the logistic regression but even for this simple problem, SAMC still can enjoy better performance than others.

( β 0 , β 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGikaiabek 7aInaaBaaaleaacaaIWaaabeaakiaaiYcacqaHYoGydaWgaaWcbaGa aGymaaqabaGccaaIPaaaaa@3EB2@

 (0.1,0.2)

 (0.6,0.3)

 (1,-3)

 (2,0.4)

 (-3,2)

Independent

mean

 (0.10,0.17)

 (0.56,0.27)

 (1.22,-3.01)

 (1.78,0.35)

 (-3.20,1.92)

variance

0.09,0.07

 0.11,0.08

 0.15,0.23

 0.13,0.10

 0.25,0.22

Dependent

mean

 (0.10,0.18)

 (0.59,0.25)

 (1.12,-2.87)

 (2.11,0.38)

 (-3.13,2.05)

variance

0.09,0.07

 0.09,0.08

 0.15,0.23

 0.13,0.10

 0.25,0.22

Individual

 mean

(0.10,0.17)

 (0.57,0.26)

 (1.12,-2.88)

 (2.13,0.39)

 (-2.98,2.08)

variance

0.09,0.06

 0.11,0.08

 0.15,0.23

 0.13,0.10

 0.25,0.22

SAMC

mean

 (0.09,0.19)

 (0.62,0.27)

 (1.09,-2.99)

 (2.11,0.43)

 (-3.19,2.01)

variance

0.06,0.05

 0.11,0.07

 0.11,0.11

 0.09,0.10

 0.24,0.20

MLE

 mean

 (0.10,0.19)

 (0.59,0.3)

 (1.01,-2.97)

 (2.01,0.40)

 (-3.01,2.01)

variance

0.09,0.07

 0.21,0.16

 0.19,0.27

 0.24,0.22

 0.16,0.24

Table 1 Comparison of parameter estimations and variances for 5 pairs of ( β 0 , β 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGikaiabek 7aInaaBaaaleaacaaIWaaabeaakiaaiYcacqaHYoGydaWgaaWcbaGa aGymaaqabaGccaaIPaaaaa@3EB2@

Real example

 In this part, we apply Monte Carlo methods into a real example data set. Our dataset is obtianed from the Wechsler Adult Intelligence Scale (WAIS), which is a test designed to measure intelligence in adults and older adolescents and it is currently in its fourth edition (WAIS-IV). The original WAIS was published in February 1955 by David Wechsler, as a revision of the Wechsler-Bellevue Intelligence Scale that had been released in 1939. The propose of this test is to find the relationship between intelligence and Senile Dementia among older people. In our score scale, people (with and without Senile Dementia) were asked to do some tasks individually. When all tasks completed, they will get a score, ranging from 0-20. Therefore, the response  is a binary response that reflects whether people have Senile Dementia. The WAIS score is the explanation variables. Based on data, we need to build the logistic model and estimate the parameters.

Firstly, we apply the MH Independent Sampling Algorithm. We set the length of chain equals to 10000 and get the plot of parameters of, . As showed in the lower two figures of Figure 1, the two lines does not coincide, which indicate that independent sampling algorithm does not converge. The independent sampling algorithm assumes that parameters are independent; however, it cannot be easily satisfied in the real application. We calculate the covariance of  and, which is . Therefore, the independent sampling method cannot be applied here. Then we use another two MH algorithms, i.e dependent sampling and individual sampling as well as SAMC algorithm. We set the length of chain equals to 5000 and plot the cumulative mean curves for each method. As showed in Figure 2, all the methods converge when iteration grows while the SAMC converges faster than other, i.e; the two curves for SAMC coincide at around iteration 2000 while others need more iteration. To access the accuracy of estimations of  and, we use the results from MLE method as benchmark as compare them with the ones from the Monte Carlo methods. As showed in Table 2, the results from SAMC are more close to the MLE and also it has the smallest variance among all other methods. According to the result,  is significantly less than 0, which means people who get less score have higher risk of Senile Dementia.

Figure 1 The path and mean plots of , for MH Independent Sampling algorithm. The upper two plots are the trace plot of estimators of and at each iteration. The lower two plots are the cumulative mean of and , respectively. The solid line calculates the cumulative mean starting at iteration 1, while the dash lines starts at step 1500.

Figure 2 Comparison the cumulative mean plot of and for multiple Monte Carlo algorithms. The upper two panels denote the MH Dependent Sampling method; the middle two panels denote the MH individual sampling method; the lower two panels denote the SAMC method. The solid line calculates the cumulative mean starting at iteration 1, while the dash lines starts at step 1500.

( β 0 , β 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFf ea0dXdd9vqai=hGuQ8kuc9pgc9q8qqaq=dir=f0=yqaiVgFr0xfr=x fr=xb9adbaqaaeaaciGaaiaabeqaamaabaabaaGcbaGaaGikaiabek 7aInaaBaaaleaacaaIWaaabeaakiaaiYcacqaHYoGydaWgaaWcbaGa aGymaaqabaGccaaIPaaaaa@3EB2@

 Independent

 Dependent

 Individual

 SAMC

 MLE

mean

 (2.68,-0.34)

 (2.61,-0.34)

 (2.62,-0.35)

 (2.57,-0.34)

(2.40,-0.32)

variance

 0.19,0.18

 1.27,0.24

 1.23,0.10

 1.13,0.09

 1.19,0.11

Table 2 Results of both value and variance for each method of

Discussion

In this paper, we have proposed three MH algorithms and also SAMC to estimate the parameters of logistic regression. According to the simulation study, when the parameters are uncorrelated, each method has good performance. When the parameters are highly correlated, Independent Sampling method is less efficient and even cannot converge. The results showed that SAMC algorithms can be used in both scenarios and always outperform others. Compared with other MH algorithm, SAMC need less iterations to converge and have smaller variance of estimators even compared with MLE method.

One interesting further work is to apply the Monte Carlo simulations to high-dimensional big data problem. The challenge comes from two aspects. First, the data contains complex structure and therefore, the acceptance rate for most MCMC algorithm can be slow and therefore can be less efficient. Although SAMC has been proved to be a good method for the complex data and can overcome some local trap problem, it still needs to improve to the high-dimensional case where the likelihood function may not be intractable. Secondly, the big data have catastrophically large volumes and most single chain Monte Carlo simulations cannot apply. At each iteration, it should go through all the data which need large memory space for the computer machine and long time to generate the data at each iteration. For the big volume of data, we should consider parallel computing strategies, i.e, generate multiple Monte Carlo chains in parallel and then integrate chains to get the single estimator (Liang and Wu, 2013). Also, the high speed graphics processing unit (GPU) can be helpful for accelerating the speed of many MCMC algorithms.

In conclusion, we introduced several Monte Carlo simulation methods to estimate parameters of generalized linear model, i.e logistic regression. This problem can also be achieved by the MLE method which might be easier to calculate. However, our Monte Carlo algorithm can obtain a chain of estimations instead of a single value and therefore can provide more information about the process of reaching the true values of parameters and can be useful for further analysis.

Acknowledgements

The authors would like to thank the referee and the Editor for careful reading and comments which greatly improved the paper.

Conflict of interests

Author declares that there is no conflict of interest.

References

  1. Metropolis N, Rosenbluth AW, Rosenbluth MN, et al. Equation of state calculations by fast computing machines. The journal of chemical physics. 1953;21:1087–1092.
  2. Hastings WK. Monte Carlo sampling methods using Markov chains and their applications. Biometrika. 1970;57(1):97–109.
  3. Müller P. A generic approach to posterior integration and Gibbs sampling. USA: Purdue University, Department of Statistics. 1991.
  4. Tierney L. Markov chains for exploring posterior distributions. The Annals of Statistics. 1994;22(4):1701–1728.
  5. Chib S, Greenberg E. Markov chain Monte Carlo simulation methods in econometrics. Econometric theory. 1996;12:409–431.
  6. Liang Liu, Carroll, Carroll RJ. Stochastic approximation in Monte Carlo computation. Journal of the American Statistical Association. 2007;102:305–320.
  7. Geman S, Geman D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on pattern analysis and machine intelligence. 1994;721–741.
  8. Geyer CJ. Markov chain Monte Carlo maximum likelihood. 1991.
  9. Hukushima K, Nemoto K. Exchange Monte Carlo method and application to spin glass simulations. Journal of the Physical Society of Japan. 1996;65:1604–1608.
  10. Marinari E, Parisi G. Simulated tempering: a new Monte Carlo scheme. Europhysics Letters. 1992;19:451.
  11. Geyer CJ, Thompson EA. Annealing Markov chain Monte Carlo with applications to ancestral inference. Journal of the American Statistical Association. 1995;90:909–920.
  12. Liang F, Wong WH. Evolutionary Monte Carlo for protein folding simulations. The Journal of Chemical Physics. 2001;115:3374–3380.
  13. Wong WH, Liang F. Dynamic weighting in Monte Carlo and optimization. Proceedings of the National Academy of Sciences. 1997;94:14220–14224.
  14. Berg BA, Neuhaus T. Multicanonical ensemble: A new approach to simulate first–order phase transitions. Physical Review Letters. 1992;68:1–9.
  15. Hesselbo B, Stinchcombe RB. Monte Carlo simulation and global optimization without parameters. Physical review letters. 1995;74:2151.
  16. Wang F, Landau D. Efficient, multiple–range random walk algorithm to calculate the density of states. Physical review letters. 2001;86:2050.
  17. Mitsutake A, Sugita Y, Okamoto Y. Replica–exchange multicanonical and multicanonical replica–exchange Monte Carlo simulations of peptides. I. Formulation and benchmark test. The Journal of chemical physics. 2003;118:6664–6675.
Creative Commons Attribution License

©2018 Jia. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.