Seeds

Seeds: Random effect logistic
regression

This example is taken from Table 3 of Crowder (1978), and concerns the proportion of seeds that germinated on each of 21 plates arranged according to a 2 by 2 factorial layout by seed and type of root extract. The data are shown below, where r _i and n _i are the number of germinated and the total number of seeds on the i th plate, i =1,...,N. These data are also analysed by, for example, Breslow: and Clayton (1993).

The model is essentially a random effects logistic, allowing for over-dispersion. If p _i is the probability of germination on the i th plate, we assume

   r _i ~ Binomial(p _i, n _i)

   logit(p _i) = a ₀ + a ₁x _1i + a ₂x _2i + a ₁₂x _1ix _2i + b _i

   b _i~ Normal(0, t )

where x _1i , x _2i are the seed type and root extract of the i th plate, and an interaction term a ₁₂x _1ix _2i is included. a ₀ , a ₁ , a ₂ , a ₁₂ , t are given independent "noninformative" priors.

Graphical model for seeds example

[seeds2]

BUGS language for seeds example


      model
      {
         for( i in 1 : N ) {
            r[i] ~ dbin(p[i],n[i])
            b[i] ~ dnorm(0.0,tau)
            logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
               alpha12 * x1[i] * x2[i] + b[i]
         }
         alpha0 ~ dnorm(0.0,1.0E-6)
         alpha1 ~ dnorm(0.0,1.0E-6)
         alpha2 ~ dnorm(0.0,1.0E-6)
         alpha12 ~ dnorm(0.0,1.0E-6)
         tau ~ dgamma(0.001,0.001)
         sigma <- 1 / sqrt(tau)
      }

Data ( click to open )

Inits for chain 1     Inits for chain 2    ( click to open )

Results

A burn in of 1000 updates followed by a further 10000 updates gave the following parameter estimates:

[seeds3]

We may compare simple logistic, maximum likelihood (from EGRET), penalized quasi-likelihood (PQL) Breslow and Clayton (1993) with the BUGS results

[seeds4]

Heirarchical centering is an interesting reformulation of random effects models. Introduce the variables

    m _i = a ₀ + a ₁x _1i + a ₂x _2i + a ₁₂x _1ix _2i b _i= m _i + b _ithe model then becomes
   r _i ~ Binomial(p _i, n _i)

   logit(p _i) = b _i

    b _i~ Normal( m _i , t )

The graphical model is shown below
[seeds5]

This formulation of the model has two advantages: the squence of random numbers generated by the Gibbs sampler has better correlation properties and the time per update is reduced because the updating for the a parameters is now conjugate.