Seeds: Random effect logistic
regression
This example is taken from Table 3 of Crowder (1978), and concerns the proportion of seeds that germinated on each of 21 plates arranged according to a 2 by 2 factorial layout by seed and type of root extract. The data are shown below, where r
i
and n
i
are the number of germinated and the total number of seeds on the
i
th plate,
i
=1,...,N. These data are also analysed by, for example, Breslow: and Clayton (1993).
The model is essentially a random effects logistic, allowing for over-dispersion. If p
i
is the probability of germination on the
i
th plate, we assume
r
i
~ Binomial(p
i
, n
i
)
logit(p
i
) =
a
0
+
a
1
x
1i
+
a
2
x
2i
+
a
12
x
1i
x
2i
+ b
i
b
i
~ Normal(0,
t
)
where x
1i
, x
2i
are the seed type and root extract of the
i
th plate, and an interaction term
a
12
x
1i
x
2i
is included.
a
0
,
a
1
,
a
2
,
a
12
,
t
are given independent "noninformative" priors.
Graphical model for seeds example
BUGS
language for seeds example
model
{
for( i in 1 : N ) {
r[i] ~ dbin(p[i],n[i])
b[i] ~ dnorm(0.0,tau)
logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
alpha12 * x1[i] * x2[i] + b[i]
}
alpha0 ~ dnorm(0.0,1.0E-6)
alpha1 ~ dnorm(0.0,1.0E-6)
alpha2 ~ dnorm(0.0,1.0E-6)
alpha12 ~ dnorm(0.0,1.0E-6)
tau ~ dgamma(0.001,0.001)
sigma <- 1 / sqrt(tau)
}
Data
( click to open )
Inits for chain 1
Inits for chain 2
( click to open )
Results
A burn in of 1000 updates followed by a further 10000 updates gave the following parameter estimates:
We may compare simple logistic, maximum likelihood (from EGRET), penalized quasi-likelihood (PQL) Breslow and Clayton (1993) with the
BUGS
results
Heirarchical centering is an interesting reformulation of random effects models. Introduce the variables
m
i
=
a
0
+
a
1
x
1i
+
a
2
x
2i
+
a
12
x
1i
x
2i
b
i
=
m
i
+ b
i
the model then becomes
r
i
~ Binomial(p
i
, n
i
)
logit(p
i
) =
b
i
b
i
~ Normal(
m
i
,
t
)
The graphical model is shown below
This formulation of the model has two advantages: the squence of random numbers generated by the Gibbs sampler has better correlation properties and the time per update is reduced because the updating for the
a
parameters is now conjugate.