While we have been focused on two-level multilevel models, it is straightforward to extend these models to higher levels of nesting. For example, we may be studying students nested in classrooms nested in schools nested in counties.
It is not always the case that groupings will be nested. For example, suppose we are studying university students. A student may be a member of several nonnested groups – for example the student could be taking a stats class, a comp sci class, and a music performance class. We might expect positive correlation in outcomes of students within each class, yet these three classes are not nested within each other. When random effects are not nested, we refer to them as crossed.
Consider a study in the semiconductor industry of variability in manufacture of silicon wafers. For each wafer, the thickness of the oxide layer is measured at three different sites. The wafers themselves are sampled from eight different production lots. In this experiment, sites are nested in wafers, and wafers are nested in lots.
data(Oxide,package="nlme")
head(Oxide,12)
## Source Lot Wafer Site Thickness
## 1 1 1 1 1 2006
## 2 1 1 1 2 1999
## 3 1 1 1 3 2007
## 4 1 1 2 1 1980
## 5 1 1 2 2 1988
## 6 1 1 2 3 1982
## 7 1 1 3 1 2000
## 8 1 1 3 2 1998
## 9 1 1 3 3 2007
## 10 1 2 1 1 1991
## 11 1 2 1 2 1990
## 12 1 2 1 3 1988
The site index repeats across wafers; wafer index repeats across lot.
Let’s consider a random effect for lot and a random effect for wafer in the model
\[Y_{ijk}=\gamma_0+\alpha_i+\beta_{ij}+\varepsilon_{ijk}\]
where \(\alpha_i \overset{iid}\sim N(0,\sigma^2_\alpha) \perp \beta_{ij} \overset{iid}\sim N(0,\sigma^2_\beta) \perp \varepsilon_{ijk} \overset{iid}\sim N(0,\sigma^2_\varepsilon)\).
Here \(i\) indexes the lot, \(j\) indexes the wafer within lot, and \(k\) indexes the site within wafer.
In this model, derive the following quantities.
Using the data ordering, put these values (and others) together to show the form of the full matrix \(\text{Cov}(Y)\).
Using the ordering in the data across the 72 oxide layer thickness measurements, we expect the covariance matrix to have the following block structure in lots and wafers.
The darker, smaller squares (higher correlations) are for measures taken on the same wafers, and the larger squares are for measures taken in the same lot. Measures from different lots are independent.
We need to be careful about specifying the model because the indices are nested. We want a random effect for wafer and a random effect for lot, but we need to be careful about how we specify it because wafer 1 in lot 1 is not the same wafer as wafer 1 in lot 2. To fit a model using the nested indices provided, we use the following code.
#specifying that wafer index is nested in lot
ox1 <- lmer(Thickness ~ 1 + (1|Lot/Wafer), data = Oxide)
summary(ox1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Thickness ~ 1 + (1 | Lot/Wafer)
## Data: Oxide
##
## REML criterion at convergence: 454
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.8746 -0.4991 0.1046 0.5510 1.7922
##
## Random effects:
## Groups Name Variance Std.Dev.
## Wafer:Lot (Intercept) 35.87 5.989
## Lot (Intercept) 129.87 11.396
## Residual 12.57 3.545
## Number of obs: 72, groups: Wafer:Lot, 24; Lot, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2000.153 4.231 472.7
Wow, the lot explains a lot of the variability in the response! There is considerable variability across wafers as well.
What if we used our regular old code?
ox2 <- lmer(Thickness ~ 1 + (1|Lot) + (1|Wafer), data = Oxide)
summary(ox2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Thickness ~ 1 + (1 | Lot) + (1 | Wafer)
## Data: Oxide
##
## REML criterion at convergence: 490.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.6115 -0.4268 0.1087 0.3976 2.2815
##
## Random effects:
## Groups Name Variance Std.Dev.
## Lot (Intercept) 139.019 11.791
## Wafer (Intercept) 1.493 1.222
## Residual 38.348 6.193
## Number of obs: 72, groups: Lot, 8; Wafer, 3
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2000.15 4.29 466.2
Well, the wafer effect went away, and the residual variance got larger. What happened?
The model assumed wafer 1 was repeated in all 8 lots, wafer 2 was repeated in all 8 lots, etc. so that there were only 3 wafers (instead of 24). This watered down the wafer effect (wrong model!) and estimated a correlation that looks more like this.
Yikes, observations from different lots should be independent, but we induced them because of the way the wafer index is coded in the data.
If you don’t like using the nesting coding, we can fix the issue with the index for wafer and use our regular coding. Below we make the index on wafer unique by appending it to the lot – so the first digit of the wafer2 index designates lot number, and the 2nd digit designates the wafer within the lot.
library(tidyverse)
Oxide <- mutate(Oxide, Wafer2 = as.numeric(paste0(Lot, Wafer)))
head(Oxide, 12)
#now we can also use the coding we're used to
ox3 <- lmer(Thickness ~ 1 + (1|Lot) + (1|Wafer2), data = Oxide)
summary(ox3)
## Source Lot Wafer Site Thickness Wafer2
## 1 1 1 1 1 2006 11
## 2 1 1 1 2 1999 11
## 3 1 1 1 3 2007 11
## 4 1 1 2 1 1980 12
## 5 1 1 2 2 1988 12
## 6 1 1 2 3 1982 12
## 7 1 1 3 1 2000 13
## 8 1 1 3 2 1998 13
## 9 1 1 3 3 2007 13
## 10 1 2 1 1 1991 21
## 11 1 2 1 2 1990 21
## 12 1 2 1 3 1988 21
## Linear mixed model fit by REML ['lmerMod']
## Formula: Thickness ~ 1 + (1 | Lot) + (1 | Wafer2)
## Data: Oxide
##
## REML criterion at convergence: 454
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.8746 -0.4991 0.1046 0.5510 1.7922
##
## Random effects:
## Groups Name Variance Std.Dev.
## Wafer2 (Intercept) 35.87 5.989
## Lot (Intercept) 129.87 11.396
## Residual 12.57 3.545
## Number of obs: 72, groups: Wafer2, 24; Lot, 8
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2000.153 4.231 472.7
If indices are nested, you need to be careful to use coding that indicates the nesting.
Caveat: if you did actually move the class across schools, then you would want to code the “usual way” without changing the indices!
If indices are nonnested, you don’t have to worry about codifying the nesting.
We consider data from the 2005 Nature study by Silk et al. on prosocial behavior in chimpanzees. Prosocial behavior is designed to benefit others or society as a whole (as opposed to close friends or family members). An experiment was set up in which a chimpanzee was seated at a table that contained two levers that could be controlled by the chimp: one on the right, and one on the left. Another chimpanzee could be seated at the other side of the table without access to the levers.
Multiple experiments were conducted per chimp, varying whether another chimp was present at the table and also which other chimp would join at the table.
The experiment also involved blocking of trials within a session, so that a random effect for block is needed. The blocks were not nested within chimps, or vice versa – so that chimp effects and block effects are crossed.
The researchers want a model that controls for potential correlation in responses due to blocking and that allows investigation of chimp-specific variability in response and in the tendency to exhibit prosocial behavior.
Variables of interest include the following.
Write down a model that
Then fit the model and describe your results!
#code to read data
library(rethinking)
data(chimpanzees)
d <-
chimpanzees %>%
select(-recipient) %>%
mutate(block_id = block)