<- aov(Posttest~Group, data = multitask)
m1
model.tables(m1, type = "effects")
Tables of effects
Group
Group
Control Exp1 Exp2
12.049 -0.703 -11.345
To analyse data collected from a Completely Randomised Design we could use \(t\)-tests and compare the samples two at a time. This approach is problematic for two reasons. Firstly, the test statistic of a \(t\)-test is calculated with a standard deviation based only on the two samples it considers. We want our test statistic to consider the variability in all samples collected. Second, when we conduct multiple tests the overall Type 1 Error rate increases. That is, when doing many tests, the chance of making at least one wrong conclusion increases with the number of tests (if you want to know more see the box below). To avoid this, we will use the ANOVA method which was specifically developed for comparing multiple means.
When we collect samples, we usually want to learn something about the populations from which they were drawn. To do this, we can develop a model for the observations that reflects the different sources of variation believed to be at play.
For Completely Randomised Designs, we have \(a\) treatments which implies \(a\) population means \(\mu_1, \mu_2, \mu_3, \ldots, \mu_a\). We are interested in modelling the means of the treatments and the differences between them. Ultimately we want to test whether they are equal which we’ll get to in the next section. First, we construct a simple model for each observation \(Y_{ij}\):
\[ Y_{ij} = \mu_{i} + e_{ij}, \]
where
\[ \begin{aligned} i & = 1, \dots, a \quad (a = \text{number of treatments}) \\ j & = 1, \dots, r \quad (r = \text{number of replicates}) \\ Y_{ij} & = \text{observation of the } j^{th} \text{ unit receiving treatment } i \\ \mu_i & = \text{mean of treatment } i \\ e_{ij} & = \text{random error with } e_{ij} \sim N(0, \sigma^2) \end{aligned} \]
That is, each observation is modeled as the sum of its population mean and some random variation, \(e_{ij}\). This random variation represents unexplained differences between individual observations within the same group and we assume that these differences follow a normal distribution with mean 0 and constant variance across all treatment groups. 1
We can change the notation slightly by arbitrarily dividing each mean into a sum of two components: the overall mean \(\mu\) (the mean of the entire data set, which is the same as the mean of the \(a\) means2) and the difference between the population mean and the overall mean. In symbols, this translates to:
\[ \begin{aligned} \mu_1 &= \mu + (\mu_1 - \mu) \\ \mu_2 &= \mu + (\mu_2 - \mu) \\ &\;\;\vdots \notag \\ \mu_a &= \mu + (\mu_a - \mu) \end{aligned} \]
The difference \((\mu_i - \mu)\) is the effect of treatment \(i\), denoted by \(A_i\). So each population mean is the sum of the overall mean and the part that we attribute to the particular treatment (\(A_i\)):
\[ \mu_i = \mu + A_i, \quad i = 1, 2, \dots, a, \]
where \(\sum A_i = 0\).
Replacing \(\mu_i\) in the model above leads to the common parameterisation of a single-factor ANOVA model3:
\[ Y_{ij} = \mu + A_{i} + e_{ij} \]
where
\[ \begin{aligned} i & = 1, \dots, a \quad (a = \text{number of treatments}) \\ j & = 1, \dots, r \quad (r = \text{number of replicates}) \\ Y_{ij} & = \text{observation of the } j^{th} \text{ unit receiving treatment } i \\ \mu & = \text{overall or general mean} \\ A_i & = \text{effect of the } i^{th} \text{ level of treatment factor A} \\ e_{ij} & = \text{random error with } e_{ij} \sim N(0, \sigma^2) \end{aligned} \]
The model can be interpreted as follows:
Each observation, \(Y_{ij}\), is the sum of the overall mean (\(\mu\)), plus the effect of the treatment it belongs to (\(A_i\)), and some random error (\(e_{ij}\)). We use two subscripts on the \(Y\). One to identify the group (treatment) and the other to identify the subject (experimental unit) within the group:
\[ \begin{aligned} Y_{1j} &= \mu + A_1 + e_{1j} \\ Y_{2j} &= \mu + A_2 + e_{2j} \\ Y_{3j} &= \mu + A_3 + e_{3j} \\ &\;\;\vdots \notag \\ Y_{aj} &= \mu + A_a + e_{aj} \\ \end{aligned} \]
Okay, so we have a model which we now need to fit to our data. When we do this, we estimate the model parameters using our data. The parameters we want to estimate are \(\mu\) (the overall mean), the treatment effects (\(A_i\)) and \(\sigma^2\) (the error variance). As for regression, we find least squares estimates for the parameters which minimise the residual or error sum of squares5:
\[ \text{SSE} = \sum_i\sum_j e_{ij}^2 = \sum_i\sum_j (Y_{ij} - \hat{Y}_{ij})^2 = \sum_i\sum_j (Y_{ij} - \mu - A_i)^2\]
It turns out when we solve for the estimates that minimise the SSE6, we obtain the following estimators:
\[ \begin{aligned} \hat{\mu} = \bar{Y}_{..} \\ \hat{\mu}_i = \bar{Y}_{i.} \end{aligned} \]
and
\[\hat{A}_i = \bar{Y}_{i.} - \bar{Y}_{..}\]
From linear model theory we know that the above are unbiased estimates7 of \(\mu\) and the \(A_i\)’s. What does this tell you? It tells you that we can use the sample means as estimates for the true means. The estimated mean response for treatment \(i\) is the observed sample mean of treatment \(i\) and the observed overall mean is the estimated grand mean.
For the last parameter, the error variance, an unbiased estimator is found by dividing the minimised SSE (i.e. calculated with the least squares estimates) by its degrees of freedom:
\[ s^2 = \frac{1}{N-a}\sum_{ij}(Y_{ij} - \bar{Y}_{i.})^2 \]
This quantity is called the Mean Squares for Error (MSE) or residual mean square. It has \((N-a)\) degrees of freedom since we have \(N\) observations and have estimated \(a\) means. If you look at the formula you’ll notice that it is an average of the observed variability from the different treatment groups.
In the previous section we saw how the parameters of the ANOVA model are estimated. We also need a measure of uncertainty for each of these estimates (in the form of a standard error, variance, or confidence interval). Let’s start with the variance of a treatment mean estimate:
Variance, Standard Deviation and Standard Error: what’s all this again? The variance (Var) is a good way of measuring variability. The Standard Deviation (SD) is the square root of the variance of a sample or population. The Standard Error (SE) is the SD of an estimate (read that again).
\[Var(\hat{\mu}_i) = \frac{\sigma^2}{n_i} \]
Remember that the sampling distribution of the mean is \(N(\mu,\frac{\sigma^2}{n})\) and here we assumed that the groups have equal population variances.
If we assume that two treatment means are independent, the variance of the difference between two means is:
\[ Var(\hat{\mu}_i - \hat{\mu}_j) = Var(\hat{\mu}_i) + Var(\hat{\mu}_j) = \frac{\sigma^2}{n_i} + \frac{\sigma^2}{n_j} \]
To estimate these variances we substitute the MSE for \(\sigma^2\) as it is an unbiased estimate of the error variance (the variability within each group). The standard errors of the estimates are found by taking the square root of the variances. The standard error is the standard deviation of an estimated quantity, and is a measure of its precision (uncertainty); how much it would vary in repeated sampling.
We can assume normal distributions for our estimates because we have assumed a normal linear model and because they are means (or differences between means). This means that confidence intervals for the population treatment means are of the form:
\[ \text{estimate} \pm t^{\alpha/2}_v \times \text{SE}(\text{estimate})\]
where \(t^{\alpha/2}_v\) is the \({\alpha/2}^{th}\) percentile of the Student’s \(t\) distribution with \(v\) degrees of freedom. The degrees of freedom are the error degrees of freedom, \(N-a\) for CRD.
What are the standard errors associated with the parameter estimates in the social media example? We can easily extract this by specifying an extra argument to the model.tables
function.
Standard error of the effects:
model.tables(m1, type = "effects", se = TRUE)
Tables of effects
Group
Group
Control Exp1 Exp2
12.049 -0.703 -11.345
Standard errors of effects
Group
2.237
replic. 40
and for the treatment means:
model.tables(m1, type = "means", se = TRUE)
Tables of means
Grand mean
63.58527
Group
Group
Control Exp1 Exp2
75.63 62.88 52.24
Standard errors for differences of means
Group
3.163
replic. 40
So, now we have parameter estimates and their standard errors. Equipped with these, we are closer to answering the original question: Does social media multitasking impact academic performance of students? Based on the model we fitted and the parameters we estimated, how do we test this? The answer is with an ANOVA table.
This chapter introduces the Completely Randomized Design (CRD) model and explains why ANOVA is preferred over multiple t-tests, which inflate the Type 1 Error rate.
In ANOVA, each observation is modeled as:
\[ Y_{ij} = \mu + A_{i} + e_{ij} \]
where \(\mu\) is the overall mean, \(A_{i}\) is the treatment effect (difference between treatment mean \(\mu_i\) and the overall mean), and \(e_{ij}\) is random error which normally distributed with mean 0 and variance (\(\sigma^2\)).
Parameters are estimated using least squares, with the mean squares error (MSE) providing an estimate of variance (\(\sigma^2\)).
Applying ANOVA to the social media multitasking study, we estimated treatment means and effects together with their standard errors, setting the stage for hypothesis testing using an ANOVA table.
As opposed to non-constant variance across all treatment groups: \(e_{ij} \sim N(0, \sigma^2_{i})\) where the \(\sigma_i^2\)’s are different.↩︎
\(\mu = \frac{\sum\mu_i}{a}\)↩︎
Often called Model I.↩︎
In statistics, sums of squares is a measure of variability and refers to squared deviations from a mean or expected value. For example, the residual sums of squares (sum of squared deviations of the observations from the fitted values).↩︎
error = observed - fitted.↩︎
Another name for this is the residual sums of squares (RSS).↩︎
Unbiased means that the expected value of these statistics equals the parameter being estimated. In other words, the statistic equals the true parameter on average.↩︎
Remember: \(\mu_i = \mu + A_i\)↩︎