Open Mind Tree: Statistics Chapter 14

hapter 14Repeated Measures ANOVA

All the ANOVA stuff we have done so far has had different subjects in the various cells of the experimental design

That kind of experiment is called a between-subjects design

Sometimes, however, we run the same subjects in some or all cells of the design

Such a within-subjects (or repeated measures) design has two advantages:

you get more data per subject
you actually get more power because you can factor between-subject differences out of the error term

Memories of the ANOVA logic

Recall that the purpose of doing an ANOVA is to see if some difference between treatment means is sufficiently large is to be unlikely to occur by chance (less than 5% chance)
When we test that � we get an estimate of the difference we are interested in, and divide it by an estimate of variation due to chanceSpecifically:

Notice that this F value will increase if the difference between the means is large OR if the measurement of error is small
As you will see, repeated-measures designs allow us to reduce the error term, thereby resulting in larger Fs (more power)

An example: Within versus Between

This experiment will show the importance of the "articulatory loop" for retaining information in short-term memory
Between-Subjects Version

*Subject*	*bla-bla*	*Subject*	*no bla-bla*
1 2 3 4 5	X X² 5 25 4 16 6 36 4 16 7 49	1 2 3 4 5	X X² 7 49 6 36 5 25 6 36 6 36

Within-Subject Version

*Subject*	*bla-bla*	*no bla-bla*
1 2 3 4 5	X X² 5 25 3 9 7 49 2 4 3 9	X X² 7 49 4 16 7 49 4 16 5 25

Computations for Between-Subject

How is the within-subjects version different from the between-subjects version?

An assumption of the between-subjects ANOVA is that the observations in one level of the treatment are independent of those in the other level(s)
Hopefully you will notice that this assumption does not hold in our within-subjects version of the experiment
The use of the same subject in more than one level of the treatment almost always builds in a dependency because subjects who do well in one level tend to also do well in the other(s)
Can we remove this dependency? In fact we can, and when we do, there is a bonus! (the kind of thing that makes statistics geeks real happy J )

Getting Rid of the Variability Due to Subjects

The "dependency in observations" is due to some subjects doing better than others
What we are going to do to deal with this is to literally remove the variation due to subjects from the error term
For demonstration purposes only .. you can think of this as subtracting each subjects mean from all the scores they contribute
Using the data from out class:

*Subject*	*bla-bla*	*no bla-bla*
1 2 3 4 5	X X¢ 5 -1 3 -.5 7 0 2 -1 3 -1	X X¢ 7 +1 4 +.5 7 0 4 +1 5 +1

Where

Within-Subjects Computations

Another way of doing what is essentially the same thing is to remove the sum of squares due to subjects from the error term.

Source Tables

Between-Subjects

Source	SS	df	MS	F
Treatment	1.60	1	1.60	1.46
Error	8.80	8	1.10
Total	10.40	9

Within-Subjects

Source	SS	df	MS	F
Subject		4
Treatment		1
Error		4
Total		9

The Advantage of Within-Subject Designs

Remember, F values are increased if the difference of interest in larger OR if the measure of variance (MS_error) gets smaller
While removing the sum of squares due to subjects does make the observations independent across levels of the treatment variable, it OFTEN reduces the MS_error, thereby resulting in increased power (larger F values)
This only occurs though if the reduction in MS_error is more than compensates for the loss in df_error .. so it is not always true
Note that you cannot remove the variance (sum of squares) due to subjects when using a between subjects design because you only have one observation per subject � thus the variance due to subjects must remain as part of the error term
Moral: Usually, it is better to use within-subject (repeated measures) designs � not only do they let you use less subjects, but they are also more powerful, statistically speaking

Assumption of Compound Symmetry

Remember when we did between-subject ANOVAs, one of the assumptions was that the variance in our various treatment groups were homogenous (i.e., roughly equivelant)
A similar but slightly more complex assumption underlies repeated measures designs
Specifically, we need to satisfy the "compound symmetry" assumption which is that in addition to the variances being equal, the covariances between pairs of variables are also equal
For this to make sense, I think we may have to do a B07 time travel to re-introduce the notion of covariance �.
Imagine any two variables such as �

Subject	Height (X)	Weight (Y)
1	69	108
2	61	130
3	68	135
4	66	135
5	66	120
6	63	115
7	72	150
8	62	105
9	62	115
10	67	145
11	66	132
12	63	120
Mean	65.42	125.83
	Sum(X) = 785	Sum(Y) = 1510
	Sum (X²) = 51473	Sum(Y²) = 192238

Sum (XY) = 99064

The covariance of these variables is computed as:

But what does it mean?

The covariance formula should look familiar to you. If all the Ys were exchanged for Xs, the covariance formula would be the variance formula
Note what this formula is doing, however, it is capturing the degree to which pairs of points systematically vary around their respective means
If paired X and Y values tend to both be above or below their means at the same time, this will lead to a high positive covariance
However, if the paired X and Y values tend to be on opposite sides of their respective means, this will lead to a high negative covariance
If there is no systematic tendencies of the sort mentioned above, the covariance will tend towards zero

The Computational Formula for Cov

Given its similarity to the variance formula, it shouldn�t surprise you that there is also a computationally more workable version of the covariance formula:

For our height versus weight example then:

Back to Compound Symmetry

OK, now let�s assume we ran a repeated measures study in which we were looking at practice effects on some task over 3 days

	Day 1	Day 2	Day 3
Sub 1	700	650	620
Sub 2	520	450	430
Sub 3	600	540	500
Sub 4	650	630	620
Sub 5	750	700	690
Variance	7930	9830	10970

S X = 700 + 520 + 600 + 650 + 750 = 3220

S Y = 650 + 450 + 540 + 630 + 700 = 2970

S XY = (700 * 650) + �. (750 * 700) = 1947500

The Covariance (Variance/Covariance) Matrix

These variances and covariances are often presented in a matrix such as the following:

So, the assumption of compound symmetry is simply that the variances must all be approximately equal and the covariances must all be approximately equal
The variances need not (and often do not) equal the variances though

Complicating it all

So far in this chapter, we have been dealing with only one variable that has been manipulated in a within-subject manner
However, as we saw in Chapter 13, studies usually manipulate more than one variable which raises several possibilities
2 variables

2 between subject variables .. Chapter 13

1 within - 1 between

2 within

3 variables

3 between � Chapter 13

1 within - 2 between

2 within - 1 between

3 within

Computationally, we will only focus on the 2 new "2 variable" situations
However, as was the case with 3 between subject variables, I will expect you to be able to interpret 3 variable results � we will spend time doing this as well

One Between - One Within

Imagine the following study (raw data is presented in the text, pp. 459)
Similar to Siegel�s morphine tolerance study, King (1986) was interested in conditioned tolerance to another drug � midazolam

initially midazolam decreases motor activity

however, tolerance develops quickly

3 groups � 2 got 2 injections of midazolam prior to test .. the other (the control group) got saline injections

at test, all groups got midazolam, but one of the experimental groups received it in the same context as the had before (same group) whereas the other received it in a different context (the different group)

motor activity measured in 6 five-minute intervals producing the following data

The Data, Steve Style

	1	2	3	4	5	6	SS
	150	44	71	59	132	74	55858	88
	335	270	156	160	118	230	301885	212
	149	52	91	115	43	154	71796	101
Control	159	31	127	212	71	224	142532	137
	159	0	35	75	71	34	38328	62
	292	125	184	246	225	170	274786	207
	297	187	66	96	209	74	185907	155
	170	37	42	66	114	81	55946	85
	214	93	97	129	123	130	*1127218*	131

	346	175	177	192	239	140	295255	212
	426	329	236	76	102	232	415417	234
	359	238	183	123	183	30	268532	186
Same	272	60	82	85	101	98	111338	116
	200	271	263	216	241	227	338876	236
	366	291	263	144	220	180	389342	244
	371	364	270	308	219	267	557151	300
	497	402	294	216	284	255	687386	325
	355	266	221	170	199	179	*3063297*	232

	282	186	225	134	189	169	246983	198
	317	31	85	120	131	205	182261	148
	362	104	144	114	115	127	204946	161
Differ	338	132	91	77	108	169	186103	153
	263	94	141	142	120	195	170475	159
	138	38	16	95	39	55	34315	64
	329	62	62	6	93	67	129103	103
	292	139	104	184	193	122	201390	172
	290	98	109	109	124	139	*1355576*	145

	286	153	142	136	148	149	Grand	169

The Dreaded Computations

Just like when we had two between-subject variables, there are three effects of interest in the current experiment:

The main effect of Group

The main effect of Interval

The Group x Interval interaction

However, recall that we can (an do) use a different error term when testing within-subject effects than when testing between subject effects
SS_total (by the way) = 1432293So, the first thing we must do is to decide which effects are purely between-subjects, and which have a within-subject component
For this study, Group was manipulated between-subjects, but both Interval and the Group x Interval interaction have a between subjects component (i.e., Interval)
OK, now we separately deal with our between and within-subject effects

Between-Subject Effects

We treat between subject effects like we always have. We calculate SS_treat as the sum of squares of the treatment means times the relevant n, and we calculate SS_error as the sum of the variance of subjects within the group

df_group = k-1 = 3-1 = 2

df_w/grp = k(n-1) = 3(7) = 21

Within-Subject Effects

OK, for starters, the sums of squares for the Interval and interaction effects are calculated like we did in the 2 between-subject case

SS_{grp * int} = SS_cells - SS_int - SS_grp

= 766368 - 399744 - 287472

= 79152

df_int = k-1 = 6-1 = 5

df_{grp * int} = df_grp * df_int = 2 * 5 = 10

The Within-Subject Error Term

Remember than when we are dealing with within subject effects, we use a different error term (one that does not include the variability due to subject by subject variation)
Given the computations we have done so far, we can get the rest by subtraction �

*Source*	SS	df	MS	F
Between	672198¹
Group	287472	2	143736	7.85
Ss / Group	384726	21	18320

Within	760095²
Interval	399744	5	79949	29.85
Grp X Int	79152	10	7915	2.96
Ss / Grp * Int	281199³	105⁴	2678
Total	1432293	143

¹obtained by adding SS_group and SS_ss/group

²obtained by subtracting SS_between from SS_total
³obtained by subtracting SS_interval and SS_{grp * int} from SS_within
⁴obtained by subtracting df_group, df_ss/group, df_interval and df_{grp * int} from df_total

Critical F�s:
F(2,21) = 3.49 F(5,105) = 2.37 F(10,105) = 1.99

Conclusions from the Anova

Main Effect of Group
We can reject the null hypothesis that there was no effect of group. The F-obtained for the main effect of group was greater than the critical F suggesting the there are differences among the three group means. From looking at the means it appears that this is mostly due to the mean for the "Same" group being much higher than the other two means.

Main Effect of Interval
We can also reject the null hypothesis that there was no effect of interval. The F-obtained for the main effect of interval was greater than the critical F suggesting that there are differences among the six interval means. From the means, it appears as though activity was very high in the first interval, then dropped of and stayed relatively constant.

1	2	3	4	5	6
286	153	142	136	148	149

Interaction of Group * Interval
Finally, we can also reject the null hypothesis that the effect of interval was the same for the three groups. The F-obtained for the interaction was greater than the critical F suggesting that the effect of interval is different for the three groups. From the means, it appears as though the "Same" group stayed active longer (across more of the early intervals) than the other groups.

	1	2	3	4	5	6
*Control*	214	93	97	129	123	130
*Same*	355	266	221	170	199	179
*Differ*	290	98	109	109	124	139

*** Chapter 13 Flashback***

	*2 mins*	*5 mins*	*10 mins*
phobic	mean = 7	mean = 8	mean = 9	8
control	mean = 5	mean = 5	mean = 5	5
	6	6.5	7	6.5

Source	df	SS	MS	F
Time	2	8	4	13.79
Group	1	108	108	372.41

T x G	2	8	4	13.79

Within	42	12	0.29
Total	47	136

*** Chapter 13 Flashback***

Simple Effects for the effect of time at each level of group.

*Source*	df	SS	MS	F
T for Phob	2	16	8	27.59
T for Cont	2	0	0	0

Within	42	12	0.29
Total	47	136

So, we could describe the interaction by saying that fear increased over time for phobics, but fear did not change at all over time for the controls*** Chapter 13 Flashback***

Simple Effects

As was the case when we had two between subject variables, we will often want to do simple-effects analyses to gain a better understanding of the interaction
Recall that there are two ways we could approach these analyses, we could ask

At which intervals was there a significant effect or group (i.e., a difference between the groups)?, or

For which groups was there a significant effect of interval?

Here it makes sense to look at the interaction and consider the experimental predictions to determine which of these approaches is likely to yield the information you want
Since the predictions are focused primarily on potential differences between groups (or lack of differences), the first approach is the one we would want to take in this case
Nonetheless, we will briefly consider both situations

Simple Effects for Within-Subject Variables

We had decided that in our situations we were not interested in looking at the effect of interval separately for each group
But, if we had been, then we would have been examining the effect of a within-subject variable (interval)
For reasons that are not important, whenever you are doing simple-effects that are focused on the effect of a within-subject variable, you cannot use some general error term (like, for example SS_{s/grp * int})
Instead, what you do is a separate one-way, repeated measures analysis of variance for each simple effect
So, for example, if you were interested in the effect of interval for the control group, you would run a complete repeated measures ANOVA examining the interval variable but using only the data from the control group

Simple Effects for Between-Subject Variables

Step 1: Computing sums of squares for the effect of group at each interval

Step 2: Mean Squareds for the group effects at each interval

Since there are three groups at each interval, there are 2 degrees of freedom for each contrast

MS = SS/df, so �

MS_{Grp at Int1} = 79688 / 2 = 39844.00

MS_{Grp at Int2} = 155125 / 2 = 77562.50

MS_{Grp at Int3} = 74840 / 2 = 37420.00

MS_{Grp at Int4} = 15472 / 2 = 7736.00

MS_{Grp at Int5} = 30416 / 2 = 15208.00

MS_{Grp at Int6} = 10888 / 2 = 5444.00

Step 3: The error term

OK, here is where we differ from the Chapter 13 way of doing things
The appropriate error term SS is the SS_Ss/Cell
We could calculate that by hand but it would take a lot of work
In the "trust me" category, I give you the following:
SS_Ss/Cell = SS_Ss/Group + SS_{Ss/Grp X Int}, and
df_Ss/Cell = df_Ss/Group + df_{Ss/Grp X Int}
So, for our example �

SS_Ss/Cell = SS_Ss/Group + SS_{Ss/Grp X Int}
= 384726 + 281199 = 665925
df_Ss/Cell = df_Ss/Group + df_{Ss/Grp X Int}
= 21 + 105 = 126
MS_Ss/Cell = SS_Ss/Cell / df_Ss/Cell
= 665925 / 126 = 5285.12

Step 4: Source table depicting results

Source	df	SS	MS	F
Grp at Int1	2	79688	39844	7.54
Grp at Int2	2	155125	77562.5	14.68
Grp at Int3	2	74840	37420	7.08
Grp at Int4	2	15472	7736	1.46
Grp at Int5	2	30416	15208	2.88
Grp at Int6	2	10888	5444	1.03

Ss / Cell	126	665925	5285.12
Total	143	1432293

F_crit(2,126) = 3.07

Open Mind Tree

Statistics Chapter 14

0 comments:

Popular Posts

Visitors

Archives

Infolinks In Text Ads

Featured Posts

Blogger Tips