Thursday, January 13, 2011

Much Better Error Bars for Within-Subjects Studies

For any scientists reading this blog, and of those, the ones who use within-subjects designs, this will be a revelation. Everyone else should skip. There's a problem that came up in our last set of reviewer comments, that if you have a within-subjects factorial design, standard error bars or 95% confidence intervals on your bars representing means do not portray the results of the repeated-measures ANOVA. Basically they're way too big, because they don't incorporate the benefit of comparing people to themselves; they include the between-subjects variance. So the basic trick for comparing two means by eye to determine significance, as described here

http://scienceblogs.com/cognitivedaily/2008/07/most_researchers_dont_understa_1.php

(in a nutshell, if they represent standard error the error intervals have to be separated by about 1/2 interval before the difference between the means is signicant at alpha = .05)  doesn't work. You lose the very desirable property of being able to tell the story of your results purely in your figures.

To the rescue comes Cousineau (2005)'s within-subject confidence intervals.
http://www.tqmp.org/Content/vol01-1/p042/p042.pdf
The idea is so straightforward and easy to implement: if your data is organized with participants as rows and conditions as columns, simply take the mean of each row and subtracted it from the items in that row, making a new table. Then add the overall mean of the original table to all the entries of the new table. Each column will have the same mean as in the original data, but the row means will all be identical to each other and to the overall mean. Now construct your standard error bars or 95% confidence interval bars in the usual manner. Then the error bars will represent only the difference due to condition differences, and visually comparing any two error bars in the manner described above is the equivalent of doing a paired-samples t-test between the means (I haven't doublechecked that) When we did this to our latest paper, the difference was like night and day: all of a sudden nearly all of our significance findings were clear and easy to read off the bar graph.

There's a risk here, that your readers may not know what the heck you're doing, or even be suspicious that you are trying to make your results look better than they are. But the visual pairwise comparisons will be very close (not necessarily exactly the same) as the pattern of results from the corresponding ANOVA (and at least one reviewer out there is certainly applying that kind of visual test even when inappropriate, that is, for a within-subjects design), and there is a paper to cite for the idea. It has now been cited 67 times so it appears the idea is catching on.

Read the Cousineau paper first, but as a late breaking correction to it there's this paper, Morey 2008:
http://www.tqmp.org/Content/vol04-2/p061/p061.pdf
It appears that the error bars are slightly too small when done the Cousineau way, but can be fixed by an easy numerical correction.