After studying these two separately in the Intro to Statistics class, I present this finding--that not only are these two tests based on the same math, but some statistical packages are moving to unify them. For example, in SPSS, there are some ANOVA tests that you can no longer find under the ANOVA tab--you must look under the GLM tab. Additionally, any data that ordinarily seems amenable only to an ANOVA test can be transformed into being amenable to regression. At the end of this lecture, I show my students one of my favorite geeky math cartoons--it sometimes goes around on Pi Day. I explain that after hearing that ANOVA and linear regression are fundamentally the same test, subsumed under the GLM, that their faces, I'm sure, all look like this, that they will rush out to twitter the discovery to all of their friends, and use the information to pick up dates at parties.
Saturday, June 27, 2015
In my Intro to Statistics course, one of the tasks I feel obliged to do is to introduce the students to how two of the main topics of the course--linear regression and ANOVA--are linked, since they seem to be completely unrelated, other than the fact that we spend 75% of the semester on doing these two tests. Regression was first explored in the late 1890s by Pearson, applying the procedure to genetics, and similarly, ANOVA (analysis of variance), pioneered by Fisher, was also applied to genetics some decades later, in the 1920s. Both tests have somewhat different assumptions that must be met before they can be applied correctly, and both tests require different types of data. Because of these, and other differences, it isn't obvious that both of the two tests are based on the same math, linear/matrix algebra. At a later point in statistical research, their linkages were discovered, and now both are subsumed under the General Linear Model.