LearningLiveCode: Building the Permutation Test of Exact Inference with LiveCode

Yes, with a blog post title like that, I fully expect this post to go viral. I'm sure you'll be seeing me on CNN later this evening.

Ah, well, yes, the truth is that words like "permutation" and "inference" are hardly captivating stuff, unless you are into mathematics and statistics. Well, I'm into both, but in a "mere mortals" kind of way. I've been teaching a MOOC (massive open online course) for about two years now with the title "Statistics in Education for Mere Mortals." I love statistics, but I'm not a statistician or mathematician. Instead, I'm your typical educational researcher who has largely favored quantitative research methods and therefore I've become very acquainted with the practical use of statistics through my research. I've been on a mission for a number of years to bring more statistics into our graduate curriculum in instructional design and development.

Side note: I'm teaching this five-week MOOC right now and it is open and free to anyone. Click here for the enrollment page. We are two weeks into it, but it's a self-paced course, so you could easily catch up. In fact, several (very motivated) people have already finished it.

This summer I had the chance to teach a 3 credit version of this course for the University of Georgia. (Now I can honestly tell my wife - finally - that I got paid to teach this course.) I've had a blast.

As I prepared to teach the UGA course, I came across this very readable and provocative article by George Cobb:

Cobb, G. (2007). The introductory statistics course: A Ptolemaic Curriculum? Technology Innovations in Statistics Education, 1(1). Available Online: http://escholarship.org/uc/item/6hb3k0nz

Cobb argues that introductory statistics courses should abandon the teaching of the standard statistical tool for comparing the average scores for two different groups - the t test - in favor of something called the permutation test, which is a concept that has been around at least since the 1930s. This approach is only now practical due to the easy access to high-speed computers (such as the one you are reading this blog post on right now). The really cool thing is just how simple and straightforward the permutation test is:

You run an experiment comparing two groups. Let's use the example of one group of students (the treatment group) going through "Lloyd's Wonderous Statistics Tutorial" as compared to another group who get a standard statistics tutorial (the control group). Let's say there were 10 students in each group.
You record each of the student's scores on the final exam and compute the average for each group. Let's say that the treatment group's average was 34.4 and the control group's average was 27.9, for a difference of 6.5. Write 6.5 on the top of a piece of paper.
Next, you take all of the individual student scores - from both groups - and write each on a card.
Shuffle the cards and then put 10 of the scores at random into one pile and the remaining 10 scores into a second pile. The first pile will be for the treatment and the second for the control.
You again compute the averages for the two groups and the difference between them. Write that difference score on that piece of paper on the next line under the 6.5.
Repeat this procedure a couple of thousand times.
How many times did 6.5 or greater come up? If the answer is less than 5% of the time, you have what is called a statistically significant difference, which basically just means that you don't think that the original 6.5 difference is due to mere chance. Instead, something else must account for the difference, which we'll conclude is the fact that "Lloyd's Wonderous Statistics Tutorial" is just so much better than the standard variety. By the way, the proportion of times that 6.5 or greater came up you is something called a probability, or p, value.

Even though this is a simple procedure, you can immediately see why the test wasn't embraced back in the 1930s - with only paper and pencil, no one wanted to do it! And, there is an even more accurate way of doing the test, namely to only use the permutations possible by dividing 20 scores into two groups of 10 (which, by the way, there are 184,756 different ways of grouping these numbers; check out this web site for a cool calculator). If you use only the unique permutation groupings as your scores in your overall calculation, your p value will be as exact as it gets, whereas the shuffling version has some error built into it. But, as long as you reshuffle a few thousand times, it's close enough.

Lloyd's Permutation Test App

OK, that's a long introduction to the fact that I built just such a Permutation Test app for demonstration purposes in my UGA course using LiveCode. Here's a screenshot of the app:

I used the top hat to symbolize the random shuffling. If you follow the blue columns from left to right, you'll see that this follows the step-by-step procedure describe above.

I made a video explaining the permutations test along with a demonstration of my app. But, if you just want to check out the app, fast forward to the 10:42 mark: