Free Statistics Help Book
An Interactive Multimedia introductory-level statistics book.
The book features interactive demos, simulations and case studies.
Chapter
Section
Testing Means :  

Robustness Simulation



Questions to be answered before the simulation are not yet implemented in this test version.


Begin by answering the questions, even if you have to guess. The first time you answer the questions you will not be told whether you are correct or not.


Once you have answered all the questions, answer them again using the simulation to help you. This time you will get feedback about each individual answer.


General Instructions


This demonstration allows you to explore the effects of violating the assumptions of normality and homogeneity of variance. When the simulation starts you see the distributions of two populations. By default, they are both normally distributed, have means of 0 and standard deviations of 1. The default sample size for the simulations is 5 per group. If you push the “simulate” button, 2,000 simulated experiments are conducted. You can adjust the number of simulations from 2,000 to 10,000. A t-test is computed for each experiment and the number of tests that were significant, not significant, and the type I error rate (the proportion significant) are displayed.


Since the null hypothesis is true and all assumptions are met with these default values, the type I error rate should be close to 0.05, especially if you ran a large number of simulatons. It will not equal 0.05 because of random variation. However, the larger the number of simulations you run, the closer the type I error rate should come to 0.05.


You can explore the effects of violating the assumptions of the test by making one or both of the distributions skewed and/or by making the standard deviations of the distributions different. You can also explore the effects of sample size and of the significance level used (0.05 or 0.01).


By exploring various distributions, sample sizes, and signficance levels, you can get a feeling for how well the test works when its violations are violated. A test that is relatively unaffected by violations of its assumptions is said to be “robust.”


Step By Step Instructions


1. Using the default settings, click “simulate” and see what the type I error rate is. It should be close to 0.05. Try running more simulatons by pressing the “simulate” button again. If the simulations are done quickly, try setting the number of simulatons to 10,000 before pressing the “simulate” button.


2. Examine the effect of non-normality. Give both distributions a slight skew. Do at least 10,000 simulated experiments. Note how the type I error rate varies from 0.05 (if at all).

3. Repeat Step 2 using 15 subjects per group instead of the default of 5.


4. Make both distributions normal but make the variance of the second distribution much bigger than the first. Find the type I error rate for 5 subjects per group and then for 20 subjects per group.


5. Try different combinations of skew and heterogeneity of variance and examine the effects on the type I error rate.


6. Investigate the effects of the combination of unequal sample sizes and homogeneity of variance. Give one population a standard deviation of 1 and the other a standard deviation of 3. Then make the sample sizes unequal. For one set of simulations, make the sample from the population with a standard deviation of 1 larger than for the other sample. Then try it the other way.


Summary


In general, an independent groups t test works well even if its assumptions are violated. The test is slightly conservative (a lower type I error than the signficance level) for skewed distributions. When the homogeneity of variance assumption is violated, there is a slight increase in the type I error rate, and the effect is more pronounced with small sample sizes.


There is a serious inflation of the type I error rate when there is a combination of unequal sample sizes and heterogeneous variances. This occurs when the sample size from the population with the larger variance is smaller than the sample size from the populaton with the smaller variance.


Copyright 2011