Below Above 1 0 1 0 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2
In class, we wrote something like the following code in R: (However, there seems to be a problem. The t-test statistic in our data was 3.56. It is possible to get the same statistic if one zero, one 1, and two 2's are in the below column. It is possible to have a more extreme value if two 1's and two 2's are in the below column. Our result or something more extreme should be coming up about 12 times out of 1000. I don't get them to come up at all in simulations running up to 100,000. It is probably a problem with the random number generation and better code would fix it.)
out <- array(0, dim=c(10000)) pool = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,2,2) for(i in 1:10000){ pool<-sample(pool, 23) below = pool[1:4] above = pool[5:23] out[i] = t.test(above, below, var.equal=TRUE)$statistic } out[order(out)]In other years, we wrote the code below for a global macro to perform a bootstrap version of a permutation test on these data. The following code creates a permutation, forms the pooled-variance t-statistic, and counts the number of times the t-statistic exceeded that of the original data. The code has to be modified if the test is not whether the mean or column 1 is greater than the mean of column 2 but a 2-sided test or a test in the other direction.
GMACRO permutation # this macro assumes you have 2 samples # one stored in column 1 and one in column 2 # code after a pound sign are comments and are ignored Name k1 "loop" Name k2 "SampleSize1" Name k3 "SampleSize2" Name k4 "Total" Name k5 "T" Name k6 "RandomT" Name k7 "Count" Name k8 "pValue" Name k9 "pooledSD" Let Count = 0 Let SampleSize1 = N(c1) Let SampleSize2 = N(c2) Let Total = SampleSize1 + SampleSize2 Let pooledSD = sqrt(((SampleSize1 -1)*stdev(c1)**2 + (SampleSize2 -1)*stdev(c2)**2)/(SampleSize1 + SampleSize2 -2)) Let T = (mean(c1) - mean(c2) )/(pooledSD*(sqrt(1/SampleSize1 + 1/SampleSize2))) Stack c1 c2 c3; subscripts c4. Do k1 = 1:1000 Sample Total c4 c5 Unstack c3 c6 c7; Subscripts c5. Let pooledSD = sqrt(((SampleSize1 -1)*stdev(c6)**2 + (SampleSize2 -1)*stdev(c7)**2)/(SampleSize1 + SampleSize2 -2)) Let RandomT = (mean(c6) - mean(c7) )/(pooledSD*(sqrt(1/SampleSize1 + 1/SampleSize2))) If RandomT >= T Let Count = Count + 1 ENDIF ENDDO Let pValue = Count/1000 Print pValue ENDMACROA small change in the code gives the code for a bootstrap (sampling with replacement) procedure:
GMACRO permutation # this macro assumes you have 2 samples # one stored in column 1 and one in column 2 # code after a pound sign are comments and are ignored Name k1 "loop" Name k2 "SampleSize1" Name k3 "SampleSize2" Name k4 "Total" Name k5 "T" Name k6 "RandomT" Name k7 "Count" Name k8 "pValue" Name k9 "pooledSD" Let Count = 0 Let SampleSize1 = N(c1) Let SampleSize2 = N(c2) Let Total = SampleSize1 + SampleSize2 Let pooledSD = sqrt(((SampleSize1 -1)*stdev(c1)**2 + (SampleSize2 -1)*stdev(c2)**2)/(SampleSize1 + SampleSize2 -2)) Let T = (mean(c1) - mean(c2) )/(pooledSD*(sqrt(1/SampleSize1 + 1/SampleSize2))) Stack c1 c2 c3; subscripts c4. Do k1 = 1:1000 Sample Total c3 c5; Replace. Unstack c5 c6 c7; Subscripts c4. Let pooledSD = sqrt(((SampleSize1 -1)*stdev(c6)**2 + (SampleSize2 -1)*stdev(c7)**2)/(SampleSize1 + SampleSize2 -2)) Let RandomT = (mean(c6) - mean(c7) )/(pooledSD*(sqrt(1/SampleSize1 + 1/SampleSize2))) If RandomT >= T Let Count = Count + 1 ENDIF ENDDO Let pValue = Count/1000 Print pValue ENDMACRO