Biostatistics Mini-project 2

The answers to these two problems should be written up as essays. You should write as though you were writing a methods and results section for a paper you would submit. Rephrase the question posed, write a brief summary of your results making clear what test you used (including degrees of freedom and other relevant parameters), whether the test was one or two-sided. Graphics can be very useful and make your point for you better than words. Include them when appropriate. Your write-ups should be kept short. You should NOT do every test you know and the data and discuss all possible outcomes. You SHOULD pick an appropriate test, conduct it, and report the results of it. NEVER, NEVER include raw Minitab (or other software) output. When appropriate, you can format your own table and include the relevant parts of the Minitab output. You should also comment on the experimental design and how it impacts your conclusions. If the data are observational, then there may be numerous factors impacting the significance of the test (or the lack of significance) and the results may not be due to the factor that is distinguishing between groups but by some uncontrolled factor. True experiments also differ in the quality of their design for controlling for miscellaneous factors. Mention any problems with the design you notice.

Problem 1: In January 2006, Dan Donato and co-authors J.B. Fontaine, J.L. Campbell, W.D. Robinson, J.B. Kauffman, and B.E. Law published a short Science article (vol 311, p.352) reporting, essentially, that measure seedlings after a natural fire in Oregon indicated that it seemed better for seedlings not to salvage log. That is, there was a greater decrease in the number of seedling in the logged areas between 2004 and 2005 than in the unlogged areas. The fire that occurred in 2002 was named the Biscuit fire and the logging occurred in between the record of seedling numbers given in the table below.

Later in 2006, after his article was published, Dan was hauled in front of a congressional committee, in part to defend his statistical analysis. Congressman Baird disputed Dan's statistical techniques claiming that others (2-sample t-tests), more appropriate for the analysis of these data, indicated no significant difference in logged versus unlogged areas.

Analyze these data yourself and write a report, either defending Dan's and his co-author's claim that logging hurt seedling production or defending Congressman Baird's contention that there is no significant difference. Think about what the appropriate null and alternative hypotheses should be - these are formed before seeing the data. Note that before these data were published, it was widely believed that logging should be beneficial for seedling rejuvenation. Defend your choice of statistical test - preferably in language that a congressman could understand.

Plot	seedlings_2004	seedlings_2005	treatment
1	298	164	logged
2	471	221	logged
3	767	454	logged
4	576	141	logged
5	407	217	logged
6	1534	224	logged
7	2423	349	logged
8	1697	1388	logged
9	1137	646	logged
10	288	220	unlogged
11	622	747	unlogged
12	300	260	unlogged
13	888	584	unlogged
14	1448	1566	unlogged
15	1425	626	unlogged
16	2349	1924	unlogged