Biostatistics/Statistics 578B
Scientific Data Analysis

Dr. McKnight

Week 6 Homework 

Written part 
due noon 
Wednesday, May 9, 2001 

The following assignment is to be turned in over the web at the class turn-in space,

https://catalyst.washington.edu/webtools/secure/esubmit/turnin.cgi?owner=bmck&id=106

Please submit the answers to both questions below as a single, double-spaced file, using either text, html, microsoft word or Adobe Acrobat .pdf format.  Be sure your document has your name on it, and be sure it clearly labels the answer to question one and the answer to question two.

The final project this quarter is to analyze data from an industrial cohort study of workers in the diatomaceous earth industry.  Data were collected on  the histories of 2342 workers in a diatomaceous earth mining and processing facility in California over the period from 1909 to 1994, and the file cristdoc.txt contains documentation for a data file (to be made available next week) containing these.   The main questions of interest are whether exposure to crystalline silica predisposes a worker to death from lung cancer or non-malignant respiratory disease.  Some background and detailed information about study design, quoted from  the introduction of the scientific article that originally reported these data, are given in the file cristdetail.htm.
 

     
  1. Using the documentation for the cristobalite data available on the class website assignments page, write a double spaced 1-2-page introduction to the considerations you think might have led to the collection of the data.  The last paragraph of this section should be as precise as possible a statement, in words, of the scientific questions that will be asked of the data in the analysis.

  2.  
  3. Write a one-to-two-page detailed description of the statistical techniques that will be employed to analyze the data and answer the questions set out in question 1 above.   This should begin with  a description of any plots, tables or other descriptive calculations that will be performed and an explanation of their purpose.  It should also contain a detailed description of any  statistical models that will be fit including the complete definition of outcome variables, explanatory variables, and any parameters in these models.  The last paragraph of this section should translate the verbal questions from question 1 to statistical questions in terms of these parameters, and explain what statistical procedures will be performed to answer these questions.
Last update:  5/01/01