Categorical Data Analysis in Epidemiology

Biostatistics/Epidemiology 536

Course Outline, Fall Quarter 2007

Printable (pdf) version of this document

Lectures:                T, Th       1:30-3:20             HST T733

Discussion:                T         12:30-1:20            HST T639

Video recording:     Lectures will be taped using the tablet PC and made available on the course web page.  However, this is subject to technology limitations and time so we cannot guarantee availability of a recording for every class.

Handouts:       Copies of the handouts will be available from the class website (see below).

Instructor:      Bill Barlow

                        Research Professor of Biostatistics

                        williamb@crab.org  or  wbarlow@u.washington.edu

                        Telephone:        206-839-1761 (Cancer Research and Biostatistics) Ü  voice mail

                        Office hours:     T 11:00 am - 12:00 pm; Th 12:00 pm - 1:00 pm ;  or other times

                             by appointment (either at UW; Met Park East (M, W, F); or FHCRC (W))

Guest Instructor:       Mary Lou Thompson,  Research Professor of Biostatistics

Teaching assistants               Name                          E-mail                                     Office hours*

Charlotte Gard             gardc@u.washington.edu         

                                                Colleen Sitlani               cmg22@u.washington.edu       

                                                Britton Trabert              brittont@u.washington.edu       

Required text:  Hosmer D and Lemeshow S. Applied Logistic Regression, 2nd Edition.  New York : Wiley, 2000.

Recommended text:   Breslow N and Day N.  Statistical Methods in Cancer Research, Volume1:  The Analysis of Case-Control Studies.  Lyon, France: IARC Scientific Publications No 32, 1980.

Required software:  STATA 10. Can be purchased for $ 48 (small version, Getting Started manual) and up. The Intercooled version with permanent license ($155) is recommended.  Details available from  http://www.stata.com/order/new/edu/gradplan.html Order directly from Stata.

Computing:     Access is provided to Stata on computers in departmental computing labs as well as the Health Sciences library computing laboratory.  Most students will use their own personal computers.

Website:   The website will be used to distribute data, lecture notes, and to make important broadcast announcements.  The website also allows you to send email to me, the TA's, or even the entire class.  Our class specific website is http://courses.washington.edu/b536 .  You will need your UW netid to get access to the website.  Most handouts are in Adobe pdf format so can be printed directly by any computer with Acrobat Reader.

E-mail: A classlist containing e-mail addresses of all registered class members is being constructed for e-mail contact. You are encouraged to direct questions to the instructor and/or teaching assistant. When of general interest, these will be edited, rendered anonymous as to the sender and forwarded together with the response to the classlist.

Grading:         25%  Assignments

                          5%  Class participation

                        20%  Midterm         (In class on October 30)

                        25%  Project           (Two parts: first part due Nov. 8; second part due Nov. 29)

                        25%  Final               (In class on the last day of lectures, December 6)

These percentage are used to find your rank in the class and the ranks are used to assign the final grade.

Lectures: The lectures are prepared in advance, with hardcopies of the lecture notes distributed in class and posted to the website. Questions from registered students are encouraged.  The questions often clarify points on which several students may share the same uncertainty. If you believe your question is not of general interest, feel free to ask your question before or after class. Auditors may not submit assignments or exams for grading.

Discussion section:   The discussion section will be used to discuss Stata examples, homework, and outstanding questions. 

Assignments will generally be distributed one week in advance and be due in class the following Thursday.  Computer output should be edited to eliminate all irrelevant material and should clearly indicate the answer to the question posed.  Late assignments are not acceptable.  Homework keys prepared by the teaching assistants will be posted to the class website.  Assignments are not to prepare you for exams, but to prepare you for realtime application of the methods to data.

Difficulty: Most will find this course demanding.  The homeworks are time-consuming and the text needs to be read several times for it to make sense.  We expect you to talk to your classmates about the materials and homeworks to gain further insight.  If you need help, please use both the TA and myself to get assistance.  I welcome feedback during the course so feel free to let me know what you think by e-mail, before or after class, or by anonymous note.

Learning Objectives

It is assumed that when entering BIOST/EPI 536, you have completed a course in linear regression and been exposed to logistic regression and some categorical data analysis.  You should understand the basic statistical concepts of sampling variation, parameter estimation, confidence limits, and statistical hypothesis testing.  You should know about simple statistical techniques for analyzing data from a binomial distribution including odds ratio estimation in 2 x 2 tables and in series of 2 x 2 tables.  You should be familiar with the Mantel-Haenszel test and testing trend in a 2 x K table.  By the end of this course, you should be able to do the following using unconditional or conditional logistic regression, or using generalized estimating equations (GEE):

1.      Perform regression analyses with multiple predictors.

2.      Perform tests that indicate which covariates should be included in the model.

3.      Determine if there is a linear trend  for ordinal level (or better) covariates

4.      Use graphical and other methods for assessing adequacy of the fitted model.

5.      Interpret each coefficient in the model.

6.      Describe the methods and results to a non-statistical reader.

Books at the Health Sciences Reserve Desk (3 day):

Hosmer D and Lemeshow S. Applied Logistic Regression, 2nd Edition.  New York : Wiley, 2000.

Breslow N and Day N.  Statistical Methods in Cancer Research, Volume1: The Analysis of Case-Control Studies.  Lyons, France: IARC Scientific Publications No 32, 1980.

Clayton D and Hills M, Statistical Models in Epidemiology. New York: Oxford University Press, 1993.

Kleinbaum DG and Klein M. Logistic Regression: A Self-Learning Tex, 2nd Edt. New York : Springer, 2002.

Special Dates

October 2                                   Discussion section (12:30-1:20) will meet

in the HS Library Computer Labs A and B

October 11, 16, 18, 23               Lectures by Dr. Mary Lou Thompson

October 30                                 Midterm exam (in class)

November 8                               Part 1 of the project due

November 29                             Part 2 of the project due

December 6                                Final exam on the last day of lectures (in class)