SOM Logo

Empirical Study

III HOME

OVERVIEW

FIRST STEPS

EMPIRICAL STUDY

LITERATURE REVIEW

RESEARCH

FORMS

 

Data Analysis and Interpretation

Look before you leap. Begin analyzing your data by listing all the research questions you would like to answer using the variables you have. Start with the most important questions and then go to the minor ones. If you have collected information on a lot of variables, don't try to include them all. Keep in mind the main purpose of your study and stick to the questions that pertain to it. It can also be helpful at this time to rough out the tables and graphs you plan to include and to start planning the "story line" for your results section. This will provide a roadmap for your analysis.

Choosing statistical methods. Once you have a clear idea of what questions you want to ask, choosing the right statistical technique will depend on characteristics of your experimental design and the kinds of variables you will be using. Answers to the following questions will steer you toward the correct analysis:

  1. What is the scale of measurement for each variable?
    • Nominal variables record the number of subjects who fall into each of several mutually exclusive, non-ordered categories. Race, sex, and cause of death are examples of nominal variables.
    • Ordinal variables represent measures that are ordered along a single dimension, but where the step size of the measure is undetermined. Tumor stage is an example of an ordinal variable.
    • Interval variables represent measures ordered along a single dimension where the step size of the measure is constant. Examples are temperature, parity and hospitalization costs.
  2. If the variable is on an interval scale, is it normally distributed?
    • Some variables, such as age at first childbirth, body weight, and many physiological measures are normally distributed. Most individuals cluster around the middle values of the range with fewer at the extremes. Other variables such as parity or PSA are characteristically skewed. Most individuals cluster at one extreme of the range then numbers dwindle progressively toward the other extreme.
  3. Are observations independent or correlated?
    • Observations are independent when different study groups have undergone different treatments, i.e. a case-control study, parallel groups study.
    • Observations are correlated either when a single group undergoes 2 or more treatments at different times (pre-post study) or when study and control groups are matched for certain characteristics.

Performing the analyses. Although Excel has the tools for many statistical operations, it's easier and more foolproof to use a statistics program instead. SPSS can read Excel files directly and does analyses quickly and easily. The library computers have SPSS or you can buy it yourself from the bookstore for about $75. You can get statistical help from UW Statistical Consulting Services (http://www.stat.washington.edu/consulting/) or from the Student Resource Office in the Medical School.

Links to statistical sites:

  • Selecting Statistics (http://trochim.human.cornell.edu/selstat/ssstart.htm) Directs you to an appropriate statistical procedure via a series of questions.
  • HyperStat (http://www.davidmlane.com/hyperstat/) Searchable stats textbook.
  • Statsoft Electronic Statistics Textbook (http://www.statsoft.com/textbook/stathome.html) Another searchable text. Includes lots of sophisticated stuff, but the basics are there, too.
  • Statistics at Square One (http://www.bmj.com/collections/statsbk/index.shtml) An introductory stats course from BMJ.

How to interpret p-values. A low p-value (<.05) is usually considered "significant". Significant, in this statistical sense, means that there is a low probability that the observed effect occurred by chance-by the luck of the draw rather than because of some systematic difference between study groups. Two caveats are in order in interpreting significant p-values. First, pay attention to the size of the effect (the measured differences between groups) as well as the p-value. Ask yourself whether the observed difference is also clinically significant-large enough to really matter. Second, before you can attribute the observed difference to a putative cause, you need to consider the contribution of confounding variables.

Non-significant results can be just as interesting as significant results. Results can fail to achieve significance for 2 reasons, either because there is no effect of the treatment being tested or because the study included too few subjects to demonstrate significance. The first possibility is informative; the second is not. To distinguish between the two you need to examine the power of your study. If you are trying to interpret negative results, this paper may help you: Detsky, AS and Sackett, DL (1985) When was a "negative" clinical trial big enough? Arch Int Med 145, 709-712.