|
III HOME
OVERVIEW
FIRST STEPS
EMPIRICAL STUDY
LITERATURE REVIEW
RESEARCH
FORMS
|
|
Dealing with Data
Spreadsheet or database? The first thing
to do once you have collected your data is to enter it into a database
or spreadsheet. What's the difference? For almost all III purposes,
the two are equivalent for information retrieval and storage, and both
can be easily imported to SPSS for analysis. Databases are a little more
time-consuming to set up but offer the great advantage of preventing
some kinds of data entry and manipulation errors. In a spreadsheet,
each cell is independent, so if you accidentally hit the up arrow during
data entry, you could overwrite existing data. The independence of each
cell also means that if you do any kind of sorting, calculating, etc.
its possible to leave cells out of the operation if you're not careful.
A database minimizes opportunities for these kinds of errors by treating
rows or columns as units rather than as strings of independent cells.
Use a spreadsheet when the dataset is relatively small (<100
subjects, <20 variables). In a small dataset, errors are easier to catch
and correct. If you are working with large amounts of data, and/or if
your data may be used by other investigators in the future, it may be
worth the extra effort to use a database. If you use a spreadsheet,
avoid any manipulations other than data entry. Manipulations to the
data are best done in SPSS where there's no chance of them corrupting
the raw data. Save the data as an Excel 4.0 worksheet. At this time
SPSS can't open workbook files.
Shortcut to creating
an Access database
Data format. Regardless of which kind of data storage
system you choose, the format will be the same:
- 1 subject per row
- 1 piece of information per column
- Use the topmost row of a spreadsheet for the column heading
- Use the first column for subject IDs
- If a measurement is repeated several times with the same subject,
record each occasion of measurement in a different column using a unique
column heading
- If more than 1 response to a survey question is an option, give every
possible response its own column and treat each one as a "yes or no"
question. Here's an example:
- I take a daily dose of (choose all that apply):
- NSAIDs
- nutritional supplements
- prescription drugs
This question would end up occupying 3 columns: NSAIDs - yes or no,
nutritional supplements - yes or no, and prescription drugs - yes or
no
- Enter a value for each variable, if possible. For example, if a column
is headed "history of hypoglycemia", enter the "no"s not just the "yes"s.
Leave cells blank only if the information is missing.
- Categorical variables can be recorded as words or numbers. Words are
more meaningful and can usually be analyzed just fine in SPSS. Both
Access and Excel are smart enough to fill in frequently used words once
the first letter is entered, so text entries don't mean more typing.
- If a column contains numerical data, don't include any text. SPSS
will think this is a mistake and treat it as missing data.
- Be consistent in how data is entered. SPSS won't know that "male"
and "man" mean the same thing.
A good example:
| Subject ID |
Sex |
Age |
Group |
Pain-pretreatment |
Pain-posttreatment |
Satisfied? |
| 1 |
M |
43 |
control |
5.5 |
4.2 |
yes |
| 2 |
M |
75 |
control |
6.8 |
6.1 |
yes |
| 3 |
F |
68 |
control |
5.2 |
4.9 |
no |
| 4 |
M |
72 |
experimental |
6.3 |
5.4 |
yes |
| 5 |
M |
49 |
experimental |
7.2 |
5.9 |
yes |
| 6 |
M |
55 |
control |
6.7 |
6.8 |
no |
| 7 |
M |
47 |
experimental |
5.2 |
5.1 |
yes |
| 8 |
F |
70 |
experimental |
6.3 |
5.9 |
no |
| 9 |
F |
43 |
experimental |
6.6 |
5.3 |
yes |
| 10 |
M |
56 |
control |
5.9 |
5.5 |
no |
A bad example:
| |
|
|
|
Pain |
|
| Subject ID |
Sex |
Age |
Group |
pre-post |
Satisfied? |
| 1 |
1 |
43? |
cont |
5.5 4.2 |
yes |
| 2 |
1 |
75 |
control |
6.8 6.1 |
yes |
| 3 |
2 |
68 |
C |
5.2 4.9 |
no |
| 4 |
1 |
72 |
exp |
6.3 5.4 |
yes |
| 5 |
1 |
49 |
ex |
7.2 5.9 |
yes |
| 6 |
1 |
55 |
control |
6.7 6.8 |
no |
| 7 |
m |
47 |
experimental |
5.2 5.1 |
yes |
| 8 |
1 |
70 |
E |
6.3 5.9 |
no |
| 9 |
2 |
43 |
experimental |
6.6 5.3 |
yes |
| 10 |
1 |
56 |
control |
5.9 5.5 |
no |
Preparing the Petition
| UW Human Subjects Review | Sample
Petition
Data Management | Data Analysis | The
Final Report
|