Psych 218 –
Third SPSS Tutorial
You are researching treatments for substance abuse, and your main outcome of interest is the number of days in which participants use substances (per month). You randomly assign 40 participants to 4 different treatment conditions:
1) a wait-list control (they later receive a treatment, but
you measure their use before they do)
2) 12-step group, either Alcoholics Anonymous or Narcotics
Anonymous
3) harm reduction group therapy, geared toward reducing the
harms associated with use (abstinence from drugs is not necessarily the goal)
4) individual psychodynamic psychotherapy, focusing on the unconscious drives to use substances
The data above (download by clicking here, then save to a folder on your computer) are the number of days that each participant has used substances in the most recent month (assessed at 3 months into the study). The possible range of days is 0 to 31 days; this dependent variable is “daysuse.” The “group” variable represents the therapy that each participant receives (1-4 represent the conditions 1-4 listed above).
You can look at the data in boxplot form by selecting
GRAPHS
/ BOXPLOT / SIMPLE (summaries for groups of cases)
and defining your variables (“daysuse” is your variable of interest, and “group” is the category axis).
Run the analysis of variance (overall F test in SPSS by selecting)
ANALYZE
/ COMPARE MEANS / ONE WAY ANOVA
The one-way ANOVA command (which you used in the last tutorial) requires you to specify a dependent variable (daysuse) and a factor (group). You also want to select under the Options that you want both descriptive and homogeneity of variance statistics. Your output follows:
You have an effect!! Somewhere, that is… This is fine!
No HOV
violations.
OK, so you know that there is some effect of the independent variable. Everything that you’ve done in SPSS up to this point is review. However, let’s assume several different scenarios in which you want to do different types of tests…
SCENARIO 1: You have two questions in mind before you begin the study:
1)
Is harm reduction superior
to all other methods of therapy?
2) Does group therapy differ from the other conditions (both 12-step and harm reduction are in groups in this study)? That is, does the average of the wait-list control and the individual psychodynamic therapy differ from the average of 12-step group and harm reduction group?
What is the appropriate type of testing to do here? You have two comparisons that you plan to make before the study begins. The null hypotheses are:
First comparison: μ _{harm r } = μ _{wait list }+ μ _{12-step }+ μ _{psychodyn.} or μ _{harm r } - μ _{wait list }+ μ _{12-step }+ μ _{psychodyn} = 0
3 3
Second comparison: μ _{harm r } + μ _{12-step } = _{ }μ _{wait list }+ μ _{psychodynam} (can also set this = 0)
2 2
The contrast coefficients for these two comparisons are (with the means in order of wait list control, 12-step group, harm reduction group, and individual psychodynamic therapy):
First comparison:
(-1/3, -1/3, 1, -1/3)
Second comparison:
(-1/2, 1/2, 1/2, -1/2)
Note that the sum of the contrast coefficients, in both cases, is 0. This satisfies the rule for constructing contrasts. Also note that the sum of the absolute value of the coefficients is 2; this satisfies the suggested way of constructing contrasts.
What type of comparisons are these? Should you use Bonferroni, Tukey’s HSD, or Scheffé?
SO, how do you do these in SPSS? Well, SPSS allows you to specify which contrasts you would like to make in the ANOVA command. Let’s re-run the ANOVA, but this time indicate which contrasts we’re testing:
Select:
ANALYZE / COMPARE MEANS / ONE WAY ANOVA
Enter your dependent and independent variables as before.
Also select Contrasts
To enter your contrasts, enter each coefficient into the “coefficients” box and click “ADD.” When you have finished entering one contrast, click on “Next” to go the second contrast. Here is how your dialog box will look after you have entered each contrast:
Contrast 1: Contrast 2:
Click on “Continue,” then OK to run the tests. (The output is below; descriptives and HOV tests are omitted since they are the same as in the F test run previously)
The value of F and the overall significance is
still the same.
Notice that the tests of the contrasts give
you the value of each contrast (this is computed the same way that you would
compute ψ-hat), the standard error of the contrast, and the t-statistic
that is the test of the contrast.
Because Levene’s statistic was not
significant, we will fail to reject the null
hypothesis that the population variances are the same, and will use the
values under the “assume equal variances” row.
Also note that for the first contrast, it
states that the “sum of the contrast coefficients is not zero.” This is because SPSS only allows us to enter
decimals for each coefficient (and only two decimal places), so .33 is not
interpreted as 1/3. Don’t worry about
the fact that the contrast value will be slightly off because of this.
NOW, what about the p-values
(significance levels of each contrast)? SPSS does not take into
consideration the fact that you may have more than one planned comparison, so
it does not do the Bonferroni adjustment.
This adjustment is accomplished in one of two ways:
1)
Compare
your obtained p-values to the desired alpha level DIVIDED BY the number of tests you planned in
advance. For example, if you have
planned two tests (as you did here), you
compare your p-value to alpha (.05) divided by 2,
or .025, to determine whether or not the tests are significant.
2)
Multiply
your p-value by the number of tests (two here) and compare to the alpha that you have set. For these contrasts, your p-values would be
(.001 * 2) or .002 and (.020 * 2) or .040, respectively.
These are both less than .05, of course.
Regardless of the way you choose to do the
adjustment (either to the alpha level or to the p-value), you will obtain the
same result.
Now, let’s think of a different scenario:
SCENARIO 2: Your primary question of interest is whether any of the four conditions differ from each other in the experiment.
This results in p (p-1) / 2 comparisons, where p is the number of groups or conditions. For a 4 group experiment like the one here, there are 6 pairwise comparisons.
What test should you use to account for the
fact that you are doing all pairwise comparisons? You planned these comparisons in advance, remember…
(ANSWER: Tukey’s HSD test is appropriate for all
pairwise comparisons, whether these comparisons are planned beforehand or
compared afterward).
You can compute Tukey’s HSD test in SPSS by
running the usual command:
ANALYZE / COMPARE MEANS / ONE WAY ANOVA
(SPSS will remember the a priori contrasts
that you just entered; you can clear those by selecting “contrasts” again and
removing the contrast coefficients you entered)
Select “POST HOC.” Note that this is a misnomer; you planned
these pairwise comparisons in advance, but SPSS only offers Tukey’s HSD test
under the “Post Hoc” option.
Select that you want the Tukey test.
Run the test, and you will get the following as part of your output:
First, note that SPSS yields 12 pairwise
comparisons (that is because each pair is duplicated, e.g., ’12-step group and
harm reduction’ is the same comparison as ‘harm reduction and 12-step
group’). The value of the difference
between the group means is given, along with the significance of each
comparison. SPSS has already adjusted
for the multiple pairwise comparisons using Tukey’s HSD, so you can take these
significance levels as they are listed.
Notice that there are
significant differences between wait list control and harm reduction, and
between 12-step group and harm reduction.
In both cases, harm reduction is superior.
SCENARIO 3: You had no planned comparisons before you looked at the data. After you ran the overall ANOVA, you wondered if the wait list control was less effective than the average of the other three conditions (any therapy at all) in reducing the number of days of substance use.
In doing this, you had no planned comparisons, and only one post-hoc comparison.
What type of test is appropriate to use?
ANSWER:
Because you conducted this test post-hoc, you would need to use either
the Tukey HSD or the Scheffé…however, to cover the fact that you are doing more
than pairwise comparisons, you would need to use Scheffé (the only test that is
appropriate if you do complex comparisons post-hoc).
Your null hypothesis for this test is: μ _{wait list } = μ _{12-step }+ μ _{harm r }+ μ _{psychodyn.}
3
Your contrast coefficients should be (1, -1/3, -1/3, -1/3).
You can run this contrast as you did earlier in SPSS, using the following values:
If you do that, you would get the following:
Unfortunately, this is not Scheffé’s
test. It is the t-test corresponding to
a planned comparison; if you check Scheffé in the post-hoc box of the ANOVA
command, you get the following output:
Note here that you are encountering a
limitation of SPSS. When you perform a
Scheffé test in SPSS, it gives you the pairwise tests, just as it did for
Tukey. It does not give you the complex
comparisons (of course, this would be difficult, as there
are an infinite number of complex comparisons!). What you should note from this is that the
significance levels of the Scheffé tests are greater than the significance
levels of the Tukey’s HSD tests (e.g., the difference between harm reduction
and 12-step group is no longer significant with Scheffé).
Why is this? Since
you are accounting for the possibility of doing all possible comparisons
(pairwise and complex), you are doing many more tests than with Tukey, so your
test of each comparison is therefore more conservative. That means, it is more difficult to reject the
null with each of your tests using Scheffé as compared to using Tukey.
Unfortunately, if you want the
Scheffé’s test statistic as is listed in your text, you must compute this by
hand; SPSS will not give it to you for complex comparisons.
SCENARIO 4: You had no planned comparisons before you looked at the data. After looking at the data, you decide to test all pairwise comparisons, but stop there.
You will be conducting all pairwise comparisons post-hoc.
What test do you use? This isn’t meant to be a trick; you still use Tukey’s HSD. That is because this test is appropriate for all pairwise comparisons, whether planned or post-hoc.
SCENARIO 5: You had no planned comparisons before you looked at the data. After looking at the data, you wonder if harm reduction is significantly different from individual psychodynamic therapy, but don’t want to answer any other questions.
You had no planned comparisons, were interested in only one pairwise comparison post-hoc.
What test do you use? Again, it’s Tukey’s HSD! That’s because when you conduct post-hoc tests, you need to adjust your overall alpha to account for the number of tests that you would need to have conducted in order to do the test of interest (here, you would have needed to do ALL pairwise tests in order to do this pairwise test). Tukey’s test is what you would use, since you did not do a complex comparison and you did not plan this comparison in advance.