Quasi-Experimental Designs and Classic Confounds


Classic Confounds

  1. History - occurs when some event happens that could plausibly affect participants' performance. The event could concern just a particular participant (e.g., winning the lottery), or it could involve the whole society (e.g., a war or an economic depression).

  2. Maturation - occurs when participants' behavior differs from the beginning to the end of a study simply as a result of intrinsic changes (such as developmental changes related to aging, the natural progression of illnesses or disorders, and changes as a function of greater experience)

  3. Testing - occurs when taking a pretest influences participants' behavior on a posttest (e.g., by causing them to be more relaxed on the posttest because it is now a more familiar testing situation, by fatiguing them, or by allowing them to have practice with the particular types of items on the test).

  4. Instrumentation - occurs when the measuring instrument changes in some way from pretest to posttest.

  5. Regression - the statistical phenomenon whereby participants who score in the extremes on a pretest may score closer to the mean on the posttest, even if there is no treatment effect.
    Click here for combined class results of Friday's "Coin Flip" exercise and more information on regression to the mean.

  6. Mortality/Attrition - occurs when some participants begin a study, but fail to complete it for any reason.

  7. Selection - This confound applies to quasi-experimental designs that involve more than one group, such as a pretest-posttest design with a non-equivalent control group or a time series design with a non-equivalent control group. A selection confound occurs when the method of selecting participants for the conditions of an experiment causes those various groups of participants to differ at the outset of the study in important ways that could affect their performance on the dependent variable.

Practice Exercises

First, identify the type of quasi-experimental design. Second, try to answer the other
questions posed for each exercise.

 

 

Exercise #1: The Gun Ban


Based upon reports of increased violent crime in other parts of Oklahoma, the newly incorporated town of Hillsborough creates a law banning people from bringing handguns within the town limits. After passing the law, the town council also starts to collect data on violent crime that occurs within the town limits. For the year following the handgun ban, they find that the likelihood of a Hillsborough resident being a victim of violent crime is less than one-in-a-thousand. The town council members all seek re-election, claiming that their law has made Hillsborough a safer place.

What type of experimental design was used?

What is the most basic reason why we cannot accept the council members' conclusion that the gun ban made Hillsborough a safer place?

 

Click here to view a discussion of Exercise #1.

 

 

***********************************************************

 

 

Exercise #2: Dirty River


People have been noticing that the water in the Ookaville River looks dirty and oily. Starting in May, scientists take measurements of the pollution in the river for 5 months and then new, stricter water pollution regulations are enacted. Scientists then take measurements of pollution in the river for the 5 months following the new regulations. The pollution was high in all the measurements before the new regulations. Since the new regulations, pollution in the river has been decreasing each month. The scientists conclude that the new regulations have lead to less pollution in the river. Another group of scientists examines the procedures that were used to assess pollution in the river. They find that the scientific field kits used to measure pollution were becoming gradually less-and-less sensitive to pollution. Thus the pollution level was being under-estimated towards the end of the study.

What type of experimental design was used?

What one confound is most directly illustrated by this example?

What other confounds might exist?

 

Click here to view a discussion of Exercise #2.

 

 

***********************************************************

 

 

Exercise #3: Another Foul River

In a different part of the United States, the residents of Allentown have been noticing that the water in the Allen River looks dirty and oily, and it smells. Scientists take measurements of the pollution in the river for 5 months and then new, stricter water pollution regulations are enacted. Scientists then take measurements of pollution in the river for the 5 months following the new regulations. The pollution was high in all the measurements before the new regulations. Since the new regulations, pollution in the river has been decreasing each month. The scientific field kits used to measure the pollution worked properly at each measurement period; no problem with that here. However, just two weeks after Allentown introduced its stricter water pollution regulations, a national "Save the Environment" media campaign was launched by a group of environmental organizations. The campaign featured daily TV, radio, and newspapers advertisements attempting to get people to recycle more goods and create less pollution (such as reducing use of pesticides and runoff). This media campaign lasted for several months.

What type of experimental design was used?

What one confound is most directly illustrated by this example?

 

Click here to view a discussion of Exercise #3.

 

***********************************************************

 

 

Exercise #4: Don't Be So Anxious

Dr. Bowers has developed a new treatment for general anxiety disorder (GAD) and he has designed a test to assess its effectiveness. Unfortunately, Dr. Bowers' new treatment involves the use of very painful electric shocks. Dr. Bowers finds 40 people who have been clinically diagnosed with GAD and enrolls them (with their consent) in his treatment program. First, he measures their level of general anxiety. Then, over the course of the next 10 weeks, the participants are given several powerful and painful electric shocks several times a week. Finally, Dr. Bowers measures the participants' anxiety levels at the end of the 10 week treatment period. Although only 29 of the 40 participants actually stay in the study for all 10 weeks, these 29 participants have lower anxiety levels than they did before the treatment. Dr. Bowers concludes that his treatment is effective.

 

What type of experimental design was used?

What one confound is most directly illustrated by this example?

What other confounds might exist?

 

Click here to view a discussion of Exercise #4.

 

 

 

***********************************************************

 

 

Exercise #5: There's No Place Like Home


Fifty pregnant women respond to a newspaper advertisement placed by a researcher, which asks them to participate in a study on childbirth. The researcher asks each woman whether she is willing to volunteer to participate in a home birthing program (treatment condition), or instead wishes to undergo normal hospital procedures for childbirth (control group condition). Twenty-one of the women volunteer for the home birthing program, and the rest of the women choose to the normal hospital procedure. Subsequently, during childbirth, women in the home birthing condition spent an average of 6 hours in labor, while those in the hospital birthing control group spend an average of 9 hours in labor.

 

What type of experimental design was used?

What one confound is most directly illustrated by this example?

What other confounds might exist?

 

Click here to view a discussion of Exercise #5.

 

 

 

***********************************************************

 

 

 

Exercise #6: Restrain Your Child


Alarmed at the large number of children in their state being injured in car crashes during the first six months of 1997, state legislators debate a tough child restraint law during the fall of 1997. The law passes and goes into effect on January 1 of 1998. Injury rates are recorded every six months (January and June) and compared to limited existing data (prior two years only) from before the law. Although more data will continue to be collected, the researchers must file an initial report to the state legislature by the end of 1999. The dependent variable is the number of children aged 1-10, per thousand injured in car accidents. On the basis of the following data, should the researchers conclude that the program had a significant effect?


 

 

What type of quasi-experimental design was used?

What should the researchers conclude?

What factors might account for the decline?

 

Click here to view a discussion of Exercise #6

 

 

 

***********************************************************

 

 

 

Exercise #7: Divide and Conquer

 

Ms. Connor’s third-grade class at Emerald Elementary School is having a difficult time learning how to divide numbers. On homework problems and class exercises, they seem to be struggling with division. So Ms. Connor’s develops a new approach to teaching division that relies heavily on real-world examples rather than abstract numbers on the chalk board. She decides to test the effectiveness of this new method. In the first week of April, she gives her entire class a math exercise to perform that consists entirely of division problems. Overall, the children can only answer about 45% of the questions correctly. For the next month, Ms. Connor uses her new method to teach division to the children. Then she gives them a posttest that consists of division problems similar in difficulty to the problems on the pretest exercise. (Ms. Connor’s did not go over the answers to the pretest with the children, and it’s unlikely like the children would remember the specific questions a month later). She finds on the posttest that the children, on average, answer 61% of the items correctly. Ms. Connor is pleased and concludes that her new teaching methods works well.

 

What type of experimental design was used?

 

What confound provides the greatest threat to the internal validity of Ms. Connor’s “study”

 

What other confounds might have influenced the results?

 

 

 



Click here to view a discussion of
Exercise #7

 

***********************************************************


Discussion of examples (answers)

 

 

Exercise #1 Discussion: The Gun Ban


What type of experimental design was used?
Posttest only design (quasi-experiment)

What is the most basic reason why we cannot accept the council members' conclusion that the gun ban made Hillsborough a safer place?
There is no information about the violent crime rate in Hillsborough for the year before the gun ban, and thus no basis for comparison. Because Hillsborough is newly incorporated, such data do not exist. The council could, however, take the same unincorporated geographic area (i.e., same boundaries) for the prior year and attempt to get the crime statistics from the County or Counties that had jurisdiction over that area.


 

 

 

***********************************************************

 

 

 

 

Exercise #2 Discussion: Dirty River


What type of experimental design was used?
Simple Interrupted time series design (quasi-experiment)

What one confound is most directly illustrated by this example?
Instrumentation (the accuracy of the field kits decreased over time and underestimated the level of pollution).

What other confounds might exist?
History (e.g., residents may be more likely to use pesticides in the spring and summer, when more gardening occurs; they may wash their cars more or change their own car oil more at this time of year…all of these actions could create more runoff). Regression to the mean is possible but if regression were operating, this should start to show up in the series of pretests (i.e., pollution levels should decline over the pretests).

 

 

 

 

***********************************************************

 

 

 

 

 

Exercise #3 Discussion: Another Foul River


What type of experimental design was used?
Simple Interrupted time series design (quasi-experiment)

What one confound is most directly illustrated by this example?
History: the "Save the Environment" media campaign may be causing the reduced pollution levels

(Intentionally have not asked "What other confounds might exist," as this example overlaps with the previous one.)

 

 

 

 

***********************************************************

 

 

 

 

 

Exercise #4 Discussion: Generalized Anxiety Disorder


What type of experimental design was used?
One-group pretest-posttest design (quasi-experiment)

What one confound is most directly illustrated by this example?
Subject mortality/attrition (Just over 25% of the participants dropped out. Perhaps these participants felt that they treatment wasn't helping them, or was making them more anxious. Had they remained in the study it is possible that the overall change in pretest-posttest scores might not have showed a significant drop.)

What other confounds might exist?
Regression: If these people had clinical GAD, then it is likely that their scores on the pretest would be quite high relative to the population as a whole.
History: Other events during the 10 weeks might have caused participants to be less anxious overall, although there is no specific information to suggest this.
Testing: Taking the pretest may have influenced scores on the posttest
Maturation: Unlike depression, GAD tends to be a more chronic disorder, so while it's possible that some of the improvement could be due to a maturational effect, it's not likely (I would not expect students to know this on an exam question. If there is an exam question that involves maturation, the scenario would use an example where maturation would stand out).

 

 

 

 

***********************************************************

 

 

 

 

Exercise #5 Discussion: The Home Birthing Program


What type of experimental design was used?
non-equivalent control group design (quasi-experiment)

What one confound is most directly illustrated by this example?
Selection. The researcher did not randomly assign the women to the treatment versus control conditions. Rather, the women chose the procedure that they preferred. The women who volunteered for home birthing may be different in some significant way (i.e., more adventurous and willing to try new things; less anxious in general, more physically hardy, eating a healthier diet) than the women who chose the typical hospital procedure.

What other confounds might exist?
History is the most likely other confound. It is possible that the two groups of women, on average, had different experiences leading up to childbirth that might have influenced the time they spent in labor.

 

 

 

 

***********************************************************

 

 

 

 

Exercise #6 Discussion: Restrain Your Child


What type of quasi-experimental design was used?
simple interrupted time series design

What should the researchers conclude?
The number of injuries clearly declines after the law is enacted, but this merely continues the same rate of decline that was occurring during the two years before the law was enacted. There is some variability in both the pre- and post-periods, but in general the rate of decline over the 4-year period is steady. It does not look as if the law, by itself, is producing any change to this declining rate.

What factors might account for the decline?
Regression to the mean might account for some of the decline in the early years, but as the numbers keep dropping such regression would be weaker and weaker. Yet, the decline remains steady.

History effects could include steady improvement in automotive technology that make cars more crashworthy, so that fewer children and adults are injured in car accidents (perhaps especially in low-speed accidents).

It is also possible that the child restraint law did have an effect, but that the effect was masked by history or other confounds. Suppose that the start of the time series analysis, regression played a significant role in the drop in injuries from January '96 to June '97. As regression became weaker in the post-law period, the drop in injuries would be expected to level off. But perhaps, from Jan '98 to June '99, more parents are indeed using child restraints in cars, and thus the continued steady decline is really due to the child restraint law. Of course, if some major improvements in car safety technology occurred during the posttest period but not the pretest period, then this history confound might explain why the injury rate continued to decline after regression to the mean might have ceased being a factor.


 

 

 

***********************************************************

 

 

 

 

 

Exercise #7 Discussion: Divide and Conquer

 

What type of experimental design was used?
one-group pretest-posttest (quasi-experiment)

 

What confound provides the greatest threat to the internal validity of Ms. Connor’s “study”
Maturation: The children are gaining more experience with division problems during the month-long intervention period, no matter whether Ms. Connor’s would have used her new method or the old method. This gain in experience falls under the “maturation” category and could easily explain the results. Note also that the children are maturing biologically; their brain development/cognitive development is progressing in general and while we couldn’t possibly know how much if any of the 16% improvement is due to this biological maturation, it certainly is a possible contributor (imagine if she would have used the new method for 4 months rather than 1 month, which would give biological maturation even a greater possible role).

 

What other confounds might have influenced the results?
History: It is possible that other events took place, overall, in the children’s lives that might have influenced their performance, including on the motivational side. Ms. Connor would need to be sure, for example, that the intervention period didn’t come just before or after a holiday break.

 

Testing?: Perhaps you said “testing” was the greatest confound, in your answer to the previous question. It is the case that the pretest and posttest were similar, and thus the experience of taking the pretest might have influenced children’s posttest performance (e.g., maybe they felt more relaxed). However, we must take the context into account. The fact that these children have been doing division problems in their homework assignments and in class exercises – even before the pretest – reduces the possibility that testing is a confound, though it is still possible. If the pretest was the first “test” or exercise in which the children had done division problems, then it would be a candidate for a major confound.

 

Regression? On average, the initial scores (45% correct) are not extreme, and it is unlikely that regression is operating here. But suppose that in Ms. Connor’s class (25 students), she chose the 5 worst performers and used her new method with them and her regular method with the other students. If these 5 students improved more than the rest off class, then in this case regression would be a likely confound.