Glossary
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
|
A-B-A reversal design |
A single-subject design in which measurements are made of the target behavior during a baseline period (A), then an experimental treatment (B) is given and the target behavior is measured again, and finally the experimental treatment is removed (return to A) and the target behavior measured. Example: You may want to use this design when you are interested in residual effects of the treatment that may still be in place once the treatment is removed. A study of substance abuse treatment may use an A-B-A reversal design to establish a baseline level of use, to see how the use changes during treatment, and to determine if the treatment effects persist once the individual is no longer receiving treatment. |
A-B-A-B reversal design | An extension of the single-subject reversal design (see above) in which the final phase is a reintroduction of the experimental treatment (B). This design is often employed when it is desirable to reinstate a beneficial treatment at the end of the study. The ABAB reversal design should not be confused with ABBA counterbalancing (see below). |
ABBA counterbalancing |
Complete within-subjects counterbalancing used in within-subjects designs. Conditions A and B are administered to each participant in the ABBA or BAAB order. Not to be confused with the ABAB reversal design. |
Abscissa | The horizontal axis (or x-axis) in a graph |
Abstract | Summary of approximately 100 words at the beginning of a research article. A good abstract provides a definition of the problem being studied, the primary hypothesis (or hypotheses), a summary of results, and a brief statement of the implications of these results. |
Accidental or convenience sampling | Selections of participants on the basis of how readily "available" and willing they are to respond. This method does not involve probability sampling. Example: Asking willing volunteers at a super mall to fill out a questionnaire on product preferences. |
Alpha level | Acceptable level of Type I error. The probability of rejecting the null hypothesis when the null hypothesis is true. The researcher sets the alpha level at the beginning of the experiment. Example: If you set the alpha level at .05 before a study of treatments for depression, you are accepting the 5% possibility that you will make an error in concluding that the treatment has an effect, when in fact it has no effect. |
Alternative hypothesis (H1 or HA) | the hypothesis of the researcher's hunch. When evidence against the null hypothesis is strong enough, the alternative hypothesis is said to be supported. |
ANCOVA | The "analysis of covariance" is a statistical methodology allowing researchers to compare the means of two or more groups when it is thought that differences among participants on a third variable might obscure the effect of the independent variable. In this case, the values of the third variable are "covaried out" of the analysis, and the analysis of variance then proceeds on the adjusted means of the groups. |
ANOVA F-Test | The "analysis of variance" statistical methodology. This method allows researchers to determine whether any differences among group means are reliable in order to evaluate the effect of the independent variable. The analysis of variance method can also be used to test for interactions between two or more independent variables in a factorial design. |
APA format | Publication style and format specified by the American Psychological Association (APA) |
Applied research | Research to provide solutions to practical problems. Applied research is often contrasted with basic research. |
A priori tests | Statistical tests that are planned before the experiment is performed and therefore are part of the design of the experiment; contrasted with post hoc tests, which are decided upon after the data have been examined. |
Archival research | The systematic investigation of documents or records that are accumulated and maintained by individuals or organizations. |
Attrition | Potential confounding variable and threat to internal validity in research. Attrition is loss of research participants from a study before it is completed. Also called subject mortality. See also selective subject loss. |
|
|
Baseline |
A measure of the dependent variable as it occurs without the experimental manipulation. Used as a standard of comparison in single subject designs. |
Basic research | Fundamental or pure research. Basic research is carried out to add to knowledge but without immediate applied or practical goals. Basic research is often contrasted with applied research. |
Between-subjects design |
Also known as a between-groups design. An experimental design in which each participant is tested under only one level of each independent variable. Two or more groups of participants formed at random from a pool of available participants. The ndependent-groups design is a type of between-subjects design. |
Box-plot | a graphical summary of data using a rectangle to indicate the first and third quartiles. A line inside the rectangle indicates the median. Lines extending from the sides of the rectangle ("whiskers") commonly represent the limits of the data within 1.5 interquartile range units from the nearest quartile. |
|
|
Carryover effect | In within-subject designs, the undesirable effect that testing participants in one condition has on their later behavior in another condition. Carryover effects are a primary reason that researchers use counterbalancing. See also practice effect. |
Case study |
An intensive investigation of the current and past behaviors and experiences of a single person, family, group, or organization. |
Ceiling effect | An undesirable measurement outcome occurring when the dependent measure puts an artificially low ceiling on how high a participant may score. Example: A memory test that assesses how many words a participant can recall has a total of five words that each participant is asked to remember. Because most individuals can remember all five words, this measure has a ceiling effect. See also "floor or basement effect." |
Central tendency | Average or typical score in a distribution of scores. Three measures of central tendency are the mean, median, and mode. |
Chi square test for independence | The chi-square test for independence is a statistical method for the comparison of proportions or frequencies across two or more groups. |
Closed ended question | Item in a survey or questionnaire which provides specific response alternatives (e.g., multiple choice answers). Also known as "forced choice items." |
Cohort |
The people (and society in general) living at the time a given individual is developing. Example: If you grew up in the '80s, you and your cohort probably have some association with the phrase, "Where's the beef?" |
Cohort effect | The effect of belonging to a given generation (e.g., the '60s generation). Sometimes people mistakenly assume that differences between people of different age groups is the result of biological aging when the difference is really due to the groups having different backgrounds because they grew up in different eras. |
Compensation | In social science research, the behavior of participants in a control group that is directed toward making up for being deprived of a desired treatment. May be a problem in field research when the treatment is a training program or some other desired treatment. |
Complex design |
See factorial design |
Confidence interval | A range of values around a sample statistic (e.g., the sample mean) for which there is a specified degree of confidence that the population parameter (e.g., the population mean) falls within the interval. The 95% and 99% confidence intervals are most often used. |
Confounded variables | Variables that covary with the independent variable of interest and that thereby "confound" the conclusion as to the effect of the independent variable on the dependent variable. Example: A researcher conducts a study of how the neighborhood in which one grows up influences performance on an intelligence test. This researcher has overlooked many confounding variables: to name just two, one's neighborhood is related to one's socioeconomic status and schooling opportunities. |
Construct | Generalized concept, such as anxiety, that the investigator wishes to measure or manipulate. |
Construct validity | The degree to which a study measures (or manipulates) the construct that the researcher claims it does. |
Contamination | Problem that occurs if there is communication about an experiment between groups of participants assigned to different conditions. |
Contingency table | a cross-classification of variables that are discrete (or categorical). The entries in a cell of the table correspond to the number of observations that share the characteristics defined by the given row and column. |
Control group | Participants who don't receive the experimental treatment. These participants are compared to the treatment group to determine whether the treatment had an effect. See also "placebo." |
Convenience sampling | See accidental sampling. |
Convergence principle | We base research conclusions on evidence from a number of slightly different sources. Convergence lets us reach resonably strong conclusions depite the flaws in individual research studies. |
Correlation | A statistical technique for determining the degree of association between two or more variables. |
Correlation coefficient | A number that can vary from -1.00 to +1.00 and indicates the degree and direction of relation between two variables. |
Correlational design | Designs that are used to establish the relationship between two variables without the ability to infer causal relationships. |
Counterbalancing | Refers to any technique used to vary systemically the order of conditions in an experiment to distribute the effects of time of testing (e.g., practice and fatigue) so they are not confounded with the conditions. |
Covary | To vary or change together |
Covariance | Covariance is a measure of how two variables or two data sets vary with respect to each other. Where the variance is the average of the squared deviations of data points from their mean, the covariance is the average of the products of the deviations of pairs of data points from their means. |
Cross-tabulation | Procedure for organizing frequency data that displays the relationship between two or more nominal variables. A cross-tabulation table contains individual cells, with the number in each cell representing the frequency of participants who show that particular combination of characteristics. For an example click here. |
|
|
Data | Plural noun that refers to information gathered in research (singular form: datum). |
Debriefing | Explanation provided by the researcher to human participants after they complete a study. |
Deduction | Reasoning from the general to the particular. In deductive reasoning specific predictions are made about future events. |
Demand characteristics | Any aspect of the situation created by the researcher that suggests to the participants what behavior is expected. |
Demographics | Characteristics of a group, such as sex, age, and social class. |
Demoralization | An effect from participants knowing that they are being denied the preferred treatment. Participants may then perform/behave differently than if they had not been aware that they were denied the preferred treatment. |
Dependent variable | The dependent variable is the one that is evaluated/measured as a result of an experiment. It is the specific behavior of participants that is observed and measured. (The word "behavior" in this context can be understood at many different levels: for example, the behavior under study might be a reaction time, a blood level of a certain hormone, the percent of items answered correctly on a test, the number of times someone smiles, or the person's score on a pencil and paper measure of anxiety.) |
Descriptive statistics | Statistics that serve to describe the properties of a particular set of scores. The mean and the standard deviation are examples that describe, respectively, the central tendency and the variability of a group of measurements. |
Differential transfer | A problem in within-subject designs if exposure to one condition may produce persistent consequences that influence a participant's response to subsequent conditions. The problem arises when performance in one condition differs depending on the specific condition that precedes it. See also carryover effect. |
Double-barreled question | Several questions embedded into a single question; e.g., Are you happy and completely satisfied? |
Double-blind | Situation in which both the data collector and the participant are unaware of the research hypothesis and the participant's group membership in the experiment. |
|
|
Ecological validity | The extent to which research is conducted in a situation that is similar to the everyday life experiences of the participants. |
Empirical | Relying upon or derived from observation or experiment. |
Epidemiology | Branch of medical science that deals with the incidence, distribution, and control of disease in a population. |
Ethology | The systematic study of behavior; usually animal behavior in natural settings. |
Experimenter bias | Any effect that the expectations of the researcher might have on the measurement and recording of the dependent variable. Uncontrolled experimenter bias threatens the validity of research. |
External validity | The degree to which the results of a study can be generalized to other participants, settings, and times. A study with external validity is generalizable. |
Extraneous variable | Any uncontrolled factor that is not of interest to the researcher but could affect the results. |
|
|
Face validity | When a measure intuitively seems to measure what it is supposed to measure. A measure has face validity to the extent it is a plausible measure. |
Factorial design | A research design in which there is more than one independent variable and each independent variable is present at every level of every other independent variable. Also called complex design. |
Field experiment | An experiment performed in a non-laboratory setting. |
Fixed-alternative item | Test or questionnaire items in which the response alternatives have been predetermined by the researcher; e.g., multiple-choice and true/false items. |
Floor or basement effect | The dependent measure artificially restricts how low scores can be. See also "ceiling effect." |
|
|
Generalizability | The ability to extend a set of findings observed in one research setting to other situations and groups. A study which has external validity is one that is generalizable to other situations or participants. |
|
|
Hawthorne effect | Situation in which participants' behavior is influenced by their knowledge that they are in an experiment and therefore of some importance to the researcher. Changes in a participant's behavior brought about by interest shown in the person rather than, or in addition to, any research manipulation. |
Heterogeneous | Dissimilar; varied or mixed |
Histogram | A graphical summary used to represent the distribution of data. Data values are generally grouped into intervals of equal width, and bars above these intervals represent the number of observations in each interval. |
History | Potential confounding variable and threat to internal validity in research. History represents any change in the dependent to encourage helmet use. |
Homogeneous | Similar. |
Hypothesis | A testable prediction. |
Hypothesis testing | The process in scientific research of choosing between the null and alternative hypotheses. |
|
|
Idiographic approach | The intensive study of an individual (contrast with nomothetic approach). |
Independent variable | The variable manipulated by the experimenter. |
Independent-groups design | See between-subjects design. |
Induction | Process of reasoning from a part to a whole as might be performed when data from a particular study are used to develop a general theory. |
Inferential statistics | Mathematical and logical methods for determining the properties of one or more populations from an inspection of samples drawn from the populations. Inferential statistics is based on probability theory and descriptive statistics. |
Informed consent | Potential participants must be in a position to decide whether or not to participate in an experiment. They consent to take part (or don't consent) based on prior information about the research. |
Instrumentation | Potential confounding variable and threat to internal validity in research. Instrumentation occurs when the measuring instrument changes over time. Example: A researcher tests a four-week program for encouraging weight loss. The researcher measures the participants' body weights with a scale that is calibrated at the start of the study but, unknown to the researcher, weighs "5 pounds too light" during the final test of body weight. At the end of the program the researcher finds that participants average 5 pounds lighter than at the start and the researcher concludes the weight-loss program is a success. |
Interaction effect | In a factorial design an interaction occurs between two independent variables when the effect of one independent variable on the dependent variable depends on the level of the other independent variable. |
Internal validity |
The extent to which changes in the independent variable can be deemed responsible for changes in the dependent variable. |
Interquartile range | A measure of dispersion. Equal to the difference between the score in the distribution at the 75th percentile and the score at the 25th percentile. |
Interval scale | Scale of measurement in which the distance between any two adjacent scores is the same as the distance between any other two adjacent scores, but zero is not a true zero. An example of an interval scale is temperature measured in either centigrade or Fahrenheit. |
Interrater reliability | Correlation between ratings of two or more raters in a research study. |
|
|
Leading question | Questions structured to lead respondents to the answer the researcher wants; e.g., "You like this university, don't you?" |
Likert scale item | A question in a survey which requires a Likert-scaled response. A Likert scale measures the extent to which a person agrees or disagrees with the question. The most common scale is 1 to 5. Often the scale will be 1=strongly disagree, 2=disagree, 3=not sure,4=agree, and 5=strongly agree. |
Loaded question | A questionnaire item that contains emotionally charged words that bias the respondent's interpretation of the item; e.g. Do you HATE early morning classes? |
Longitudinal study | Measurement of a single group of people with the same or similar instruments on successive occasions. |
|
|
Main effect | In a factorial design, a main effect is the effect of one of the factors, independent of the effect of any other factor or interaction between factors. In a two-way factorial design, there are main effects of the row factor (commonly designated Factor A) and of the column factor (commonly designated Factor B). |
Manipulation check | Measures included in an experiment to test the effectiveness of the independent variable. A way to check whether or not the different conditions operated in the way the researcher expected that they would. |
Matched-subjects design | Experimental design in which participants are matched on some variable assumed to be correlated with the dependent variable and then randomly assigned to conditions. Also called matched-groups design. |
Maturation | Potential confounding variable and threat to internal validity in research. Maturation involves changes in participants on the dependent measure during the course of the study that result from normal growth processes associated with the passage of time. Example: A researcher tests arm strength at the beginning of sixth grade, runs a fitness program during the sixth grade school year, and again tests arm strength at the end of sixth grade. Arm strength is higher at the end than the beginning of sixth grade. The researcher wants to claim that the fitness program caused increased arm strength but cannot rule out maturation as the reason. |
Mean | A measure of central location determined by adding all of the observations and dividing by the number of observations. |
Median | A measure of central location, determined by ordering the observations in the sample and then choosing the middle value or the average of the middle two values. |
Mixed designs (between- and within-subjects variables) | Factorial design in which at least one of the factors is a between-subjects factor and at least one of the factors is a within-subjects factor. |
Mixed designs (manipulated and non-manipulated variables) | Also known as mixed factorial design. Factorial design in which at least one of the factors is a between-subjects factor and at least one of the factors represents a nonmanipulated independent variable and at least one of the factors represents a manipulated independent variable. The distinction between manipulated and nonmanipulated variables does not change the data analysis. However, the interpretation of the results must take into account the fact that some of the factors are nonmanipulated factors. These designs are sometimes called true-mixed designs. Also called mixed experimental and subject-variable designs. |
Mode | Most frequent score in a distribution. A measure of central tendency commonly used with nominally scaled variables. |
Mortality | Potential confounding variable and threat to internal validity in research. Subject mortality results from participants dropping out of a research study before it is completed either because they refuse to participate or because they cannot participate. Also called attrition. |
Multiple-baseline design | A single-case experimental design that involves recording two or more observations across time, but staggering the baselines. Baseline measures are established and treatment then is introduced at different times. There are three variations of a multiple baseline design. Multiple baseline across behaviors (different behaviors are observed), across participants (different participants are observed), or across settings (different settings are used). |
|
|
Natural-groups design | A multi-group study in which the grouping variable consists of a naturally occurring distinction (e.g., gender, personality). |
Naturalistic observation | A type of systematic observation in which the researcher unobtrusively records the behavior of unaware participants in their natural environments. Typically observation without researcher intervention. |
Negative correlation | Relationship between two variables in which an increase in one variable is generally associated with a decrease in the other variable. |
Nominal scale | Scale of measurement in which only categories are produced as scores. Examples are gender and political affiliation. |
Nomothetic approach | An approach to research that seeks to establish broad generalizations or laws that apply to large groups of individuals; typical performance of a group is emphasized (contrast with idiographic approach). |
Nonequivalent control group design | Similar to a natural groups design. In quasi-experiments, a comparison group that isn't determined by random assignment. |
Nonparametric statistics | Inferential statistical procedures that do not rely on estimating population parameters such as the mean and variance. |
Nonprobability sampling | In contrast to probability sampling, nonprobability sampling does not involve random selection of participants from a population. Accidental (convenience) sampling and purposive sampling are examples of nonprobability sampling methods. |
Null hypothesis | Generally, the hypothesis of "no effect/no difference" or "no relationship." The null hypothesis is directly tested in the null hypothesis significance test. When evidence against it is "strong enough" one rejects the null hypothesis. Otherwise, one fails to reject the null hypothesis. |
Null hypothesis significance testing | The process by which a researcher determines whether a finding can plausibly be explained as due solely to random variation (chance). If results can not plausibly be explained as due solely to chance, the null hypothesis is rejected in favor of an alternative hypothesis. |
Null results | Results that fail to disprove the null hypothesis. |
|
|
Obscuring factor | In a research study, these factors prevent the researcher from finding an effect when in fact the effect is a "real one" (Type II error). They may make the results so unclear that the researcher simply cannot draw conclusions. Obscuring factors include ineffective manipulations, ceiling and floor effects, and too much variability in the data produced by individual differences, measurement error, lack of control, and distractions in testing situations |
Observer bias | Bias created by an observer seeing what the observer wants or expects to see. |
Observer effects | Observers, by being present, may influence and change the behavior of the people or animals that they observe. Those being "watched" may not behave as they normally would. The observer, whose goal is to see "typical behavior," sees something else instead. |
Open-ended items | Questions that don't provide fixed response alternatives. |
Operational definition | A definition that presents a construct in terms of observable operations that can be measured and utilized in research. |
Ordinal scale | A measurement scale in which objects or attributes are ordered but in which the intervals between points are not equal; an example is first, second, and third place finishers in a race. |
Ordinate | The vertical axis (or y-axis) in a graph. |
Outliers | Values
of observations in a set of data that lie far from the other observations. Formal criteria exist for determining whether observations are outliers. These criteria can be mean based (i.e., values lying beyond 2.5 standard deviations from the mean) or median based (i.e., values lying beyond 2 interquartile range distances from the median). |
|
|
P-Value | The probability of observing a test statistic as extreme as or more extreme than the observed value, assuming that the null hypothesis is true. P-values are commonly compared to a selected alpha (significance) level to determine whether results are statistically significant. |
Parametric statistics | Statistical tests that make assumptions about the distribution of scores (e.g., normally distributed); these tests require interval or ratio data. |
Partial correlation | Shows what the value of the Pearson correlation coefficient between two variables would be in the absence of one or more other variables. For example, across a sample of children ranging in age from 6 to 12, height might correlate .65 with verbal ability. However, removing the influence of age on these two variables, the partial correalation between height and verbal ability is likely to decrease to nearly zero. |
Participant observation | Observation of behaviors as it occurs by a researcher in a natural setting in which the researcher is an active participant. The researcher's role may be disguised or undisguised. |
Pearson product-moment correlation | Index of the degree of linear relationship between two variables where each variable represents interval or ratio data. |
Pilot study | A type of practice run that precedes the actual study. |
Placebo | (Latin: to please) Although the term is used in a variety of ways, it is common to speak of a placebo control, which controls for the effects of suggestibility (e.g., in a drug study, one group [the placebo control] receives an inert pill but is treated in every other way like the group that receives the pill with the active ingredients). |
Plagiarism | Presenting another person's material as though it were your own. |
Population | Any clearly defined set of objects or events (people, occurrences, and animals). Populations usually represent all events in a particular class (e.g., all college students; all headache sufferers). |
Positive correlation | Relationship between two variables where one variable increases as the other variable increases. |
Post hoc analyses | Secondary analyses that evaluate relationships between variables not specifically hypothesized by the researcher prior to the study. |
Power (of a statistical test) | The probability of rejecting the null hypothesis in a statistical test when it is in fact false. |
Practice effect | A change that participants undergo with repeated testing. Practice effects may be positive (performance improves with familiarity) or negative (performance worsens with fatigue). A common concern in repeated measures designs. |
Predictive validity | How closely the measure we are considering is related to, or correlated with, the variable we want to estimate. |
Pretest-posttest design | Research designs in which participants are tested at two points in time; before the administration of the independent variable and again after the administration of the independent variable |
Probability sampling | A probability sampling method is any method of sampling that utilizes some form of random selection of participants from a population. Each possible participant in the population has an equal chance of being selected to be in the sample. Simple random sampling, stratified random sampling, cluster sampling, and systematic sampling are examples of probability sampling methods. Drawing names from a hat is also an example of probability sampling. |
Protocol | A recipe to be followed exactly in conducting a research project. |
Purposive sampling | Hand-picking the individuals to be sampled because they have special characteristics or (more commonly) because they are apt to provide needed information. This method does not involve probability sampling. |
|
|
Quartile | A value corresponding to a 25% point in a data set. The first quartile is the 25th percentile, the second quartile is the 50th percentile, and the third quartile is the 75th percentile. |
Quasi-experimental design | Quasi-experimental designs were developed to be used in situations in which true experiments are not feasible. They attempt to control as many threats to internal validity as possible. Quasi-experiments, like true experiments, involve the manipulation of an independent variable. That is, the institution of an experimental treatment. However, quasi-experimental designs lack at least one of the other properties that characterize true experiments, such as random assignment of participants to conditions and control over confounding variables. |
Quota sampling | Sampling the desired numbers of (meet quotas for) certain types ofpeople (certain age groups, certain ethnic groups, etc.). This method doesn't involve random sampling and usually gives a less representative sample than random sampling would. |
|
|
Random assignment | The process by which participants are allocated to treatments by a random process and not by any subjective or possibly biased approach. The use of random assignment makes it more likely that particular characteristics of the participants are randomly distributed over the groups of the experiment. |
Random sampling | The selection of participants in an unbiased manner so that each potential participant has an equal possibility of being selected for the experiment. |
Range | Measure of dispersion reflecting the difference between the largest and smallest scores in a set of data. |
Ratio scale | With ratio scale numbers, the difference between any two consecutive numbers is the same (see interval scale). But in addition to having interval scale properties, in ratio scale measurement, a zero score means the total absence of a quality. With ratio scale numbers, you can meaningfully form ratios between scores. If IQ scores were ratio (they are not), you could say that someone with a 180 IQ was twice as smart as someone with a 90 IQ (a ratio of 2 to 1) but we cannot make this claim. However, for height measured in inches, which is a ratio scale, you may say that somebody 60 inches tall is twice as tall as somebody 30 inches tall. |
Regression line | A
regression line is the straight line (y = slope*x + y-intercept) drawn through a scatterplot of data points (paired x,y values) that "best" describes the linear relationship between the x observations and the y observations. The criterion for "best" fit most commonly used is the line that minimizes the squared distances between the y values of the points and the y values on the regression line. This criterion is called the "least squares criterion." |
Reliability | Requirement that a measure be consistent and reproducible. |
Replication | To repeat a study with either no changes at all in the procedure (direct or exact replication) or carefully planned changes in the procedure (systematic replication). Researchers may also ask the same question but with different procedures (conceptual replication). |
(Sample) Response bias | The threat to the representativeness of a sample. Occurs when some participants selected to be in the sample systematically fail to take part or respond. A survey that claims to represent the opinions of all undergraduates because it was sent to all students but includes only responses from students who remember and are able to take the completed form to the administration building illustrates sample response bias. |
Restriction of range | To observe a sizeable correlation between two variables, both must be allowed to vary widely. Occasionally, investigators fail to find a relationship between variables - when in fact there is a relationship - because they only study one or both variables over a highly restricted range; e.g., considering just NFL offensive linemen and saying that weight has nothing to do with playing offensive line on the basis of the find that great NFL offensive tackles don't weight much more than poorer NFL offensive tackles. |
Retrospective study | A type of study by which a researcher examines data available prior to the beginning of the study to answer the research question. For example, if a researcher wanted to know whether certain characteristics of children with autism were recognizable before the age at which most children are diagnosed (2-3 years old), she might look at videotapes of their first birthday parties. |
|
|
Sample | The people and situations that the researcher actually observes and measures; a subset of a population. |
Sampling | Process of drawing a sample from a population. Many sampling techniques are available including simple random sampling, stratified random sampling, and various nonrandom sampling techniques such as quota sampling. |
Sampling frame | The listing of sampling units (e.g., people, schools, hospitals) that might be sampled. |
Scatterplot | A graph made by plotting the scores of individuals on two variables (e.g., plotting each participant's height and weight). This graph provides a visual idea of what kind of relationship (e.g., positive, negative, zero, U-shaped) exists between the two variables. Each point plotted in a scatterplot represents a pair of scores for one participant. See example here. |
Selection | Potential confounding variable and threat to internal validity in research. Selection includes any process that may create groups not equivalent at the beginning of the study |
(Sample) Selection bias | Threat to the representativeness of a sample. Occurs when the procedures used to select a sample result in the overrepresentation of a segment of the population or the under-representation of a segment of the population. A survey that claims to represent the opinions of all undergraduate students, but includes only responses from students living in dorms, sororities, and fraternities and ignores students who commute and live off campus illustrates sample selection bias. |
Selective subject loss | Happens when participants drop out differentially across conditions in an experiment. A particular concern if subject loss is the results of some characteristic of the participant that is related to the outcome of the study. |
Sensitivity | The degree to which a measure is capable of distinguishing between participants having different amounts of a construct. For example, if individuals really differ in "happiness" the measure should pick up and reflect the differences. |
Sensitization | After getting several different treatments and performing a task several times, participants may realize (become sensitive to) what the researcher's hypothesis is. |
Significance test | See null hypothesis significance testing |
Simple main effect | In a factorial design, the effect of one independent variable at one level of a second independent variable. |
Simple random sampling | Sampling procedure in which each possible sample of a specified size in the population has an equal chance of being selected. |
Single blind | In a single-blind experiment either the participant (if you're most concerned about subject bias) or the person running participants (if you're most concerned about experimenter bias) is unaware of who's receiving what level of the treatment. |
Single-subject designs | Designs that require only a single participant or single unit of observation (e.g., a single family or a single organization). |
Social desirability | A bias resulting from participants giving responses that they believe will make them look "good" rather than giving honest responses. |
Spearman rank-order correlation | Correlation coefficient that indexes the degree of relationship between two variables, each of which is measured on an ordinal scale of measurement. |
Standard deviation | A measure of spread or variability, computed as the square root of the sum of squared differences between each observed value and the sample mean, divided by the sample size. See also "variance." |
Standard error of the mean | A measurement of error when a sample mean is used to estimate the mean of the population from which the sample was drawn. The standard error of the mean is computed as the standard deviation of the original population divided by the square root of the sample size, . One can see from this formula that the larger the sample size, the smaller the standard error of the mean. |
Statistical conclusion validity | The degree to which the statistical conclusions we reach about relationships in our data are reasonable. The most common threat to statistical conconclusion validity is a lack of power in the statistical analysis. |
Statistics | A field of study dealing with the collection, analysis, interpretation, and presentation of numerical data. |
Stem and leaf display | A graphical summary representing the distribution of data. The stem and leaf display is similar to the histogram, but it retains the quantitative information from the entire sample. Stems represent groupings of numerical values, and leaves represent individual observations. |
Structured observation | Observation of behavior in a researcher-structured setting; for example, observing a mother explore a new toy with her child for 5 minutes; for example, observing a married couple discuss for 5 minutes a topic, such as finances, that the couple identifies as an area of conflict in the marriage. |
Stratified random sampling | Making sure that the sample is similar to the population in certain respects (e.g., percentage of women) and then randomly sampling from these groups (strata). |
Subject biases (Subject effects) | Undesirable ways in which the participant can influence the results of a study (guessing the hypothesis and playing along, giving the socially correct response, etc.). |
Subject variable | Characteristics of people that can be measured or described but cannot be varied experimentally (e.g., ethnicity, gender). Also called an individual-differences variable. |
Systematic sampling | A variant of probability sampling. Useful when no sampling frame (list of possible participants) is available. Systematic sampling requires using every nth participant, starting from a randomly selected point. For example, a researcher studying traffic patterns during rush hour might record type of vehicle and number of visible occupants for the 11th vehicle entering a freeway from an on ramp and then record the same information every 20th vehicle thereafter. |
|
|
T-Test for dependent measures | The t-test is a statistical method that allows researchers to determine whether an observed difference in condition means is reliable in order to evaluate the effect of the independent variable. When the groups are sampled in such a way as to connect observations across the groups (such as with repeated measures on the same sample), the t-test is called the paired groups t-test or the dependent means t-test. |
T-Test for independent measures | The t-test is a statistical method that allows researchers to determine whether an observed difference between the two group means is reliable in order to evaluate the effect of the independent variable. When the groups are independently sampled, the t-test is called the independent groups t-test or the independent means t-test. |
Testing | Potential confounding variable and threat to internal validity in research. Testing represents any change in a participant's score on a dependent measure that is a function of the participant having been tested previously in the research project. |
Theory | A statement of a general principle which relates one aspect of nature to another. A way of explaining a phenomenon. |
Third variable problem | The possibility that some unobserved variable is responsible for the correlation between variables X and Y. |
Time-series design | A type of experimental strategy in which some target behavior of the same subject or subjects is measured at many different times. Example: A study of obsessive-compulsive disorder may ask participants to record their compulsive behaviors with a daily diary and use these results for analysis. |
True experiment | Research procedure in which research participants in two or more conditions are compared on a dependent measure with participants assigned without bias to each of the conditions. Within-subjects designs and between-subjects designs are true experiments. In between-subjects designs the participants are randomly assigned to conditions. In within-subjects designs participants are randomly assigned to the order in which they complete the conditions (and order is counterbalanced). |
True-mixed design | See mixed designs (manipulated and nonmanipulated variables). |
Type I error | Probability of rejecting the null hypothesis when the null hypothesis is true. |
Type II error | Probability of not rejecting the null hypothesis when the null hypothesis is false. |
Type III Error | This type of hypothesis testing error occurs when a researcher conducts a non-directional (two-tailed) inference test and correctly rejects Ho (no difference between population parameters), but incorrectly concludes that the direction of the difference is reflected in the direction of the difference between the sample statistics. For example, this type of error would occur when the researcher decides that the mean of population 1 is greater than the mean of population 2 because the mean of sample 1 is greater than the mean of sample 2, when actually the mean of population 1 is less than the mean of population 2. |
|
|
Validity | Accuracy of ideas and research; to what degree are these true and capable of support? |
Variance | Descriptive statistic that indicates a measure of the degree of variability among research participants for a given variable. The variance is essentially the average squared deviation from the mean and is the square of the standard deviation. |
|
|
Within-subjects design | Experimental design in which each participant receives all levels of the independent variable. Also called repeated-measures design. |
Within-subjects design with Concurrent Measures | This repeated-measures design has no treatment group. Each participant is confronted with all values of the independent variable and is allowed to choose among them. For example, a researcher might offer a baby monkey the choice of two mothers (wire with food and cloth with no food) and record the time spent with each mother. |