Chapter 1: Introduction

A few terms:

Data

Populations

Samples

Parameters

Statistics

Branches of Statistics:

Descriptive

Inferential

Classifying Data

Two types:

Qualitative

Quantitative

Levels of Measurement

Nominal

Ordinal

Interval

Ratio

Experimental Design

Data Collection

Experiment

Simulation

Census

Sampling

Sampling Techniques

Simple random sampling

Stratified sampling

Cluster sampling

Systematic sampling

Convenience sampling

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Chapter 2: Frequency Distributions and Their Graphs

Frequency distribution key words:

Classes

Class size

Frequency (f)

Class width

Upper and lower class limits

Midpoints

Relative frequency

Cumulative frequency - ogive

Frequency histogram

Frequency polygon

Questions:

How do you find the mean from a frequency distribution? The standard deviation?

Other types of graphs

Stem-and-leaf plot

Pie chart

Pareto chart

Scatter plot

Measures of central tendency

Mean

Median

Mode

Weighted mean (relate to mean of a frequency distribution)

Shapes of distributions

Symmetric

Uniform

Skewed (left or right/negative or positive)

Measures of variation

Range

Variance

Standard deviation (population versus sample – calculate differently!)

What does the standard deviation mean?!?

Empirical rule

Chebychev’s theorem

Measures of position

Quartiles and the Interquartile Range

Percentiles

 

 

 

 

 

 

 

 

Chapter 3 – Probability

Terms:

Probability

Outcome

Sample space

Event

Classical probability vs. Empirical probability (f/n)

Law of large numbers

Complement of an event

Conditional probability

What is the P(B|A) ?

Independence: P(B|A) = P(B)

Multiplication Rule

P(A and B) = P(A) · P(B|A)

Mutually exclusive events

Addition Rule: P(A or B) = P(A) + P(B) – P(A and B)

Counting and Factorials!

Permutations

Combinations

With replacement and without replacement

 

 

 

 

Chapter 4: Discrete Probability Distributions

Mean, standard deviation, and expected value of a discrete probability distribution

Discrete Distributions:

Binomial distribution

 

 

 

Mean, variance and standard deviation

 

 

Geometric Distribution

 

 

 

 

 

 

 

 

 

 

Poisson

 

 

Chapter 5: Normal Probability Distributions

A continuous distribution

Bell-shaped

 

Mean, median and mode are equal

Empirical rule

68% of area lies within 1σ of μ

95% of area lies within 2σ of μ

99.7% of area lies within 3σ of μ

 

Z scores (standard normal)

 

Finding area under the curve (to the left of Z)

 

Sampling distribution

Central limit theorem

Applying the Central limit theorem:

Normal approximation to binomial distribution

Rule of thumb for using it?

m = np

s = npq

Continuity Correction

 

 

 

 

 

 

 

 

 

 

 

 

 

Chapter 6: Confidence Intervals

 

Point estimate, interval estimate, level of confidence (c)

 

E: Maximum error of the estimate

E = zc s/ n

 

Confidence interval for m (n 30)

 

 

 

Confidence interval for m (n < 30, s unknown)

 

 

 

Confidence interval for population proportion, p

Rule of thumb for use

 

 

 

Finding minimum sample size to estimate m

 

 

 

Finding minimum sample size to estimate p

 

 

 

When to use z, when to use t?

Confidence interval for s and s2

What are the guidelines?

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Chapter 7: Hypothesis Testing with One Sample

Null and Alternative hypotheses

Equality ALWAYS included in the H0

Claim can be in either H0 or Ha

Hypotheses are ALWAYS about population parameters (NEVER about sample statistics)

Type I and II errors

Level of significance

Right-tailed, left-tailed and two-tailed tests (and determining critical values)

 

 

 

 

Rejection region and making a decision to reject or not reject H0

 

 

 

 

P-value

 

Finding critical values for Z, t, and c2 distributions

Hypothesis testing for proportions

 

Rule of thumb:

 

Z =

 

 

Hypothesis testing for s and s2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Chapter 8. Hypothesis Testing with Two Samples

Two sample Z test for difference between means (n>=30 for both populations)

z=

 

Two sample Z test for difference between means (n < 30 for at least one population)

Equal variances

t=

 

Unequal variances

t=

 

Testing the difference between means (dependent samples)

Use paired difference

t=

 

Difference between proportions

z=

Chapter 9 Correlation and Regression

Correlation coefficient r =

Correlation is not causality!

t-test for correlation coefficient

t=

 

Linear regression

Regression line y=mx+b

Residuals sum to zero

m= b=

 

Coefficient of determination r2 = explained variation (by line)/total variation

 

Standard error of estimate se =

 

Prediction intervals E for a particular value of x=x0.

 

E=

 

 

 

 

 

 

 

 

 

Chapter 10. Chi-square and F tests

Goodness of fit tests

c2 =

where O = observal values from sample, E= claimed values (null hypothesis)

This is a right tailed chi-square test with k-1 degrees of freedom (# categories –1)

Test for independence. Just like goodness of fit test except degrees of freedom = (r-1)(c-1). E values are obtained by multiplying column sums by row sums over total sample size. Also a right-tailed chi-square test.

 

Comparing two variances. The ratio of variances from two normally distributed population is F distributed.

F = s12/s22

Put larger variance in numerator. Can be either a 1 or two tailed test. Critical F depends on degrees of freedom of numerator (n1-1) and of denominator (n2-1)

Analysis of Variance

F= MSB/MS