MATH/STAT 394/5B, Autumn 2006/Winter 2007
Introduction to Probability Theory and its Applications


394 Assignments

I. Read Ross Chapters 1 and 2.

Chapter 1 is a review of elementary combinatorics, i.e., permutations and combinations, etc. The class lectures will begin with Chapter 2, covering the basics of a probability model: sample space, elementary outcomes, events, probability measure.

HW#1: due Wednesday Oct. 4.

Required (hand in):

Ch. 1 Problems 1,4,8,10,13,19, 21,22,24,30,31.
Exercises 5,8,12a,b.

Recommended (do not hand in):

Ch.1 Problems 2,3,12,14,26
Exercises 10,11,13,14,18,20,21.


II. Read Ross Chapter 2.

Notes:
1. Axiom 3 on page 30 is usually stated for both finite and countably infinite sequences of events E_1, E_2,.... Then the proof in the middle of page 30 that P(emptyset)=1 can be simplified by considering only two events, E_1=S and E_2=emptyset.
2. I will not discuss Section 2.7.

HW#2: due Wednesday Oct. 11.

Required (hand in):

Ch. 2 Problems 1,5,9,11,13,15,18,23,29,32,33,37,41,43,45,49,54. (Most of these are brief.)
Exercises 11,12,15.

Recommended (do not hand in):

Ch.2 Problems 2,3,8,10,12,14,16,17,19,20,25,28,31,34,35,40, 42,46,50,51,52,53,56.
Exercises 1-7 (these constitute a review of Boolean algebra, i.e., simple set-theoretic operations), 8,13,16,21.


III. Read Ross Chapter 3, Sections 3.1 - 3.3 (Conditional Probability).

Chapter 3 deals with the extremely important topic of conditional probability. Perhaps the most important concept in probability and statistics is dependence (also, correlation) among random events/variables, for the purpose of understanding and even predicting random phenomena. Conditional probability helps to measure the dependence between events. You will begin to notice a gradual increase in the level of scientific and mathematical sophisticaton in this Chapter and its problems/exercises.

Notes:

1. Equation (3.4) is called the Law of Total Probability. Equation (3.1) is a special case.
2. Equation (3.5) is Bayes' Formula, which is extremely important and useful.
3.A few of the Examples in Sections 3.2 and 3.3 are somewhat artificial, trivial, and/or verbose, obscuring the basic mathematical structure that we are studying (e.g., Examples 2a, 2e, 3l, 3n.) Just look for the occurrences of conditional probability, the law of total probability, and Bayes' Formula - these are the main concepts here.
4.Example 3l (page 83) is an extension (perhaps overly complicated) of Example 2b (page 68). I would state the question simply as follows: If a couple is known to have exactly two children, what is the conditional probability that they are both boys, given that at least one is a boy? This is now identical to Example 2b(b).

HW#3: due Wednesday Oct. 18.
Most of these are again brief, but some require more thought.

Required (hand in):

Ch. 3 Problems 1,5,8,11,15,17,18,22,23,27,28,37,40,47,50
Exercises 1,3,7,8.
Also: Extra Exercise 3.1: (a) If A is independent of B and B is independent of C, is A independent of C?
(b) If A is independent of C and B is independent of C, is AB independent of C?
(This is a simplified version of Ross's Exercise 5.)

Comments:

(i) Ross Exercise 1 becomes more transparent if we define C=A\B, D=AB, E=B\A. Then C,D,E are mutually disjoint, and the desired inequality becomes: P[D|CUD]>P[D|CUDUE].
(ii) The Hint in Exercise 3 is of interest in its own right. It is equivalent to the inequality (EX)(E(1/X))\ge 1, where X is a random variable with range {1,...,k} and probabilities proportional to n_1,dots,n_k. This inequality holds for every positive random variable - can you prove it?

Recommended (do not hand in):

Ch.3: All the unassigned Problems from 1 through 50. All the unassigned Exercises from 1 through 8.


IV. Ross Chapter 3, Sections 3.4 - 3.5 (Independence)

Independence is best defined in terms of conditional probability: Events A and B are called independent if P[A|B]=P[A}. That is, the information that B occurs does not change the probability that A occurs. This definition is equivalent to the product formula: P[A and B]=P[A]P[B]. It is also equivalent to P[B|A]=P[B] (verify). Furthermore, A and B are independent
iff A and B^c (the complement of B) are independent
iff A^c and B are independent
iff A^c and B^c are independent (verify).
We can restate these facts as follows: the EXPERIMENTS E_A and E_B are independent, where E_A and E_B denote the binary experiments with outcomes {A, A^c} and {B,B^c} repectively.

Independence of three or more events is best described directly in terms of independent experiments (or independent random variables). For example, experiments E_1, E_2, E_3 (with sample spaces S_1, S_2, S_3) are defined to be independent if, for any choices of events A_1 in S_1, A_2 in S_2, A_3 in S_3,
P[A_1 and A_2 and A_3] = P[A_1]P[A_2]P[A_3].
Note that this also implies pairwise independence, for example by taking A_3 = S_3 we get
P[A_1 and A_2] = P[A_1]P[A_2]
for any choices of events A_1 in S_1, A_2 in S_2. More generally, if E_1,...,E_n are n independent experiments, then they are also pairwise independent, triple-wise independent, etc.

We will also discuss conditional independence for three events A,B,C: A and B are said to be conditionally independent given C if P[A|B,C] = P[A|C]. Equivalent definitions are:
P[A and B|C] = P[A|C]P[B|C], and also
P[B|A,C] = P[B|C] (verify).

Notes:

1. Some of the topics in Sections 3.4 and 3.5, such as the Gambler's Ruin problem (Example 4k), will be treated more thoroughly in 396.
2. The conditional independence assumption stated by Ross in the "Solution" to Example 5f (pp. 109-110) is incorrect, in the sense that it is not needed: the result is valid for any events H, E_1 and E_2. [The proof is simple - try it.]

HW#4: due Wednesday Oct. 25.

Required (hand in):

Ch. 3 Problems 53,56,57,58,63,64,70,74,76,83,86,90.
Hint for Problem 74: In the actual game under analysis, A goes first: let P_A = Prob[A wins] in the actual game. Also consider the hypothetical game in which B goes first: let P_B = Prob[B wins]. Note that P_B is not necessarily 1-P_A. In the actual game, condition on {A rolls a 9 on the first trial} to get a linear equation in the unknowns P_A and P_B. In the hypothetical game, condition on {B rolls a 6 on the first trial} to get a second such equation. Solve these two simultaneous equations for P_A and P_B.

Exercises 9,10,14,15 (omit the last part),28.
Hint for Exercise 10: start with the case n=3.
Hint for Exercise 14: Simply let N -> infinity in Ross eqn. (4.5).

Recommended (do not hand in):

All the non-required problems in Ch.3.


V. Read Ross Chapter 4, Sections 4.1-4.6 (Discrete Random Variables, Expected Value, Variance)

Chapter 4 deals with discrete random variables, their expected value and higher moments, especially the variance, and the expected value of a function of a random variable (which is just another random variable). Continuous random variables are treated in Chapter 5, but I will treat expected values for discrete rvs in parallel with continuous rvs. The important Poisson distribution is introduced in Section 4.7, derived as an approximation to the Binomial(n,p) distribution when n is large and p is small.

Notes:

1. Examples 4c and 6g are somewhat confusing, so can be omitted. Example 7d is new to this edition and somewhat sophisticated although interesting. You could skip the second half of this example (pp; 168-9).

2. The Poisson process on pp. 170-172 is very important. I give a more transparent treatment in my class notes.

3. Example 8h discusses the "capture-recapture" (or "mark-recapture") method often used to estimate the size of wildlife populations.

Typos: Ross p.146, three lines after (4.1), "Equation (5.1)" should be "Equation (4.1)".
p.168, last line: 1/p should be 1/(p-1).

HW#5: due Wednesday Nov. 1.

Required (hand in):

Ch. 4 Problems:
1: also find E(X).
2: also find E(X).
5: also find the probability distribution of X and find E(X). Note that these are closely related to a binomial distribution (how?)
13: also find E(X).
20,21,22a (not b),26,28,32,36,38,
40: also find E(X), where X is the total number of correct guesses.
41.

Exercises:
3 (Hint: the answer involves a limit).
4,6,8,11 ("sequential" = "Bernoulli". Note that the answer does not depend on p.}
13 (Hint: it is easier and equivalent to maximize log P[X=k]. This also applies to #18.)
17b (not a), 18.

Also: MDP Class Notes p.8, Exercise 1.2.

Recommended (do not hand in):

Ch.3 Problems 4,11,14,17,18,19,23,27,30,35,37,39.
Exercises 2,5 (consider the cases \alpha>0 and \alpha<0 separately), 7,9 (\sigma>0),10,14.


VI. Read Ross Chapter 4, Sections 4.6-4.9 (Poisson and other Distributions)

Comments:

1. Ross Example 8f, p.177: The mean and variance of a negative binomial random variable can be easily obtained by representing it as the sum of i.i.d. (independent, identically distributed) geometric rvs.

2. Ross Example 8j, p.180: Similarly (see MDP Notes Example 3.2, p.31) the mean and variance of a hypergeometric rv can be obtained by representing it as a sum of identically distributed (but not independent) Bernoulli rvs X1,...,Xn and applying the formula

Var(X1+...+Xn)=Var(X1)+...+Var(Xn)+2[Cov(X1,X2)+ Cov(X1,X3)+...+Cov(X(n-1),Xn)]

HW#6: due FRIDAY Nov. 17.

Required (hand in):

Ch. 4 Problems:
42,46,50,51,55,57,60,
62 (Assume that all 2n persons are seated at random and the table has 2n seats. This problem is based on the ideas in Ross, pp.164-5.),
73,
75: also, find E(X). Hint: relate X to a negative binomial rv.),

Exercises:
20,25,
26 (Note: this result will be related to the Poisson process. It says that the waiting time to the occurrence of the nth event has a certain "gamma" distribution.),
27,
28 (This result implies that for the Bernoulli process - already discussed in class - the distribution of the waiting time to the rth success has a negative binomial distribution. This is the discrete analog of #26 above.)
35 (Hint: part (c) follows from (a) and Exercise 6.)

Recommended (do not hand in):

Problems 43,44,45,52,54,56,59,61,63,65,66,72,73, 78: also find the expected number of selections.
Exercises 19,21,22,23,25,29,30,31,32,34.


VII. Read Ross Chapter 5.

Comments:

Many of the examples in this Chapter are special cases of the method in Section 5.7 (Theorem 7.1). (See Chapter 2 of my Notes for a more thorough treatment.) As you read Ross, see if you can find such special cases. One is given at the bottom of p.219. Another is found on p.240.

In Example 1a, p. 206, it should be verified that f(x)>0 for 0 In Example 3d, p. 217, there is a third possible interpretation of "random chord". Let U and V be independent random points, uniformly distributed on the circle. Consider the random chord from U to V. What is the probability now?

The calculation on p.238 is essentially equivalent to Exercise 26 of Chapter 4.

HW#7: due Wednesday Nov. 29.

Required (hand in):

Ch. 5 Problems:
1, 2 (first find "C"),
4,
6 (in parts a and c, first convince yourself that f is a valid pdf),
10, 11,
14 (for the second part, first compute the pdf of Y=X^n, then compute E(Y) from this pdf),
15 (use Table 5.1, p.222),
18, 19, 20, 22, 23.

Exercises:
5 (recall Lemma 2.1, p.211) , 7, 9, 11, 12, 14,
18 (add to the Hint: apply Exercise 5.)

Recommended (do not hand in):

Problems 3,7,8,16,17,21.
Exercises 2,3,13.


VIII. Read Ross Chapter 5.

HW#8: due Wednesday Dec. 6.

Required (hand in):

Ch. 5 Problems: 25, 26, 27, 31, 32, 34, 37,
39 (first, state the range of Y),
40 (state the range of Y),
41.

Exercises: 19, modified as follows: (a) Find a general formula for the kth moment E(X^k), k=1,2,... . (b) Find Var(X).
25, 26, 28, 30 (state the range of Y).

Extra Credit: MDP Notes, p.22, Exercise 2.2.

Recommended (do not hand in):

Problems 24, 28, 33, 38.
Exercises 20 (to be done in class),
24, 27, 29 (See Ross, pp.219-220).


395 Assignments

I. Read Ross Chapter 6, Sections 6.1 - 6.5 (Joint Distributions, Conditional Distributions, the Multinomial Distribution).

This chapter is devoted to the joint distribution of two or more random variables. Because the nature of the relationships among rv's is the most central feature of probability and statistics, this chapter is possibly the most interesting and important of those we have encountered. Most (not all, as noted below) of the examples in this Chapter are non-trivial, interesting, and important, as are the homework questions. For this reason we will devote three assignments to Chapter 6 (followed by three assignments on the equally important Chapter 7).

Many of the examples in this Chapter are special cases of the method in Section 6.7, especially equation (7.1), which deserves to be featured as a theorem. Examples 7b, 7c, 7e, 8a, 8b, 8d are particularly important and/or interesting. As you read the earlier sections, try to see how the method of Section 6.7 could apply.

Notes on Sections 6.1 - 6.5:

1. p.262, Example 1c: verify that f(x,y) is a pdf (note that X and Y are independent).

2. p.265-6, Example 1e: Find E(X/Y) and show that E(X/Y) is not = E(X)/E(Y).

3. p.266-7, Example 1f: the important multinomial distribution is treated at length in Section 7 of my MDP notes. (Also see Ross Example 4c, p.290.)

4. p.271-2, Example 2e: there is a preferable derivation that does not require the assumption that the pdfs are differentiable. Recall the argument at the bottom of p.232.

5. p.273, Example 273: sketch the range of (X,Y).

6. p.274-6, Example 2g: from a probabilistic viewpoint, the method presented here is more complicated than necessary (although it may be computationally efficient). See the Remark on p.276 for a simple method - this requires that we know how to generate a random permutation of 1,2,...,n such that all n! permutations are equally likely. Can you see how to do this by generating i.i.d. rvs U_1,...,U_n, each uniformly distributed on (0,1)?

7. p.280, Example 3a: see Example 2.5 on p.23 of MDP notes.

8. p.282, Example 3b: this relates to the discussion of the Poisson process in Section 3.6 of MDP notes. In particular, see MDP p.46 Proposition 3.1 and MDP p.80 Exercise 6.4.

9. p.286: Example 3d is unmotivated - I would skip it.

10. p. 291: there is a typo in the first line of Example 4d: delete the solitary "d".

11. p. 292, Example 5b: first verify that f(x,y) is a pdf, i.e., integrates to 1.

12. p.293-6: the discussion at the bottom of page 292 is incomplete - see MDP p.52-3 for a better treatment. In particular, the formula at the bottom of Ross p.293 is stated more clearly in (4.14) of MDP p.52 as a version of Bayes Formula for the case of a mixed joint (bivariate) distribution where one variable is continuous and one is discrete. (Also, the order of Ross Examples 5c and 5d should be reversed: 5d relates to Bayes formula at the bottom of p.292, whereas 5c deals with the bivariate normal distribution, studied in Ross Section 7.8 and MDP Section 8.3.)

HW#1: due WEDNESDAY Jan. 10.

Required (hand in):

Ch. 6 Problems:

1c (The pair of dice is rolled only once. Be sure and specify the range of (X,Y).)

6: Clarification: define N_1 = the number of tests needed up to and including the first defective, N_2 = the number of additional tests needed up to and including the second defective. The transistors are drawn without replacement. (Be sure to specify the joint range of (N_1,N_2). Are N_1 and N_2 independent?)

7: Clarification: define X_1 = the number of trials needed up to and including the first success, X_2 = the number of additional trials needed up to and including the second success. The trials are independent. (Be sure to specify the joint range of (X_1,X_2). Are X_1 and X_2 independent?)

8 (plot the range of (X,Y)).

10: Note that X and Y are i.i.d. exponential rv's. CHANGE part b as follows: let T=X/Y. Find F_T(t) and f_T(t) for 0< t < infinity.

12 (also see Ex.2b, p.268).

14 (also find E[distance] ).

15.

17 (assume that X1, X2, X3 are i.i.d continuous rv's).

19.

Exercises:

5 (can you evaluate the integral in (b)? - I can't), 6 (second part only), 7, 9, 10,

11 (for the Hint, recall Exercise 28, p.253),

Recommended (do not hand in):

Problems 1a,b (The pair of dice is rolled only once. Be sure and specify the joint range of (X,Y).)

2, 3, 4, 5 (Be sure to specify the joint ranges in these problems.)

9, 13, 16, 18.

Exercises 1, 3, 4, 8, 12 (this extends Prop. 2.1, p.272).


II. Read Ross Chapter 6, Sections 6.6 - 6.8 (Order Statistics, the Jacobian Method for Transfomations, Exchangeable Random Variables.)

Notes on Sections 6.6 - 6.8:

1. The distribution of order statistics is also discussed in MDP Section 9, where some of the treatment is more accurate. For example, the statement "and 1 of them equal to x" in Ross, p.298, line 5, should be "and one of them to lie in an interval (x, x+dx)".

2. p.305: in the next-to-last line of Example 7c, change "this work" to "the time".

3. p.306: the result of Example 7d can be derived more easily using (8.65) of MDP notes, p.110.

4. p.306: relate Example 7e to the distributions of the waiting times in a Poisson process.

5. p.310-311, Example 8d: in fact, Y_1,...,Y_n,Y_{n+1} are exchangeable, where Y_{n+1}=1-Y_n.

HW#2: due FRIDAY Jan. 19.

Required (hand in):

Ch. 6 Problems: 11, 20,

21: change (b) to: find f_X(x) and f_Y(y); change (c) to: find E(X) and E(Y).

23 (a), (b), (c) only. Add to (b): find f_X(x). Add to (c): find f_Y(y).

24 (a), (b), (c) only.

25 (the arrival times are uniform over (0, 10^6) hours),

28, 31, 33,

39 (specify the range of (X,Y)).

Exercises: 2, 14, 15, 17.

Recommended (do not hand in):

Problems 22, 26, 29, 30, 32, 34-38, 40 (specify the range of (X,Y); recall Problem 1c).

Exercises 13, 16.


III. Continue Ross Chapter 6.

HW#3: due FRIDAY Jan. 26.

Required (hand in):

Ch. 6 Problems 42,

44: Let X and Y denote the number of accidents in the first and second years, resp. You are to assume that X and Y are conditionally independent given lambda. The first part of the question asks you to find the conditional pdf of lambda given X; for this you may used Bayes formula for pdfs in the mixed case: eqn. (4.14) on p. 52 of MDP notes. The last sentence of the problem asks you to find E[Y|X]: condition further on lambda and apply the iteration formula for conditional expectation, eqn. (4.6) on p.50 of MDP notes. Finally, note that X and Y are NOT unconditionally independent, since they both depend positively on the same random variable lambda: show that Cov(X,Y)=[1/(1+\alpha)]Var(X)>0. (Hint: one method is to use (4.17), p.53 MDP Notes.

45 (this can be solved either by direct integration or by applying the result of Example 8d).

46, 48, 51, 53, 56 (a and c only).

Exercises

18, 19 (see Problem 44), 20, 26,

29 - note the typo: the final "t" should be "1" - (one approach is to apply exchangeability),

30 (think of X as the (n+1)st observation.

Recommended (do not hand in):

Ch.6 Problems 41, 43, 47, 49, 50, 54,

58: also show that U,V,W are exchangeable but not independent; for the latter, show that Cov(U,V)>0.

59, 60.

Exercises 23, 24, 25, 27, 28, 31, 32, 33.


IV. Read Ross Chapter 7, Sections 7.1 - 7.4 (Expectation, Variance, Covariance, Correlation). (We will devote three assignments to Chapter 7.)

In addition to covering fundamental concepts such as expectation, variance, covariance, conditional expectation, moment generating functions, and the multivariate normal distribution, Chapter 7 contains many interesting examples, of increasing specialization and complexity. You are welcome, even encouraged, to read the specialized examples such as 2m, 2q, 3f, 5e, and 6c, but we will not cover these in detail. (Some of these may be encountered in 396.)

The first key results concern the expectation and variance of a sum of random variables (Sections 7.2 and 7.4).

Notes on Sections 7.1 - 7.4:

1. p.329: Example 2a can be handled via order statistics and spacings, as in Example 8d, Ch. 6, p.310.

2. p.333: For Example 2g, also see Example 8j pp.180-2; also see p.361.

3. p.334: In Example 2i, note that the final expression for E[Y] is approximately Nlog(N).

4. p.336: In Example 2l note that E[D^2]=n implies that D is approximately sqrt{n}. As an exercise, find Var[D^2].

5. p.341: In Example 2o, eqn. (2.7) is important - recall Exercise 6, Ch. 4, p.197.

6. p.352: The heading of Example 3e should be "The negative hypergeometric distribution." There is a typo in the second-to-last equation on p.353: a + sign is missing before the third "m" (just after the fraction).

7. p.354: The "N" in line 2 should be "n". Note that the sum for E[X] is approximately log (n).

8. p.361: recall that the hypergeometric distribution was also discussed on p.182 and p.333.

9. p.364: Example 4f deals with the multinomial distribution. (So does Example 5g, p. 373.)

HW#4: due FRIDAY Feb. 2.

Required (hand in):

Ch. 7 Problems:

1 (if the coin lands heads, then she wins twice the value that appears on the die, and if tails, she wins half this value).

3 (order statistics can be used).

8 (also, show that (a) as p approaches 0, the limit of the expected number of occupied tables approaches N; (b) as N approaches infinity with p fixed, the expected number approaches 1/p).

11, 15, 16, 17 (this is an interesting set of results).

22 (see Example 2i, p.334).

25, 26, 30, 33.

Exercises:

1, 5, 10, 11, 13 (see Ross p. 279, second paragraph),

15, 17, 19.

Recommended (do not hand in):

Problems 2, 4, 5, 6, 7, 9, 10, 12, 13, 14, 18

19 (in (b), show that if all types are equally likely, then the mean number is r/2.)

20, 21, 23, 24, 31, 32.

Exercises 2, 4, 6, 7, 8 (6, 7, and 8 are useful results), 9, 12, 14, 18 (this appears in Section 7 of the Class Notes on the multinomial distribution.)


V. Read Ross Chapter 7, Sections 7.5 and 7.6 (Conditional Expectation and Variance, Prediction).

Notes on Sections 7.5 and 7.6:

1. We have already encountered the important iteration formula for conditional expectations and probabilities (Section 7.5.2, Proposition 5.1, and Section 7.5.3 eqn. (5.6)).

2. p.369: in Example 3.68, can you use this conditioning method to find E(X^2) and thence Var(X)?

3. p.372: In Example 5f, the pdf f(x,y) is best expressed in terms of the bivariate "covariance matrix" of (X,Y). See Section 8.3 of MDP class notes.

4. p.375: In Example 5i, note that the derivation and final answer m(x) = e^x are valid only for x\le 1.

5. p.377: Example 5j is very amusing. Note, however, that n must be known for the analysis to hold. What strategy might you use if n were infinite? For this, assume that the successive prizes are a sequence of i.i.d. continuous random variables - you have to decide when to stop and accept the current prize. (This is called the "Secretary Problem".)

6. p.379: In Example 5l, show that when X and Y have the same distribution (pdf), the integral expression for P{X less than Y} reduces to 1/2.

7. p.380: In Example 5m, show that the result is equivalent to eqn. (3.2) on p.280.

8. p. 384: Example 6b is an important application of Bayes formula for pdfs. It is essentially equivalent to Example 5.1 of MDP notes, p.65.

HW#5: due WEDNESDAY Feb. 14.

Required (hand in):

Ch. 7 Problems:

36, 38, 41 (hypergeometric distribution),

42, 45, 47,

51, but altered as follows: (a) Find the conditional distribution of X|Y. (b) Find E[X|Y]. (c) Find the marginal distribution of Y. (It appears to me that the marginal distribution of X is not simple.) (d) Find Cov(X,Y).

53 (hint: condition on the initial door chosen)

56 (also find the limit of the expected number of stops as N approaches infinity),

57, 58,

64 (in (b), show that Var(X) depends on \mu_1 and \mu_2 only through \mu_1-\mu_2.

Exercises:

21, 23, 27,

34 (change "run" to "string" to agree with Exercise 33),

38.

Recommended (do not hand in):

Problems:

34, 35, 37 (see Exercise 19), 39, 40, 43, 44, 46, 48, 49, 50 (compare to 40), 52, 55, 59, 61, 62, 63,

Exercises:

20 (a useful fact), 22, 28 (in Class Notes), 29, 30 (in Class Notes),

31 (use induction on t and condition on N(t-1)),

33, 35, 37,

39 (compare to 38 - this is a special case, where X_1=X and X_2 = X^2).


VI. Read Ross Chapter 7, Sections 7.7 and 7.8 (Moment Generating Functions; the Multivariate Normal Distribution ).

See Sections 3.3 and 3.4 of the Class Notes for a discussion of moment generating functions (mgf).

See Section 8 of the Class Notes for a review of vectors and matrices (S3.1), random vectors and covariance matrices (S3.2), the multivariate normal distribution (S3.3), and the chi-square distribution (S3.4). The bivariate normal distribution is a special case - see S8.3.7 in the Class Notes, p.112.

Notes on Sections 7.7 - 7.8:

1. p.392: Ross briefly mentions a basic property of the mgf, namely, that it uniquely dtermines the probability distribution. A proof is given for a special case in S3.3.1 of the Class Notes, p.40.

2. p.394: Example 7h should be compared to Proposition 3.2, p.283.

3. p.395: In Example 7i, note that the mgf of the chi-square distribution with n degrees of freedom is the same as that of the Gamma distribution with parameters alpha=n/2 and lambda=1/2, hence these two distributions coincide.

4. p.396: The expression for M_Y(t) in the fourth equation can be expressed in terms of M_X and M_N as follows: M_Y(t) = M_N(log M_X(t)). [verify]

5. p.398: Compare Example 7m to Ross Excercise 2, Ch. 6, p.319, and to Example 1.17, Class Notes, p.16.

6. p.399: The system of linear equations can be written more simple via vectors and matrices as X=AZ. The material in Ross Section 7.8 is treated via vectors and matrices in S8.2-8.4 of the Class Notes.

7. p.402: This section treats the joint distribution of the sample mean and sample variance for a sample from the univariate normal distribution N(mu, sigma^2). Also see S8.4.2 of the Class Notes, p.115.

HW#6: due FRIDAY MAR.2.

Required (hand in):

Ch. 7 Problems:

65, 69 (here it is assumed that lambda is itself an exponential random variable),

70 (add (c): one head and one tail occur),

71, 72, 73 (add (e): find Corr(R,S),

75,

77 (add (c): Find the conditional distribution of X|Y - this does not require moment generating functions. Then let Z = X - Y and find the conditional distribution of Z|Y. Since this does not depend on Y, conclude that Z is independent of Y. Last, find E(X) and Var(X)),

79.

Exercises:

40 (part (a) is already solved in Example 5c, p.294, so skip this. Part (b) is already solved in Example 5f, p.373, but give an alternative proof using the result of (a) and the relation (4.17) in the Class Notes, p.53.)

44, 46, 48, 50.

Recommended (do not hand in):

Problems

66, 67, 68, 76, 78 (this is very interesting).

Exercises 41, 42, 43, 45, 47, 49, 51, 52, 53, 54, 55.


VII. Read Ross Chapter 8 (Laws of Large Numbers, Central Limit Theorem, Probability Inequalities).

See Section 10 of the Class Notes for a more general discussion of probabilistic convergence, especially "propagation of error" in Section 10.3.)

Comments on Chapter 8:

This chapter contains the most well-known probabilistic limit theorems: the Weak and Strong Laws of Large Numbers, and the Central Limit Theorem. Earlier, we saw the Poisson and normal approximations to the binomial distribution. These are both limit theorems: the first describes the limiting form of the binomial distribution as n -> infinity and p -> 0 such that np remains constant, the second describes the limiting form as n -> infinity but p remains fixed. This latter is a special case of the Central Limit Theorem.

The proofs of limit theorems usually involve bounds on "tail probabilities" of random variables, such as Chebyshev's inequality. As Ross mentions, such bounds are of mainly theoretical use, and do not give close approximations in practice. By contrast, the Poisson and normal approximations are often quite good.

Ross's proof of the Central Limit Theorem on pp. 434-5 is basically nice, but the final steps can be improved notationally by setting x = 1/n and letting x -> 0. Apply L'Hospital's rule to the resulting functions of x (rather than of n).

Ross includes a proof of the Strong Law of Large Numbers on pp.443-5. This is an advanced result that is not usually proved in introductory texts. Ross makes the simplifying assumption that the fourth moment is finite (the SLLN requires only finiteness of the first moment) and uses this to give a relatively simple proof.

In Example 5b, pp.449-450, the X_i are Bernoulli variables with p = 100/199 which is approximately 1/2. Also, the X_i are exchangeable and approximately independent. Thus the distribution of X = X_1 +...+X_100 is approximately Binomial (100, 0.5). Thus EX is approximately 50 and Var(X) is approximately 25 (this agrees with the value 25.126 on p.450. Thus X can, in turn, be approximated by the normal distribution N(50,25), which has standard deviation 5. Thus the desired probability P[X<=30] is approximately the probability that a normal rv is 4 sd's below its mean. This probability is tiny. This shows that the upper bound .058 that Ross obtains using the one-sided Chebyshev inequality is very crude.

The bound in Example 5c, p.451, is not as tight as the bound in the Extra Problem that I have assigned in HW#7.

p.452: There is a typo in the display following "Therefore". The inequality should be reversed, i.e., it should be <=.

The general Poisson approximation in Section 8.6 is very interesting but can be skipped.

HW#7: due FRIDAY Mar.9.

Required (hand in):

Ch. 8 Problems 5, 9, 13,

20 (Hint: First use Jensen's inequality to show that for a nonnegative random variable Y, [E(Y)]^a <= E(Y^a) for any a > 1.)

Exercises 1, 6 (f denotes the density function), and

Extra Problem: Let f and F denote the pdf and cdf, respectively, of the standard normal distribution N(0,1). For x > 0, show that

1 - F(x) < f(x)/x.

Notice that this is a tighter bound than that given in Ross's Exercise 12, p.460.

Recommended (do not hand in):

Problems 6, 7, 8, 10, 11, 14, 15.

Exercises 5, 7, 8, 9, 13.

Send mail to: Course Email
Last modified: 2/24/2007 4:19 PM