In this lab we will use R to simulate drawing marbles from a bag (i.e. what you just did by hand). Simulation will be one of the foundations of this course so it is important that you get comfortable with simulating data in the first week or two. You will be provided with some simple code to implement the simulations and then modify this code to answer some questions.
We start by defining a vector of strings (40 "red" and 40 "clear") using the rep function.
bag <- rep(c("red","clear"),40) We can draw a random sample from this object using the sample function.
four.marbles <- sample(bag,4,replace=FALSE)because the parameter replace is FALSE sampling occurs without replacement (i.e. the marble is not put back in the bag before sampling drawing another marble).
To check for an extreme event (i.e all red or all clear) we can use the all function and the or operator (|). This returns, TRUE or FALSE.
all(four.marbles=="red")|all(four.marbles=="clear")## [1] FALSE
If we want to simulate drawing multiple samples of size four from a bag then we need some way of repeating the sample command. We accomplish this using the for(...){} control construct. Here, everything inside of the braces, {}, is repeated 10 times. Notice that 1:10 is short hand for the integers from 1 to 10.
for(i in 1:10){
four.marbles <- sample(bag,4,replace=F)
}Finally, we add code that checks for an extreme event for every interation and tallies the total number of extreme evens. The if(...){...} control construct executes the commands within the braces, {}, if the logical statement within the parentheses, (), is TRUE.
extremeEvents <- 0
for(i in 1:10){
four.marbles <- sample(bag,4,replace=F)
if(all(four.marbles=="red")|all(four.marbles=="clear")){
extremeEvents <- extremeEvents + 1
}
}
extremeEvents/10## [1] 0.3
We can generalize this code by defining the variables draws and sampleSize.
draws <- 10
sampleSize <- 7
extremeEvents <- 0
for(i in 1:draws){
marbles <- sample(bag,sampleSize,replace=F)
if(all(marbles=="red")|all(marbles=="clear")){
extremeEvents <- extremeEvents + 1
}
}
extremeEvents/draws## [1] 0.1
sum(marbles=="red").Now, instead of simulating the results for a single group, simulate the results for multiple groups and look at the distribution of extreme events (i.e. extremeEvents). This can be accomplished by nesting the for loop above in another for loop and saving the results from each simulation.
simulations <- 10000
draws <- 10
sampleSize <- 4
extremeEventsVect <- rep(NA,simulations)
for(j in 1:simulations){
for(i in 1:draws){
}
}To assign a value to location j in a vector you can use: extremeEventVect[j] <- value.
r) to repeat the simulation above in question 3.