In this lab we will use R to simulate drawing marbles from a bag (i.e. what you just did by hand). Simulation will be one of the foundations of this course so it is important that you get comfortable with simulating data in the first week or two. You will be provided with some simple code to implement the simulations and then modify this code to answer some questions.
We start by defining a vector of strings (40 "red"
and 40 "clear"
) using the rep
function.
bag <- rep(c("red","clear"),40)
We can draw a random sample from this object using the sample
function.
four.marbles <- sample(bag,4,replace=FALSE)
because the parameter replace
is FALSE
sampling occurs without replacement (i.e. the marble is not put back in the bag before sampling drawing another marble).
To check for an extreme event (i.e all red or all clear) we can use the all
function and the or operator (|
). This returns, TRUE
or FALSE
.
all(four.marbles=="red")|all(four.marbles=="clear")
## [1] FALSE
If we want to simulate drawing multiple samples of size four from a bag then we need some way of repeating the sample command. We accomplish this using the for(...){}
control construct. Here, everything inside of the braces, {}
, is repeated 10 times. Notice that 1:10 is short hand for the integers from 1 to 10.
for(i in 1:10){
four.marbles <- sample(bag,4,replace=F)
}
Finally, we add code that checks for an extreme event for every interation and tallies the total number of extreme evens. The if(...){...}
control construct executes the commands within the braces, {}
, if the logical statement within the parentheses, ()
, is TRUE
.
extremeEvents <- 0
for(i in 1:10){
four.marbles <- sample(bag,4,replace=F)
if(all(four.marbles=="red")|all(four.marbles=="clear")){
extremeEvents <- extremeEvents + 1
}
}
extremeEvents/10
## [1] 0.3
We can generalize this code by defining the variables draws
and sampleSize
.
draws <- 10
sampleSize <- 7
extremeEvents <- 0
for(i in 1:draws){
marbles <- sample(bag,sampleSize,replace=F)
if(all(marbles=="red")|all(marbles=="clear")){
extremeEvents <- extremeEvents + 1
}
}
extremeEvents/draws
## [1] 0.1
sum(marbles=="red")
.Now, instead of simulating the results for a single group, simulate the results for multiple groups and look at the distribution of extreme events (i.e. extremeEvents
). This can be accomplished by nesting the for
loop above in another for
loop and saving the results from each simulation.
simulations <- 10000
draws <- 10
sampleSize <- 4
extremeEventsVect <- rep(NA,simulations)
for(j in 1:simulations){
for(i in 1:draws){
}
}
To assign a value to location j
in a vector you can use: extremeEventVect[j] <- value
.
r
) to repeat the simulation above in question 3.