Introduction
The term "simulation" refers to the use of a model to imitate a system in action — to perform what is sometimes called "what if" analysis. Monte Carlo simulation simply means using randomly generated values for uncertain variables in such a model.
At each step in a calculation we basically roll the dice to see what happens and we repeat this many times to generate distribution of possible scenarios, and a sense of the average expected results.
Example: The Emergency Family Housing Agency
You have been given the task of analyzing a program that provides stable housing for homeless families. The protocol followed by the program involves an intake process that can take a few days during which families are put up at a nearby shelter/hotel at the expense of the program. Once the intake process has been completed, the families are matched with resources and an apartment (and so do not require further resources from your organization). The intake and resource matching process are relatively labor intensive — one staff member can generally handle only one family per week. The organization has been collecting data for the last two years on how many families show up needing services each week.
Your job is to provide the organization with forward looking estimates of how many staff members they should have on the payroll and how much needs to be in the budget to cover the shelter/hotel costs.
As noted above, the typical approach in Monte Carlo simulation is to "run" the simulation many times and then look at the pattern of outcomes. In each run there is some event that can happen in different ways; which way it happens in a given run depends on a "roll of the dice."
Not JUST Rolling the Dice
Here we (re-)Introduce idea of a distribution from which a value is chosen at random. Uniform distribution, normal distribution, "custom" distribution, Poisson distribution. Here are few distribution shapes (note that what is here called "rectangular" we call "uniform").
Remember all the examples of an urn of balls of different colors — the likelihood of a given event, say, "picking a red ball," depended on the proportion of red balls in the urn. That is, we are picking a ball AT RANDOM, but the probability of a given color depends on the distribution of colored balls in the bin.
Example. My MP3 player has three Beatles albums (32 songs), two Talking Heads (21 songs), and four Jimmy Buffet albums (47 songs). What does the distribution of songs look like?
Suppose, for example, that we are simulating a party to which we will invite people who like punk and people who like hip-hop. We plan on a random playlist on our i-device. Research has shown that the more songs in a row of the type of music you don't like, the more likely you are to leave a party. How long will the party last?
One part of the simulation is going to be to generate the next song. If songs are picked at random, whether it is hip-hop or punk depends on how many of each there are on our playlist.
Another way to say that is that it depends on the DISTRIBUTION of songs on the playlist — how many of each type. Since the songs are picked at random, the order on the device does not matter. If there are 40% punk and 60% hip-hop, then the next song always has a 40% chance of being punk.
Example 2. Suppose I have a class made up of three types of students. One type is always on time. Another type is quite likely to be late. Over the course of the semester, what patterns can I expect? Let's assume there are 4 students in each group.
Suppose the distribution of lateness for group 2 is this
minutes late probability cumulative probability 0 0.1 0.1 3 0.3 0.4 6 0.4 0.8 9 0.2 1.0 Let's start by simulating one student. Suppose we have a spinner like this with areas of each sector proportional to the probabilities in the table:
Instead of a spinner, we can use a random number chart. What we are going to do is generate (pick) a random number and if it is less than 0.1 we will say the student is on time. If it is between 0.1 and 0.4 we will say she is 3 minutes late. Between 0.4 and 0.8, 6 minutes and greater than 0.8, 9 minutes late.
If we examine the probabilities in order we can ask "is the random number less than 0.1?" and if it is not we can then ask "is the random number less than 0.4?" and so on. Let's try it.
Grab a random number chart. Find a random starting point — how about the row and column of your birthday? Now read the first digit. We'll treat it as if there is a decimal point in front. What is the number? What is the corresponding amount of tardiness on day 1? Now repeat for the next random number. And so on. Keep two weeks worth of data. Then arrange in a frequency table and draw a histogram.
Summary and Example for Lab
Same fundamental approach we saw with stock and flow: system is modeled as a discrete process — at each "tick" of the clock, the variables of the system update. The added feature is that some of the variables are random numbers.
References and Readings
MS Excel Help: "Introduction to Monte Carlo simulation"
Microsoft. Introduction to Monte Carlo simulation
Describes technique and how to implement in Excel. Includes downloadable spreadsheet and problem set.
Sedgwick & Wayne. Monte Carlo Simulation, Section 9.8 in Introduction to Programming in Java.
Wolfram Mathworld. Monte Carlo Method
Herbold Keith D. 2000. "Using Monte Carlo Simulation for Pavement Cost Analysis." Public Roads, Nov/Dec 2000 Vol. 64 ยท No. 3
Zhang, Junfu. 2003. "Revisiting Residential Segregation by Income: A Monte Carlo Test." International Journal of Business and Economics, Vol. 2, No. 1, 27-37
Distribution Chart
Denton Peterson, et al. 1990 "Monte Carlo Simulation of HIV Infection in an Intravenous Drug User Community." Journal of Acquired Immune Deficiency Svndromes 3:1086—1095.
Glossary
- deterministic
- Said of a process when there are contingencies but these depend only on measurable or observable conditions rather than on chance. Opposite, obviously, of non-deterministic, but also, sometimes, of "stochastic" or "probabilistic." (google, wikipedia)