Hello all - need some help!

#1
Hey everyone.

I never would have thought I would be in a statistical forum before, but in the search for answers here I am!

Why I am here... I work in the field of Risk, specifically safety risk. I have found that when I am trying to present risk to those who need to understand and make decisions, typical methods are overly simplistic and subjective. I was speaking with a colleague who suggested that monte carlo simulation may be a way to show the range of scenarios and the probability.

In plain English that is what I would like to demonstrate. The risk of you being hit by a falling tree while at work is between X and Y and the outcome likelihood between first aid and death is A to B.

So my question is if monte carlo is the best way forward and how would I even begin to compile what I needed to run the simulation.

The things I have or could have at my disposal are...
Subjective risk assessments in a typical heat map.
Incident data for if and how often these events occur and what the outcome was.
Some human resource data such as number of people, hours worked and the like to maybe identify exposure.

Any help is much appreciated!
M
 

fed2

Active Member
#2
overly simplistic and subjective
No such thing. I feel like when your trying to 'boil down the options' for someone, this would be exactly what you want to do. Chances are its just in one ear and out the other if you say a bunch of techno blah blah, and they will probably appreciate your just giving them some simple options.

The risk of you being hit by a falling tree while at work is between X and Y and the outcome likelihood between first aid and death is A to B.
I don't know if monte carlo is it, so much as you would need some data to run stats on. Monte carlo is just a generic way of referring to a simulation, so i guess the question is what are you simulating, and to what end? What are the inputs to the simulation?
 
#3
No such thing. I feel like when your trying to 'boil down the options' for someone, this would be exactly what you want to do. Chances are its just in one ear and out the other if you say a bunch of techno blah blah, and they will probably appreciate your just giving them some simple options.


I don't know if monte carlo is it, so much as you would need some data to run stats on. Monte carlo is just a generic way of referring to a simulation, so i guess the question is what are you simulating, and to what end? What are the inputs to the simulation?
Hey,

Sorry maybe I didn't explain things well enough... when you manage risk it's not good enough to provide simplified information because the leadership needs to understand as many scenarios as possible to make decisions.

For example. In many places where they are managing risk in a simplistic and overly subjective ways they would say things like...

The risk of getting into a motor vehicle accident and dying is low. That statement alone will drive behaviours and decision making. I would like to provide a wider lens e.g.

Motor vehicle accidents
No incident - 80%
Incident with first aid only - 5%
Incident with temporary disability - 2%
Incident with permanent disability - 3%
Incident with fatality - 10%

Or

Motor vehicle accidents
Collision with other motorists - 25%
Collision with stationary fixed items - 35%
Loss of control with no collision - 40%

M
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
Yeah Monte Carlo simulations directed by existing data and assumption is likely what you need and they help account for doubt. Not sure if actuaries use these much, but I know the business sector does. I don't have a good book recommendation, but I would start simple and build up. This can be done in Excel, excel add ins or R.

I took a class once that used, statistics, data analysis, and decision modeling by James Evans. I am not saying it was a good book but similar books may give you the initial steps. If you find better resources please share.

Once you figure this out the next step is usually agent-based models which drill down to levels of interactions.

PS, welcome to the forum. I would also get familiar with playing around with your own data, all of the time. I would be happy to help once you get going.
 
#6
ok so let get to simulatin'. you got some data, or will it be whole cloth fabrication.
Hey,

So I can source some data as I mentioned in my original post, but that is part of my question is what data or variables would I need to say produce the probabilities of those incident outcomes? death, serious injury etc...

Some of my "data" will blur the lines of objective and subjective... Especially for the extreme rare events (which are probably the ones I am most interested in).

I guess I am not looking for someone to give me the answer, more to teach me how to fish/think.

M
 

katxt

Active Member
#7
I have an Excel workbook which is a simple Monte Carlo spreadsheet, and an associated paper which goes through some of the applications. Let me know if you are interested and I'll post it. The original paper is still on the net but the spreadsheet seems to have disappeared. I like Excel because you can see what is going on and it is easy to program. It is slower than R but if you are prepared to wait 20 seconds rather than 2 it's very useful. kat
 
#8
I have an Excel workbook which is a simple Monte Carlo spreadsheet, and an associated paper which goes through some of the applications. Let me know if you are interested and I'll post it. The original paper is still on the net but the spreadsheet seems to have disappeared. I like Excel because you can see what is going on and it is easy to program. It is slower than R but if you are prepared to wait 20 seconds rather than 2 it's very useful. kat
Yes please!

Can you attach it here or would you need email address?

M
 
#11
Wow. That was quick. Don't worry about the coding. Section 6.1 might be the sort of thing you're after.
I think I am slowly getting closer to understanding what I need... Let me know if I am understanding...

I am dealing with non-numerical variables for the most part, so with my above scenario of a motor vehicle accident if I assign a number to a value would I then run the simulation to see the frequency?

e.g.

Least likely - First aid treatment = 0
Likely - Temporary Disability = 1
Most likely - Fatality = 2

Or am I way off.... keeping in mind I have ZERO statistical education LOL
 

katxt

Active Member
#12
It's hard to say without knowing more, but I imagine something like associating a frequency to each outcome. You need a way of picking each case at the appropriate frequency.
Each outcome has a cost of some sort which can be translated into cash, say (the universal measure). The cost for each outcome can vary so perhaps Temporary Disability may range from $2000 to $500 000, but mostly about $20 000. You need data or expert opinion to decide this distribution. And so on for the other possibilities.
Then you run the simulation. Each iteration is a plausible scenario. From the outputs you can get the most likely cost, and things like the probability of having to pay out more than a certain amount.
 
#13
It's hard to say without knowing more, but I imagine something like associating a frequency to each outcome. You need a way of picking each case at the appropriate frequency.
Each outcome has a cost of some sort which can be translated into cash, say (the universal measure). The cost for each outcome can vary so perhaps Temporary Disability may range from $2000 to $500 000, but mostly about $20 000. You need data or expert opinion to decide this distribution. And so on for the other possibilities.
Then you run the simulation. Each iteration is a plausible scenario. From the outputs you can get the most likely cost, and things like the probability of having to pay out more than a certain amount.
Ok so if I wanted to figure out the probability of each of the different "outcomes" I would need to assign some range of values

Low, Likely, High

First aid - 100, 300, 1000
Temp Disability - 2000, 20000, 500000
Perm Disability - 200000, 500000, 5000000
Fatality - 750000, 3000000, 15000000

Then I could just "translate" the "amount" to a probability as I am not trying to show a financial cost to the organisation I am trying to show the likely outcomes to highlight the severity of a risk event and the possible risks to human life.

P.S. I would probably need to reverse these numbers though wouldn't I? as this would make fatality a "more likely" outcome?
 

katxt

Active Member
#14
I think we are on different paths. There is no way to decide the likelihood of each of the four cases using just Monte Carlo. These probabilities can only be estimated from historical company data, or from the experience of similar companies, or from government departments, or perhaps from the opinion of experts. Perhaps you decide that 97% need first aid, 2% are temporary, 0.3% are permanent and 0.7% are fatal.
Then you can go further if you want to using Monte Carlo. What I outlined above is what can be added to quantify in some way the overall range of risk. This might be useful, say, in negotiating an insurance scheme.
 
#15
I think we are on different paths. There is no way to decide the likelihood of each of the four cases using just Monte Carlo. These probabilities can only be estimated from historical company data, or from the experience of similar companies, or from government departments, or perhaps from the opinion of experts. Perhaps you decide that 97% need first aid, 2% are temporary, 0.3% are permanent and 0.7% are fatal.
Then you can go further if you want to using Monte Carlo. What I outlined above is what can be added to quantify in some way the overall range of risk. This might be useful, say, in negotiating an insurance scheme.
So monte carlo is only validating things that are already known or suspected?

So lets say I examined the company data and it showed that

97% of vehicle incidents result in first aid
2% Temp
0.3% Perm
0.7% Fatality

If I ran monte carlo would it just spit out those same figures? or is there some thing i am missing that "the law of something or other" would actually make the results

85% of vehicle incidents result in first aid
7% Temp
3% Perm
5% Fatality

I really appreciate your help on this by the way!
 

katxt

Active Member
#16
Yes, you would just get those figures back.
Imagine that you are considering a scenario where you are calculating something. The answer (output) may have some uncertainty because the figures used to calculate the outcome (inputs) are themselves uncertain. Monte Carlo is a way of modelling the uncertainty in your inputs to estimate the uncertainty in the output. It doers this by generating many plausible sets of inputs and recording the matching plausible outputs.
There is no law of something or other. The idea is simple and common sense once you have seen it (like so many things, I guess.)
Your figures above could be one input in estimating something more complicated and could even incorporate uncertainties in those figures. kat
 
#17
Yes, you would just get those figures back.
Imagine that you are considering a scenario where you are calculating something. The answer (output) may have some uncertainty because the figures used to calculate the outcome (inputs) are themselves uncertain. Monte Carlo is a way of modelling the uncertainty in your inputs to estimate the uncertainty in the output. It doers this by generating many plausible sets of inputs and recording the matching plausible outputs.
There is no law of something or other. The idea is simple and common sense once you have seen it (like so many things, I guess.)
Your figures above could be one input in estimating something more complicated and could even incorporate uncertainties in those figures. kat
Hey,

Ok so maybe I am thinking about this from the wrong point...

So using what you said I want to calculate the probability of those "outcomes" and they are uncertain... I need to think about the precursors to the output and look at those values/uncertainties?

fake stats but for e.g.
Employees driving cars - between 100-200
MVA's / year - between 15-30
MVA's that result in an injury - 60%
97% of vehicle incidents result in first aid
2% Temp
0.3% Perm
0.7% Fatality

This would then adjust the outcome amounts as there is some uncertainty about how many are using a company car for work purposes and there is some uncertainty about the MVA's per year?

Am I closer now?
 
Last edited:

katxt

Active Member
#18
Yes, closer I think. Can we first try an actual calculation using particular figures to get a point estimate of the number of deaths per year.
MVAs/year = 20. % that result in an injury 60%. %fatalities 0.7%. Fatalities =20x60%x0.7% = 0.08/year or 1 about every 12 years
Now we try and estimate how accurate our inputs are.
MVAs/year between 15 and 35 say most likely about 25 according to 10 years workshop records
% that result in an injury between 50 and 70% according to company nurse
%fatalities between 0.2 and1.0% according to which source you use.
These input ranges are put into the model in the place of the original single estimates with suitable distributions and we get something like one fatality every 8 to 15 years (just guessing).
 
#19
Yes, closer I think. Can we first try an actual calculation using particular figures to get a point estimate of the number of deaths per year.
MVAs/year = 20. % that result in an injury 60%. %fatalities 0.7%. Fatalities =20x60%x0.7% = 0.08/year or 1 about every 12 years
Now we try and estimate how accurate our inputs are.
MVAs/year between 15 and 35 say most likely about 25 according to 10 years workshop records
% that result in an injury between 50 and 70% according to company nurse
%fatalities between 0.2 and1.0% according to which source you use.
These input ranges are put into the model in the place of the original single estimates with suitable distributions and we get something like one fatality every 8 to 15 years (just guessing).
Ok

That makes sense now. When you say suitable distributions what does that mean?
 

katxt

Active Member
#20
Ideally, the random input generated will match reality. If the average number of accidents per year is very unlikely to be less than 15 in real life, then the number generated in the simulation should be very unlikely to be less than 15. Sometimes you know that the distribution may be normal with known mean and SD. If you don't know the exact distribution, then a common compromise is the triangular. You set the maximum, minimum and most likely values. Then the simulation picks a random input from that range, most likely around the nominated values. So you might, after due consideration and consultation with experts decide that the average MVA/year is probably about 25, but because of our lack of hard data could possibly be as low as 20, say. or as high as 30. We could then use a triangular 20, 25, 30 and plausible scenarios would have MVA/y mainly round 25 but going as low as 20 and as high as 30. You are in effect modelling your ignorance of the real value.