Ambulance utilisation: help choosing a method


I need some advice in selecting a statistical test please.

I am researching ambulance utilisation rates at a population level (defined by a geography)

For each geographical area i have the dependent variable: number of emergency calls made (i have the population size so i can standardise the geographic area giving a calls per 1,000 of population figure)

I am interested in understanding how a number of independent variables taken from the census for the geographical area relate to ambulance call rate figure:
Independent variables, which are all categorical are:
General Health

the variables are likley to interact with each other ?confounding

I am interested in understanding:
1. interaction of the variables to explain utilisation rate

2. where populations are similar i.e. independent variables are the same what amount of variation remains

Any advice on which methods to look into would be helpful.

Many Thanks



TS Contributor
For each geographical area
How many are there in your sample?
Any advice on which methods to look into would be helpful.
Multiple linear regression. Whether you want to include interactions between predictors,
depends on your theoretical model.

With kind regards

Thanks K.

I was intending to split the area i am studying into 733 geographical areas using a national system (medium super output layer)
I have the complete data set for emergency calls made spanning 5 years.
the range per year is from 100 calls per area to 5000 call per area


TS Contributor
If you collapse data from 5 years, then multiple linear regression would be an option. If you consider each year separately, multilevel regression (which takes into account that observations are clustered within areas), or maybe generalized estimating equations (GEE).

With kind regards



Omega Contributor
How will you enter age into your model? Are you able to enter it as a breakdown of age groups?

Your study would fall into a ecological study design and will be good for hypothesis generating, but not testing. I am guessing you don't know who actually used the ambulances or called (e.g., demographics).

For each area the number of people in each age group are recorded by year. so you could combine if needed.
i.e. number of 1 year olds, number of 2 year olds, number of 3 year olds etc.

i was intending to make inferences at geographical rather than individual level, to avoid ?ecological fallacy. i.e. if you have population that looks like this i.e. x many of over 65s and deprivation score of x then you can expect emergency call utilisation of y

i don't have the individual characteristics of the callers, hence ecologically correlating with the census data for the area.



Omega Contributor
I was just trying to point out say, region one has 45% elderly and region 2 has 45% elderly, but we have absolutely no idea how age effects ambulance usage. It may have nothing at all to do with usage and rurality or urbanization account for it, something not measured. You can't make any direct links at all since you may not be accounting for something or you don't know who actually used the ambulances (e.g., could be the same person over and over again in one region and every utilization completely unique in another). All kinds of goofy stuff could or could not be happening.

You just need to disclose that if you diseminate any results at all.
Thanks - good point.

So is there a way of minimising the chance of that by the quantity of data - or can you not do that because it will always be ecological correlation?
i.e one person over 65 may be responsible for all the calls in an area. but if across 733 geographical areas and over 5 years of ambulance calls the proportion of over 65s living in area was correlated consistently does this improve the chances of it being true?
i could control for more factors like urban/rural etc. but there are lots, so was restriciting based on previous literature about types of people who utilise ambulances more.


TS Contributor
If in many areas just a few persons > 65 years are responsible for all calls,
then across areas you will have a low correlation between proportion of (or
number of) persons > 65 and frequency of emergency calls. And seemingly
that is what you want to find out, whether there's an association? If you want
to develop a model to predict ambulances' workload on the area level, then
IMHO individual-level considerations are not necessary in the first step.

With kind regards