I haven’t looked at a statistics book for a good 5 years, since leaving uni and moving into the job world. However, I have come across a real world problem where I feel there is a statistical method to help me, but after searching high and low cannot find it.

I’ve looked at two tail test to determine confidence interval but cannot seem to apply it to my particular real world example, as there are 4 possible outcomes.

**Real world example:**

We have a set of 9,163 applications which will be transition to new environments; there are 4 possible places, or servers, where they might be moved to or ‘land’. Let’s call them servers 0, 1, 2 and 3. Each of the 4 servers has a different support cost associated to it – if too many applications land on a server which is too expensive to maintain than the transition will be seen as economically unviable. It’s quite costly to asses each application to determine which server it will land on, therefore we are thinking of taking a sample of applications to determine whether there is a case for continuing with the rest.

We were thinking of assessing 5% of applications (458 applications) and then taking that and extrapolating the results out to apply to the remaining, unassessed applications, to decide whether there is a business case to move them.

**Question**: is the sample size of 458 of 9,168 applications enough to provide statistically relevant results, with a confidence level of 95%; in terms of the results the margin of error or confidence interval should be 2.5%.That is to say, we expect the results of the application assessment to be wrong very rarely.

**Maths orientated example.**

Number of applications = 9,163

Total number of outcomes, or servers they might land once assessed = 4

Required confidence level = 95%

Margin of error i.e. percentage of applications which will be incorrectly assessed = 2.5%

Sample size = ?????

**Question**

Is there any help, guidance or insight you can provide in order to help me determine an appropriate sample size which will ensure that I can extrapolate out the results with a high degree of statistical confidence?

Many thanks in advance,