I have run tests with the system 13,00 times and get the following frequencies for the number of defective items in a sample:

x f

0 64

1 270

2 745

3 1429

4 2001

5 2169

6 1995

7 1635

8 1231

9 716

10 386

11 212

12 86

13 44

14 13

15 0

16 4

This gives a mean of 5.59 and SD of 2.37 for the number of defects found in a sample. I'm puzzled as to how to use this data to either accept or reject the 99% claim of the black box. What hypothesis test do I use? I thought of using the difference of two proportions i.e. the proportion of samples having one or more defective items is 0.99 in the black box claim and 0.995 in my testing, but because I don't know the sample sizes in the 'black box' I can't calculate the standard deviation.

How can I use this to prove or disprove the claim?

Can I use fewer tests?

Can I put a confidence interval on the 99% claim?

Thanks in advance for any enlightenment that can be offered.

PS: Population sizes are very large (hundreds of millions) if this makes any difference.