Sample Size. Binomial

#1
Hello everyone,

I am trying to do an exercice where I have 30 batches of 200 units which have different number of faulty parts, as you can see in the attached picture.

Could anyone help me about how to calculate the minimum sample size in order to get at least one defective part?

I have calculated the probability which is: 162 faulty pieces/6000 units = 0,027

I guess I should use the bernuilli equation for P(faulty pieces>=1) =0,99 or 1, and then obtain the size of the sample, but I am having problems calculating it by using this equation.

Could anyone tell me if I am doing it correctly? or if there is a better way to calculate it.

Thanks
 
Last edited:

Dason

Ambassador to the humans
#2
Can you explain more what you mean by "minimum sample size in order to get at least one defective part". Are you picking from the 6000 units directly? Or are you going to be picking new parts and you want to guarantee a faulty part is in there?
 
#3
Yes Dason, I would have to pick new ones and assure that there is at least 1 defective part in that new sample, with a probability of 99%.
 

Dason

Ambassador to the humans
#4
The "sizes" listed only sum to 5940. Also how were the batches chosen? It doesn't seem like the proportion of faulty parts is distributed the way we would expect if each batch was independent and identically distributed.
 
#5
Sorry, I made a mistake when I created the table. All batches sizes are 200. All batches are manufactured in a row by using the same process and machine. I am not even sure if it is possible to calculate that. I am really confused now.
 

Dason

Ambassador to the humans
#6
Well there is definitely overdispersion. So it seems like there are times where the machine probably is more likely to spit out faulty parts. That complicates the matter a bit. Although we might be able to use something like a beta-binomial model to account for the batch effect if we treated consecutive batches as independent from one another.
 
#8
All batches sizes are 200. All batches are manufactured in a row by using the same process and machine.
Then it sounds like a time series. If one part is faulty then probably the next part is also faulty.

Could one imagine a model with the number of defective is Poisson distributed (or binomial since it is a there is a batch size of 200) with a parameter p?

But suppose the (un-observable latent) parameter p_t varies like a time series as an AR(1) (auto regressive) model:

p_t = a*p_t-1 + eps_t

where eps_t is normal and independent with zero mean.
 
#9
Hello GretaGarbo,

I don't understand that equation. Could you please explain to me how you would resolve it? Or what is the result so that I can try to do it by myself?

Thank you
 
#10
I don't understand that equation. Could you please explain to me how you would resolve it? Or what is the result so that I can try to do it by myself?
I just made that suggestion to see if someone else would agree. (Then maybe software suggestions could come later.) Sometime we believe that this can have an interest also to other people and not just the original poster.

Maybe it is easier if one think of it a mixed model. So that one "can borrow information" from the other batches when estimating each batch proportion of faults. And using a beta distribution as prior and the binomial for the likelihood. (And maybe this is just what Dason intended above.) I believe there are standard software for this. We don't know what the original poster is using.

(Sorry I am not writing a textbook of explanations. Let's se if someone agrees.