distributing labeled balls in unlabel! Statistics for Biological problem,help needed?

fz2life

New Member
Hello everyone,
I am a biology student and i am stuck with a problem, hence here. I have a large number of 10 different kinds of viruses in a tube and i need to infect cells with it. i need to do the infection in a way that each cell gets only one virus (and this is not the problem). i know that if i just infect 100 cell, the likelihood of getting at least one virus of each kind is not great. Instead if i increase the number of infected cells to say 1000 and each cell is infected by only 1 virus, then the likelihood of getting at least one virus of each kind increases. I am wondering if there is a formula to show this phenomenon and maybe a graphical way to illustrate that increasing the number of infected cells, increases chances of getting each virus at least once? I think a way to generalise my problem is: i have loads of 10 different colored balls and i need to put them into boxes. All boxes are identical and each box can contain only one ball. Increasing the number of boxes with balls at the end of the experiment, i can ensure that at least there will be one ball of each color in a box. I know bits of R so it wil be helpful if i could create some graphs in R. Can anyone give any input on this? Any help is much appreciated

Thanks very much

Con-Tester

Member
Re: distributing labeled balls in unlabel! Statistics for Biological problem,help nee

You will also need to know a few other things such as the initial distribution of balls/viruses in the reservoir from which you draw. For example, if ball/virus A occurs 200,000 times as frequently as ball/virus B, there’s obviously a much greater chance of missing ball/virus B. Also, you need to know whether that reservoir has limited capacity or not—that is, every time you draw a ball/virus, whether the proportion of balls/viruses of the same type in the reservoir is significantly affected.

With the assumptions that (1) each ball/virus type occurs with the same relative frequency, and (2) the reservoir has (practically) infinite capacity, the problem becomes much simpler.

The first thing to realise is that if you’ve got n different ball/virus types, you must draw at least n separate times to have a non-zero probability of drawing one of each of them. I suggest that you start with a simple case of, say, three different ball/virus types using the simplifying assumptions above and compute the probabilities of getting all three in three, four, five, six, etc. draws. Such an exercise will give you insight into the principles involved, which you can then generalise to the case where you have n different ball/virus types and r draws (rn).