# Thread: Estimate number of urns based on drawn balls

1. ## Estimate number of urns based on drawn balls

Assume there is a number of labeled urns and every urn holds a number of balls (at least one). You know there are n balls in total but you do not know how many urns there are nor how the balls are distributed between them. You can blindly draw a fraction of the balls (lets say 10%) and can see which urn the ball came from. How do you estimate how many urns there are in total?

2. ## Re: Estimate number of urns based on drawn balls

I think maybe the question lack some assumptions/information.

If you assume that the balls are equally likely to be distributed in each of the urn,
then the number of balls in each urn observed is following the multivariate hypergeometric distribution.

3. ## Re: Estimate number of urns based on drawn balls

To maybe bring some intuition to problem, the real life application for this is the following. Assume you have a number of customers, and each customer can make orders (which most likely follows a power law). Now you only see a subset of orders (e.g. 1,000,000 out of 10,000,000 total). Based on the orders you see, you have to estimate how many customers you have in total.

You might assume that the number of orders per customer follows a power law. But actually I would like to not assume a distribution, but infer it from the sample, and then go from there.

4. ## Re: Estimate number of urns based on drawn balls

Originally Posted by axs
you have a number of customers, and each customer can make orders
What all determines a customer's decision to make one or more orders?

5. ## Re: Estimate number of urns based on drawn balls

Originally Posted by Outlier
What all determines a customer's decision to make one or more orders?
Since we're talking real life data, the answer is I don't know. Just that the number of orders per customer follows a power law.

