# Order statistics - German Tank Problem

#### beer4me

##### New Member
Hi everyone,

This problem is simply stated but has me stumped:
In WW2 the Allies noticed that the German tanks had sequential serial numbers on them. They were able to capture some tanks, and note their serial numbers.

The problem is this: given m serial numbers from captured tanks, what is a suitable estimator for N, the total number of German tanks?

I think I can assume that N is so large and m so small that I can assume that tanks were drawn with replacement (is this assumption recommended?) In this case, samples are independent.

Hence the probability of choosing any particular tank is 1/N. The probability distribution is just a uniform distribution (but a discrete distribution). The cumulative probability of choosing a tank with a number less than or equal to X is F(X) = X/N.

This is how far I've got in the solution: let the observed m tank serial numbers be x1, x2, ..., xm. Let Xmax be the random variable which represents the maximum serial number and xmax be the observed maximum number. I now try to find the probability distribution of Xmax. The probability that Xmax is less than or equal to any value x is F(x)^m = (X/N)^m (let's call this Fmax(x)).

I can find the probability that Xmax=x by simply taking the difference: P(Xmax=x) = Fmax(x)-Fmax(x-1).

Now I don't know how to proceed. Somehow I need to get an equation which features N, rearrange it so I can solve for N. I thought about taking the mean of P(Xmax), but I'm not sure that would work.

Can anybody please give me a direction for pursuing this problem?

#### beer4me

##### New Member
I thought I should give a bit more info on what I was thinking of doing.
I'm thinking of trying to get an expression for the expectation of Xmax, and then setting that equal to xmax, the observed maximum, because if you have only one sample, that's your best estimate of the mean. Do you think this idea should work? The only problem is I get a complicated expression for the expectation and I don't think I can solve for N.

#### beer4me

##### New Member
Just an update - through extensive internet searching and putting together information I've solved the problem - more details later, but my assumption of sampling with replacement is wrong - it's okay to sample without replacement.

I was on the right track trying to get the expectation and equating that to the observed value.