# Sample design problem

#### Shotokan

##### New Member
A healthcare provider has been paid on N claims for certain health services, some of which may have been unauthorized. I must draw a sample of n claims to estimate the total payments to this provider for unauthorized claims:
Y = Sum of I(k) * X(k) for k = 1 to N where:
I(k) = 0 if claim k is authorized, = 1 if claim k is not authorized.
X(k) = (known) payment for claim k.
I also must calculate a 90% one-sided (lower) confidence limit for my estimate of Y. Conventionally, this 90% confidence limit is the amount that the provider is asked to repay.

I am struggling because this is a finite population problem and I think I should be sampling with probability proportional to payment (proportional to size) and payment varies substantially across the N claims. For example, 25% of claims might account for 75% of total payments and, assuming authorization risk is constant and independent of payment, 25% of claims represent 75% of unauthorized payments, on average. Including the highest payment claims in my sample will reduce the error in my estimate of Y because the error variance is an increasing function of Sum of X(k)**2 for the unsampled claims (the total unauthorized payment for the sampled claims will be known with certainty).

I am at a loss to design a sampling plan that will estimate the lower 90% confidence limit with a given level of precision. Any ideas how I should proceed?