Hi all,
I've searched the web and these forums but couldn't find a subject that resembled my problem. If I missed something then I'd like to apologize upfront.
My problem is this. I have a given probability p, given by an outside agency and the number of successes n from an owned datasource. I'm interested in the population total N and the confidence interval of N.
Background: I know how many passengers there are for a specific bus agency using 2013 data (as in: I have the (micro)data from that specific company). Assumption: this is my n. I also know from transportation figures (from an outside agency) that this specific bus company had a market share (as measured in passengers) of 30% in 2013. Assumption: this is my p. This assumption brings with it certain other assumptions, that is fine.
So now the question is: how many passengers were there in total (assuming this can not be taken from the transportation figures)? ie. what is N and what is the 95% confidence interval surrounding N?
So my questions are:
1) Can I just use the relation: p=n/N, therefore the point estimat for N is: N=n/p? I'd assume so.
2) Can I just use the Clopper-Pearson interval to calculate an interval for p, then using that to get to an interval for N?
3) Suppose the 95% lowerbound for p=plb. Is the 95% lower bound for N then n/plb? Same logic for the 95% upper bound.
4) When calculating the Clopper-Pearson interval, can I just use: n=n, p=p, N=n/p?
5) If any of the answers to my questions 2-4 is 'No', then what is the correct approach?
I realize that using this approach, N is not necessarily an integer. Is that problematic?
Thanks for any help,
Jasper
Tweet |