# Random Sample Generation w/out Knowing P

#### stats.notmajor

##### New Member
I'm trying to solve the following issue: I have two decks of cards. One deck of cards has 40 cards and the other deck has 30 cards. The makeup of cards in both decks is unknown to the tester, however, the tester knows that neither deck contains duplicate cards, respectively. The ultimate goal of the test is to determine the percentage of duplicates in the entire population (30+40=70). Given a provided confidence level and margin of error, I have been using the following formula to compute the sample size required to understand the percentage of cards in the total population:

(((z-score^2)*(p)*(1-p)) / (margin of error^2)) / (1+(((z-score^2)*(p)*(1-p)) / ((margin of error^2)*population size)))

How is one to determine the value of p, given the tester does not know the makeup of either deck of cards?

Any clarification would be much appreciated!!

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Are these samples representative of random samples of standard playing card decks? So 30 cards randomly selected from 52 and 40 cards randomly selected from 52 cards?

#### stats.notmajor

##### New Member
Great question - they are not. Assume the cards are not standard playing cards.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
It feels like there needs to be a super population that both samples are taken from. If there is a infinite number of possibilities than the probability of overlap is dismal. If there are a finite number of possibilities than the overlap probability is high.

Or is the question just about writing out what you would do without knowledge of the number of potential values. You could also solve this for a small finite set (toy set you create) to ensure you have the equation right, then plug in a place holder for the unknown.

#### stats.notmajor

##### New Member
It is more about writing out what I would do without the knowledge of the number of potential values. Are you suggesting I pull an arbitrary sample size from each deck to determine how many duplicates are contained in those and use that information for my p-value?

To clarify: Neither of the decks is pulled from a larger population

Thanks so much!