+ Reply to Thread
Results 1 to 4 of 4

Thread: This might be a simple statistics question, or a really odd statistics question.

  1. #1
    Points: 9, Level: 1
    Level completed: 17%, Points required for next Level: 41

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    This might be a simple statistics question, or a really odd statistics question.




    I am not a statistics person, so I am not sure if this is a statistics type question or a question that has an answer. I am not sure how best to formulate my question, but I will try:

    Let's say that there is a population size that I will call N, and let's say that there is a subset of the population N with an attribute (a) that I will call Na.

    I want to sample the population N with the smallest sample size (Ns) that will likely give me at least one member of the sample that has the attribute a.

    I assume that the answer will depend on the population size, on the percentage of the population that has the attribute a, and on how sure I want to be that my sample has at least one member in it with the attribute a.

    So an example might be as follows:

    Lets say that there are 10^9 people (thats N), and that10^5 of them have some particular genetic trait a (thats Na).

    Lets say that I want to be 95% sure that my sample Ns includes at least one person with the genetic trait a.

    How many people would I have to test for the genetic trait to be 95% sure that I will find at least one person in my sample with that genetic trait?

    Say that everything is random and there are no sampling errors, keeping it as simple(?) as possible.

  2. #2
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: This might be a simple statistics question, or a really odd statistics question.

    hi,
    I would do it this way:

    1. let q= 1- Na/N the probability of NOT seeing the trait if you pick one sample
    2. the probability of not having a single sample with the trait in a sample of n is q^n.
    3. You want tokeep this probability below a limit L = say 5% - which is equivalent to a probability y 95% of having one or more peple carrying the trait in your sample, so you can set L = q^n.

    So, you just need to solve the equation which gives you n = log L / log q

    regrds

  3. #3
    Points: 9, Level: 1
    Level completed: 17%, Points required for next Level: 41

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: This might be a simple statistics question, or a really odd statistics question.

    Quote Originally Posted by rogojel View Post
    hi,
    I would do it this way:

    1. let q= 1- Na/N the probability of NOT seeing the trait if you pick one sample
    2. the probability of not having a single sample with the trait in a sample of n is q^n.
    3. You want tokeep this probability below a limit L = say 5% - which is equivalent to a probability y 95% of having one or more peple carrying the trait in your sample, so you can set L = q^n.

    So, you just need to solve the equation which gives you n = log L / log q

    regrds
    Thanks rogogel.

    So if the population (N) is 10^9, and 10^5 people have trait a (Na is 10^5), and I want a 95% chance that one or more of the people in my sample (call it Ns or n) will have trait a, then:

    Sample size Ns or n = (log 0.05) / log (1 (10^5/10^9)) = 29,956

    And if 10^6 people have trait a, then the sample size Ns or n would be about 2,994.

    Is that right?

  4. #4
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: This might be a simple statistics question, or a really odd statistics question.


    I did not check the numbers but seems to be right. Of course we assume independent random sampling.
    Kind regards

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats