+ Reply to Thread
Results 1 to 6 of 6

Thread: Very large samples

  1. #1
    Points: 8,343, Level: 61
    Level completed: 65%, Points required for next Level: 107

    Posts
    278
    Thanks
    14
    Thanked 7 Times in 7 Posts

    Very large samples




    Hello all,

    I am about to receive a very large sample, of around 10 million observations. I will explain. I am requested to statistically analyze some characteristics of cells (protein levels and so on). As a result, a plate of cells, or a few of them will be sampled and examined. Cells are tiny, in each "sample" there can be up to 10 million cells. How would you address this problem ? If I do CI, or any hypothesis testing, it's useless. If I go for control charts, same problem, I will get very narrow limits.

    Would you go for a sub-sampling direction, some sort of bootstrap, or is there another way ?

    thanks.

  2. #2
    TS Contributor
    Points: 22,410, Level: 93
    Level completed: 6%, Points required for next Level: 940

    Posts
    3,020
    Thanks
    12
    Thanked 565 Times in 537 Posts

    Re: Very large samples

    If I do CI, or any hypothesis testing, it's useless.
    Why?

  3. #3
    Points: 8,343, Level: 61
    Level completed: 65%, Points required for next Level: 107

    Posts
    278
    Thanks
    14
    Thanked 7 Times in 7 Posts

    Re: Very large samples

    because n is huge, any null hypothesis will be rejected, no matter if H0 is correct or not.

    CI's will be extremely narrow due to the large sample size

  4. #4
    Omega Contributor
    Points: 38,289, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,992
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Very large samples

    Can adjust your level of significance. Not sure if there may be any literature out there on overpowered tests?
    Stop cowardice, ban guns!

  5. #5
    TS Contributor
    Points: 6,786, Level: 54
    Level completed: 18%, Points required for next Level: 164

    Location
    Sweden
    Posts
    524
    Thanks
    44
    Thanked 112 Times in 100 Posts

    Re: Very large samples

    Hypothesis tests are valid even in such large samples. But ask yourself, is it of any practical importance whether there are significant differences; or is it way more important to determine the effect size? You could of course carry on with significance testing or CI's just to provide the reader with proof of that there are differences. But in my opinion, the effect size or practical significance is way more interesting, especially in this case.

  6. The Following User Says Thank You to Englund For This Useful Post:

    hlsmith (05-10-2013)

  7. #6
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Very large samples


    Quote Originally Posted by NN_STAT View Post
    Would you go for a sub-sampling direction, some sort of bootstrap, or is there another way ?

    thanks.
    None of this stuff is important until you figure out what it is you want to learn from the data. If you don't know that then it is hopeless and you should figure that out first! If you do know that then that is information that would be useful for us to help you.
    I don't have emotions and sometimes that makes me very sad.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats