+ Reply to Thread
Results 1 to 2 of 2

Thread: comparing three sets

  1. #1
    Points: 2,917, Level: 33
    Level completed: 12%, Points required for next Level: 133

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    comparing three sets




    Dear all,

    I have three sets, A (495 elements), B (1130 elements) and C (812 elements).

    The elements are biological entities bound by a proteins and for each set I have the count of elements occupied by 1, 2,3 ... proteins.

    For instance, I have 226 elements bound by 1 protein in set A vs 258 in Set B...

    In Set C, we do see that we have more elements bound by many proteins (i,e 15 elements bound by 11 proteins) than in both sets A and B.

    My question: I need to make a statistical test to see if Set C has significantly more elements bound by several proteins than set B. What I would like to have is a pvalue for each pair of sets telling me if the count distributions are significantly different between sets.

    Any help can be appreciated,

    Best.


    Set A
    1 2 3 4 5 6 7 8
    226 143 75 31 12 4 3 1

    Set B
    1 2 3 4 5 6 7 8 9 10 11 12 13
    258 205 181 152 113 77 63 32 30 6 9 2 2

    Set C
    1 2 3 4 5 6 7 8 9 10 11 12 13 14
    142 168 99 80 73 80 43 44 25 28 15 8 4 3

  2. #2
    Human
    Points: 12,666, Level: 73
    Level completed: 54%, Points required for next Level: 184
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,360
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: comparing three sets


    First I thought of doing it by the Poisson distribution, conditional that the value is larger than zero. (Just divide the probability mass function with the probability of zero.) And then do a likelihood ratio test between set A and set B etcetera.

    But why make it complicated? The sample size is large so the means will be approx. normal by the central limit theorem. A usual z-test can be used (which will be the same as a t-test here as the sample size is large, thus degrees of freedom is very large).

    And of course it will be statistically significant.

    (My main problem was going from frequencies to values)

    Spoiler:


    But I am not sure if I understand the problem. What is "entities"?

    I pretend that the elements corresponds to persons and they are asked how many coins they have in their pockets. So there would be 226 persons with just one coin, 143 persons with two coins etc. in group A.

    Are these data statistically independent? Is there any pseudo replication is this? I could have misunderstood the situation completely.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats