+ Reply to Thread
Results 1 to 6 of 6

Thread: Enrichment analysis between two sets of proteins

  1. #1
    Points: 11, Level: 1
    Level completed: 21%, Points required for next Level: 39

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Enrichment analysis between two sets of proteins




    I'm trying to calculate a statistical measure describing the overlap of two sets of proteins. The issue is that the sets are from different organisms with a different complement of proteins and some defined homologs.

    Example for what I mean:

    species #1
    100 total proteins

    species #2
    200 total proteins

    overlap
    50 homologs

    Say I choose 30 random proteins from set 1 and 40 random proteins from set2, and the overlap of the two groups of random proteins is 10 homologs.

    How could I describe this overlap with some sort of statistical value?

    Thanks a lot.

  2. #2
    Points: 5,115, Level: 45
    Level completed: 83%, Points required for next Level: 35

    Location
    Bangalore
    Posts
    135
    Thanks
    3
    Thanked 30 Times in 29 Posts

    Re: Enrichment analysis between two sets of proteins

    I will try to propose something, but I am not sure if this would make any sense for your case..

    Let A and B be the number of protiens in the two sets. Let assume A>B.
    Let C be the number of common proteins you get.
    I am trying to define a quantity which will vary from 0 to 1 where 0 indicates no overlap and 1 indicates maximum overlap
    \frac{C}{A+B-C} * \frac{A}{B}

    Basically, the first fraction gives you the overlap percenage and the second fraction just scales it so that the index is within [0,1]

  3. #3
    Points: 11, Level: 1
    Level completed: 21%, Points required for next Level: 39

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Enrichment analysis between two sets of proteins

    Thanks for the reply. I've thought about something similar, but don't I somehow have to take into account the total proteins in each set and the maximal possible overlap?

    If I can ignore that, maybe something like your proposal would work. What I was thinking about doing was the following:

    Randomly take X number of proteins from set 1 and Y number of proteins from set 2
    Determine the stat for the overlap between these 2 groups of proteins
    Repeat N times
    Plot a histogram for the results of this simulation
    Determine where my experimental group of proteins from set 1 and set 2 of size X and Y, respectively, fall in the histogram
    Calculate the probability of getting this amount of overlap based on the simulation

  4. #4
    Devorador de queso
    Points: 95,889, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,937
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: Enrichment analysis between two sets of proteins

    That's a reasonable approach. I've been wondering though
    Say I choose 30 random proteins from set 1 and 40 random proteins from set2, and the overlap of the two groups of random proteins is 10 homologs.
    Are you really just choosing these proteins at random or is there a specific reason you chose these 30 and these 40?
    I don't have emotions and sometimes that makes me very sad.

  5. #5
    Points: 11, Level: 1
    Level completed: 21%, Points required for next Level: 39

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Enrichment analysis between two sets of proteins

    I have two networks in my analysis, one from species 1 and one from species 2. Each of these networks is associated with a different number of proteins from their respective organisms (30 and 40, respectively, in my example). So, when I do my random simulation, I wanted to choose the same number of associated proteins from each species set.

  6. #6
    Devorador de queso
    Points: 95,889, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,937
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: Enrichment analysis between two sets of proteins


    Ok. Well your method sounds good to me. You're essentially doing a randomization test.
    I don't have emotions and sometimes that makes me very sad.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats