+ Reply to Thread
Results 1 to 3 of 3

Thread: Correlation analysis

  1. #1
    Points: 4, Level: 1
    Level completed: 7%, Points required for next Level: 46

    Thanked 0 Times in 0 Posts

    Correlation analysis

    First I have to apologise for the probably stupid question, but my statistic knowledge is pretty much 0 and after searching for an answer for hours, I simply gave up and thought I rather ask here, although I'm sure that a lot of people asked the same question before...
    I have two datasets, one containing expression values for 20 different cell lines and one the IC50 (proliferation) values for the same cell lines. Depending on the IC50 values the cell lines are grouped in the phenotypes sensitive and resistant (i.e.low IC50 or high IC50 values). A few genes show a correlation (by eye) with the phenotype, i.e.are more or less expressed (i.e.greater or smaller expression values) in the sensitive cell lines compared to the resistant ones.
    How can I test if there's a significant correlation between the expression of several genes (one at a time) and either the phenotype or IC50 values (whatever is possible/easier/better), i.e.what type of test can I use (maybe Chi-squared) ?
    I use R for all my analysis, i.e. a function/package in R that does the stats analysis I need would me amazing.
    Hope you can help me!

  2. #2
    Points: 197, Level: 3
    Level completed: 94%, Points required for next Level: 3

    Thanked 0 Times in 0 Posts

    Re: Correlation analysis

    I would try cluster analysis, eg, Ward's minimum variance method or K-means. I think this article can help you.

  3. #3
    Points: 119, Level: 2
    Level completed: 38%, Points required for next Level: 31

    Durham, NC
    Thanked 3 Times in 3 Posts

    Re: Correlation analysis

    I am not familiar with the the terms specific to your field (IC50/proliferation). I think in your dataset1 you have cell lines (categorical variable) with expression values (continuous?) , IC50/proliferation values (also continuous?), and a categorical variable designating either high or low IC50. I understand that your expression data corresponds to specific genes, but I'm nbot sure how many genes you have measured within each of the cell lines. Can you clarify what your data is like? Also, you never described dataset2

+ Reply to Thread


Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Advertise on Talk Stats