+ Reply to Thread
Results 1 to 3 of 3

Thread: Linear regression using compositional data as independent variables

  1. #1
    Points: 5,897, Level: 49
    Level completed: 74%, Points required for next Level: 53

    Posts
    70
    Thanks
    25
    Thanked 1 Time in 1 Post

    Linear regression using compositional data as independent variables




    I have a set of ratios that the sum results in 100% (compositional data in other words). These 7 variables need to be implemented in a linear regression model (I am using heckman but its the same) as independent variables. I can't add them all seem it violates the logic of regression in terms of on coefficient changes while the others remain constant.

    What are my options?
    - Can I use just 6 out of the 7 and that would be okay?
    - I know that Aitschison has contributed alot of work on the matter. It seems that one choice is to use centered log ratio transformation for the compositional data but I have no idea how this can be implemented. I am using R and the compositions package has a clr function but it produces a variable type that I've never seen before.

    Any help would be greatly appreciated.

  2. #2
    Points: 5,897, Level: 49
    Level completed: 74%, Points required for next Level: 53

    Posts
    70
    Thanks
    25
    Thanked 1 Time in 1 Post

    Re: Linear regression using compositional data as independent variables

    In attempt to answer the second option. I tried to use clr in linear regression and it seems to be working fine. Note: This is just for testing. I know that Site is not a numerical variable in this dataset. Can anyone verify for me that this a proper and valid usage. I know that logs cannot be interpreted the same since the variables have been transformed. I want only to show that they have an effect on my model.

    library(compositions)
    cdata <- Hydrochem[,8:19]
    ddata = clr(cdata)
    summary(lm(Site~K+Mg+Ca+Sr+Ba+NH4+Cl+NO3+PO4+SO4+HCO3+TOC, data=Hydrochem))
    summary(lm(Site~ddata, data=Hydrochem))

  3. #3
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: Linear regression using compositional data as independent variables


    Hi micdhack,

    I might be too late.
    NOT ddata = clr(cdata) but
    ddata = ilr(cdata) or
    ddata = alr(cdata).
    In the clr transfomration the rows sum to zero. In the ilr and alr they don't.
    Then go again.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats