+ Reply to Thread
Results 1 to 4 of 4

Thread: Combining variables prior to performing regression analysis

  1. #1
    Points: 19, Level: 1
    Level completed: 37%, Points required for next Level: 31

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Combining variables prior to performing regression analysis




    Hello to all. I am running a regression analysis to evaluate if the joint analysis of costs and benefits can explain medicinal plant selection better than only benefits. My 'benefit' variables are 'perceived efficiency' and 'perceived taste' and my 'cost' variable is 'difficulty of acquisition'. My response variable is 'use'. I chose a health problem (constipation) and asked people in a local community to indicate plants known to treat it. Then I asked them to rank plants from (a) the most used to the less used, (b) the most efficient to the less efficient, (c) the most difficult to acquire to the less difficult to acquire, and (d) the tastier to the lass tasty. Analyses were based on average ranks. The average rank for 'use' was the dependent variable. Ten plants were used for constipation and their average rank rangerd from 1.8 (most used) to 6.4 (less used). The independent variables were calculated the same way.
    After that I did two different things that resulted in different results
    1) As all variables have the same nature, minimum and maximum values and the same medium values I combined them a priori to run a simple linear regression. I promoted several combinations (with sums to preserve the linear nature) and I found that the combination of “efficiency + taste - difficulty of acquisition” leads to a higher value of R (0.8, when the best variable alone has a R of 0.38). I compared regression lines for the combination cited above and the isolated variables and I found significant differences.
    2) I performed a traditional multiple linear regression. Although the three variables alone can are all significantly related to the response variable, when they are together in the model they have no relationship with “use”. When I do a stepwise approach, only one variable is left in the model (taste) with a R of 0.38.
    I don’t’ know if I am forcing the data, but to me It is difficult to conclude that only taste explains use when I found a R of 0.8 with an a priori combination of variables.
    The thing is: is there something wrong on doing what I did in the first set of analysis? What would be the best way to discover the best combination of variables to explain “use”? Is there another way to reveal the best combination?
    I apologize for my superficial knowledge on the subject. Can anyone help me?
    Last edited by patri; 05-08-2015 at 11:02 AM.

  2. #2
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Combining variables prior to performing regression analysis

    Your dependent variable is a set of 4 ordered categories, not a continuous variable. So multiple regression isn't really appropriate here. Ordinal logistic regression might be a better choice.

  3. #3
    Points: 19, Level: 1
    Level completed: 37%, Points required for next Level: 31

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Combining variables prior to performing regression analysis

    Acually the output of my dependent variable are average ranks. I have 10 plant species and each of them has an average ranking of use (e. 4.1 - 5.2 - 1.8) If the plant was cited as the most used by many people, it reached rank values close to 1. That is why I cannot use logistic regression.
    All independent variables were calculated in a similar way.

  4. #4
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Combining variables prior to performing regression analysis


    How do you create an average of a rank? Since they are not interval scale formally you can't average them (although admitedly this is done fairly often I would guess and I do it myself).
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats