1. ## Link function for proportional outcome

I have a data set where the outcome variable is percent passing (ELA and Math tests) for school districts. I will use a 2 level multilevel model with various predictors/covariates at level one and two.

The outcome variable is percent passing. Obviously the outcome is limited to between 0 and 1 and thus it is not sensible to assume normal distribution (the scores are likely normally distributed) but using a Gaussian link could result in predictions > 1 and < 0. A logit might make sense (binomial family) as this is used in logistic regression (0/1) but it seems wrong because I can take any value between 0 and 1.

Poisson deals with count data. I don't have count.

So what link function is appropriate here and why?

If more details are needed I can furnish them.

2. ## Re: Link function for proportional outcome

logistic regression works for general binomial data (n > 1) - you don't need to just have 0/1. Do you have the value for n for each observation?

3. ## Re: Link function for proportional outcome

Dason you I didn't quite follow. I assume you're saying that I can treat it the same as if though the outcome were 1/0 and use the link function as binomial. This actually seems(ed) sensible but I had seen that using the binomial link was for 1/0 outcomes only. But maybe that was my misinterpretation.

I have observational data. The lowest level I have is district level information on percent of students who passed. I also have aggregated demographic characteristics for each district. I have perecent passed but I also have the n for the school districts so pulling actual n out is doable:

Code:
round(percent_passed * n) = n_passed

4. ## Re: Link function for proportional outcome

@vict I'd be inclined to agree except that assumption will give predicted values > 1 and < 0. This is not possible.

5. ## Re: Link function for proportional outcome

how come there is no love here for beta regression??

6. ## The Following User Says Thank You to spunky For This Useful Post:

trinker (04-18-2014)

7. ## Re: Link function for proportional outcome

Originally Posted by trinker
Dason you I didn't quite follow. I assume you're saying that I can treat it the same as if though the outcome were 1/0 and use the link function as binomial. This actually seems(ed) sensible but I had seen that using the binomial link was for 1/0 outcomes only. But maybe that was my misinterpretation.
Yeah that's your misinterpretation - this is fine for logistic regression. By the way it's a logit link (not a binomial link) with a binomial family. Basically you're saying conditioned on your covariates the response follows a binomial distribution. The logit link function is how you 'link' the covariates to the success probability - it's what models the form the of the relationship between x and p.

I have observational data. The lowest level I have is district level information on percent of students who passed. I also have aggregated demographic characteristics for each district. I have perecent passed but I also have the n for the school districts so pulling actual n out is doable:

Code:
round(percent_passed * n) = n_passed
Yeah you can do logistic regression with that data.

8. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (04-18-2014)

9. ## Re: Link function for proportional outcome

@spunky I'll let the discussion go a bit before I decide but this seems to be exactly what I'm after. I also have to do this in HLM program as the requirement of my multilevel course is that I use this program. Do you know if this is available in HLM? I have never heard of it (which means next to nothing) so maybe it's not a commonly used link function yet?

10. ## Re: Link function for proportional outcome

Originally Posted by spunky
how come there is no love here for beta regression??
It is more difficult and in the case where you actually have the counts it makes more sense to do something like logistic regression. There isn't really much motivation behind using beta regression in this type of case in my opinion. Plus logistic regression is hard enough for non-math people to interpret and understand but it's a lot easier to understand than beta regression (binomial distribution is pretty simple compared to the beta distribution...)

11. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (04-18-2014)

12. ## Re: Link function for proportional outcome

Originally Posted by trinker
I have never heard of it (which means next to nothing) so maybe it's not a commonly used link function yet?
I think you have a misunderstanding when it comes to the link function. Beta regression is using the beta distribution as the response distribution (what we call the 'family' in glm) - this doesn't directly specify the link function. The link function is how you "link" the covariates to the mean of the response at those values of the covariates.

13. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (04-18-2014)

14. ## Re: Link function for proportional outcome

Originally Posted by Dason
Yeah that's your misinterpretation - this is fine for logistic regression. By the way it's a logit link (not a binomial link) with a binomial family. Basically you're saying conditioned on your covariates the response follows a binomial distribution. The logit link function is how you 'link' the covariates to the success probability - it's what models the form the of the relationship between x and p.
Thanks, for the help on using the correct language. Great explanation.

Can I use the percent pass in with a logit link with the binomial family or are you saying use the n_passed (round(percent_passed * n) = n_passed). The n_passed makes less sense because I don't have actual data on individual students though I can make up ids for them arbitrarily and then assign pass fail based on round(percent_passed * n) = n_passed but I don't see what that buys me.

15. ## Re: Link function for proportional outcome

Originally Posted by Dason
Plus logistic regression is hard enough for non-math people to interpret and understand but it's a lot easier to understand than beta regression (binomial distribution is pretty simple compared to the beta distribution...)
this is *exactly* why beta regression needs to be used MORE often. it helps you leave people puzzled and unable to criticize your work. when faced with their own ignorance, they have little option but to think along the lines of "well, this seems complicated enough so it must be right".

but you do have a point though. i assumed the emphasis was on the percentages and not on the counts themselves but if you have the counts then go for logistic regression.

16. ## Re: Link function for proportional outcome

You don't need data for individual students. Did I say something that implied that you did? You need the total count and the total number of passed (the outcome from the 'binomial' experiment) but you don't need the outcomes for each student individually.

17. ## Re: Link function for proportional outcome

Originally Posted by Dason
I think you have a misunderstanding when it comes to the link function.
Yes this is True. I think it's clearer now. I was thinking link actually transforms the 0/1 but it doesn't it works on the aggregated outcomes (which is percent passed failed). Is this correct?

18. ## Re: Link function for proportional outcome

Originally Posted by Dason
You don't need data for individual students. Did I say something that implied that you did? You need the total count and the total number of passed (the outcome from the 'binomial' experiment) but you don't need the outcomes for each student individually.
No but my thinking is if I supply counts how will it know what the counts mean. Say I give it 900 students in district A passed and 1230 in District B passed. How will it (HLM program) know what those numbers mean without either individual data data (passed or not passed) or a way to say 900 out of 2000 students.

I mean it's sensible you can do this with equations and figure it out that way but I have to give it a data file.

19. ## Re: Link function for proportional outcome

Originally Posted by trinker
Yes this is True. I think it's clearer now. I was thinking link actually transforms the 0/1 but it doesn't it works on the aggregated outcomes (which is percent passed failed). Is this correct?
No - it doesn't do anything to the data itself. It models the relationship between the data and the mean. You don't transform the predictors.

For logistic regression you're assuming that

which says that the response has a binomial distribution with parameters (the number of observations/students observed for this response) and (the success probability for each observation/student).

That seems simple enough but the logistic regression part adds the assumption that we can additionally model the as a function of the covariates. This is what allows us to think things like "the success probability increases as the covariates increase". How we actually 'link' the with the covariates depends on ... you guessed it - the link function. For logistic regression we assume

So we are saying that if we apply the link function to we get a linear function with respect to the covariates. Notice we don't apply the link function to the covariates - we apply it to .

20. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (04-18-2014)

Page 1 of 2 1 2 Last

 Tweet