1. Re: Link function for proportional outcome

Originally Posted by trinker
No but my thinking is if I supply counts how will it know what the counts mean. Say I give it 900 students in district A passed and 1230 in District B passed. How will it (HLM program) know what those numbers mean without either individual data data (passed or not passed) or a way to say 900 out of 2000 students.

I mean it's sensible you can do this with equations and figure it out that way but I have to give it a data file.
You would need to tell it what the total count is for each school. Your response for each observation is essentially a vector of length 2 which specify the number of students that passed and the number of students that took it total.

I don't know how you do this in HLM though - I've never used that program.

2. The Following User Says Thank You to Dason For This Useful Post:

trinker (04-18-2014)

3. Re: Link function for proportional outcome

Gotcha. It all becomes appalling clear.

4. Re: Link function for proportional outcome

Originally Posted by spunky
how come there is no love here for beta regression??
There is a lot of "love" for beta regression. Maartenbuis gave (thanks again for that!) a great link to beta regression and also suggested the possibility of fractional logit in this thread.

But I guess that beta regression is mainly for variables that are not built from 0/1 variables, but instead from variables like: fraction of income spent on food, fraction of time spent on talkstats etc.

5. The Following User Says Thank You to GretaGarbo For This Useful Post:

trinker (04-18-2014)

6. Re: Link function for proportional outcome

Thanks for everyone's help. FGor future searchers I found a similar question asked on R help that basically rehashes a lot of what was said here: http://r.789695.n4.nabble.com/regres...td4484928.html

7. Re: Link function for proportional outcome

@Dason there was one response there that was interesting and I think would work with HLM:

Originally Posted by Peter Ehlers
Yes, and you can also use the proportions directly; just specify
the corresponding vector of number of trials as the 'weights'
Thoughts?

8. Re: Link function for proportional outcome

I am sorry but maybe you need some individual data. If the data is about pass and fail, then there might be individual explanatory factors like the parents education. Then you would need data like "y/n" number of passed and number of pupils and say "university education" as explanatory factor, and then the corresponding for say "high school education", etc.

If you omit level 1 explanatory factors I believe that you might get strange results like in the "classical" Robinson study from the 1950:ies in USA. (A "classical" study is something that a lot of people have heard of, but nobody have read!). There they had investigated the summary statistics of proportions who could read in each municipality (or maybe county) and the so called proportion "black". That turned out to give a high correlation. But when looking at it individually, the "correlation" was much lower.

9. The Following User Says Thank You to GretaGarbo For This Useful Post:

trinker (04-18-2014)

10. Re: Link function for proportional outcome

@Greta, useful info. I think I'll continue with this as it's for class. I had planned on trying to publish results but after your comment that sounds problematic at best. Perhaps this wasn't the best data set. But I've learned a ton (I could have used pre-made national data sets but where was the fun in that?) As far as getting individual data, that's not possible. So I think I need to cut my losses here and call it a learning exercise. Thanks for that info.

11. Re: Link function for proportional outcome

Originally Posted by trinker
@Dason there was one response there that was interesting and I think would work with HLM:

Thoughts?
Yes, but that is essentially what logit (or probit or something similar) is doing. The different variances are used as weights in weighted least squares (wls) . But then you would have a linear link function, not an S-shaped like in logit. This does not matter much if all the proportions are between say 0.20 and 0.80.

- - - -

Also, mostly for the enjoyment for Spunky, I would add that you can have a link function like "Cauchyit" (just like, logit, probit).

[logit is using the logistic function. Probit is using the normal distribution. (Maybe Spunky could use a distribution from a copula and call that link function "Spunkit"!)]

12. The Following User Says Thank You to GretaGarbo For This Useful Post:

trinker (04-18-2014)

13. Re: Link function for proportional outcome

@Greta can you give more direction with: Robinson study from the 1950:ies in USA?

14. Re: Link function for proportional outcome

About the Robinson study from the 1950:ies in the USA:

As I told you: A "classical" study is something that a lot of people have heard of, but nobody have read!). I have not read it. Just heard about it from other sources. This is what I believed that it was about:

Robinson had made a study in the 1950:ies about data from a census in the USA in 1930 about illiteracy among 10 year olds and older. I thought that the study was about “black” and “non-black” people. At 1930 there must have been a lot of people born in the 19:th century, even born during the slave period in the US. Also, there must have been a lot of immigrants from all over the world, grown up when the education system was not very good.

The data had been aggregated into districts, I am not sure if it was to counties or states within the USA.

When the proportions of illiterate and “black” was computed from districts, a very high correlation between proportion “black” and proportion illiterate was shown, something like 0.94. But when the correlation was calculated on individual data the correlation was only 0.20. I believed that this effect was because in areas with many “black”, there were also many poor “white” people. So that instead of a “race” issue, it seemed to be about rich and poor people. That was my interpretation and what I remembered from what I had read and heard from friends.

Now that I looked I saw that Robinson also had investigated if the person had immigrated to USA. Wikipedia about “Ecological fallacy” gives this description:

A 1950 paper by William S. Robinson[5]computed the illiteracy rate and the proportion of the population born outside the US for each of the 48 states + District of Columbia in the US as of the 1930 census. He showed that these two figures were associated with a negative correlation of −0.53 — in other words, the greater the proportion of immigrants in a state, the lower its average illiteracy. However, when individuals are considered, the correlation was +0.12 — immigrants were on average more illiterate than native citizens. Robinson showed that the negative correlation at the level of state populations was because immigrants tended to settle in states where the native population was more literate. He cautioned against deducing conclusions about individuals on the basis of population-level, or "ecological" data.

So even the sign of the correlation could be reversed.

So, what I thought to have been described as a “race” issue, I had believed that it was really a poverty issue. Now I saw that it could also be an immigration issue.

How wrong we can be!

The mixed model/multilevel aspect is that if we don't analyse on the individual level, or at the appropriate level, we will be cheated.

- - -

I believe that this is a reprint of Robinsons paper:

http://ije.oxfordjournals.org/content/38/2/337.full

(but that can also be a misunderstanding!)

15. The Following 2 Users Say Thank You to GretaGarbo For This Useful Post:

trinker (05-06-2014), victorxstc (05-13-2014)

Page 2 of 2 First 1 2

 Tweet