Linear test for trend adjusted for cluster group?

Hi, any help appreciated with this question.

I have 8 time periods over which I wish to see if usage of a service changes using several demographics (age group, sex, HIV status, etc). There are about 2000 attendances in each time period from about 50 clusters. I can get proportions with 95% CIs using the 'proportion' command for each time period, however when I try to see if there are temporal changes using a linear trend test (eg vwls) I can't find anything that takes the clustering into account.

Any suggestions for how to generate a linear trend test accounting for clustering?

Many thanks!


One option would be to use one of the panel data estimators, eg -xtreg-, -xtlogit- or -xtprobit- (first you'd -xtset clustervar-)


My impression was that time was a covariate rather than your outcome; I thought your outcome was service utilisation that you were measuring with proportions. If this is the case then you can definitely use xtlogit (and you can add, as additional covariates, age, sex etc)
Thanks for your help again - unfortunately although time could be a covariate the primary outcome isn't service utilisation per se as there are no denominator values.

Thus in time period 1 (6 months) there are 2000 people using the service of which x are male, y are aged 15-30, z have symptom1, etc. In period 2 1900 people attended.....

Therefore all I can do is look at the proportions attending to see the change - there is a mild overall decrease in usage over the time period, but from crosstabulation it seems that younger males do not attend the longer the service is available. I could simply do a chi squared test for trend or vwls on this but it does not take into account the clustering effect.

I could get this from regression but do not have a primary variable coded 0/1.

Any further thoughts are much appreciated.


So are you trying to describe the change in proportions over time (eg x% male in the first period, y% in the second period etc)? Or are you trying to describe the overall change in the number of service utilisations, and study the impact of various covariates on the total utilisations?

If the former then I would still go with logistic regression. Male vs female can and should be coded as a 0/1 variable.

If the latter then you're describing count data so should be using some kind of count model - Poisson or negative binomial regression. Each of these has a panel equivalent in Stata (xtpoisson, xtnbreg)
Many thanks for your reply again - it is indeed the former, however the reason I didn't use xtlogit was a) apart from sex the other variables are categories (age has 4 bands, socio-economic status has 3, etc) and thus cannot be a 0/1, and b) it doesn;t strike me as being statistically correct to suggest that the sex of the participant is somehow changed by time period (though I appreciate it would give me the answer I would want).

I'd be happy to forgo the last point though, if I could find a way to use non-binary, categorical variables. Any suggestions?

As always, many thanks in advance.


You could, theoretically, do a multilevel multinomial logistic regression. This is quite hard to do (at least in Stata) and probably quite over the top.

Others may have better advice but I would be tempted to model it as a count variable using Poisson regression - so the outcome would be the number of attendances and the predictors would be time period, sex, age, socioeconomic status etc. You could take account of the clustering by using a clustered sandwich estimator (option cluster(clustervar)) or by using a panel estimator (xtpoisson after typing -xtset clustervar-)

Then the question of whether, for example, the proportion of males has changed over time can be re-phrased as whether the impact of time on attendance was different for men and women - which is a simple interaction. So the command may look something like:
xtpossion attendance timeperiod sex ses ...
and for the interaction:
xtpoisson attendance ses...