0-3 bounded continuous dependent variable

Hello all,

I am working on a difficult (for my abilities) dataset. The dependent variable is continuous [0-3] bounded and was measured between 2000 to 2019 in several locations (not necessarily the same location in each year). So I have a spatial component that would like to account for.

My independent variables are year (as categorical and year_lin as continuous) and catvar (as categorical).

It is an observational study, there is no replication/randomization.
The goal is to examine how Y changed the past 2 decades and control for catvar across the entire region.
So I am trying different variations (different distributions and covariance structures) of this mixed model:

proc glimmix data = data plots=all;
class state year catvar;
model Y= year_lin|catvar/dist=lognormal ddfm=satterth solution;
random intercept/subject=state*year type=vc ;
random intercept/subject=state*year type=sp(sph)(long lat) residual;

The second random statement (R-side) never worked (cannot find good starting values).

I have also tried transformation to bound Y between 0-1 and use beta dist. No matter what I do, there are always issues with the residuals that I cannot solve.
The dataset is not very large (~650 datapoints).

Any ideas of what might work?



Active Member
proc glimix is a royal ass-pain. if ur going lognormal why not just log x-form and head on over to proc mixed? what about the 0's in [0, 3]? Can't take the logs of those can you? GEE is another option, and probably going to be alot easier. Doesn't glimix go to 'GEE mode' under some covariance structures.

Good luck!
Thank you both for the replies. There are only a handful of zeros. I can add +0.0001 to every value so I won't loose any datapoints when using log transformation. I was using glimmix to fit a beta distribution when I converted everything to (0-1) range. I will try proc mixed too.

The variable is continuous, there are values such as 0.34, 1.27, 2.94 etc. so I don't think ordered logistic regression is a valid approach.

Hello again,

I am still working on the same data and I have a question about the residual plots. Everything looks good (histogram, QQ-plot and box-plot) apart from the fitted vs residual plot (top left). I have tried many different models and I keep seeing this tilted rectangle.

This model works best so far.

proc glimmix data=data plots=all plots=boxplot;
class A B location year;
model Y=A|B|year_lin/ddfm=kr2 solution dist=n;
random location/subject=year;
random residual/subject=year type=ar(1);
I include year as fixed cont. variable (year_linear) and year (as categorical) in the random residual statement to account for any R side correlated errors.
I have tried to account for possible spatial R side correlated errors using the coordinates of each location but that resulted in zero covariance parameter estimates.

I believe there is something I do not account for in the model.

Any thoughts on this?