I am working on a difficult (for my abilities) dataset. The dependent variable is continuous [0-3] bounded and was measured between 2000 to 2019 in several locations (not necessarily the same location in each year). So I have a spatial component that would like to account for.

My independent variables are year (as categorical and year_lin as continuous) and catvar (as categorical).

It is an observational study, there is no replication/randomization.

The goal is to examine how Y changed the past 2 decades and control for catvar across the entire region.

So I am trying different variations (different distributions and covariance structures) of this mixed model:

proc glimmix data = data plots=all;

class state year catvar;

model Y= year_lin|catvar/dist=lognormal ddfm=satterth solution;

random intercept/subject=state*year type=vc ;

random intercept/subject=state*year type=sp(sph)(long lat) residual;

run;

The second random statement (R-side) never worked (cannot find good starting values).

I have also tried transformation to bound Y between 0-1 and use beta dist. No matter what I do, there are always issues with the residuals that I cannot solve.

The dataset is not very large (~650 datapoints).

Any ideas of what might work?

Thanks