Not sure if ANCOVA or multivariable linear regression is best

#1
I’m using SAS to analyze data for a quality of life (QoL) project and I’m trying to answer two research questions. I want to present something simple for a poster presentation and a little more detailed for a manuscript. Initially I thought about using ‘Proc GLM’ since it does ANOVA, ANCOVA, and linear regression but now I’m not so sure.

The QoL scores have 5 different well-being domains (u2 urinary, bowel, sexual, hormonal). Preliminary examination of the data showed poor reliability with low Cronbach alpha coefficient (0.52-0.66). Due to incomplete responses, 8.5%-15% of the responses could not be scored – depending on the domain. Skewness of the QoL scores range from -0.2 to -2.9. Running univariate statistics, the normal probability plots take on a concave shape.

Data
DV = (continuous QoL scores – each will be examined in a separate model)
Main effect = (categorical race/ethnicity)
IVs = (age, marital status, BMI, income) combination of continuous and categorical


Research Questions
1) Are there racial differences in QoL scores after controlling for multiple confounders? (Address this in a poster & manuscript)

Approach for Research Question #1
I think of this as an ANCOVA portion of ‘Proc GLM’ procedure since I’m only reporting adjusted means and adjusted mean differences.
(Questions) Is this an appropriate of stating this? Can I really call it that since ‘raceth’ is not a continuous covariate?
Proc glm data=qol order=internal;
Class raceth (other categorical IVs);
Model famwb = raceth IVs /solution ss3;
Lsmeans raceth/pdiff;
Run; Quit;


2) What is the effect of race/ethnicity on QoL scores after controlling for multiple confounders? (Address this in a manuscript only)

Approach for Research Question #2
I think of this as a multivariable linear regression.
(Question) If I reported parameter estimates would this be considered the regression portion of the ‘Proc GLM’ procedure?

Proc glm data=qol order=internal;
Class raceth (other categorical IVs);
Model famwb = raceth IVs /solution ss3;
Run; Quit;

(Question) – I want ‘Caucasian’ to be my reference and compare the other racial groups to the reference. What’s the syntax for this? I know there is a contrast option but I’ve never used it.


(Final questions)
*I plotted the residuals of some of the continuous IVs but the patterns were not random. I tried several transformations but they didn’t make a huge difference. Would it be better to categorize the variables or not use them at all?

After adjusting for all confounders, the R^2 were low ~0.04. Doesn’t this raise a red flag? Could this be because of the large percentage of unscored data?

I apologize for the lengthiness. I’d appreciate any light you can shed on these issues.