I compute the mean of the variable Y in 4 groups (A B C D) that differ for age, gender and body mass index (BMI).
In order to understand whether these confounding factors affect the mean I fit a linear regression model that tells me:
The model is significant p<0.05, but it explain very little variance of my data (0.0168)
From the output I extrapulate that:
1) gender affect the means -> the means should be corrected in order to represent groups with the same amount of male and female.
2) Group D has a mean value of Y that is less than the other groups.
3) Age affects the mean of Group D -> the means should be corrected in order to represent groups having the same age.
4) BMI does not affect the means...I can forget about it and do not compensate the bias in the groups.
Could you please tell me whether my reasoning is correct and how to adjust the means? possibly using Matlab
Code:
n=89 | n=375 | n=302 | n=166 |
GROUP A | GROUP B | GROUP C | GROUP D |
MALE 46 (51.6%) | 241 (64.2%) | 202 (66.8%) | 113 (68%) |
FEMALE 43 (48.3%) | 134 (35.7%) | 100 (33.1%) | 53 (31.9%) |
AGE 67 | 66.7 | 66.8 | 64.4 |
BMI 26.8 | 27.4 | 26.4 | 23.3 |
In order to understand whether these confounding factors affect the mean I fit a linear regression model that tells me:
Code:
Linear regression model:
Y ~ 1 + Gender + Age*Group
Estimated Coefficients:
Estimate SE tStat pValue
(Intercept) 463.11 65.509 7.0694 3.0685e-12
Age -0.68755 0.97163 -0.70762 0.47936
Gender_Male -16.569 5.609 -2.954 0.0032168
Group B -64.942 72.462 -0.89622 0.37037
Group C -113.96 76.663 -1.4866 0.13747
Group D -237.59 88.448 -2.6863 0.0073556
Age:Group B 0.98802 1.0734 0.92049 0.35756
Age:Group C 1.8465 1.1359 1.6256 0.10437
Age:Group D 3.7766 1.3354 2.828 0.0047853
Number of observations: 932, Error degrees of freedom: 923
Root Mean Squared Error: 80.5
R-squared: 0.0252, Adjusted R-Squared 0.0168
F-statistic vs. constant model: 2.98, p-value = 0.00264
From the output I extrapulate that:
1) gender affect the means -> the means should be corrected in order to represent groups with the same amount of male and female.
2) Group D has a mean value of Y that is less than the other groups.
3) Age affects the mean of Group D -> the means should be corrected in order to represent groups having the same age.
4) BMI does not affect the means...I can forget about it and do not compensate the bias in the groups.
Could you please tell me whether my reasoning is correct and how to adjust the means? possibly using Matlab