Correlation coefficients as a variable in a multiple regression?


I would like to do a multiple regression using correlation coefficients as my dependent variable - is this possible?

Specifically the project is about whether factors like personality, age, gender etc will predict how good a person is at assessing their own performance on a task. So the dependent variable is the correlation between how well they did and how well they thought they did.

I am a bit worried though that the correlations will be clustered up the high end as a preliminary glance at the data collected so far seems to show people are actually pretty good at assessing their own performance. If so, I won't have a normally distributed set of data - does that matter?

If so, what can I do about it? Would you recommend converting the r values to z scores or any other kind of transformation before using in the multiple regression?

Thank you for your help.


TS Contributor
It would be hard, wouldn't it?

Using correlations as dependent variable is not incorrect in theory. Still, it can be a little bit tricky. At first, you should consider that calculating a correlation requires a normal distribution in both variables and it can also be affected by a small sample size. Perhaps you can try some nonparametric correlations; that, of course, will depend on your data. The main point is that you will have to take care about the assumptions of correlations and then the assumptions of regression.

Another issue is that the interpretation can become complicated. Regression studies the association between the predictors and the expected value of your response variable. So, you should be associating the predictors with the expected value of the correlation for a given individual. That concept can be somehow confusing.

Ironically, you shouldn't be too worried about the distribution of the response. There is no normality assumption in the response variable. Just be sure to test the model assumptions, since skewness in the response can lead to outliers, which can become a problem. Either way, there are some robust regression techniques and estimators to overcome that.

In briefing, the analysis is possible, although it will be harder than a "common regression". In my opinion, if you could measure some difference or some other way to obtain an estimation of the ability for a person to assess his performance, it could be easier.

If I can be of any further assistance, please feel free to ask. Good luck