Regression or Correlation?


New Member
I have collected a range of genetic and non-genetic factors from a series of natural populations and I want to know if;
a) how close the correlation between different measures of genetic diversity are (e.g. plot He against I) because I know that different measures include different weighting for richness and evenness and I want to see what the level of association can tell me about my populations. e.g. if He and I are highly correlated then maybe infrequent alleles are not important
b) b) I want to see whether non-genetic factors, e.g. density and area, are associated and if they characterise habitat type
c) and then I want to look at the association between genetic and non-genetic variables.

I feel that none of these are clearly dependent and I find that I want to swap the dependents around – is genetic diversity dependent on size or is size dependent on genetic diversity?? Model II regression? What I really want to know is if they are associated without making any assumption about why they are. If so, can I just use correlation?

I would love any feedback on this.

Hi Penny,

It sounds like you are just interested in whether the variables are associated without committing to a direction of the relationship so a correlation analysis will suffice. I always find it helpful though to remember that even performing a regression still does not offer any indication of the causal relationship between variables (just like a correlation does not). This can only be performed using a True Experimental design. A regression is really just a technique you can use when you want to build a statistical model that can predict an outcome. You can have an idea about the direction though, and so importantly, your predictors should be determined based on some a priori hypothesis on whether those predictors are useful in predicting the outcome variable (although it still will not answer the question of whether this is the true direction of causal relations). As such, if there are no a priori hypotheses on what you think is happening, or no need to predict the outcome variable, then there is no need to use regression because it will be difficult to interpret and so a correlation will be fine.

As soon as more theoretical evidence for the direction of the relationships arise, you might perform a regression then. Hope this helps :)


New Member

Thanks George. I really appreciated your feedback it. Its hard to determine what to use.

Initially - I used a whole series of pairwise correlation analysis - to kind of feel out the directionality of the relationships if there were any.

Then, because I wanted managers to be able to predict the levels of genetic diversity within a population by investigating its physical characteristics, I performed a multiple regression analysis to determine whether a good prediction of genetic diversity could be achieved and which factors gave the best indication.

The reason I did this was because I felt it would have been inappropriate to use regression analysis when the reason I was doing the work was to see if there were any correlations in between any of the factors. Although there is literature pertaining to the relationships between some of my variables, and correlations have been established in other species, I suspected that my species would not conform to what we expect based on other species.

I have been asked to change my correlation analysis to regression analysis...

But given your answer I feel quite okay about my choice of analysis (?)

Thanks again,