Comparing Slopes Best Fit Lines Between 2 Groups

So lets say I have 2 data sets. Both sets contain various types of information about a person, including age as well as other factors relating to health (eg. cholesterol, blood pressure, etc)

The difference between the sets is that set 1 is sampled from the general population and set 2 is sampled from only healthy individuals with no risk factors.

Now say I do various bivariate correlations between age and the factors I am investigating. I found tests that have a valid correlation with age in both groups. I then plot the results for each test best fit line (note: That is one plot per set for each test).

How would I statistically compare the slopes of the best fit lines for each individual test, and not the tests as a whole?

I'm aware if I just make a table of the slopes for set 1 and set 2 and then do a chi-squared test, I could see how the sets compare to eachother as a whole, but I want to see if there is any way I can say:

In this one test, there is a very significant difference between the slope of the best fit line in set 1 and set 2.

I can't seem to wrap my head around this one, any help would be much appreciated!

Edit: If you know of a method that could be done using SPSS that would be even more helpful.
Last edited:


Cookie Scientist
Combine your two datasets into a single dataset, with a factor (perhaps named "set") indicating which set each observation came from. Then you want to run a model like DV ~ IV + set + IV:set, and you want to test the IV:set interaction term. This will tell you if the slope for IV differs between sets.
I have no idea how to evaluate this statement:
"DV ~ IV + set + IV:set"

I get what DV and IV mean but I dont understand what sayin DV ~ IV means and what you mean by IV:set.

So I went ahead and created a third set in SPSS, which is the combined sets.
There is a new value for each data point called group and the two values are 0 and 1.
Value 0 is the whole group, Value 1 is the healthy group.

So what should I do next exactly? (You don't need to put it in spss terms, but could you phrase it with words this time)


Cookie Scientist
Okay, I guess I was a little too wrapped up in familiar jargon :p. "DV ~ IV + set + IV:set" is a piece of the syntax that you would use for running this model in R, but basically it reads as a regression equation, except with ~ in place of the equal sign. So in this model we are regressing the dependent variable (DV) on the independent variable (IV), the set indicator, and the interaction between IV and set, which is what "IV:set" is supposed to represent. Is this more clear?

The interaction term ("IV:set") is what you are interested in testing in this model. As a word of caution I would not recommend interpreting the coefficients for IV and set in this model that includes the interaction term. If you want to interpret these coefficients, I would recommend doing this in the context of a model without an interaction. (Alternatively you could recode your IV and set variables to make the coefficients more interpretable in the interaction model, but I am not sure if you want to bother with this.)


New Member
Hi guys

I am also trying to perform the same analysis - that's comparing regression lines between two datasets. I have read this can be done using an ANCOVA, but I am struggling as to how to arrange the data as well as to how to actually run the test.

I have combined my two datasets into a single dataset, with a factor (named "group") indicating which group is which. In addition, I have my combined dependent and independent variables.

Now - how should I run the test? Should I go to Analyze → General linear model → Univariate and then what? I have the dependent variable, fixed factors, random factors and covariate boxes available. Should I include my dependent variable into the dependent variable bos; my factor ("group") into the fixed factor box; and the independent variable into the covariate box? If all the later is correct, should I take the results shown for my factor ("group") from the inter-subjects effect spss output as the stats result (i.e. the stat difference between regression lines)

Sorry - but my knowledge in stats is very limited!!! Help, please!!!

Many thanks!!!