# Multiple Regression with Multiple Groups

##### New Member
Hello,
I am looking at predictor variables (scores on partner perception measures) on a criterion variable (relationship satisfaction), in order to see how scores on multiple measures predict the participants overall relationship satisfaction. I believe that a multiple regression is the best statistical method to do this ( but I'm open to suggestions). The challenge I'm facing is that I'm looking at multiple populations, (differing age ranges, gender, races). In order to compare these groups across the regressions, do I need to run a multiple regression for each group that I am examining? If so, should I use a bonferroni to correct for the many analyses I am running? Is there a better test I should be using that will help me examine how multiple predictors predict an outcome for multiple groups? Thanks for any help you can provide!

#### Philyuko

##### New Member
If your DV (that which you are trying to predict here) is categorical then you could run a discriminant analysis (2 or more groups) or a binary logistic regression (2 groups).

Many things to consider ...

For the DA try to make sure you have a ratio of 20:1 (at least 20 items per IV)
For the BLR try to make sure you have a ratio of 40:1 (at least 40 items per IV)

The DA assumes normal distribution for the IV, the BLR doesn't - hence the trade off.

Now - before anything else - does this sound like it's on the right lines?

##### New Member
Thank you!

Yes, this sounds like it's on the same lines. Thanks for your help. I just want to reflect what I read: It sounds like I should run the multiple regression first on the entire population, and then run a discriminant analysis and/or BLR after the inital analysis to compare across different groups. Would the DA/BLR qualify as post hoc anaylses then? Am I understanding correctly? Thanks again.

#### Philyuko

##### New Member
First question is ...

Is it categorical or is continuous?

If it's categorical - there is no MR to be run ...
If it is continuous - there is no DA or BLR to be run

so ... you choose ;-)
Can't have both
Must be one of the other ...

we'll take it from there ...

##### New Member
The DV is categorical. It is level of relationship satisfaction
So I would use a DA or BLR..is that right?

##### New Member
I have multiple independent variables- scores on 6 different measures...1 outcome variable, relationship satisfaction, and I also want to examine outcomes of different groups.

#### Philyuko

##### New Member
"I have multiple independent variables- scores on 6 different measures"

OK 6 ...

"1 outcome variable, relationship satisfaction"

How many groups? 1, 2, 3???
How many in each group?

"and I also want to examine outcomes of different groups."

which groups?

##### New Member
I want to examine results across age (3 groups), gender (2 groups), race (5 groups), and cohabitation status (2 groups). Total of 12. Each group should have at least 50 persons per category. (I havn't collected data yet, but will be posting my study online and expect a relatively large n....goal of 400 people)

#### Philyuko

##### New Member
OK - well each of those can become predictors too ... right?

So - as it such a mixed bunch - I would def go for a DA here.

With 400 cases, and about 6 predictors, you have a nice ratio.

I STRONGLY recommend that you divide your data (randomly) into a training and testing set ... if this is, you're trying to find a model for predicting the DV. Split them at 67%/33% (Witten & Frank, 2005). Use the 67% for everything and once you have made your model, use it to see how well it does on the remaining 33% ... that final 33% (test set) tells you how good your model is.

Ummm ... what else ...
Report findings in terms of precision/recall/ and F1
You can load variables one by one if you have strong theoretical reasons for one be more important than another ... or you can do them all at once ... with this much data the training/test is good.

Do an ANOVA on the training data first because any non-sig variable should not be included in the DA

A bunch of papers here
https://umdrive.memphis.edu/pmmccrth/public/Phil's papers.htm?uniq=-v6s2vs
do DAs
That should give you a good idea

good luck!

#### noetsi

##### Fortran must die
If you DV is categorical ordinal logistic regression would make sense. If you think that subgroups influence predictions than a dummy variable to address group as a predictor makes sense (although the more complex multilevel model would make more). If group matters, if there are random effects, than a robust standard error (White's I think) would probably be recommended.

If group matters, if an effect is nested inside a group, than at the least an ICC (interclasscorrelation) statistic for the empty model will be useful. Its pretty simple to run. It shows how much group matters. You should have at least 30 groups to do this (there are bootstrapping methods to do it with less but of course that is more complex).