# Regression analysis sample size

#### Statstuck

##### New Member
Hello,
I am undertaking a study looking at patients with AF (a heart rhythm disorder).
I'm looking at how continuous baseline variable (X) predicts a improvement (a continuous measure of function) (Y) after a discrete treatment.
I plan on using regressional analysis after collecting the before and after data in my cohort.

My question is-
I want to look at lots of other independent variables in the same cohort (e.g. hair colour, height, age, blood tests) for association with Y.
1. Can I do this by simply replacing the new variable with X and plotting a new regression chart? (Assuming independence).
2. Do I need to increase my sample size if I am doing this additional variable analysis.

Thanks very much for your help with this,

Nick

#### obh

##### Member
Did you think about multiple regression? using several predictors (Xi) in the model?

#### Statstuck

##### New Member
Hi @obh
Yeah, considered that, but I figured Xi model would mean I'd need a muck greater sample size (for 80% power results).
If there was a way I could assess additional variables without increasing sample size significantly, it would allow me to analyse many more features.

Is that possible?

#### obh

##### Member
I assume for better prediction, you need to use multiple regression.

What is the point of running many single regressions? What do you want to achieve?

#### Statstuck

##### New Member
I want to try and find associations of outcome. Thus simultaneously compare several variables.
As far as I understand- for multiple regression you need larger sample sizes. My query is- is the significance of any findings different if I analyse as single regressions individually vs multiple regressions together. I assume for the former, I can use a smaller sample size.

#### obh

##### Member
Is your goal only to understand what IV (xi) correlate with your DV (Yi)?
Or do you want to know to predict Y?

Last edited:

#### obh

##### Member
Ps even if you run a lot of single tests, randomly some may become significant.
That why you need to take a smaller significant level => lower power => you need a bigger sample size.

#### Statstuck

##### New Member
Predict Y is the goal. I guess another way to look at it is-
If I want to run a multiple regression model and the max possible no. of patients I'll be able to recruit will be 100.
How many IVs will I be able to look at?