# Thread: Multivariate analysis?

1. ## Multivariate analysis?

I am working on designing a research project and I am trying to determine the appropriate statistical analysis.

My research project (simplified version) is the following:

There are 4 groups of people: 75 with breast cancer, 50 with lung cancer, and 25 with renal cancer. The 4th group is all cancers for 150 people.

Assume all groups have the same anti-cancer drug. We then measure three dependent variables: improvement (yes/no), recurrence of cancer (yes/no), and other drug complications (yes/no).

Also, let us assume there are 3 independent variables for each of the groups: age (years), sex (male/female), and length of cancer before treatment (months).

Also, let us assume then after treatment it seems like people with breast cancer have higher improvement rates, less recurrence rates, and less complication rates, in comparison to people with lung or renal cancer. However, that could be misleading if people with breast cancer were also younger, female, and had cancer for a shorter duration, in comparison to those with lung or renal cancer.

So, I am having trouble determining which statistical analysis to study the same anti-cancer drug for 4 groups (breast, lung, renal, and all cancers) for 3 dependent variables (improvement, recurrence, and complications).

Then, I would like to see if there is a correlation/statistical significance among 3 independent variables across the 4 groups (age, sex, length of cancer before drug) to the same 3 dependent variables (improvement, recurrence, and complications).

Thank you!

2. ## Re: Multivariate analysis?

In have not run it myself, but this seems like a situation for multiple multivariate logistic regression.
multiple: multiple explanatory variables
multivariate: multiple dependent variables
logistic regression: binary dependent variables.

Just from an experience standpoint, many red flags go off.
The first and most important is whether you will have enough power given the small sample sizes. For instance renal cancer represents 25 cases, and say the largest even amount 50% have an outcome. That would mean you would be predicting for approximately 13 people with an event (this is optimistic). Now add age, sex, length of cancer. You may have 7 that are female and 4 of those that are old and 2 of those that have had cancer for a longer duration. Can you with confidence (sampling variation) say those 2 people represent all other like persons? Now you are potentially making 6 comparisons (e.g., 1v2, 1v3, 1v4, 2v3, 2v4, 3v4) and have to correct your level of significance to address false discovery. So now you have 2 women you are comparing to multiple groups and have a 0.008 (0.05/6) level of significance that your p-value has to land below. All of this gets pretty dicey.

You can run the bivariate comparisons of the covariates between groups to see if they are potential confounders. Well if you find in your test that there is not difference, you still need to make sure these tests are soundly powered, because you would just be failing to reject the null hypothesis of no differences. So you will need to make sure these differences are still not clinically significant, but not statistically significant.

All cancer = any cancer that is not breast, lung, renal?

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts