We have a data structure with 11 columns: 3 species and 8 chemical-physical indicators of water quality. However, two of the species are zooplankton well known for their use as biological indicators of water quality. That's why it seemed likely that we have only one variable (zebra mussel density in this case) and 10 independent variables (including these two species of zooplankton). To check, we have made a correlation/regression analysis between zebra mussel density and respectively the two species of zooplankton, with the outcome that these two zooplankton species individually have no significant effect on zebra mussel density. That is why we believe that it does not make much sense to include them as independent variables, and therefore we can include them as response variables. So: we have 3 response variables (the species) and 8 independent variables (water quality). However, what kind of analysis would be better here? In multiple regression, I think that there is usually a response variable alone? So three times multiple regression, one for each response variable? Or is there something better? Secondly, we do not get a clear picture if the variables used should be normally divided, or only the residuals of the outcome.