Search for logistic regression or logit regression (the same thing). Probit regression is very similar.
(That is the most natural choice. But it is not non-parametric analysis.)
I'm asking for advice. Topic of my research is brain iron accumulation. My task is the next . I have one dependent variable that is cathegorical and binominal(patient has the pathology or does not). And six independent variables that are continuous and nonpatametric.
The question is :how can i predict the dependent variable using my data.
Thanks
Search for logistic regression or logit regression (the same thing). Probit regression is very similar.
(That is the most natural choice. But it is not non-parametric analysis.)
A thing like "nonparametric variables" does not exist. Instead, there are non-parameteric statistical analyses (analyses which do not make certain assumptions).And six independent variables that are continuous and nonpatametric.
With kind regards
Karabiner
»Jetzt kann mich der Führer mal am Arsch lecken.« (Ernst Kuzorra, 1941)
ondansetron (11-14-2017)
What do you mean by "data"? Do you mean the independent variables (predictors)? No "parametric" test assumes them to be normally distributed.
If you mean the dependent variable (DV): there is no need for a dependent variable to be normally distributed. For parametric models, instead it is the distribution of the model's prediction errors (residuals) which matters, not the distribution of the DV itself.
And if your sample size is large enough (say, n > 30 or 40 or so), then even normally distributed residuals are not necessary for "parametric" analyses.
With kind regards
Karabiner
»Jetzt kann mich der Führer mal am Arsch lecken.« (Ernst Kuzorra, 1941)
There are no (distributional) assumptions about the independent variables in regression. The independent variables are assumed to be fixed values (and thus have no distribution).
In logistic regression the dependent variable is assumed to be binomial distributed. There is no assumption about the normal distribution and no need to try to transform to normal distribution.
In general:
Some people seems to believe (after having read an elementary course) that there are only two possibilities; either normal-distribution-methods or non-parametrics. That is wrong. There are many parametric distributions (that are skewed and so on) that does not look like the normal distribution (e.g. binomial distribution, Poisson distribution, exponential distribution).
ondansetron (11-14-2017)
You are absolutely right about my statistic skils,
BUT when i was comparing two groups (one of them had the pathology and the other did not) using Mann Whitney test i've got got 3 independent variables that were differtent in two groups, and the
difference was statistically significant (in the begining i had six independent variables) .
So now i whant to analize the power of the ifluence of each of those statistically significant independent variable (or find the coeficient of correlation , or represent it as the odd ratios or some other mystic **** )on the depemdent variable. And i think that it will be rather small because there are at least 10 more independent variables that can also influense the dependent variable that i study. For example as my study is connected with brain iron deposition i have some patients (thete were much fewer of them ), who had a lot of iron in their brain but didn't have any signs of pathology(and i think that's the reason of skewenesss)because of the other independent variablest that i dont have.
So, if you will give me an advice or just some link that will help me to dig out some gems out of all the mud that i'm digging in, i'll be very greatfull.
The idea is that the dependent variable (DV) is explained by the independent variables (IV1, IV2, ..., IV6). So that the "arrow" goes from the IVs to the DV.
DV <--- IV1, IV2, ...,IV6
So that pathology or non-pathology is explained by e.g. age and exercise etc. But if you do an Mann Whitney test then you investigate how the two groups pathology or the non-pathology influences age. That does not make sense. Mann Whitney is just irrelevant here. (It is by the way sensitive to "spread", so it certainly has its assumptions (that is often violated) .)
The correlation is by the way a parameter. If you want that you do parametric estimation.
Go ahead and do a multiple logistic regression. Then you will also get an odds ratio.
I was performing M-W test just to detect are there any differences among two groups because at firs it was only hypotesis that the parameters that i used to reveal intergroup groups can differ, what`s wrong with that?(For example i could compare the length of fingers among people with heart desease and without- that would be a nonsense)
In all medical scientific articls i`ve read the pathological state is presented as dependen variable
Last edited by Mykola; 11-14-2017 at 05:20 PM.
But if you do a Mann-Whitney test here, you just investigate whether the two groups differ with respect to age. I.e. whether there's an association. The test itself does not say anything about influences. That is a matter of design and of interpretation.But if you do an Mann Whitney test then you investigate how the two groups pathology or the non-pathology influences age.
E.g. one can conduct a radomized experiment with, say, 7 groups receiving different dosages of a toxic agent, and measure whether subjects (plants) are killed or not during the experiment. Then a M-W test can be used to investigate whether those plants which were killed had received higher dosages than the survivors. If yes, then the interpretation is straightforward (IMO): higher dosages here led to more deaths.
With kind regards
Karabiner
»Jetzt kann mich der Führer mal am Arsch lecken.« (Ernst Kuzorra, 1941)
Here is a general comment, not particular to your post. Medical literature doesn't exactly use the right methods at the right time or in the right way (nor do they recognize statistics as something that does not follow a cookbook approach). So, the argument that medical publications use one method or do something one way is a poor argument. There is a lot of "oh, this group published with this analysis, that must be the right way to do it."
To piggyback on ondan's comment; many times reviewers of submitted papers will encourage authors to change their analyses to something the reviewer is more familiar with. A heads-up, you do not have to change your analyses if you can justify their appropriateness. But if two approaches are comparable in generated output, at times it may not be a bad idea to mirror existing literature to make your methods a little closer to theirs in order to be able to compare them better. But as ondan wrote, there are many ways to do things and not all are correct.
Stop cowardice, ban guns!
I agree that there is something in what Karabiner says. I guess that in many cases the Mann Whitney would give the same result (sig/no sig) for MW as a logit model.
But the null hypothesis in Mann Whitney is P(x1 > x0) = 0.5, where x0 is the age of those who are not sick and x1 is the age of those who are sick. But the age (or the relevant IV in this case) is not normal according to OP and possibly skewed and heteroscedastic, and Mann Whitney is sensitive to that (Search for Fagerland-Sandvik).
Suppose that there is just one designed variable and it is only on high/low levels. That would be a very unnatural Mann Whitney test.
Note that the OP said:
In contrast, one can evaluate all seven IV:s with the dependent variable sick/not sick in a logit model.I have one dependent variable that is cathegorical and binominal(patient has the pathology or does not). And six independent variables
But this is about optimal inference. It is known that the dependent variable is binomial. Logit is estimated with maximum likelihood (ML). ML gives consistent and efficient estimates. How could anything be better than maximum likelihood? And by Neyman Pearsons lemma it would give the most powerful test.
Tweet |