my main variable of interest is the blood level of a protein, continuous, normally distributed (not completely sure though, should check graphically since in my populations there are only cases, no controls).

I step of analysis

my outcome is the difference between time 1 and time 0 of a variable that expresses whether the person has a symptom or not.

basically it can have 4 values:

- presence symptom at T1 and T0,

- absence symptom at T1 and T0,

- presence symptom at T1 (worsening),

- presence symptom at T0 (improvement).

(I might decide to collapse the first two in one single category since they mean that there has been no change in the situation globally, so it would become a categorical variable with three different values).

I shall run the analysis at unadjusted level and consequently considering few covariates.

I basically would like to investigate whether or not high values of the protein can be predictor of the difference in symptoms at T1 considering T0. My first thought was a multinomial logistic regression (putting as ref level the worsening of situation, since it is a logical evolution of the pathology). But I still have doubts whether or not anova would be a better idea.

II step of the anlysis:

symptoms can be evaluated with a score that goes from 0 to 10 so in the second level of the analysis my outcome is the difference between scores:

outcome: score T1 - score T0 (it goes from 10 to -10). 10 stands for the stronger worsening, - 10 for the stronger improvement. somehow the variable can be considered ordinal but I am not completely sure.

mean and median are both 0 (where median is 0 and mean is 0.2 or something like that) so there is a distribution quite equal at both sides.

My though here was to use tertile of the outcome ( tertile with the greater improvement, middle and greater worsening) and run another multinomial but I have a lot of doubts on this last one. What do you think? I excluded the linear regression (or some form of correlation) as well as the ordinal linear regression because there are quite few values in the outcome, but at same time too many to treat the variable as a categorical at first.

Thank you in advance for your help