im trying to finish my thesis but non-surprisingly have trouble with the statistics of it.

The study im conducting is concerned with an intervention for children and seeks to foster 2 dependent variables (knowledge and help-seeking) which are considered to be protective factors for psychopathology. There is no control group, all students (200+) received the treatment.

All variables are measured via questionnaire.

My first hypothesis was that the treatment is effective for the whole sample and paired sample t-tests indicate that it was.

In my second hypothesis I want to know if the treatment is as effective in four a priori defined (high risk) subgroups as it is in the whole sample. Naturally I expect to find no difference as high risk subgroups should benefit (at least) equally as the rest.

I have parametric data for the four subgroups: depressed, suicidal, impulsive and avoidant. I plan to define cut-off values to transform these data into categorial variables (i.e. depressed /not-depressed).

My rationale would now be to perform a mixed ANOVA with the within subject factor "time" and the between subject factor "risk group".

The following questions now arise for me:

1) Is a mixed ANOVA the go to procedure?

2) At least one of my subgroups is not normally distributed. Also for its questionnaire there is no pre-existing cut-off value. How do I define a cut-off and what arises from the assumption of non-normality? For info the highest quartile still contains 30+ subjects.

3) Its safe to assume that the subgroups are not independent which is also signified by considerable overlap (depressed and suicidal). Are there any problems the result from this? I know I could still perform the mixed Anova just as well right?

Thanks so much for your help. I hope this is my last run-in with statistics of this kind as im really not fond of doing research.

Best Regards