I would like compare our ideal sample distribution to that from other samples in a study examining the age at onset (AAO) in schizophrenia, which is typically in late adolescence. In this study, our aim is to investigate the influence of patient’s ethnicity and seasonality of birth on the AAO in schizophrenia. We will also compare our AAO distribution with those found in other published studies using a two sample Kolmogorov–Smirnov test.
I have several questions:
What would be the cut-off value for the maximum date range? For instance, the older subject in our sample is 55 years old but when comparing to other samples we don't always have the maximum value or sometimes the maximum value is larger than our samples maximum value. Therefore, what maximum data range should we use? Should we always use our own or the highest value.
What statistical software is best suited for this analysis? I have been using STATA version 11.0 for this analysis so far but if there is more appropriate program please let me know.
I'd recommend you use expert knowledge and some judgement when deciding the highest value.
Schizophrenia is a relatively new phenomenon in the current form it is in, and a lot of the people affected are generally young adults and not so much elderly people (this is from my experience).
I would recommend that you find a value to use as a way to censor values that are greater than some limit, or decide based on expert knowledge and your own judgement (from your own experiences).
You might want to do some kind of literature check to get a few statistical attributes of schizophrenics and consider the population you are analyzing as well as what your sample represents.