I am very new to statistical analysis, but am very excited to to be learning it. I am working on finding reference ranges for some mammalian blood work. I hope this is the appropriate place to post my query, otherwise, my apologies.

I have successfully found my reference ranges for all my normally distributed material. Now I am working on my non-parametric categories. I am under the impression that I will find the reference ranges by arranging my data points in order and demarcating the 2.5 and 97.5 percentiles, followed by finding IC95% for each of those values (I may have misinterpreted this). Then using the upper value of the IC95% from the 2.5 percentile and the lower value from the 97.5, I have a developed a presumed reference range. I would be grateful for any guidance on the appropriate way to calculate these values.

Additionally, everywhere I look to find a calculation for IC95% I get a slightly different equation. Could someone offer an accurate formula?

I have also been led to believe that this may be an appropriate time to use bootstrapping for my non-parametric data. Unfortunately, I am struggling to understand how this would help or how this works.

Thank you for all your help and time! ]]>

I'm working on a default probability model and was modeling the default probability using logistic regression. The pearson residual and deviance residual plots are attached. Can anyone give me some idea about how to interpret these two plots? Do these plots indicate heteroskedasticity? serial correlation? overdispersion? curvature etc.?

Thanks!!

For example, take the following matrix observation and expected

x y z total

a 10 5 20 35

b 20 25 30 75

c 30 15 20 60

expected

12 9 14

25.71428571 19.28571429 30

22.28571429 16.71428571 26

this would result in 4 dof, Chi = 11.48 overall therefore significant at < .05

But, why can one not simply analyse the columns separately, given the total column as the theoretical proportion?

The estimated values come out exactly the same for each value, which makes it much easier to pin down which thing is the cause of significance.

In the above example, a chi square of independence is significant, with the largest individual chi values (c,x) and (a,z). If analysing each column separately, with the proportions given as suggested and 2 dof, then no column comes out as significant.

I understand that these have different usages in practice, but I am trying to work out whether doing this is valid or not.

Thanks in advance ]]>