# Thread: Comparing the differences between two samples

1. ## Comparing the differences between two samples

Okay, it has been quite a few years since I have really used any statistics concepts so I am puzzled on what is probably a simple problem that is work related.

We have a model that we use that rates a profile with a number ranging from 1 to 30... 30 being awesome, 1 being terrible - Anything below 10 we don't like. Now we are looking to build a new model based on different criteria and the range of scores would be from 1 - 40 this time. Now I wan't to know what the new threshold for a minimum score should be.

We took a sample of 40 profiles (all 40 scored above the minimum of 10 on the original model) and examined with both models to see what the difference was. The new model tends to score higher (mostly because the range of possible scores increased on the high end. Basically, the new model scores about 5 points higher on average. So the question becomes, should we just make our new minimum 15 points now (originally 10 + the 5 point higher average)? Or should I be looking at standard deviation?

Say I want 95% of profiles that passed the first model (scored over 10 - true for all 40 of the sample) to also pass the 2nd model - So I would think that is just 2 standard deviations?

So to do this, would I simply solve for the standard deviation of the residuals? Basically, square the difference in scores between the two model for each of the 40 samples, sum them, divide by 39 (40-1), and take the sq root of that number? The result would be one standard deviation. If we wanted to capture 95% in the new model we would just increase our minimum score of 10 by 2x that standard deviation

I don't know if any of this made sense, but if anyone can offer suggestions to get on the right track it would be much appreciated.

2. ## Re: Comparing the differences between two samples

Hi Cheeseburger.
It sounds like you have two sets of scores for each of the 40 profiles. Here is one suggestion - In Excel, make a scatter plot of the data with the 30 scale on the bottom and the 40 scale vertical. The points will probably be reasonable straight(ish). If it is reasonable straight, put a linear trendline on the chart and include the equation on the graph. Then use the equation to find the y value that matches with x = 10. This is your new cutoff point.

Be wary of the normal data idea unless the data is normal. Incidentally, there is 95% within 2 sd, so there is only 2.5% below 2 sd.

Cheers, kat

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts