# Thread: how do i approach this problem?

1. ## Re: how do i approach this problem?

I work with customers in a work placement/medical program with high variability. It reflects that different environments lead to very different results in terms of normality. The common wisdom I have seen, in social sciences, is that normal data is rare. The fact that this is not the case in industry is fasinating to me. It reflects I think the reality that statistical methods are a lot easier to use in industry than services (I know a lot of six sigma theoriest disagree with that point).

Dason the central limit theorem doesn't make a mean a good descriptor of a distribution with strong outliers. Moreover, statisticians disagree on whether the CLM works as advertised with strong outliers in the tail. One book I read a decade ago, I no longer remember the author's name, argued forcefully that regardless of the size of the sample strong outliers could invalidate ANOVA (and logically other GLM methods). Additionally, they have developed robust regression specifically to deal with outliers. So if the central limit theorem address extreme outliers or skew, why do you need robust regression?

2. ## Re: how do i approach this problem?

Originally Posted by noetsi
I work with customers in a work placement/medical program with high variability. It reflects that different environments lead to very different results in terms of normality. The common wisdom I have seen, in social sciences, is that normal data is rare. The fact that this is not the case in industry is fasinating to me. It reflects I think the reality that statistical methods are a lot easier to use in industry than services (I know a lot of six sigma theoriest disagree with that point).
I think the key to having normal data is the element of randomness. Manufacturing processes tend to vary randomly about a target setting. The less common exceptions have common themes such as a physical restriction (cannot be less than zero, machining to a hard stop) or an artifact of the measurement method (Geometric dimensioning that converts Cartesian measurements into polar and using absolute values instead of actual deviations).

Transactional business processes lack this element of randomness. Human behavior around deadlines impact the cycle/delivery/lead times. Inter-arrival times are not independent throughout the work day because people have patterns of behavior and so on. Outliers can also be caused by behavior. If you are already a week late on a delivery, what's another week?

3. ## The Following User Says Thank You to Miner For This Useful Post:

noetsi (03-28-2014)

4. ## Re: how do i approach this problem?

I agree that randomness is likely different. I think that there is more order, predictability, in an industrial process than in transactional processes (at least those that involve social services) and that external factors outside the process have greater impact in a service environment than in an industrial process on average. Outliers are I suspect far more common and have greater impact when the key drivers are human behavior than industrial production. At least for a non-start up process anyhow.

A fasinating issue in any case.

5. ## Re: how do i approach this problem?

im pretty confused on how to do (a) can someone guide me through it?

6. ## Re: how do i approach this problem?

Look up how to make a t-based confidence interval. It should be in your notes - otherwise there are plenty of examples online. Once you give that a try if you're still stuck let us know what exactly is giving you trouble.

7. ## Re: how do i approach this problem?

Originally Posted by Dason
Look up how to make a t-based confidence interval. It should be in your notes - otherwise there are plenty of examples online. Once you give that a try if you're still stuck let us know what exactly is giving you trouble.
is this right for question 1 a?

8. ## Re: how do i approach this problem?

Originally Posted by noetsi
None of the training I had in six sigma, or the associated materials, mentioned that distinction Miner but I take your word for it.

I don't understand if you mean a sample done 100 times or a single sampling with a hundred points. I have data that has hundreds if not thousands of points that is still highly skewed because of huge outliers. It is not remotely normal.
Regarding using the Z distribution rather than the t distribution, wouldn't the non-normalness be just as big a problem when using the t?

9. ## Re: how do i approach this problem?

can anyone please check if my work is right??? i posted my work....

10. ## Re: how do i approach this problem?

Originally Posted by boundless constraint
Regarding using the Z distribution rather than the t distribution, wouldn't the non-normalness be just as big a problem when using the t?
Commonly it is not treated that way in the literature. I think the correct answer is that it depends on the specific nature and magnitude of the departure from normality, although I have not seen any specific comments on this. Discussion of the t and z distribution tends to be 1) either use z or t. Personally I think, given that you rarely have the population standard deviation, just use t.