# how do i approach this problem?

#### titanlord1

##### New Member
how do i approach this problem? thanks in advance.

#### Dason

Hi! :welcome: We are glad that you posted here! This looks like a homework question though. Our homework help policy can be found here. We mainly just want to see what you have tried so far and that you have put some effort into the problem. I would also suggest checking out this thread for some guidelines on smart posting behavior that can help you get answers that are better much more quickly.

#### titanlord1

##### New Member
this is what i did

first find the standard error

standard error = standard deviation divided by square root of sample size
SE = 5 / Square root of 60
SE = 5/ 7.745
SE = 0.645

next, use the z table
for the 90% confidence interval, z= 1.645

1.645 x standard error
1.645 x 0.6 = 0.987

665 – 0.987 = 664.013
665 + 0.987 = 665.987

the 90% confidence interval is 664.013 - 665.987

#### Dason

You general approach is I think what they're looking for for part(b). Note that you're given the sample standard deviation though (not the population standard deviation). Have you learned about the T-distribution yet?

#### titanlord1

##### New Member
yea we learned about t distribution.

#### Dason

You probably want to use the t-distribution in part (a).

#### titanlord1

##### New Member
what makes you say to use t distribution?

#### Dason

You're given the *sample* standard deviation - not the population standard deviation.

#### noetsi

##### Fortran must die
An interesting aside is that while some text suggest that you use the t distribution whenever you don't know the population standard deviation (which is almost always in practice since you rarely have this), other text do not. They suggest using the Z distribution as long as the population can be assumed to be normal.

Which I always found confusing. It is common in six sigma to use the z distribution with sample data which is likely wrong.

#### Miner

##### TS Contributor
It is common in six sigma to use the z distribution with sample data which is likely wrong.
Six Sigma only uses the z - distribution with process capability data, which typically has a sample size of 100. At 100 samples the t-distribution should be very close to the normal distribution. For hypothesis test, the t-tests and ANOVA are used.

#### noetsi

##### Fortran must die
None of the training I had in six sigma, or the associated materials, mentioned that distinction Miner but I take your word for it.

I don't understand if you mean a sample done 100 times or a single sampling with a hundred points. I have data that has hundreds if not thousands of points that is still highly skewed because of huge outliers. It is not remotely normal.

#### Dason

I don't understand if you mean a sample done 100 times or a single sampling with a hundred points. I have data that has hundreds if not thousands of points that is still highly skewed because of huge outliers. It is not remotely normal.
The data doesn't need to be normal. CLT to the rescue.

#### Miner

##### TS Contributor
None of the training I had in six sigma, or the associated materials, mentioned that distinction Miner but I take your word for it.

I don't understand if you mean a sample done 100 times or a single sampling with a hundred points. I have data that has hundreds if not thousands of points that is still highly skewed because of huge outliers. It is not remotely normal.
I'm coming from a manufacturing and business process paradigm. In manufacturing, most processes do follow a normal distribution. Data are typically collected in small subgroups over an extended period of time (e.g., SPC) then estimates are made of the short and long term variation using within subgroup variation and overall variation after determining whether the process is in a state of statistical control. The automotive industry forced sample sizes of 100 on their supply base and that practice has spread throughout the manufacturing world. In business processes, we deal with cycle times which are rarely normally distributed, so nonparametric methods are used instead.

#### noetsi

##### Fortran must die
I work with customers in a work placement/medical program with high variability. It reflects that different environments lead to very different results in terms of normality. The common wisdom I have seen, in social sciences, is that normal data is rare. The fact that this is not the case in industry is fasinating to me. It reflects I think the reality that statistical methods are a lot easier to use in industry than services (I know a lot of six sigma theoriest disagree with that point).

Dason the central limit theorem doesn't make a mean a good descriptor of a distribution with strong outliers. Moreover, statisticians disagree on whether the CLM works as advertised with strong outliers in the tail. One book I read a decade ago, I no longer remember the author's name, argued forcefully that regardless of the size of the sample strong outliers could invalidate ANOVA (and logically other GLM methods). Additionally, they have developed robust regression specifically to deal with outliers. So if the central limit theorem address extreme outliers or skew, why do you need robust regression?

#### Miner

##### TS Contributor
I work with customers in a work placement/medical program with high variability. It reflects that different environments lead to very different results in terms of normality. The common wisdom I have seen, in social sciences, is that normal data is rare. The fact that this is not the case in industry is fasinating to me. It reflects I think the reality that statistical methods are a lot easier to use in industry than services (I know a lot of six sigma theoriest disagree with that point).
I think the key to having normal data is the element of randomness. Manufacturing processes tend to vary randomly about a target setting. The less common exceptions have common themes such as a physical restriction (cannot be less than zero, machining to a hard stop) or an artifact of the measurement method (Geometric dimensioning that converts Cartesian measurements into polar and using absolute values instead of actual deviations).

Transactional business processes lack this element of randomness. Human behavior around deadlines impact the cycle/delivery/lead times. Inter-arrival times are not independent throughout the work day because people have patterns of behavior and so on. Outliers can also be caused by behavior. If you are already a week late on a delivery, what's another week?

#### noetsi

##### Fortran must die
I agree that randomness is likely different. I think that there is more order, predictability, in an industrial process than in transactional processes (at least those that involve social services) and that external factors outside the process have greater impact in a service environment than in an industrial process on average. Outliers are I suspect far more common and have greater impact when the key drivers are human behavior than industrial production. At least for a non-start up process anyhow.

A fasinating issue in any case.

#### titanlord1

##### New Member
im pretty confused on how to do (a) can someone guide me through it?