# Calculating Ppk and Cpk. Am I doing this right?

#### CABAL

##### New Member
Hi all.

I am trying to make sure that I am doing this the right way. I hope you guys will look through my calculations.

First, I have a list of 42 measurements. They are presented lower on the page. LSL = 4.070, USL = 4.090. Now, I want to calculate Ppk from the data. I take the mean and SD of the complete data (42):

Ppk_upper = (USL-Xbar)/(3*sd)
Ppk_upper = (4.090-4.079476)/(3*0.002640095)
Ppk_upper = 1.32879

Ppk_lower = (Xbar-LSL)/(3*sd)
Ppk_lower = (4.079476-4.070)/(3*0.002640095)
Ppk_lower = 1.19646

Ppk = 1.19646, as this is the lowest
-------------------------------------------------

Cpk_upper = (USL-Xbar)/(3*sigma)
Cpk_lower = (4.090-4.079476)/(3*0.001224)
Cpk_upper = 2.86601

Cpk_lower = (Xbar-LSL)/(3*sigma)
Cpk_lower = (4.079476-4.070)/(3*0.001224)
Cpk_lower = 2.58061

Cpk = 2.58061, as this is the lowest

For Cpk I group the values below in pairs of two - as this is the sample size. Then I take the mean of each pair, and then take the mean of the means.

I find sigma for Cpk by taking find the mean range by subtracting the measurements in a pair from each other then taking the mean of them.

Sigma = mean of ranges/D2
Sigma = 0.001380952/1.128
Sigma = 0.001224

4,078
4,079
4,076
4,076
4,081
4,079
4,082
4,084
4,08
4,08
4,081
4,084
4,077
4,079
4,078
4,08
4,079
4,079
4,077
4,078
4,08
4,081
4,078
4,078
4,077
4,078
4,078
4,077
4,082
4,08
4,075
4,08
4,081
4,08
4,083
4,082
4,087
4,085
4,077
4,074
4,079
4,079

#### Attachments

• 803.5 KB Views: 1
Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Alot of acronyms and no descriptions. I am thinking Cpk is an engineering stats terms @Miner.

#### Miner

##### TS Contributor
Cabal,

Welcome to Talkstats. I am afraid that you may have been waiting a long time for a response. I am probably the only person that has a clue what you are asking. I can tell that you are using the correct formulas for Ppk and Cpk, but cannot confirm that you are using the correct standard deviations for each without the entire data set. You mentioned that you were using 42 data points, but there are only 32 shown, so I cannot duplicate your results. Provide the entire data set and I will confirm your calculations. I also noted that the subgroups provided were not in a state of control, so Cpk would not be valid.

For future reference, if you have quality specific questions, I recommend the following discussion forums:
Disclaimer: I am a moderator on both forums.

Last edited:

#### Miner

##### TS Contributor
Alot of acronyms and no descriptions. I am thinking Cpk is an engineering stats terms @Miner.
These are process capability indices. Essentially a ratio of the available tolerance divided by the process variation. One index looks at the short-term within-subgroup variation, the other looks at the overall long-term between-subgroup variation.

Last edited:

#### CABAL

##### New Member
Miner,

Thanks for your reply - I will check out the other forums. Thanks for agreeing to check my calculations. I have edited the first post, there should now be 42 measurements. I also attached a picture - to visualize the groupings (samples of 2) and the range. I am looking very much forward to reading your findings!

hlsmith, yes - a lot of fancy acronyms for an old microbiologist such as myself

Thanks,
Nicholas

#### Miner

##### TS Contributor
I found one error in your calculation. The correct range for subgroup 20 is 0.003. This makes the correct Rbar = 0.001429.

I am including the results as calculated using Minitab for your comparison.

The Xbar/R control chart is showing indications of over-dispersion in the Xbar chart. This may be an indication that you are not using rational subgroups. If you can provide more information about this process, I may be able to determine the cause.

#### CABAL

##### New Member
@Miner Thank you very much for this! I hope it is OK for me to ask some questions - I am trying to understand, and not just using automation. Which is where I believe we are at fault at my company. This is not my area, I just got hooked on trying to figure it out.

My target value is 4.080. Regarding Minitab, where do you tell it to use the sample groups? It says 42 samples in all on your graph, but I cannot see where it is stated that it should use them as 21 groups of 2 samples each.

Is the only difference between calculating Ppk and Cpk the SD/sigma? Earlier I was under the impression for a day that Ppk used the mean for the entire dataset (not sample groups), and Cpk use the mean of the means of the samples. I now believe that both Ppk and Cpk utilizes the mean of the means of the samples. I am correct?

Thanks,
Nicholas

#### Miner

##### TS Contributor
1. I entered a subgroup size of 2 in the screens below.
2. That is correct. Both use XdoubleBar as the mean. The difference is in the standard deviations. Cpk uses the within subgroup variation, StDev (Within), which is estimated using Rbar. Ppk uses StDev (Overall). The formulas used by Minitab may be found here.

#### noetsi

##### Fortran must die
This reminds me of the wonderful years of Sick Sigma. My dissertation was on the implementation of TQM in government (not that this actually went anywhere in the public sector).

Good people like Miner actually use it. To bad it is not used much in government.

#### Miner

##### TS Contributor
Use it and teach it. I just finished teaching a green belt class yesterday, and will be teaching another green belt class, plus a black belt class next month. Been doing this for my company for over 12 years.

I don't know if they still practice it, but the US Army was big into Six Sigma, particularly in Logistics.

#### noetsi

##### Fortran must die
They were in the nineties. Probably still are in logistics, but I would guess not elsewhere.

I took a black belt class but never got a license associated with it. We gave up on TQM because of political realities. There is not a lot of interest anymore in formal improvements. Sadly no one with influence pushes change and we get the same amount regardless of how well we do. Without that there are not going to be difficult efforts to improve in service industries. At least not very often.

#### CABAL

##### New Member
Hi again

I was wondering - regarding the UCL for the range chart - what is the reasoning behind that level? If I understand it correctly, that is a calculated level, yes? or should it be specified?

Thanks!

Last edited:

#### Miner

##### TS Contributor
The upper control limit (UCLR) is a calculated value based on Rbar. This is a link to the formula.

The UCLR is the upper limit for expected within-subgroup variation. Variation above this limit is indicative of a change in within-subgroup process variation. There is not typically a LCLR unless the subgroup size is large.

The UCLXbar and LCLXbar likewise set limits for the expected between-subgroup variation.

It is important to note that while these control limits were set at 3 standard deviations, they were established as an economic tradeoff between the expense of missing a true process change vs. the expense of chasing a false alarm. 95% limits (2 StDev) were initially considered, but were rejected because the cost of chasing false alarms was too high.

#### Miner

##### TS Contributor
that is really interesting miner.
Industrial statistics is all about reducing costs. Variation causes rejects, and rejects cost money. Therefore, reduce variation. Control charts identify the unusual variation (worthy of expending resources to reduce) from background variation. If the background variation is still too great, that is where DOE and six sigma projects come into play.

#### noetsi

##### Fortran must die
Many years ago I did a lot a research on TQM (it is what my dissertation is in). I always thought a higher quality product was what industrial statistics was about.

#### Miner

##### TS Contributor
Unfortunately, you are correct, but manufacturers are slowly coming around to realize that it is true. Trying to inspect quality in does increase costs, but designing quality in and taking variation out of the process does reduce costs.

#### noetsi

##### Fortran must die
I think the logic that the US business community uses is to create new markets and exploit them rather than to create high quality products. My own views, and I am certainly not an engineer, is US culture is not well designed generally to pursue quality or for that matter industrial engineering. This was particularly obvious in the TQM phenomenon.