# SPC - normal normal distribution?

Discussion in 'SPC - Statistical Process Control' started by essegn, Mar 9, 2019.

1. ### essegnMember

Joined:
Feb 5, 2016
Messages:
47
6
Trophy Points:
7
2. ### Bev DModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
621
680
Trophy Points:
92
Location:
Maine
Control charts do not require a Normal distribution. Subgroup averages do not need to form a Normal distribution; the Central limit theorem does not apply.
Anyone who teaches this is not knowledgeable about SPC.

If you really want to understand SPC, read the works of Donald Wheeler, W. Edwards Deming, Davis Balestracci, Walter Shewhart (the creator of SPC). You can find Donald adn Davis’s works at quality digest for free.

John C. Abnet and Andy Nichols like this.
3. ### Golfman25Well-Known Member

Joined:
Nov 6, 2015
Messages:
824
407
Trophy Points:
62
Maybe there was a misunderstanding between spc and cpk.

4. ### essegnMember

Joined:
Feb 5, 2016
Messages:
47
6
Trophy Points:
7
there is no need for a normal distribution, in order to calculate a Cpk.
The normality is required the only for predicted (expected) ppm-Rate.
Is this true?

If so, what kind of misunderstandig between cpk and spc do you mean?

5. ### Golfman25Well-Known Member

Joined:
Nov 6, 2015
Messages:
824
407
Trophy Points:
62
My understanding is traditional CPK formulas assume a normal distribution. So if your data isn't normally distributed you may need a different analysis.

6. ### Bev DModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
621
680
Trophy Points:
92
Location:
Maine
essegn is correct. As originally implemented Cpk did not take the shape of the distribution into effect. It was a simple, rough ratio of the process spread in relation to the tolerance range. In that context, the index had some informative value. Later pseudo statisticians ruined the index by trying to ‘predict’ the process defect rate. THAT does require knowledge of the actual distribution AND suffers from the gap between the theoretical infinite tails and the reality that real processes do not have infinite tails.

Miner likes this.
7. ### essegnMember

Joined:
Feb 5, 2016
Messages:
47
6
Trophy Points:
7
as i wrote earlier:
there is no need for a normal distribution, in order to calculate a Cpk.
The normality is required the only for predicted (expected) ppm-Rate.

I am confused now because of Minitab - please look at the attached picture.
It seems to me, that there is no difference in Minitab between the Cpk value and the Expected within.
As i converted the Cpk of 1,31 i became 86,76 ppm - the same as the "Expected Within".
This means there is the only "predictible" - potential Cpk calculated - and what i wrote above is not correct.
I am right?

----------------------

Pp resp. Ppk should give an info about past performance.
In my example gives Minitab Ppk of 1,19 - after convertion 362,25 ppm, which is the same value as "Expected Overall".
It is also a prediction or potential of the process.

The actual / past performance is being given from "Observed".
Why the Pp and the Ppk in Minitab are not showing the actual performance?

File size:
51 KB
Views:
8
8. ### MinerModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
601
525
Trophy Points:
92
Location:
Greater Milwaukee USA
@essegn

Several comments and observations:
• See Minitab Help. There are 5 links on the left bar that you should read to get a better explanation of what Minitab is trying to show. For more detailed information see Methods and formulas
• The chart you attached is generated under Capability Analysis > Normal. Therefore, Minitab uses the normal distribution assumptions.
• Your chart only shows defects on the lower spec side, so there is a direct link between the Cpk/Ppk and Expected performance. However, if your distribution showed greater spread such that you had defects on both ends, the expected performance would now reflect the defects on both sides, which Cpk/Ppk do not. They only reflect the worst side.
• Expected is a theoretical performance based on the normal distribution and on the capability index. Observed performance is the actual performance regardless of any normality assumptions.

9. ### essegnMember

Joined:
Feb 5, 2016
Messages:
47
6
Trophy Points:
7
Hi Miner,

Is this summary correct?
Pp - Performance - prediction - Process stability required - the only normal distributions - within & between subgroups - all measurements
Cp - Capability - prediction - Process stability required - the only normal distributions - within subgroups - x last measurements

Pp - prediction of all data - within & between subgroups, NO ACTUAL PERFORMANCE
Cp - prediction of x last measurements - within subgroups, NO ACTUAL PERFORMANCE

The actual / current/ past performance is given the only from ''Observed performance'' in Minitab.
- when a process is not stable - do not calculate any C or P Index.

10. ### MinerModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
601
525
Trophy Points:
92
Location:
Greater Milwaukee USA
@essegn Your summary is essentially correct in terms of how Minitab handles these metrics. The only exception that I will note is "...of x last measurements..." Minitab does use all of the data that you provide, and does not restrict it to the last x measurements. You may be thinking of the capability 6 pack where Minitab displays the last 25 data points on a control chart.

ncwalker likes this.
11. ### essegnMember

Joined:
Feb 5, 2016
Messages:
47
6
Trophy Points:
7
Do Pp and Ppk require process stability?

12. ### MinerModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
601
525
Trophy Points:
92
Location:
Greater Milwaukee USA
Yes. A stable process is predictable. An unstable process is not predictable. The capability/performance indices are predictions of the future and require a stable process.

ncwalker likes this.
13. ### ncwalkerWell-Known Member

Joined:
Sep 21, 2015
Messages:
261
169
Trophy Points:
42
Location:
North Carolina
I'm going to disagree with Miner a little bit. I don't think they require a stable process, but be aware an unstable process will have worse results. Capability studies are a prediction of what a process can do. Feed in unstable process, and your capability study is going to predict that your process will perform far worse than it actually may. And it will (typically) magnify the effect. A small decrease in stability will result in large decrease in capability. And, your indices in the capability study won't give you a clue that the problem is stability. It will just give you a bright, red BAD. The REAL information is always in the plots.

essegn likes this.
14. ### Bev DModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
621
680
Trophy Points:
92
Location:
Maine
I think it's important to understand the difference between stability and homogeneity. When an out of control condition is detected by a control chart it is either a 'real change' (shift, trend or drift) OR it is simply a stable non-homogenous state. Non homogeneity is when the primary factor(s) controlling the average are nto he same as primary factor(s) controlling the standard deviation. control charts work because of homogeneity - the within subgroup variation is related to the population standard deviation adjusted by c4 or d2. Then the subgroup average will vary by the formula for the standard error of the mean. all is well when the process is homogenous. (another way of thinking of a homogenous process is when the dominant component of variation is piece to piece; hence the rule of thumb of subgrouping by some number of sequential pieces). Today's processes are rarely homogenous, which is why we have rational subgrouping. So when a process is stable yet non-homogenous we utilize rational subgrouping.

If you have a non-homogenous process and you use a subgrouping scheme that is not rational (for example sequential pieces) then your control chart will appear to be unstable because of all of the out of control points, where the control chart correctly detected the non-homogeneity. HOWEVER, your process may be completely stable.

Unfortunately of course the non-homogeneity will also throw off a simple capability index calculation because the variation is not a straightforward standard deviation calculation.

Real life is much more complicated than the 'simple' formulas you find in books, standard and google.

Andy Nichols likes this.
15. ### ncwalkerWell-Known Member

Joined:
Sep 21, 2015
Messages:
261
169
Trophy Points:
42
Location:
North Carolina
Then I shall attempt to put forth a non-homogeneity example:

If I were making chocolate chip cookies and one of my outgoing parameters were "Chips melted in the cookies well," in other words, I want the chocolate to merge with the dough well. Then:

The mean would be primarily affected by the oven temperature - a cool oven wouldn't melt them in well, a hot oven would.

But the standard deviation would be affected (maybe not primarily) by the initial temperature of the chocolate chips - If in my small dough batches I had some frozen chips and some room temperature chips. Or perhaps chips in contact with the feed been get warmed up and chips in the center do not, and this was not controlled, I would expect this to introduce noise in my output variable. The standard deviation.

So that would be non-homogenous: oven temp drives mean, chip load temp affects standard deviation.

Does that describe it?

And if so: would we say it's non-homogenous if I were ONLY controlling oven temp? What if I were control charting both? Do I then have two separate but interacting homogenous processes?