SPC - normal normal distribution?

essegn · Mar 9, 2019

Hi,

i was taught, that SPC Charts need a normal distribution.
However i have thought all the time, that it is not inevitably necessary.
Then i found the following article:
https://www.qualitydigest.com/inside/lean-column/control-charts-keep-it-simple-121818.html#

What do you think? What are your thoughts?

Best Regards
Peter

Bev D · Mar 10, 2019

Control charts do not require a Normal distribution. Subgroup averages do not need to form a Normal distribution; the Central limit theorem does not apply.
Anyone who teaches this is not knowledgeable about SPC.

If you really want to understand SPC, read the works of Donald Wheeler, W. Edwards Deming, Davis Balestracci, Walter Shewhart (the creator of SPC). You can find Donald adn Davis’s works at quality digest for free.

Golfman25 · Mar 10, 2019

Maybe there was a misunderstanding between spc and cpk.

essegn · Mar 11, 2019

there is no need for a normal distribution, in order to calculate a Cpk.
The normality is required the only for predicted (expected) ppm-Rate.
Is this true?

If so, what kind of misunderstandig between cpk and spc do you mean?

Golfman25 · Mar 11, 2019

essegn said: ↑

there is no need for a normal distribution, in order to calculate a Cpk.
The normality is required the only for predicted (expected) ppm-Rate.
Is this true?

If so, what kind of misunderstandig between cpk and spc do you mean?
Click to expand...

My understanding is traditional CPK formulas assume a normal distribution. So if your data isn't normally distributed you may need a different analysis.

Bev D · Mar 12, 2019

essegn is correct. As originally implemented Cpk did not take the shape of the distribution into effect. It was a simple, rough ratio of the process spread in relation to the tolerance range. In that context, the index had some informative value. Later pseudo statisticians ruined the index by trying to ‘predict’ the process defect rate. THAT does require knowledge of the actual distribution AND suffers from the gap between the theoretical infinite tails and the reality that real processes do not have infinite tails.

essegn · Apr 23, 2019

as i wrote earlier:
there is no need for a normal distribution, in order to calculate a Cpk.
The normality is required the only for predicted (expected) ppm-Rate.

I am confused now because of Minitab - please look at the attached picture.
It seems to me, that there is no difference in Minitab between the Cpk value and the Expected within.
As i converted the Cpk of 1,31 i became 86,76 ppm - the same as the "Expected Within".
This means there is the only "predictible" - potential Cpk calculated - and what i wrote above is not correct.
I am right?

----------------------

Pp resp. Ppk should give an info about past performance.
In my example gives Minitab Ppk of 1,19 - after convertion 362,25 ppm, which is the same value as "Expected Overall".
It is also a prediction or potential of the process.

The actual / past performance is being given from "Observed".
Why the Pp and the Ppk in Minitab are not showing the actual performance?

Miner · Apr 23, 2019

@essegn

Several comments and observations:

See Minitab Help. There are 5 links on the left bar that you should read to get a better explanation of what Minitab is trying to show. For more detailed information see Methods and formulas

The chart you attached is generated under Capability Analysis > Normal. Therefore, Minitab uses the normal distribution assumptions.

Your chart only shows defects on the lower spec side, so there is a direct link between the Cpk/Ppk and Expected performance. However, if your distribution showed greater spread such that you had defects on both ends, the expected performance would now reflect the defects on both sides, which Cpk/Ppk do not. They only reflect the worst side.

Expected is a theoretical performance based on the normal distribution and on the capability index. Observed performance is the actual performance regardless of any normality assumptions.

essegn · Apr 24, 2019

Hi Miner,

thank you for your reply.

Is this summary correct?
Pp - Performance - prediction - Process stability required - the only normal distributions - within & between subgroups - all measurements
Cp - Capability - prediction - Process stability required - the only normal distributions - within subgroups - x last measurements

Pp - prediction of all data - within & between subgroups, NO ACTUAL PERFORMANCE
Cp - prediction of x last measurements - within subgroups, NO ACTUAL PERFORMANCE

The actual / current/ past performance is given the only from ''Observed performance'' in Minitab.
- when a process is not stable - do not calculate any C or P Index.

Miner · Apr 24, 2019

@essegn Your summary is essentially correct in terms of how Minitab handles these metrics. The only exception that I will note is "...of x last measurements..." Minitab does use all of the data that you provide, and does not restrict it to the last x measurements. You may be thinking of the capability 6 pack where Minitab displays the last 25 data points on a control chart.

essegn · Apr 24, 2019

Do Pp and Ppk require process stability?

Miner · Apr 25, 2019

Yes. A stable process is predictable. An unstable process is not predictable. The capability/performance indices are predictions of the future and require a stable process.

ncwalker · Apr 30, 2019

I'm going to disagree with Miner a little bit. I don't think they require a stable process, but be aware an unstable process will have worse results. Capability studies are a prediction of what a process can do. Feed in unstable process, and your capability study is going to predict that your process will perform far worse than it actually may. And it will (typically) magnify the effect. A small decrease in stability will result in large decrease in capability. And, your indices in the capability study won't give you a clue that the problem is stability. It will just give you a bright, red BAD. The REAL information is always in the plots.

Bev D · May 1, 2019

I think it's important to understand the difference between stability and homogeneity. When an out of control condition is detected by a control chart it is either a 'real change' (shift, trend or drift) OR it is simply a stable non-homogenous state. Non homogeneity is when the primary factor(s) controlling the average are nto he same as primary factor(s) controlling the standard deviation. control charts work because of homogeneity - the within subgroup variation is related to the population standard deviation adjusted by c4 or d2. Then the subgroup average will vary by the formula for the standard error of the mean. all is well when the process is homogenous. (another way of thinking of a homogenous process is when the dominant component of variation is piece to piece; hence the rule of thumb of subgrouping by some number of sequential pieces). Today's processes are rarely homogenous, which is why we have rational subgrouping. So when a process is stable yet non-homogenous we utilize rational subgrouping.

If you have a non-homogenous process and you use a subgrouping scheme that is not rational (for example sequential pieces) then your control chart will appear to be unstable because of all of the out of control points, where the control chart correctly detected the non-homogeneity. HOWEVER, your process may be completely stable.

Unfortunately of course the non-homogeneity will also throw off a simple capability index calculation because the variation is not a straightforward standard deviation calculation.

Real life is much more complicated than the 'simple' formulas you find in books, standard and google.

ncwalker · May 2, 2019

Then I shall attempt to put forth a non-homogeneity example:

If I were making chocolate chip cookies and one of my outgoing parameters were "Chips melted in the cookies well," in other words, I want the chocolate to merge with the dough well. Then:

The mean would be primarily affected by the oven temperature - a cool oven wouldn't melt them in well, a hot oven would.

But the standard deviation would be affected (maybe not primarily) by the initial temperature of the chocolate chips - If in my small dough batches I had some frozen chips and some room temperature chips. Or perhaps chips in contact with the feed been get warmed up and chips in the center do not, and this was not controlled, I would expect this to introduce noise in my output variable. The standard deviation.

So that would be non-homogenous: oven temp drives mean, chip load temp affects standard deviation.

Does that describe it?

And if so: would we say it's non-homogenous if I were ONLY controlling oven temp? What if I were control charting both? Do I then have two separate but interacting homogenous processes?

Log in or Sign up

SPC - normal normal distribution?

essegn Member

Bev D Moderator Staff Member

Golfman25 Well-Known Member

essegn Member

Golfman25 Well-Known Member

Bev D Moderator Staff Member

essegn Member

Attached File(s): 1. Scan for viruses before using. 2. Report any 'bad' files by reporting this post. 3. Use at your own Risk.:

Capability - Minitab.PNG

Miner Moderator Staff Member

essegn Member

Miner Moderator Staff Member

essegn Member

Miner Moderator Staff Member

ncwalker Well-Known Member

Bev D Moderator Staff Member

ncwalker Well-Known Member

Log in or Sign up

SPC - normal normal distribution?

essegn Member

Bev D Moderator Staff Member

Golfman25 Well-Known Member

essegn Member

Golfman25 Well-Known Member

Bev D Moderator Staff Member

essegn Member

Attached File(s): 1. Scan for viruses before using. 2. Report any 'bad' files by reporting this post. 3. Use at your own Risk.:

Capability - Minitab.PNG

Miner Moderator Staff Member

essegn Member

Miner Moderator Staff Member

essegn Member

Miner Moderator Staff Member

ncwalker Well-Known Member

Bev D Moderator Staff Member

ncwalker Well-Known Member

Useful Searches