Non normal data - process stability

Discussion in 'SPC - Statistical Process Control' started by essegn, May 26, 2019.

1. essegnMember

Joined:
Feb 5, 2016
Messages:
36
3
Trophy Points:
7
Hi,

control charts are to determine if the process is in statistical control (consistent and predictable).
Stability should be done at first. After that will be checked the distribution.

Lets consider that the actual process is not stable (minimum boundary in the process cannot be less than 0 - for instance delivery times etc..)
Non-normal Data increase the chances for false signals. I have read that up to 4 times.

If the process is out of control, the process needs to be improved and not bother with transformation, as they will be meaningless.

When any process with natural non-normal distribution needs to be checked for stability at first (like others with normal distribution), such processes have a big disadvantage, because the probability that these will be unstable is up to 4 times bigger.

How do you decide if OOC points are false signals or they are really there because the process is unstable?

How do you handle with such cases?

2. Bev DModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
419
438
Trophy Points:
62
Location:
Maine
First let’s remember that Normailty and stability are NOT related. I don’t know where you get the 4 times number but it’s wrong; there is no derivation or empirical data that supports that falsehood. Non-normal processes are not more likely to be unstable than an Normal process nor are they more likely to appear unstable if the subgrouping and control chart selection are appropriate. Also remember that most processes are not actually Normaly distributed - they may be symetrical and roughly bell shaped but the “Norma Distribtuion” is a man made construct or model that is merely convenient; it is not a physical truth.

There is no need to check the distribution after determining stability for control charts. Perhaps you are thinking about the admonition to check the distribution for capability indices? Capability indices have been conflated with defect rate predictions that do require knowledge of the distribtuion to apply the ‘right’ mathematical formula but this is mostly a useless exercise in mathematical manipulation...

Also remember that the control chart is a detector of non-homogeneity not a detector of non-normality.

When you have a non-normal distribution of the individual values the distribution of subgroup averages will be more ‘symetrical’ and the traditional Shewhart chart will work just fine. Even if you have individual values the I, MR chart typically works just fine. In some cases you might need a different chart OR a different subgrouping scheme. For example with very low defect rates I use a p’ (p prime) chart.

You are correct that transformations are not valuable or necessarry.

I approach all OOC events in the same way: I look for the assignable causes. Occasionally an OOC point is a false alarm - or a brief transient OOC condition that goes away quickly - and the process reverts to in the control limits with the next event. This is to be expected on a few occasions - remember the orginal book on control charts was entitled “The economic control of quality”, not “the statistically precise analysis of process distribution”.

Miner likes this.
3. MinerModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
312
239
Trophy Points:
42
Location:
Greater Milwaukee USA
4. essegnMember

Joined:
Feb 5, 2016
Messages:
36
3
Trophy Points:
7
Thank you for your replies. There is still something i do not understand.
You wrote: Normailty and stability are NOT related.

However it will be topics:SPC and Capability little mixed up.

At the Minitab site is following written:
Capability indices cannot be estimated the same way for normal and nonnormal data because their distributions are different. For example, the shapes of nonnormal distributions are most likely asymmetric, and the distribution coverage of a nonnormal distribution cannot be represented by the number of standard deviations (a parameter unique to the normal distribution). To calculate capability indices for nonnormally distributed data, equivalent methods are necessary that are analogous to the normal case.

In Minitab there are in general two ways (for continuous data) how to calculate a capability:
• Normal
• Non-normal

This means, the the program i use make differences between these two groups and for a capability is important to know the distribution.

Cp needs to have normal distribution and is calculated as follows:
Cp= (USL-LSL) / 6*(σwithin)

σwithin - the estimated standard deviation from the average range (σ).

Control limits for I-MR Charts are being calculated with σwithin as well.

---

Conclusion:
Capability is being calculated the only for normal distributed data - see Minitab. For non-normal distributions could be the data transformed and then calculated.
Capability is being calculated with σwithin - the estimated standard deviation.
I-MR Charts are also use σwithin to calculate its control limits.
It means then that I-MR Charts require also the normal distribution.

Is that correct?

5. Bev DModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
419
438
Trophy Points:
62
Location:
Maine
No it isn’t correct. I MR charts do not require the distribution to be Normal - nor do most control charts. Control charts and capability indices are not related.
The reason the traditional capability indices (Cpk, Ppk, etc.) require Normailty is that they are conflating defect rate with process spread. Current practitioners are trying to ‘calculate the potential defect rate’ by translating the Cpk index to a the % of parts that might exceed the specification limits through use of the Z statistic which is based on the Normal distribution.

The original intent of the capability index was simply to describe the variation vs. the specification limits. Absolutely NO inferences were to be made regardign teh defect rate. Subsequent hacks decided to add the defect rate calculation. The index itself does not require any specific distribution. (See “Reducing Variablity - A New Approach to Quality” by L.P. Sullivan, Quality Progress, July 1984. Available on the ASQ website)

Also MINITAB is incorrect if it states that the standard deviation cannot represent a non-normal distribution or that the standard deviation is unique to the Normal distribution. This is patently and completely untrue. The standard deviation is representative of any distribution. It is a standard statistical estimate of the parameter. What is unique to the Normal distribtuion is the ‘coverage’ assigned to various numbers of the standard deviation. For example, 95% of the distribution will be contained by the interval of the mean +/- 1.96*SD. This is not a law of physics, but of a man made ideal model...

6. essegnMember

Joined:
Feb 5, 2016
Messages:
36
3
Trophy Points:
7

It this what are you saying?
Here is a summary, in which cases could be indices of capability calculated.

Process in statistical Control required? Normal-Distribution required?
Cp, Cpk no no
Pp, Ppk no no
Expected Overall ppm yes yes
Expected Within ppm yes yes
Observed ppm no no

Indices Cp, Cpk, Pp,Ppk- they do not necessary require a stable process, but on the other hand will have an unstable process worse results.

Do you use data transformation the only when the process is stable (in statistical control) and you would like to have the Expected Overall ppm and the Expected Within ppm calculated?

7. essegnMember

Joined:
Feb 5, 2016
Messages:
36
3
Trophy Points:
7
i have tried to do a table, but it was merged together. Hopefully you will understand it.

8. Bev DModeratorStaff Member

Joined:
Jul 30, 2015
Messages:
419
438
Trophy Points:
62
Location:
Maine
Statistical Control (Stability) is required before calculating any process capability index. Cp, Cpk, Pp, Ppk all require stability before calculating them.

In order to use the Normal distribution formula for calculating the ‘potential’ defect rate, your process must have be in Statistical Control AND have a Normal distribution.
MINITAB and other software can determine what distribution you likely have and calculate the ‘potential’ defect rate using teh appropriate formulas without transformation. However, these processes must also be in statistical control before calculating the defect rate.

Mathematically of course you can transform the data. (AFTER you are in statistical control).

I never transform the data and I never try to calculate a potential defect rate. The issue with the potential defect rate is that no distribution has infinite tails and so these calculations often overstate the potential defect rate. I simply calculate the ACTUAL defect rate. That is much more useful and believable.

Of course I also never calculate any process capability indices as they too are relatively useless. I simply plot the process results in time sequence adn use control charts and multi-vari charts and continually improve my processes...