Dismiss Notice
You must be a registered member in order to post messages and view/download attached files in this forum.
Click here to register.

CpK data for averaged results and single values different on same data

Discussion in 'SPC - Statistical Process Control' started by Alan Charles, Jun 25, 2019.

  1. Alan Charles

    Alan Charles New Member

    Joined:
    Jun 25, 2019
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    1
    Hi All,
    The above calculation gives me 2 different results on the same data. To explain if I have, say 1000 points of data and carry out a Cpk calculation then I get say 1.6. But actually these 1000 points are made up of sub sets each of 10 data points. So 100 means. If I Cpk these 100 points my result is 1.3. Can someone explain please how the calculation and the averaging is affecting my results on the same data but obviously treated differently.
    thanks
     
  2. Miner

    Miner Moderator Staff Member

    Joined:
    Jul 30, 2015
    Messages:
    396
    Likes Received:
    300
    Trophy Points:
    62
    Location:
    Greater Milwaukee USA
    It is difficult to say without the raw data since Cpk uses both the mean and the standard deviation. The means should not be very different whether averaged or not. However, when you average data, the standard deviation of the averages is the standard deviation of the individual measurements divided by the SQRT(n), or in your case by the SQRT(10). This should have the effect of increasing the Cpk, so without the data it is difficult to assess the actual cause. What do the measurements look like over time? Are they stable and in control? If not, the results will be unpredictable. Do you have mixtures of different process streams?
     
  3. Bev D

    Bev D Moderator Staff Member

    Joined:
    Jul 30, 2015
    Messages:
    472
    Likes Received:
    513
    Trophy Points:
    92
    Location:
    Maine
    what Miner said.

    you should never "Cpk" subgroup means. process capability indices are based on the spread of the individual points , all 1000 of then in your case.

    The result you describe is counterintuitive as Miner said unless your process is not homogenous. (if it has large shifts, drifts or cycling that would increase the variation of the subgroup means beyond what would be expected from simple 'sampling error') hence the request for the data.

    In reality the math is no where near as insightful as the time series plot of the data. The useful of mathematical manipulation of data is only useful if you understand the underlying requirements for any given formula to be 'correct'; in other words understanding of the variation precedes mathematics. And understanding of the formula must precede it's use.
     

Share This Page