1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
Dismiss Notice
You must be a registered member in order to post messages and view/download attached files in this forum.
Click here to register.

Identify the Most Important Predictor Variables in Regression Models

Discussion in 'Capability - Process, Machine, Gage …' started by essegn, Mar 14, 2020.

  1. essegn

    essegn Member

    Joined:
    Feb 5, 2016
    Messages:
    47
    Likes Received:
    5
    Trophy Points:
    7
    Hi,
    i have a problem to identify the most important predictor in regression.
    There are three inputs that are statisticaly significant (p value = 0,000) and there is no multicolinearity.

    What is difference between;
    - F-Value
    - Contribution
    - Coef or coeded Coef

    Which values should be considered when assessing the most significant predictor? I mean which one should be adjusted "as a first" in order to optimize the process.
    Could be please give me the explanation to the attached files.
     

    Attached File(s): 1. Scan for viruses before using. 2. Report any 'bad' files by reporting this post. 3. Use at your own Risk.:

  2. G650ER

    G650ER Member

    Joined:
    Feb 22, 2019
    Messages:
    7
    Likes Received:
    0
    Trophy Points:
    1
  3. essegn

    essegn Member

    Joined:
    Feb 5, 2016
    Messages:
    47
    Likes Received:
    5
    Trophy Points:
    7
    Hi G650EG,

    thank you for your reply.
    I was aware, that i get such an answer, but i am still unable to interpretate the most significant predictor variable.
    I have made a summary statistic of Output - please look at the attached file.

    From my point of view, the Input Y is the most important (because correlation & F-Value & PLS Regression, which is not included here).
    But here seems to the most important Input X.

    Coded coefficients are different - it depends which coding of continuous predictors are being to used - see the attached file.
    I have read all the explanation for Coeded & Uncoded Coeffcients, Contribution, F-Test and F-Value.
    It seems to me, that the coeded coefficients are the key, but there are still some obscurities.

    Could someone help by interpretation of the results?
     

    Attached File(s): 1. Scan for viruses before using. 2. Report any 'bad' files by reporting this post. 3. Use at your own Risk.:

  4. Miner

    Miner Moderator Staff Member

    Joined:
    Jul 30, 2015
    Messages:
    577
    Likes Received:
    493
    Trophy Points:
    62
    Location:
    Greater Milwaukee USA
    It will depend on your goal. Do you want to minimize variation or move the average?

    Percent contribution identifies which variable contributed the most toward the variation observed in the response. However, this is influenced by how much variation was observed in each input during the course of the study. If you want to reduce the variation in the response, I recommend using percent contribution.

    Coded coefficients explain how much the response will change given a one unit change in the input variable. Therefore, the input variable with the largest coded coefficient will move the average of the response the most. If you want to adjust the average response, I recommend using the coded coefficients.
     
  5. essegn

    essegn Member

    Joined:
    Feb 5, 2016
    Messages:
    47
    Likes Received:
    5
    Trophy Points:
    7
    Thank you for the great explanation! It makes sense now.
    I am still confused about standartized vs. non standdartized coefficients.

    When are continuous variables being standardized?
    Is there another reason as to avoid the multicolinearity?

    In Minitab there are thee ways:
    - Specify low and high levels to code as -1 and +1
    - Subtract the mean, then divide by the standard deviation
    - Subtract the mean

    As can be on the attached picture seen (my last post), they give different results.
    What is to be considered?

    When a promissing model is being built (with non coeded coefficients, VIF is low), should be run an another run with standardisation of continuous variables, in order to have 1:1 comparison?
     
  6. Miner

    Miner Moderator Staff Member

    Joined:
    Jul 30, 2015
    Messages:
    577
    Likes Received:
    493
    Trophy Points:
    62
    Location:
    Greater Milwaukee USA
  7. Miner

    Miner Moderator Staff Member

    Joined:
    Jul 30, 2015
    Messages:
    577
    Likes Received:
    493
    Trophy Points:
    62
    Location:
    Greater Milwaukee USA
    Correlation just indicates whether there is a statistically significant relationship between two variables. Regression provides a mathematical model that describes the relationship and allows you to make a prediction of the response given a specific level of the input variable.
     
    BradM likes this.