1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
Dismiss Notice
You must be a registered member in order to post messages and view/download attached files in this forum.
Click here to register.

Experiments without replications

Discussion in 'DOE - Design of Experiments' started by Jawad, Oct 6, 2021.

Tags:
  1. Jawad

    Jawad New Member

    Joined:
    Oct 6, 2021
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    Hello everyone,

    I am trying to assess the effect of several scheduling policies (treatments) on a the waiting times for patients (response). These are computer experiments.

    For each factor-level combination, I draw n patients at random, schedule them, and record their waiting times. There is no reason to assume that the n patients for each factor-level combination constitute n replications as the patients are independent and require different operations, due dates, resources, etc.

    Analyzing the waiting times data for each treatment shows a few things:
    1. There is no know distribution that characterizes any of the waiting times data, for any treatment.
    2. I cannot characterize any distribution with a single measure, so I am basing the comparisons on multiple measures for both the location and spread parameters.
    2. The data does not meet the assumptions for non-parametric tests, and those that are close will just compare one measure of the data like the median, and I know it is not representative.

    My question is, how do I set up an experiment where I cannot replicate, but can observe a large number of responses for each factor-level combination?

    Regards,
     
  2. Miner

    Miner Moderator Staff Member

    Joined:
    Jul 30, 2015
    Messages:
    577
    Likes Received:
    493
    Trophy Points:
    62
    Location:
    Greater Milwaukee USA
    You have a lot going on here, and without knowing a lot more information, I cannot comment on your specific situation, so I will start with the basics.

    Let's start with the definition of a replicate and the often confused with repeat.

    • REPEAT - multiple measurements taken under the same experimental conditions during the same (or consecutive) experimental run
    • REPLICATE - multiple measurements taken under the same experimental conditions taken during non consecutive experimental runs that are often randomized
    Repeat measurements are typically summarized as a mean and standard deviation, each of which may then be analyzed separately. Repeat measurements are usually taken as a measure to minimize measurement variation (noise) in order more clearly see treatment to treatment variation. It is also useful as a means of reducing subject to subject variation similar to a paired t-test.

    Replicates are used to increase the power of an experiment as well as obtain an estimate of pure experimental error, which allows you to calculate the lack of fit for the regression model. Without replicates, you will have a saturated model and cannot test for interactions between factors and possibly for the factors themselves. To overcome this, you must pool at least one term with the lowest Sum of Squares to provide an error term. The more terms that you pool, the more powerful the analysis.
     
  3. Jawad

    Jawad New Member

    Joined:
    Oct 6, 2021
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    Thanks Miner, but repeats are not relevant here, and I do not see why you think I confused them with replicates. In any case, pooling is a last resort and I doubt it will clear up any differences between main effects or lower order interactions even.

    My question is basic, and I think is fundamental; if I cannot replicate and experiment, but can observe many instances under each treatment, can I make any conclusions about treatment effects?

    To simplify the experiment, I'll consider one factor at two levels; that is: scheduling policy A vs. B. I draw 1000 patients at random from a larger pool and schedule their operations according to policy A, and record their waiting times. Then, I draw the same 1000 patients, but schedule them with policy B now and record their waiting times. Are the waiting times of both policies the same or not? I am reluctant to answer such a question because each patient is independent and I cannot quote an expected waiting time for patient 1001 even if there is a test to compare such populations.

    Hope this clarifies my question.
     
  4. Bev D

    Bev D Moderator Staff Member

    Joined:
    Jul 30, 2015
    Messages:
    606
    Likes Received:
    664
    Trophy Points:
    92
    Location:
    Maine
    Jawad - Miner is a highly experienced expert in this field. His response is very meaningful and I would suggest that you really think about what he is trying to say.

    I am also confused about what you are trying to do. I understand that your patients may be independent but wait times are not themselves necessarily independent. The physics of the system precede and dictate the statistical designs and analysis Also simulations can be troublesome. Can you explain in more detail what the simulation is and the differences between the policies are?
     
    Miner likes this.
  5. Jawad

    Jawad New Member

    Joined:
    Oct 6, 2021
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    1
    Hi Bev and thanks for joining. I am sure Miner, and yourself, are highly experienced, but guess what, others are too :)

    As for Miner's reply, I just don't see how it answers my question, can you, or him, state it in another way?

    The patients are independent and the waiting times are dependent since patients share resources. There is significant autocorrelation up to 40 lags.

    The simulation is a heuristic scheduler that assigns patients to time slots according to time and health constraints. The scheduler is deterministic.
    One factor of interest is how patients are dispatched to the scheduler: one-at-a-time as they arrive, or batched every hour and then dispatched to the scheduler. So for 1000 random patients, I get two sets of 1000 waiting times and I want to know if these two sets of waiting times are different. As I stated in the first message, there is nothing to suggest that the two sets of waiting times can be compared on a single summary statistic, or that I can attribute whatever differences I observe to the treatment itself rather than the random set of 1000 patients.

    Treatment Av. SE Mean StDev Min Q1 Q2 Q3 Max
    A 4085 202 6326 504 1110 1320 4272 38719
    B 3989 200 6247 504 1107 1320 4237 40074

    From the summary statistics above it seems that treatment B is slightly better in terms of average but treatment A is better in terms of max. waiting time, although this could be due to the sample itself, not the treatments.

    Given that the patients are independent, and that there is nothing to suggest that the waiting times are consistent, is there a way to assess the effect of a treatment? Or more importantly, is it possible to quote an expected waiting time for new patients? So far, it seems that the waiting time depends more on the patient's requirements and resource availability rather than how the patients are scheduled. This needs further testing though.

    Hope this clarifies the situation.