Batch Means Methods

When studying stochastic (random) processes, calculating statistical properties is often infeasible and may sometimes be impossible. The batch means method is one way to get around this problem by using direct observations of the stochastic system to measure the average value of the stochastic function, getting around the fact that there is no closed form, or simple equation, that the statistician could work with.
  1. Warm Up

    • When a stochastic process starts, the first values will not be representative of the process' steady state. This is because the statistician must seed the process with an initial value. The stochastic process then uses that initial value to generate a second value, uses the second value to generate a third, and so on. Eventually, the values generated by the stochastic process will start to form a statistical pattern. Then the process is said to be in a steady state.

      Since the batch means method is concerned with the steady state, the warm-up period, sometimes called the transient state, needs to be ignored. It can either be directly ignored by removing some initial interval from the data set before collecting batch means, or by running such a long simulation that the initial period becomes insignificant by comparison.

    Batches

    • After the simulation has been done and the transient effects dealt with, the remaining data points should be divided into batches. The average value of the data points within each batch are the batch means that the method is named after. The batch means can then be used to calculate the stochastic process' variance, a measure of how far the system tends to stray from its average value. The batch means can also be used to calculate the confidence interval, a measure of what percentage of data set is within a given distance from the average.

    Considerations

    • The biggest advantage of the batch means method is that it allows researchers to gather statistical information on problems that cannot be dealt with on pen and paper. Also, running a single extremely long simulation instead of multiple, shorter simulations is more efficient because there is only one warm-up period. For instance, if 30 batches are used, the researcher has to ignore only one warm-up period, instead of discarding 30.

      The main problem with the batch means method is that it assumes that each of the batches is statistically independent, and this is not strictly true. If the batches are long and the process is truly in a steady state, the correlation should be small, but the slight error still remains.

Learnify Hub © www.0685.com All Rights Reserved