Home → Techniques and Tips → @RISK Simulation: Numerical Results → Convergence by Testing Percentiles
Why do different percentiles take different numbers of iterations to converge? And why do percentiles sometimes converge more quickly than the mean, even though the mean should be more stable?
There can definitely be some surprises when you use percentiles as your criterion for convergence, and you can also get very different behavior from different distributions.
First, an explanation of how @RISK tests for convergence. In Simulation Settings, on the Convergence tab, you can specify a convergence tolerance and a confidence level or use the default settings of 3% tolerance and 95% confidence. Setting 3% tolerance at 95% confidence means that @RISK keeps iterating until there is better than a 95% probability that true percentile of the distribution is within ±3% of the corresponding percentile of the simulation data accumulated so far. (See also: Convergence Monitoring in @RISK.)
Example: You're testing convergence on P99 (the 99th percentile). N iterations into the simulation, the 99th percentile of those N iterations is 3872, A 3% tolerance is 3% of 3872 = about 116. @RISK computes the chance that the true P99 of the population is within 3872±116. If that chance is better than 95%, @RISK considers that P99 has converged. If that chance is less than 95%, @RISK uses the sample P99 (from the N iterations so far) to estimate how many iterations will be needed to get that chance above 95%. In the Status column of the Results Summary window, @RISK displays the percentage of the necessary iterations that @RISK has performed so far.
Technical details: @RISK computes the probabilities by using the theory in Distribution-Free Confidence Intervals for Percentiles (accessed 2013-07-24). The article gives an exact computation using the binomial distribution and an approximate calculation using the normal distribution; @RISK uses the binomial calculation.
Now, an explanation of anomalies, including those mentioned above.
P1 (first percentile) takes many more iterations to converge than P99, or vice versa.
At first thought, you might expect P1 and P99 to converge with the same number of iterations, P5 and P95 with the same, P10 and P90 with the same, and so on. But it usually does not work out that way. Let's take just the first and 99th percentiles as an example.
The tolerance for declaring convergence complete is expressed as a percentage of the target. If the values are all positive, then the first percentile is a smaller number than the 99th percentile, and therefore the tolerance for P1 is a smaller number than the same percentage tolerance for P99. The difference is greater if the distribution has a wide range, or if the low end of the distribution is at or near zero.
For an extreme example, consider a uniform continuous distribution from 0 to 100. P1 is around zero, and P99 is around 100, so a 3% tolerance for P1 is quite small and will take about 406,000 iterations to achieve. By contrast, a 3% tolerance for P99 is relatively much larger and is achieved in only 30 iterations.
On the other hand, if the values are all negative then P99 will have a smaller magnitude than P1, and will therefore converge more slowly.
Convergence happens too quickly and is very poor.
This occurs when the distribution has a narrow range, so that there is little difference between one percentile and another.
Consider the uniform continuous distribution from 10000 to 10100. Every percentile is in the neighborhood of 10,050, so a 3% tolerance is about 10,050±302 = 9,748 to 10,352. That range is actually larger than the data range. Therefore, the very first sample value for every percentile will be within that range, so convergence of any percentile happens on the first iteration, but that "convergence" is meaningless.
P50 takes more iterations to converge than P1 or P99, and also more than the mean.
This is expected for many distributions. The percentile convergence is based on a binomial distribution, with p = the percentile being tested. The binomial distribution is fairly broad for p = 50%, and so the margins of error are greater and convergence takes more iterations. But as p gets closer to 0 or to 100%, the distribution gets more narrow, margins of error get smaller, and convergence happens in fewer iterations.
As for convergence of a mean versus convergence of the 50th percentile, percentiles use the binomial distribution, but the confidence interval for the mean uses Student's t. Margins of error are usually narrower for Student's t than for the binomial, so convergence of the mean happens faster than convergence of the 50th percentile, even for a symmetric distribution.
Advice: In @RISK's simulation settings you have to set convergence tolerance as a percentage of the tested statistic (mean, standard deviation, or percentile), but the appropriate percentage is not always obvious. To help you make the decision, run a simulation with a few iterations, say 100, just to get a sense of what the output distribution looks like. Then, if you expect the percentile value to be close to zero, specify a higher tolerance or choose a different statistic. Also check your tolerance against the expected range of the output, and if necessary specify a smaller tolerance.
Last edited: 2015-06-24