Home → Techniques and Tips → @RISK Distribution Fitting → Discrete Density Data Treated as Continuous
Applies to: @RISK 5.x–7.x
My data set is as follows:
x | p |
---|---|
0 | 0.14 |
50 | 0.35 |
100 | 0.30 |
200 | 0.15 |
500 | 0.06 |
I calculate the mean in Excel by summing the product of each data value multiplied by its probability, and I get 107.5. But if I do a fit on this data, the Input column in the Fit tab shows the mean as 183.74. Have I used a correct method to calculate the mean for density data? If not, what is the correct way to do this?
Probability is quite different between discrete and continuous distributions. In a continuous distribution, there are an infinite number of points on a continuous distribution (not just 0 and 1, for instance, but also all values in between), and therefore the probability of getting any one of those values is infinitely small. That is why we always look at the probability that something will be within a certain range, not the probability that it will be equal to a single value. For discrete distributions, the probability of each possible outcome is nonzero; for example, a coin toss has only two, not infinite, possible values, so we can talk about the probability of a single value. When you do a fit, one thing you tell @RISK is the data type, so that it can apply the proper rules for probability.
The way you have the fit set up, you are specifying 5 points on a continuous density curve, which is not the same as specifying the probability at those points. Since it doesn't have any more information, @RISK assumes a linear change in density between each of these points (it connects the dots with straight lines). In effect, it treats the data as describing a RiskGeneral distribution.
When you manually calculated the mean, however, you assumed a discrete distribution: it only has values 0, 50, 100, 200, 500, and nothing else. If this is what you intended, then you want to select a data type of "Discrete Sample Data (Counted Format)" in the fitting dialog, on the first tab. As the name "counted format" suggests, the second column must be whole numbers, so you need to multiply all your probabilities by the same number. In this case, since the probabilities are all two decimal places, multiply them all by 100 to get whole numbers in the same proportion.
Last edited: 2015-06-19