Content of PetroWiki is intended for personal use only and to supplement, not replace, engineering judgment. SPE disclaims any and all liability for your use of such content. More information

# Problems with deterministic models

People who espouse risk-analysis methods are sometimes challenged by skeptics to justify the effort required to implement these methods. We must answer the question: “Why bother with risk analysis?” Put another way, “What’s wrong with deterministic methods?” A short answer appears in Murtha:

“We do risk analysis because there is uncertainty in our estimates of capital, reserves, and such economic yardsticks as NPV(net-present value). Quantifying that uncertainty with ranges of possible values and associated probabilities (i.e., with probability distributions) helps everyone understand the risks involved. There is always an underlying model, such as a volumetric reserves estimate, a production forecast, a cost estimate, or a production-sharing economics analysis. As we investigate the model parameters and assign probability distributions and correlations, we are forced to examine the logic of the model.
The language of risk analysis is precise; it aids communication, reveals assumptions, and reduces mushy phrases and buzz words. This language requires study and most engineers have little exposure to probability and statistics in undergraduate programs.”

Beyond that, we indicate some shortcomings of deterministic methods.

## Aggregating base cases-adding modes and medians (reserves, cost, time)

Deterministic reserve-estimates are often described in terms of:

• Low-side possibilities
• Medium-side possibilities
• High-side possibilities

Some people think in terms of extremes: worst and best cases together with some base cases:

• Mean
• Mode
• Median

Others report values in terms of:

• P10
• P50
• P90

Sometimes, these cases are linked to such categories as:

• Proved
• Proved plus probable
• Proved plus probable plus possible

While there is nothing wrong with any of these notions, the logic of obtaining the cases is often flawed. Again, from Murtha:

“Total capital cost is often estimated by adding the base costs for the various line items. A simple exercise shows how far off the total cost can be. Take ten identical triangular distributions, each having 100, 200, and 350 for low, most-likely (mode), and high values, respectively. While the mode of each is 200, the mean is 216.7. Summing these ten triangles gives, as usual, a new distribution that is approximately normal—this one with a mean of 2,167 and a standard deviation of approximately 165. The original mode, 200, is approximately P40. The sum of the modes is approximately P15, far from what might be expected as a ‘representative value’ for the distribution. In a 2,000-trial simulation, the P1 and P99 values are about 1,790 and 2,550.
If the distributions represented 10 line-item cost estimates, in other words, while there would be a 60% chance of exceeding the mode for any single estimate, there is an 85% chance—about 6 times out of 7—of exceeding the sum of the modes. If we added 100 items instead of just 10, the chance of exceeding the sum of modes is more than 99%. We must be careful how we use most-likely (modes) estimates for costs and reserves. Of course, if there is significant positive correlation among the items, the aggregate distribution will be more dispersed and the above effect less pronounced.”

## Multiplying base cases or P10s (factors to yield reserves or resources)

When volumetric products are used to obtain reserves estimates, there is a temptation to build the low-side reserves estimate by blithely taking the product of low estimates for the various factors. This is a dangerous business at best. The product of P10 estimates for area, pay, and recovery factor, for example, is approximately P1. For special cases (all distributions log-normal, no correlations), one can find an exact answer, but if you use a different distribution type, include any correlations between inputs (larger area tends to be associated with thicker pay), or change the number of factors (breaking out recovery factor into porosity, saturation, formation volume factor, and efficiency), there are no simple rules of thumb to predict just how extreme the product of P10 values is.

Less obvious is the fact that neither the P50 value nor the modes for the inputs yield either P50 or mode, respectively, for the output, except in very special cases. The mean values of inputs will yield the mean of the product distribution, but only if there is no correlation among inputs. In other words, even the “base-case” reserves estimate, generally, should not be obtained from a product of base-case inputs, except in very special cases.

## Including pilot projects with P(S) > 0.5

Imagine the following situation: In preparation for a court resolution of ownership, an operating company wishes to estimate the value of its extensive holdings in a major oil field. Complicating factors include several possible programs, some involving new technology. Five of the programs require successful completion of a pilot project. Based on laboratory analysis and somewhat similar development procedures in less harsh environments, the pilots all have high estimates of success, ranging from 65 to 80%. In the deterministic version of the global model, for each pilot they ignore the existence of the others, assume each pilot is successful (because each is greater than 50%), and automatically include the corresponding program. From a probabilistic standpoint, however, the chance of all pilots being successful is quite small (roughly 0.75 or about 5 to 1 against). The actual Monte Carlo model is so inconsistent with the deterministic model that the first pass results show the deterministic estimate (or better) to have only about a 5% chance of happening. Note that in the Monte Carlo simulation the more realistic scenario is used—whereby, on each iteration, the pilot either succeeds and the follow-up program is included or fails, and no contribution is included for the follow-up.

A more correct deterministic method would include P(S) × (value pilot + value of corresponding program), but even this would not shed any light on the range of possibilities. In short, it is difficult to properly account for stages of development when each stage has uncertain levels of success.

## Multiple definitions of contingency

Cost engineers add contingency to line items or to the total base estimate to account for some uncertainty. Within a company, the rules and guidelines are generally well known and consistently applied. Nonetheless, there are different interpretations among companies. One of the standard definitions says: “Cost contingency is the amount of additional money, above and beyond the base cost, that is required to ensure the project’s success. This money is to be used only for omissions and the unexpected difficulties that may arise. ...Contingency costs are explicitly part of the total cost estimate.”

By adding contingency to each line item, the total cost estimate contains the sum of these quantities. However, we have seen above the danger of summing deterministic variables. In effect, setting aside some additional funds for each line item tends to generate a larger aggregate contingency than is necessary because it emphasizes the unlikely prospect that all line items will simultaneously exceed their estimates. An alternative use of contingency is to apply a percent to the total cost. This, at least, recognizes the sum of line items as its own distribution.

## Not knowing how likely is the most likely case

Even if a deterministic method would generate a most-likely case (by some method other than simply applying the model to the most likely inputs), we would not know how likely it would be to achieve that case (or better). Monte Carlo outputs allow us to estimate the likelihood of achieving any given outcome, so the surprise is avoided at discovering, for example, that bettering your most-likely case is 7 to 2 against.

## Not identifying driving variables

Deterministic models do not tell us which of the inputs are important. Sensitivity analysis is the general term for finding out how the output(s) of a model varies with changes in the inputs. Attempting to answer “what if” questions (What if the oil price goes to U.S. \$80? What if the project is delayed by a year? What if the rig rate is twice as much as we budgeted for?) gave rise to two forms of sensitivity analysis called tornado charts and spider diagrams, which have been discussed (and their shortcomings mentioned) earlier in this chapter.