# Lesson 11 - Representing Uncertainty in our Simple Model

As was discussed in Unit 3, Lesson 8, for most real-world systems, at least some of the controlling parameters, processes and events are often uncertain (i.e., poorly understood or quantified) and/or stochastic (i.e., inherently temporally variable). For the reasons discussed in that Lesson, there can be very large uncertainties (in some cases, several orders of magnitude) in many of the parameters, processes and events associated with contaminant transport models. This, of course, results in uncertainty in any simulated results. It is for this reason that GoldSim was specifically designed as a powerful probabilistic simulator.

In this Lesson, we will look at a version of the model in which we have added some uncertainty to a handful of the input parameters (in reality, most if not all would likely have at least some uncertainty). In particular, we have added uncertainty to the partition coefficients for Sand and Sediment, as well as the discharge rate from the pipeline.

Let’s open the model now so we can begin to look at it. Go to the “Examples” subfolder of the “Contaminant Transport Course” folder you should have downloaded and unzipped to your Desktop, and open a model file named ExampleCT3_ContaminatedPond_Probabilistic.gsm. Let’s look inside the Inputs Container:

You will see that that Sediment_Kd and Sand_Kd are now represented by probability distributions (Stochastic elements).  Both are represented using Log-Normal distributions (in which the mean is specified as the constant value from the non-probabilistic simulation).  Here, for example, is the distribution specified for Sediment_Kd:

You will also notice a new Stochastic element named Discharge_Uncertainty.  It is defined as follows:

This is a dimensionless value whose most likely value is 1, with a minimum of 0 and a maximum of 4.  In the model, it is simply multiplies the Pipeline_Mass_Discharge wherever it is used. Hence, rather than always discharging 250 g/day for the first 100 days, the discharge rate is uncertain (but constant for any given realization), ranging from 0 to 1000 g/day over that period (with the most likely value being 250 g/day). This is a simplistic approach, but effectively serves the purpose of adding uncertainty to this important input.

Other than these three elements, the only other changes to the model are as follows:

• We will run a Monte Carlo simulation.  Hence, the number of realizations has been changed from 1 to 500.  The number of realizations required for a probabilistic simulation is discussed in Unit 12, Lesson 8 of the Basic Course.  This number depends specifically on what statistics you are interested in.  We stated earlier that we were interested in computing the peak concentration in the river.  But for a probabilistic simulation, we then need to specify what statistic of this output are we interested in. The mean?  The 95th percentile? We did not specifically identify this, but 500 realizations would allow us to have high confidence up to about the 98th or 99th percentile.
• Because we are interested in computing the peak concentration, we need to ensure that we run the simulation long enough to see the peak concentration appear in the stream.  Making the partition coefficients uncertain means that in some realizations their value will be higher (causing the peak to be delayed).  To ensure we run long enough to see the peak, we have extended the duration from 10 years to 20 years.
• The initial model was run with a timestep of 1 day.  We noted in the last Lesson that increasing the timestep a bit does not appreciably change results (e.g., changing it to 10 days introduced a 1% error).  As we shall soon see, the uncertainty introduced by simply changing the three parameters discussed above results in uncertainty in the result far in excess of 1%.  Hence, we will use a 10-day timestep (which speeds up the simulation by a factor or 10).

Run the model now and go the Results Container. Click on the Log Concentrations Result element, and scroll though realizations (using the control at the top of the window):

You will note that there is a large variation in the breakthrough curves.  In some realizations it is early, and in others it is late.  The magnitude also changes appreciably between realizations.

This model includes two new Result elements that were not in the original deterministic model.  The first is a Distribution Result element named Distribution of Peak.  Click on that now.  It displays the probability distribution of the output of interest, the peak concentration in the stream:

As can be seen, the peak concentration varies over two orders of magnitude, from just over 1E-5 mg/l to just over 1E-03 mg/l.  The median value is about 3E-04 mg/l (recall that the deterministic value was just over 2E-04 mg/l). The probability of the peak exceeding 1E-3 mg/l is about 3%.

There is also a Multi-Variate Result element named Sensitivity.  As discussed in Unit 11, Lesson 10 of the Basic Course, Multi-Variate Result elements allow you to analyze and compare multiple outputs when running probabilistic simulations.

One of the things they can do is carry out a number of sensitivity analyses to help determine which inputs are most responsible for the uncertainty in the output.  Click on the Sensitivity element now:

This table displays measures of the sensitivity of the Result (Peak_Concentration_Stream) to selected input variables (in this case, the three uncertain inputs). If you are interested, you can read about the details of these various measures in GoldSim Help.  For our purposes, however, it is sufficient to note the following:

• The result is positively correlated to the value of the Discharge_Uncertainty. That is, as the Discharge_Uncertainty increases, Peak_Concentration_Stream increases.  This is to be expected, as this input simply scaled the contaminant discharge rate.
• The result is negatively correlated to the values of the two Kds.  That is, as the Kds increase, Peak_Concentration_Stream decreases.  This is to be expected, as a larger Kd delays the peak and results in more spreading, leading to lower concentrations.
• The result is most sensitive to the Discharge_Uncertainty and the Sand_Kd, and less sensitive to the Sediment_Kd.  Again, this is to be expected, as the contaminant has a much greater distance to travel through the Sand than the Sediment.

So how would one use such results?  If we were interested in reducing the uncertainty in our simulated result, we should spend most of our effort in reducing our uncertainty in the Discharge_Uncertainty and the Sand_Kd (if possible). Reducing uncertainty in Sediment_Kd would not have a significant impact in reducing the uncertainty in our result, and hence it would not be worthwhile to spend effort doing so.

We will discuss the issue of dealing with uncertainty in contaminant transport models again in Unit 12.  Until then, however, this is the last that we will deal with uncertainty.  Instead, we will spend the intervening Units discussing the details of building contaminant transport models (in a deterministic manner).  We will then revisit this important topic again toward the end of the Course.