Lesson 2 - Key Decisions Required Before Starting to Build a Model
Before starting to use GoldSim (or any tool) to build a contaminant transport model, there are some fundamental decisions you must make. In fact, these decisions actually apply to any kind of simulation modeling application you may consider. They are of particular importance, however, for the complex task of contaminant transport modeling.
Before we discuss these, however, let’s briefly take a step back and consider what we actually mean by the term “model”. Basically, as used here, a “model” is an abstract digital representation or facsimile of a system. By definition, any such model is a simplification of reality, with the goals being to include those aspects that are assumed to be important and omit those that are considered to be nonessential, so as to derive useful predictions of system performance in an efficient manner.
In this definition, we immediately see some decisions we will need to make:
- What is the extent of the “system” we are modeling?
- Which aspects of that system are important to include in the model, and which are not?
But before we can discuss the two decisions noted above, we have an even more fundamental decision to make: Why should we build a model in the first place? That is, if we are going to build a model, we first have to specify why we are building the model. What are our objectives? What specific questions are we trying to answer with the model?
In general, although buiding a model will very often provide understanding and insight into system behavior, ultimately, modeling is about supporting decision-making. In most cases, we are interested in making predictions in order to evaluate performance and compare alternatives in order to make decisions regarding designs and/or policies. In addition, however, we may also want to use the model for a much broader set of goals: to provide a systematic framework for organizing and evaluating the available information about the system, and to then use the model as a management tool to aid decision-making with regard to data collection and resource allocation (what should be studied, when, and in what detail?).
Based on this, you may therefore be tempted to state that your objective is to “understand the behavior of the system”. Although environmental models can in fact be very valuable for understanding a system, defining your modeling objectives in this very open-ended way will almost certainly lead to serious difficulties.
For some types of simple systems such an approach may indeed be possible. However, this is generally not a viable approach when modeling most systems, and is certainly not the approach you should take for building a contaminant transport model. For reasons that should become clear below, trying to build a contaminant transport model designed to address such an open-ended objective (“understanding the behavior of the system”) would be very difficult to do and the model itself would be likely to be poorly defined, and hence would be difficult to interpret, use and defend.
The correct approach is to build models that are designed to answer specific quantitative questions (e.g., What is the peak concentration of a particular contaminant at a specific location? What is the mass transfer rate across a particular boundary and how does it change with time? What is the average contaminant release rate from a specific source over the first ten years? What is the peak impact on a particular human or environmental receptor?). This requires you to specify well-defined outputs that the model must calculate (e.g., peak or average concentrations at a specific location, mass transfer rates across specified boundaries). As we will discuss below, specifying the objectives of the model in this way and defining the specific outputs you need to calculate will also help you determine the appropriate level of detail in the model.
If your model is able to produce such quantitative outputs, you will find that you subsequently will be able to answer more qualitative questions about the general behavior of the system and use the model as a management tool.
Defining the System
Once we have defined our objectives, our next step is to define the extent of the actual system we wish to model. The key point is that the “system” we are going to model is a subunit of the actual environment. That is, we are not modeling the entire world; we are just modeling a part of it (e.g., a lake, a hazardous waste site, a waste disposal site). As such, the system we are modeling has a boundary.
The system inside the boundary can change with time due to external influences and/or internal influences. External influences on the model are referred to as being exogenous. Exogenous behavior is something that comes from “outside” the boundary of the model and is not due to or explained or influenced in any way by the model itself. It is simply an external “forcing function”. For example, if you were modeling a lake, the mass input into the lake from a pipeline would be an exogenous factor. The model of the lake would not explain or impact the input from the pipeline; the mass input from the pipeline would just be an external forcing function.
Internal influences on the evolution of a model are referred to as being endogenous. Endogenous behavior is something that comes from “inside” the model and is explained by the processes within the model. That is, if a system evolves endogenously, it means that the model structure itself is causing the model variables to change with time. The changes are not being driven from something “outside” the model.
Remembering these specific terms (exogenous and endogenous) is not important. What is critical is that you understand the concept: systems change with time due to external influences and/or internal influences. Real-world systems that you will model will be controlled by a combination of endogenous and exogenous factors. A little thought will convince you that this must indeed be the case. Because the system we are simulating will always be a very small subset of the world, and this system is unlikely to be completely isolated from the rest of the world, exogenous factors will always exist. Moreover, any system worth taking the time to model would by definition have sufficient complexity to have at least some internal endogenous behavior.
In a contaminant transport model (where we are concerned about tracking contaminant mass), the exogenous forcing functions will generally be in the form of mass input rates, as well as other inputs that impact the behavior of mass in the system, such as rainfall rates or flow rates entering from outside the boundary of the model. The mass in your model can, of course, cross the boundary (i.e., leave the system). Once it does so, however, it cannot return. If it does need to return, it means that you need to extend your boundary outward.
So in all models, before we can start to build the model, we need to define the boundary of the system. In some cases, the boundary will be obvious (e.g., for physical reasons). In others, it may not be. In general, however, the boundary is selected by considering the following:
- Can I easily define my external forcing functions at the boundary I have selected?
- At which point do I no longer need to track mass as it moves outward (i.e., where does mass “leave” the system of interest)?
- Is my boundary defined such that it is a good approximation to assume that there are no feedbacks between what leaves the system and what enters it?
When considering boundaries, it is important to consider not only boundaries in space, but also boundaries in time. That is, when will the simulation start and when will it stop? The simulation start time, of course, impacts the initial conditions that you specify for the model. The simulation could start at the present time (e.g., for an existing site whose behavior you wish to predict going forward) or at some time in the future (for an engineered site that does not yet exist). In other cases, the simulation may need to start in the past. This is often required for two reasons. First, in order to make predictions about the future, if data is available to do so, you will typically want to calibrate the model by simulating what was observed in the past. Secondly, in some cases you may not know what the conditions are today but may know what they were at some point in the past. For example, you may know how much mass was disposed 30 years ago (but don’t know where that mass is today). In this case, you may then want to start your simulation 30 years in the past.
The end time for your simulation is a function of your modeling objectives (and the nature of the system). If you are trying to predict a peak concentration at some distance from a contaminant source, you obviously need to simulate the system for a sufficient duration to see that peak. In some models that may be days or months, while in others (e.g., radioactive waste disposal systems) it could be tens of thousands of years!
What Processes Need to be Considered and in How Much Detail?
In the next four Lessons, we will provide an overview of the key physical and chemical processes controlling mass transport. Right now, however, we are not going to talk about specific processes, but how you actually go about determining which processes to include in your model.
The level of detail and the processes that need to be considered in your model are strongly determined by your modeling objectives and the specific outputs you need to calculate. This is one of the reasons why we emphasized the importance of defining quantitative objectives. Specifying your objective as “understanding the behavior of the system” would make it very difficult to determine which processes needed to be included and in what detail.
Even if you do have a good understanding of which processes need to be considered, you will still need to determine the level of detail in which you represent those processes. There are two aspects to consider when speaking about the level of detail:
- One simply involves how far you “drill down” into representing that process. Can you approximate the behavior with a fairly simple representation (e.g., a response surface or lookup table) or do you need to represent the process in greater detail (e.g., solving a complex set of equations)?
- The second is often (but does not have to be) related to the first. What level of discretization (resolution in both space and time) is required to represent the process? As noted in Unit 1, when mass transport equations are solved numerically, it is necessary to discretize space into discrete volumes or compartments. Mass is spread out equally throughout any particular discretized volume instantaneously (i.e., the volume is instantaneously well-mixed). Temporal discretization refers to the length of the timestep (discussed in detail in Unit 6, Lesson 3 of the Basic GoldSim Course).
To illustrate how the modeling objectives strongly determine what processes need to be considered (and at what level of detail), consider the example of predicting the concentration of a contaminant in two different lakes. In one lake, the lake can be assumed to always be very well-mixed spatially, and concentrations change slowly over time (the mass in the system has lots of “inertia”). Our objective is to compute the annual average concentration over the next five years. In another lake, the lake has significant spatial variations in concentration, and these can change fairly rapidly over time. Our object is to predict the monthly fluctuations in concentrations in different regions of the lake over the next year. The models for these two lakes would necessarily be different in several ways:
- The processes that need to be considered would be different. Both models would need to represent the loss mechanisms for the contaminant (e.g., advection out of the lake, volatilization, biological transformations). However, the second model would also need to consider the detailed hydrodynamic processes occurring within the lake (driven by wind, temperature, etc.) that produce spatial variations in the concentration.
- In order to represent the temporal and spatial variation of properties in the lake, the level of discretization (in both space and time) would need to be much finer in the second model.
- Some processes may need to be represented in more detail in the second model to properly capture aspects associated with these spatial and temporal variations in the properties of the lake. For example, temperature variations in volatilization or degradation rates might need to be included, whereas simpler models (using average properties) could be used for the first model.
There is one other critical factor that needs to accounted for when determining the appropriate level of detail you should use in your model (and it is unfortunately often completely ignored). How much uncertainty exists with regard to the input parameters and processes you are trying to model? This is such an important topic (and so greatly informs GoldSim’s entire approach to modeling contaminant transport) that we will take an entire Lesson to discuss this topic later in this Unit, and will revisit the topic in the final Unit of the Course. For the purposes of the present discussion, it is sufficient to note that generally the greater your uncertainty, the less detail you should include in your model (and in fact including more detail than is justified by the level of uncertainty would actually be inappropriate).
Although these considerations will provide initial insight into which processes need to be included in your model and the level of detail (including the level of discretization) in which they need to be represented, there is no “magic bullet” to determine what needs to be included or excluded. To a large extent, this requires experience. Most importantly, most models will evolve over time in this regard. The basic concept is that as new data are obtained about the system you are modeling (e.g., through a data collection program or research) and/or as new insights to the behavior of the system are obtained (based on preliminary model results), you will reevaluate and refine the model. You may find, for example, that your model simply cannot reproduce observed results. In that case, perhaps you need to consider some processes that you previously excluded, or you may need to model existing processes in greater detail. This “top-down” modeling process (which applies to any kind of modeling) was discussed in Unit 17, Lesson 4 of the Basic GoldSim Course.
Building simulation models in such a way that they are continuously modified in response to new information is what allows us to move beyond only providing predictions of performance; they can provide a systematic framework for organizing and evaluating the available information about the system, and can act as a management tool to aid decision-making with regard to data collection and resource allocation (what should be studied, when, and in what detail?). We will discuss this further towards the end of this Unit.
In the next four Lessons, we are going to very briefly discuss the key physical and chemical processes controlling mass transport. These are the processes we are going to learn to represent using the Contaminant Transport Module. Obviously, entire textbooks have been written describing these processes (in Unit 1, we mentioned one such text, Chemical Fate and Transport in the Environment by Hemond and Fechner). Moreover, a basic understanding of fundamental contaminant transport concepts is assumed if you are going to try to build contaminant transport models. Our goal here is to simply provide a very quick overview to review the fundamental concepts and definitions used throughout the remainder of the Course. This then will allow you to understand how GoldSim represents these processes and solves the associated equations.