VVUQ: Verification, Validation and Uncertainty Quantification in Modelling
If you work with models or simulations, it's crucial to know how reliable and accurate your results are. That’s where Verification, Validation, and Uncertainty Quantification (VVUQ) come in. These methods help you build confidence in your simulations by checking that they’re working correctly, represent the real world well, and account for uncertainty. VVUQ provides a clear, structured way to ensure your predictions are trustworthy. If you plan to use the SEAVEA toolkit, understanding VVUQ will help you get the most credible and useful results from your models.
Why VVUQ matters: building trust in scientific simulations
Even experienced researchers often struggle to answer key questions about their simulations. For example:
They might not know how many variables or assumptions are really built into their models.
They may be unsure which parts of the model have the biggest effect on the results or introduce the most uncertainty.
They often can’t say clearly when the model’s outputs can be trusted — or when they can’t.
Without a clear, structured way to tackle these issues, even the most advanced simulations can be misleading. Unverified or unchecked models can lead to incorrect conclusions, wasted resources, or poor decisions.
For example:
If a climate model doesn’t show how uncertain its forecasts are, it could mislead policy decisions about the environment.
In drug discovery, relying on a model that misjudges a molecule’s effectiveness could mean ignoring promising treatments or spending money on ones that won’t work.
This is where Verification, Validation, and Uncertainty Quantification (VVUQ) comes in. VVUQ helps you check whether your model works as intended, whether it reflects reality, and how reliable its predictions are. It moves you beyond just getting a result, to understanding how much you can trust that result.
It enforces transparent and reproducible science:
VVUQ makes you clearly explain your assumptions, data quality, and methods. This makes it easier for others to understand, trust, and reproduce your results.
You can understand risk better:
VVUQ shows the range of possible outcomes and how likely each is, helping you judge the risks in your predictions — whether it's the chance of a flood, how a material might behave, or how a cell might respond.
Smarter use of time and resources:
It helps you focus on the most important inputs. If some don’t affect results much, you can avoid spending time and money measuring them too precisely. If others are crucial, you’ll know to investigate them further.
It might give you deeper insights and new questions:
Looking closely at how your model works can reveal unexpected relationships or missing knowledge. VVUQ often leads to new scientific ideas and sharper research questions.
Helpful for policymaking and regulation:
When simulation results are used to inform real-world decisions — in health, safety, or the environment — VVUQ helps show that the results are backed by solid evidence.
Using VVUQ means doing better science. It helps you trust your model’s results, and explain that trust to others.
In depth discussion of VVUQ:Â
1. Verification: Ensuring the Code Does What it's Meant To
Verification is the process of determining that a computational model accurately represents the developer's conceptual description and the mathematical model. In simpler terms, it's about checking for bugs in the code and ensuring that the numerical algorithms are implemented correctly. It answers the question: "Did we build the model right?"
There are two primary aspects to verification:
Code Verification: This focuses on ensuring that the computer code correctly implements the mathematical model. Techniques often include:
Analytical Solutions: Comparing the code's output for simplified problems against known analytical solutions, where exact answers can be derived mathematically.
Method of Manufactured Solutions (MMS): This advanced technique involves designing a known analytical solution to the partial differential equations that govern the model, then modifying the equations with source terms so that this solution is exact. The code's ability to reproduce this "manufactured" solution, including the source terms, then verifies its implementation of the equations.
Code-to-Code Comparisons: Running the same problem on different, independently developed codes and comparing their results. Significant discrepancies can highlight potential issues in one or both codes.
Input Data and Type Verification: Ensuring that the model correctly processes and interprets all input data, including checking for correct data types, formats, ranges, and consistency. This involves verifying that the code handles valid inputs as expected and provides appropriate feedback or error handling for invalid or unexpected inputs. This is a general software quality practice that is particularly relevant for computational models where input errors can silently propagate and lead to incorrect results.
Software Engineering Practices: Adhering to good programming practices, including rigorous code reviews, debugging, and comprehensive testing throughout the development lifecycle.
Solution Verification: This addresses the accuracy of the numerical solution for a specific calculation, ensuring that numerical errors (such as those from discretisation or iterative convergence) are sufficiently small. Common techniques include:
Grid Convergence Studies (or Mesh Refinement Studies): Systematically reducing the size of the discretisation elements (e.g., mesh cells in a CFD simulation) and observing how the solution changes. If the solution converges to a stable value as the grid is refined, it indicates that discretisation errors are being reduced and the solution is approaching the "grid-independent" answer.
Time-Step Convergence Studies: Similar to grid convergence, but applied to the size of the time steps in transient simulations.
Without proper verification, even a perfectly conceived mathematical model can yield incorrect results due to errors in its implementation.
2. Validation: Confirming the Model Reflects Reality
Validation is the process of determining the degree to which a model is an accurate representation of the real world from the perspective of the intended uses of the model. This is about answering: "Did we build the right model?"
Validation involves comparing the model's predictions against empirical data obtained from physical experiments, field observations, or historical records. Key aspects include:
Experimental Data Comparison: This is the cornerstone of validation. Model outputs (e.g., temperatures, stresses, flow velocities) are compared directly against measurements from carefully designed experiments.
Calibration: Sometimes, model parameters may be adjusted within physically plausible ranges to improve agreement with experimental data. However, it's important that calibration is not used to "force" agreement in a way that sacrifices the model's predictive capability beyond the calibrated conditions.
Discrepancy Assessment: Quantifying the differences between model predictions and experimental data. This can involve statistical metrics and visual comparisons.
Hierarchical Validation: Beginning with simpler, fundamental experiments that isolate specific physical phenomena, and progressively moving towards more complex, integrated system-level tests. This allows for a modular approach to understanding where model discrepancies might arise.
Face Validity: Having domain experts review the model's inputs, outputs, and overall behaviour to determine if it appears to be a reasonable representation of the real system, even before rigorous quantitative comparisons.
Assumptions Validation: Critically examining the assumptions made during model development (e.g., simplified geometries, ideal material properties, neglected phenomena) and, where possible, conducting targeted experiments or analyses to determine their impact and validity.
It's important to remember that a model is rarely "perfectly" validated; rather, it is validated to a certain "degree of accuracy" for a specific "intended use." The level of validation required depends heavily on the consequences of incorrect predictions.
3. Uncertainty Quantification (UQ): Gauging Confidence in Predictions
Uncertainty Quantification is the science of characterising, quantifying, and propagating uncertainties in model inputs, parameters, and forms, to assess their impact on model outputs and ultimately, the confidence in predictions. No model is perfect, and every real-world system involves some level of variability and incomplete knowledge. UQ helps us understand not just what the model predicts, but how certain that prediction is.
Sources of uncertainty typically fall into two categories:
Aleatory Uncertainty (Irreducible Randomness): This is inherent variability in a system that cannot be reduced, even with more data or knowledge. Examples include manufacturing tolerances, variations in environmental conditions, or truly random phenomena. This type of uncertainty is typically described using probability distributions.
Epistemic Uncertainty (Lack of Knowledge): This arises from a lack of knowledge about the system, its inputs, or the model itself. Examples include unknown material properties, unmeasured initial conditions, or simplifications made in the model's mathematical representation. Epistemic uncertainty can often be reduced by collecting more data, refining the model, or improving measurement techniques.
Key UQ activities include:
Input Uncertainty Characterisation: Defining the probability distributions or ranges for uncertain input parameters and boundary conditions.
Uncertainty Propagation: Using various computational methods to determine how these input uncertainties translate into uncertainty in the model's outputs. Common methods include:
Monte Carlo Simulation: Repeatedly running the model with randomly sampled input values (drawn from their defined distributions) to generate a distribution of output results.
Polynomial Chaos Expansion (PCE): Building a surrogate model that represents the output as a polynomial function of the uncertain inputs, allowing for more efficient uncertainty propagation.
Sensitivity Analysis: This is an important component of Uncertainty Quantification, but also has applications in Verification and Validation. Sensitivity analysis systematically investigates how the variation in the output of a model can be attributed to different sources of variation in its inputs. It helps identify which input parameters or assumptions have the most significant influence on the model's predictions.
There are generally two types of sensitivity analysis:
Local Sensitivity Analysis (One-at-a-Time, OAT): This involves changing one input parameter at a time while holding all others constant and observing the effect on the output. While simple to implement and interpret, it doesn't account for interactions between parameters.
Global Sensitivity Analysis (GSA): This considers the simultaneous variation of all input parameters across their entire range or distribution. GSA methods aim to quantify how much each input, or combination of inputs, contributes to the overall output uncertainty. Popular GSA techniques include:
Variance-Based Methods (e.g., Sobol' Indices): These decompose the total variance of the output into contributions from individual inputs and their interactions. They provide a comprehensive understanding of input importance.
Regression-Based Methods: Fitting a regression model to the input-output relationship to determine the coefficients that indicate the sensitivity of the output to each input.
Derivative-Based Methods: Calculating partial derivatives of the output with respect to each input to understand the local rate of change.
The insights gained from sensitivity analysis are invaluable:
Prioritising Data Collection: It helps identify which input parameters, if measured more precisely, would most significantly reduce the uncertainty in the model output. This guides experimental design and data acquisition efforts.
Model Simplification: If an input parameter is found to have negligible impact on the output, it might be simplified or even fixed, reducing model complexity and computational cost.
Understanding Model Behaviour: It provides a deeper understanding of the underlying relationships within the model, revealing which mechanisms are driving the predictions.
Robustness Assessment: It can help identify if the model's conclusions are robust to variations in uncertain inputs.
Quantifying Predictive Uncertainty: Expressing the confidence in model predictions, often through prediction intervals or probabilistic statements about outcomes.
Model Calibration and Parameter Estimation with Uncertainty: Adjusting model parameters to match experimental data while accounting for uncertainties in both the measurements and the model itself, often using Bayesian inference techniques.
By integrating UQ, modellers can provide decision-makers with a comprehensive understanding of the risks associated with model predictions, moving beyond single, deterministic answers to a more realistic spectrum of possibilities.
The Role of SEAVEA: A Comprehensive Toolkit for VVUQ
The SEAVEA toolkit is designed as a comprehensive platform to facilitate and streamline the complex processes of Verification, Validation, and Uncertainty Quantification. Rather than being a single, monolithic piece of software, SEAVEA integrates various specialised components, each tailored to address specific aspects of VVUQ, providing a structured workflow for assessing and enhancing model trustworthiness.
Here's a look at the key components of the SEAVEA toolkit and how they support your modelling efforts, based on the provided information:
EasyVVUQ: This is a Python library specifically designed for Uncertainty Quantification (UQ). It provides functionalities for tasks such as parameter sampling (generating varied sets of inputs for UQ studies) and statistical post-processing of simulation results to analyse the propagation of uncertainty.
FabSim3: A command-line tool that plays a significant role in managing High-Performance Computing (HPC) workflows. It automates remote job submission, which is particularly useful for running the large ensembles of simulations often required for UQ and sensitivity analyses.
QCG-PilotJob: This component acts as a lightweight pilot job manager. It efficiently runs ensembles of tasks, reducing overhead for individual simulation runs on HPC systems, which is beneficial for the numerous computations needed for UQ.
MUSCLE3: This framework is employed for coupling multiple models to create multiscale simulations. For complex systems, validation often requires integrating different models, and MUSCLE3 facilitates this, allowing for more comprehensive system-level validation studies.
EasySurrogate: This component provides tools for building surrogate models and emulators. These are computationally inexpensive approximations of complex simulation models, including support for multi-output Gaussian processes. Surrogates are highly valuable for UQ and sensitivity analysis as they allow for many more "runs" than a full simulation model.
MOGP Emulator: A Python package specifically for fitting Gaussian Process Emulators to results obtained from computer simulations. This directly supports the creation of accurate surrogate models, which are then used in UQ to efficiently explore the parameter space and quantify uncertainties.
mUQSA: This is a web-based Graphical User Interface (GUI) for Multipurpose Uncertainty Quantification and Sensitivity Analysis. It automates and streamlines UQ computations on HPC systems, making the toolkit more accessible and user-friendly for complex analyses.
RADICAL-Pilot: As a pilot-job system, RADICAL-Pilot enables the scalable and flexible execution of a large number of diverse computational tasks on HPC platforms. It further reduces overhead by bypassing the batch system for individual tasks, enhancing the efficiency of large-scale UQ campaigns.
By integrating these specialized components, the SEAVEA toolkit helps users systematically apply VVUQ principles, from setting up rigorous verification tests to performing complex uncertainty propagation and sensitivity analyses efficiently on HPC resources. Engaging with SEAVEA means you're not just running simulations; you're building a foundation of credibility and confidence in your models, allowing you to make more reliable and informed decisions for your applications. Understanding VVUQ is the first important step towards maximising the utility and trustworthiness of your computational work with SEAVEA.