How to include a new emulator

In this tutorial, we describe how to include a new emulator to the surmise’s framework. We illustrate this with PCGP–an emulator method located in the directory \emulationmethods.

In surmise, all emulator methods inherit from the base class surmise.emulation.emulator. An emulator class calls the user input method, and fits the corresponding emulator. surmise.emulation.emulator.fit() and surmise.emulation.emulator.predict() are the main surmise.emulation.emulator class methods. It also provides the functionality of updating and manipulating the fitted emulator by surmise.emulation.emulator.supplement(), surmise.emulation.emulator.update(), and surmise.emulation.emulator.remove() class methods.

In order to use the functionality of the base class surmise.emulation.emulator, we categorize the functions to be included into a new emulation method (for example PCGP) into two categories.

Mandatory functions

fit() and predict() are the two obligatory functions for an emulation method. fit() takes the inputs \(\mathbf{X}\), parameters \(\theta\), and the function evaluations \(\mathbf{f}\), where \(\mathbf{X}\in\mathbb{R}^{N\times p}\), \(\theta\in\mathbb{R}^{M\times d}\), and \(\mathbf{f}\in\mathbb{R}^{N\times M}\). In other words, each column in \(\mathbf{f}\) should correspond to a row in \(\theta\). Each row in \(\mathbf{f}\) should correspond to a row in \(\mathbf{X}\). In addition, the dictionary fitinfo is passed to the fit() function to place the fitting information once complete. This dictionary is used keep the information that will be used by predict() below.

The PCGP.fit() is given below for an illustration:

PCGP.fit(fitinfo, x, theta, f, epsilon=0.1, **kwargs)[source]

The purpose of fit is to take information and plug all of our fit information into fitinfo, which is a python dictionary.

Note

This is an application of the method proposed by Higdon et al., 2008. The idea is to use PCA to project the original simulator outputs onto a lower-dimensional space spanned by an orthogonal basis. The main steps are

Standardize f

Compute the SVD of f, and get the PCs

3. Project the original centred data into the orthonormal space to obtain the matrix of coefficients (say we use r PCs)

Then, build r separate and independent GPs from the input space

Parameters:

fitinfo (dict) – A dictionary including the emulation fitting information once complete. The dictionary is passed by reference, so it returns None.
x (numpy.ndarray) – An array of inputs. Each row should correspond to a row in f.
theta (numpy.ndarray) – An array of parameters. Each row should correspond to a column in f.
f (numpy.ndarray) – An array of responses. Each column in f should correspond to a row in theta. Each row in f should correspond to a row in x.
args (dict, optional) – A dictionary containing options. The default is None.

Return type:

None.

Once the base class surmise.emulation.emulator is initialized, surmise.emulation.emulator.fit() method calls the developer’s emulator’s fit() function, and places all information into the dictionary fitinfo.

PCGP.predict(predinfo, fitinfo, x, theta, computecov=True, **kwargs)[source]

Parameters:

predinfo (dict) –
An arbitary dictionary where you should place all of your prediction information once complete.
- predinfo[‘mean’] : mean prediction.
- predinfo[‘cov’] : variance of the prediction.
x (numpy.ndarray) – An array of inputs. Each row should correspond to a row in f.
theta (numpy.ndarray) – An array of parameters. Each row should correspond to a column in f.
f (numpy.ndarray) – An array of responses. Each column in f should correspond to a row in x. Each row in f should correspond to a row in x.
args (dict, optional) – A dictionary containing options. The default is None.

Return type:

Prediction mean and variance at theta and x given the dictionary fitinfo.

surmise.emulation.emulator.predict() method returns a prediction class object, which has methods surmise.emulation.prediction.mean(), surmise.emulation.prediction.mean_gradtheta(), surmise.emulation.prediction.var(), surmise.emulation.prediction.covx(), surmise.emulation.prediction.covxhalf(), surmise.emulation.prediction.covxhalf_gradtheta(), surmise.emulation.prediction.rnd(), and surmise.emulation.prediction.lpdf().

Those expressions are defined within the base class to simplify the usage of the fitted models. In order to use those methods, the emulation method developers should either include functions predictmean(), predictmean_gradtheta(), predictvar(), predictcovx(), predictcovxhalf(), predictcovxhalf_gradtheta(), predictrnd(), and predictlpdf() into their methods, or define within the dictionary fitinfo using the keys mean, mean_gradtheta, var, covx, covxhalf, covxhalf_gradtheta, rnd, and lpdf.

Optional functions

supplementtheta() is an optional function for an emulation method.