How to include a new calibrator

In this tutorial, we describe how to include a new calibrator to the surmise’s framework. We illustrate this with directbayeswoodbury–a calibrator method located in the directory \calibrationmethods.

In surmise, all calibrator methods inherit from the base class surmise.calibration.calibrator. A calibrator class calls the user input method, and fits the corresponding calibrator. surmise.calibration.calibrator.fit() is the main surmise.calibration.calibrator class methods. It also provides the functionality of updating and manipulating the fitted calibrator by surmise.calibration.calibrator.predict() class methods.

In order to use the functionality of the base class surmise.calibration.calibrator, we categorize the functions to be included in a new emulation method (for example, directbayeswoodbury) into two categories.

Mandatory functions

fit() is the only obligatory function for a calibration method. fit() takes the fitted emulator class object surmise.emulation.emulator, inputs \(\mathbf{X}\), and observed values \(\mathbf{y}\), where \(\mathbf{X}\in\mathbb{R}^{N\times p}\), \(\mathbf{y}\in\mathbb{R}^{N\times 1}\), and the dictionary fitinfo to place the fitting information once complete. This dictionary is used to keep the information that will be used by predict() below.

The directbayeswoodbury.fit() is given below for illustration:

directbayeswoodbury.fit(fitinfo, emu, x, y, **bayeswoodbury_args)[source]

The main required function to be called by calibration to fit a calibration model.

Note

This approach uses Bayesian posterior sampling using the following steps:

  • 1. Take the emulator to approximate the computer model simulations

  • 2. Obtain the emulator predictive mean values at a given theta and x

  • 3. Calculate the residuals between emulator predictions and observed data

  • 4. Provide the log-posterior as the sum of log-prior and log-likelihood

  • 5. Use Monte Carlo or nested sampling method to sample the posterior

Parameters:
  • fitinfo (dict) –

    An arbitary dictionary where the fitting information is placed once complete. This dictionary is pass by reference, so there is no reason to return anything. Keep only stuff that will be used by predict below.

    Note that the following are preloaded:

    • fitinfo[‘thetaprior’].rnd(s) : Get s random draws from the prior predictive distribution on theta.

    • fitinfo[‘thetaprior’].lpdf(theta) : Get the logpdf at theta(s).

    The following are optional preloads based on user input:

    • fitinfo[yvar] : The vector of observation variances at y

    In addition, calibration can directly use and communicate back to the user if you include:

    • fitinfo[‘thetamean’] : the mean of the prediction of theta.

    • fitinfo[‘thetavar’] : the predictive variance of theta.

    • fitinfo[‘thetarnd’] : some number draws from the predictive distribution of theta.

    • fitinfo[‘lpdf’] :log of the posterior of the given theta.

  • emu (surmise.emulation.emulator) – An emulator class instance as defined in emulation.

  • x (numpy.ndarray) – An array of x that represent the inputs.

  • y (numpy.ndarray) – A one dimensional array of observed values at x.

  • args (dict, optional) – A dictionary containing options passed. The default is None.

Return type:

None.

Once the calibration method is fitted, the base surmise.calibration.calibrator assigns surmise.calibration.calibrator.theta as an attribute of the class object to communicate with the fitted method through general expressions. The attribute surmise.calibration.calibrator.theta has methods surmise.calibration.calibrator.theta.mean(), surmise.calibration.calibrator.theta.var(), surmise.calibration.calibrator.theta.rnd(), and surmise.calibration.calibrator.theta.lpdf(), which can be called once the user obtains the fitted calibrator.

Those expressions are defined within the base class to simplify the usage of the fitted models. In order to use those methods, the calibration method developers should either include functions thetamean(), thetavar(), thetarnd(), and/or, thetalpdf() in their methods, or define within the dictionary fitinfo using the keys thetamean, thetavar, thetarnd, and/or, thetalpdf.

An example is the thetalpdf() function provided from the directbayeswoodbury:

directbayeswoodbury.thetalpdf(fitinfo, theta, args=None)[source]

Returns log of the posterior of the given theta.

Not required.

Optional functions

directbayeswoodbury.predict(predinfo, fitinfo, emu, x, args=None)[source]

Finds prediction at x given the emulator and dictionary fitinfo.

Parameters:
  • predinfo (dict) –

    An arbitary dictionary where the prediction information is placed once complete. Key elements:

    • predinfo[‘mean’] : the mean of the prediction

    • predinfo[‘var’] : the variance of the prediction

    • predinfo[‘rand’] : random draws from the predictive distribution of theta.

  • fitinfo (dict) – A dictionary including the calibration fitting information once complete.

  • emu (surmise.emulation.emulator) – DESCRIPTION.

  • x (TYPE) – An array of x values where the prediction occurs.

  • args (dict, optional) – A dictionary containing options. The default is None.

Return type:

None.