Sampler API

The pycbc.inference.sampler module is the interface between pycbc_inference and the sampling engines, such as emcee. Below, we provide an overview of the general structure of the sampler classes, how it interacts with pycbc_inference, and how to add support for new samplers. We also provide inheritance diagrams for all of the currently supported samplers.

Overview & Guidelines

The following guidelines apply to the sampler classes:

  1. All sampler classes must inherit from the BaseSampler class. This is an abstract base class that defines methods that all samplers must implement, as they will be used by pycbc_inference. (See this tutorial for a primer on abstract base classes.)

  2. All sampler classes must have a name attribute that is unique across all sampler classes in pycbc.inference.sampler. This name is used to reference the sampler throughout the code, and is how the user specifies which sampler to use in their config file. For example, EmceeEnsembleSampler’s name is 'emcee'.

  3. Duplicate code should be avoided. If multiple samplers have common methods, those methods should be added to one or more support classes that the samplers can inherit from, in addition to BaseSampler. For example, all MCMC samplers need to be able to compute an autocorrelation length. That functionality is provided for single-temperature MCMCs in the MCMCAutocorrSupport class. These support classes may themselves be abstract base classes which add more required methods to samplers that inherit from them.

  4. Inheritance is kept to one level. For example, if we have sampler class Foo(Bar, BaseSampler), both Bar and BaseSampler do not inherit from any parent classes, only object.

  5. To avoid confusion, only inherited abstract methods should be overridden.

  6. All sampler classes need a corresponding class in the pycbc.inference.io module for handling reading and writing. See Inference IO for more details on IO classes.

As mentioned above, the BaseSampler class is an abstract base class. It defines a collection of abstract methods and properties that all samplers must override in order to function properly. These are (click on the names to see their documentation):

Detailed example

Let’s examine the EmceeEnsembleSampler class to see how these guidelines apply in practice. Here is its inheritance structure (click on the names of the classes to see their documentation):

Inheritance diagram of pycbc.inference.sampler.emcee

In addition to BaseSampler, EmceeEnsembleSampler inherits from BaseMCMC and MCMCAutocorrSupport. Inspecting BaseMCMC, we see that it implements several of the methods that BaseSampler requires: namely, set_initial_conditions, run, and checkpoint. This is because the steps taken in these functions are common across MCMC samplers. For example, in run, the sampler is run for blocks of iterations (specified by checkpoint_interval) until the convergence criteria has been met (which is determined by set_target). This, generally, is what all MCMC samplers do.

How an MCMC sampler is run for some number of iterations is unique to each sampling engine. To accommodate this, BaseMCMC adds BaseMCMC.run_mcmc, which it calls from within its run. BaseMCMC.run_mcmc is an abstract method – BaseMCMC is itself an abstract base class. Since EmceeEnsembleSampler inherits from BaseSampler, followed by BaseMCMC (see note), BaseMCMC fulfils BaseSampler’s requirement that a run method be implemented, but replaces it with the requirement that the class define a run_mcmc method. As a result, EmceeEnsembleSampler has its own run_mcmc; this is where the call to the underlying sampling engine (external to pycbc) is made.

Note

In python, the order of inheritance is determined by the order the parents are given in the class definition, from right to left. For example, EmceeEnsembleSampler is defined as:

class EmceeEnsembleSampler(MCMCAutocorrSupport, BaseMCMC, BaseSampler):

This means that methods introduced by BaseSampler will be overridden by BaseMCMC, which in turn will be overridden by MCMCAutocorrSupport. For this reason, all sampler class definitions must have BaseSampler listed last.

All MCMC samplers need to be able to compute an autocorrelation function (ACF) and length (ACL). This is used to determine how to thin the chains to obtain independent samples. Consequently, BaseMCMC also adds abstract base methods compute_acf and compute_acl; these are called by its checkpoint method. The MCMCAutocorrSupport class provides these functions. These functions are provided in a class separate from BaseMCMC because not all MCMC samplers estimate ACF/Ls in the same way. For example, multi-tempered samplers need to compute ACF/Ls separately for each temperature chain. Consequently, there is an equivalent class MultiTemperedAutocorrSupport that offers the same functions for multi-tempered MCMCs. This class is used by, e.g., EmceePTSampler (see its inheritance diagram, below). By making the compute ACF/L functions abstract base methods in BaseMCMC, both single and multi-tempered MCMC samplers can inherit from BaseMCMC.

We see that by separating functionality out into support classes and using multiple inheritance, we are able to provide support for all of the unique features of different samplers, while keeping the base API that pycbc_inference interacts with simple.

Inheritance diagrams

Here are inheritance diagrams for all of the currently supported samplers:

  • cpnest:

Inheritance diagram of pycbc.inference.sampler.cpnest.CPNestSampler

  • dummy:

Inheritance diagram of pycbc.inference.sampler.dummy.DummySampler

  • dynesty:

Inheritance diagram of pycbc.inference.sampler.dynesty.DynestySampler

  • emcee:

Inheritance diagram of pycbc.inference.sampler.emcee.EmceeEnsembleSampler

  • emcee_pt:

Inheritance diagram of pycbc.inference.sampler.emcee_pt.EmceePTSampler

  • epsie:

Inheritance diagram of pycbc.inference.sampler.epsie.EpsieSampler

  • multinest:

Inheritance diagram of pycbc.inference.sampler.multinest.MultinestSampler

  • nessai:

Inheritance diagram of pycbc.inference.sampler.nessai.NessaiSampler

  • ptemcee:

Inheritance diagram of pycbc.inference.sampler.ptemcee.PTEmceeSampler

  • refine:

Inheritance diagram of pycbc.inference.sampler.refine.RefineSampler

  • snowline:

Inheritance diagram of pycbc.inference.sampler.snowline.SnowlineSampler

  • ultranest:

Inheritance diagram of pycbc.inference.sampler.ultranest.UltranestSampler

How to add a sampler

To add support for a new sampler, do the following:

  1. Create a file in pycbc/inference/sampler for the new sampler’s class.

  2. Add the new class definition. The class must inherit from at least BaseSampler.

  3. Give a name attribute to the class that is unique across the supported sampler classes.

  4. Add an IO class for the sampler to the inference.io modules. Set your new class’s io attribute to point to this new class.

  5. Add any other methods you need to satisfy the BaseSampler’s required methods. When doing so, try to follow the guidelines above: do not duplicate code, and try to use support classes that offer functionality that you need. If you think some of the methods will be useful for more than just your sampler, create a new support class and add those methods to it. However, if you’re unsure what is available or you would have to make changes to the support classes that may break other samplers, just add the methods you need to your new class definition. Fixing code duplication or rearranging support classes can be done through the review process when you wish to add your new sampler to the main gwastro repository.

  6. Add the sampler to the samplers dictionary in pycbc/inference/sampler/__init__.py so that pycbc_inference is aware of it to use.

  7. When you’re satisfied that your new sampler works, file a pull request to get it into the main gwastro repostiory. Thank you for your contributions!