Sampler API
The pycbc.inference.sampler
module is the interface between
pycbc_inference
and the sampling engines, such as emcee
. Below, we
provide an overview of the general structure of the sampler classes, how it
interacts with pycbc_inference
, and how to add support for new samplers. We
also provide inheritance diagrams for all of the currently supported samplers.
Overview & Guidelines
The following guidelines apply to the sampler classes:
All sampler classes must inherit from the
BaseSampler
class. This is anabstract base class
that defines methods that all samplers must implement, as they will be used bypycbc_inference
. (See this tutorial for a primer on abstract base classes.)All sampler classes must have a
name
attribute that is unique across all sampler classes inpycbc.inference.sampler
. This name is used to reference the sampler throughout the code, and is how the user specifies which sampler to use in their config file. For example,EmceeEnsembleSampler
’s name is'emcee'
.Duplicate code should be avoided. If multiple samplers have common methods, those methods should be added to one or more support classes that the samplers can inherit from, in addition to
BaseSampler
. For example, all MCMC samplers need to be able to compute an autocorrelation length. That functionality is provided for single-temperature MCMCs in theMCMCAutocorrSupport
class. These support classes may themselves be abstract base classes which add more required methods to samplers that inherit from them.Inheritance is kept to one level. For example, if we have sampler class
Foo(Bar, BaseSampler)
, bothBar
andBaseSampler
do not inherit from any parent classes, onlyobject
.To avoid confusion, only inherited abstract methods should be overridden.
All sampler classes need a corresponding class in the
pycbc.inference.io
module for handling reading and writing. See Inference IO for more details on IO classes.
As mentioned above, the BaseSampler
class is an abstract base class. It
defines a collection of abstract methods and properties that all samplers must
override in order to function properly. These are (click on the names to see
their documentation):
set_initial_conditions
Detailed example
Let’s examine the EmceeEnsembleSampler
class to see how these guidelines
apply in practice. Here is its inheritance structure (click on the names of the
classes to see their documentation):
In addition to BaseSampler
, EmceeEnsembleSampler
inherits from BaseMCMC
and MCMCAutocorrSupport
. Inspecting BaseMCMC
, we see that it implements
several of the methods that BaseSampler
requires: namely,
set_initial_conditions
, run
, and checkpoint
.
This is because the steps taken in these functions are common across MCMC
samplers. For example, in run
, the sampler is run for blocks of
iterations (specified by checkpoint_interval
) until the convergence
criteria has been met (which is determined by set_target
). This,
generally, is what all MCMC samplers do.
How an MCMC sampler is run for some number of iterations is unique to each
sampling engine. To accommodate this, BaseMCMC
adds BaseMCMC.run_mcmc
,
which it calls from within its run
. BaseMCMC.run_mcmc
is an
abstract method – BaseMCMC
is itself an abstract base class. Since
EmceeEnsembleSampler
inherits from BaseSampler
, followed by BaseMCMC
(see
note), BaseMCMC
fulfils BaseSampler
’s
requirement that a run
method be implemented, but replaces it with the
requirement that the class define a run_mcmc
method. As a result,
EmceeEnsembleSampler
has its own run_mcmc
; this is
where the call to the underlying sampling engine (external to pycbc) is made.
Note
In python, the order of inheritance is determined by the order the parents
are given in the class definition, from right to left. For example,
EmceeEnsembleSampler
is defined as:
class EmceeEnsembleSampler(MCMCAutocorrSupport, BaseMCMC, BaseSampler):
This means that methods introduced by BaseSampler
will be overridden by
BaseMCMC
, which in turn will be overridden by MCMCAutocorrSupport
. For
this reason, all sampler class definitions must have BaseSampler
listed
last.
All MCMC samplers need to be able to compute an autocorrelation function (ACF)
and length (ACL). This is used to determine how to thin the chains to obtain
independent samples. Consequently, BaseMCMC
also adds abstract base methods
compute_acf
and compute_acl
; these are called by its
checkpoint
method. The MCMCAutocorrSupport
class provides these
functions. These functions are provided in a class separate from BaseMCMC
because not all MCMC samplers estimate ACF/Ls in the same way. For example,
multi-tempered samplers need to compute ACF/Ls separately for each temperature
chain. Consequently, there is an equivalent class
MultiTemperedAutocorrSupport
that offers the same functions for
multi-tempered MCMCs. This class is used by, e.g., EmceePTSampler
(see its
inheritance diagram, below). By making the
compute ACF/L functions abstract base methods in BaseMCMC
, both single and
multi-tempered MCMC samplers can inherit from BaseMCMC
.
We see that by separating functionality out into support classes
and using multiple inheritance, we are able to provide support for all of the
unique features of different samplers, while keeping the base API that
pycbc_inference
interacts with simple.
Inheritance diagrams
Here are inheritance diagrams for all of the currently supported samplers:
cpnest
:
dummy
:
dynesty
:
emcee
:
emcee_pt
:
epsie
:
games
:
multinest
:
nessai
:
ptemcee
:
refine
:
snowline
:
ultranest
:
How to add a sampler
To add support for a new sampler, do the following:
Create a file in
pycbc/inference/sampler
for the new sampler’s class.Add the new class definition. The class must inherit from at least
BaseSampler
.Give a name attribute to the class that is unique across the supported sampler classes.
Add an IO class for the sampler to the
inference.io
modules. Set your new class’sio
attribute to point to this new class.Add any other methods you need to satisfy the
BaseSampler
’s required methods. When doing so, try to follow the guidelines above: do not duplicate code, and try to use support classes that offer functionality that you need. If you think some of the methods will be useful for more than just your sampler, create a new support class and add those methods to it. However, if you’re unsure what is available or you would have to make changes to the support classes that may break other samplers, just add the methods you need to your new class definition. Fixing code duplication or rearranging support classes can be done through the review process when you wish to add your new sampler to the main gwastro repository.Add the sampler to the
samplers
dictionary inpycbc/inference/sampler/__init__.py
so thatpycbc_inference
is aware of it to use.When you’re satisfied that your new sampler works, file a pull request to get it into the main gwastro repostiory. Thank you for your contributions!