pycbc.inference.sampler package

Submodules

pycbc.inference.sampler.base module

Defines the base sampler class to be inherited by all samplers.

class pycbc.inference.sampler.base.BaseSampler(model)[source]

Bases: object

Abstract base class for all inference samplers.

All sampler classes must inherit from this class and implement its abstract methods.

Parameters:: model (Model) – An instance of a model from pycbc.inference.models.

abstract checkpoint()[source]: The sampler must have a checkpoint method for dumping raw samples and stats to the file type defined by io.

abstract finalize()[source]: Do any finalization to the samples file before exiting.

abstract from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]: This should initialize the sampler given a config file.

abstract property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

abstract property model_stats

A dict mapping model’s metadata fields to arrays of values for each sample in raw_samples.

The arrays may have any shape, and may or may not be thinned.

name = None

abstract resume_from_checkpoint()[source]: Resume the sampler from the output file.

abstract run()[source]

This function should run the sampler.

Any checkpointing should be done internally in this function.

abstract property samples

A dict mapping variable_params to arrays of samples currently in memory. The dictionary may also contain sampling_params.

The sample arrays may have any shape, and may or may not be thinned.

property sampling_params: Returns the sampling params used by the model.

property static_params: Returns the model’s fixed parameters.

property variable_params: Returns the parameters varied in the model.

pycbc.inference.sampler.base.create_new_output_file(sampler, filename, **kwargs)[source]

Creates a new output file.

Parameters:

sampler (sampler instance) – Sampler
filename (str) – Name of the file to create.
**kwargs – All other keyword arguments are passed through to the file’s write_metadata function.

pycbc.inference.sampler.base.initial_dist_from_config(cp, variable_params, static_params=None)[source]

Loads a distribution for the sampler start from the given config file.

A distribution will only be loaded if the config file has a [initial-*] section(s).

Parameters:

cp (Config parser) – The config parser to try to load from.
variable_params (list of str) – The variable parameters for the distribution.
static_params (dict, optional) – The static parameters used to place constraints on the distribution.

Returns:

The initial distribution. If no [initial-*] section found in the config file, will just return None.

Return type:

JointDistribution or None

pycbc.inference.sampler.base.setup_output(sampler, output_file, check_nsamples=True, validate=True)[source]

Sets up the sampler’s checkpoint and output files.

The checkpoint file has the same name as the output file, but with .checkpoint appended to the name. A backup file will also be created.

Parameters:

sampler (sampler instance) – Sampler
output_file (str) – Name of the output file.

pycbc.inference.sampler.base_cube module

Common utilities for samplers that rely on transforming between a unit cube and the prior space. This is typical of many nested sampling algorithms.

class pycbc.inference.sampler.base_cube.CubeModel(model, loglikelihood_function=None, copy_prior=False)[source]

Bases: object

Class for making PyCBC Inference ‘model class’

Parameters:: model (inference.BaseModel instance) – A model instance from pycbc.

log_likelihood(cube)[source]: returns log likelihood function

prior_transform(cube)[source]: prior transform function for ultranest sampler It takes unit cube as input parameter and apply prior transforms

pycbc.inference.sampler.base_cube.call_global_loglikelihood(cube)[source]

pycbc.inference.sampler.base_cube.call_global_logprior(cube)[source]

pycbc.inference.sampler.base_cube.setup_calls(model, loglikelihood_function=None, copy_prior=False)[source]: Configure calls for MPI support

pycbc.inference.sampler.base_mcmc module

Provides constructor classes and convenience functions for MCMC samplers.

class pycbc.inference.sampler.base_mcmc.BaseMCMC[source]

Bases: object

Abstract base class that provides methods common to MCMCs.

This is not a sampler class itself. Sampler classes can inherit from this along with BaseSampler.

This class provides set_initial_conditions, run, and checkpoint methods, which are some of the abstract methods required by BaseSampler.

This class introduces the following abstract properties and methods:

base_shape
[property] Should give the shape of the samples arrays used by the sampler, excluding the iteraitons dimension. Needed for writing results.
run_mcmc(niterations)
Should run the sampler for the given number of iterations. Called by run.
clear_samples()
Should clear samples from memory. Called by run.
set_state_from_file(filename)
Should set the random state of the sampler using the given filename. Called by set_initial_conditions.
write_results(filename)
Writes results to the given filename. Called by checkpoint.
compute_acf(filename, **kwargs)
[classmethod] Should compute the autocorrelation function using the given filename. Also allows for other keyword arguments.
compute_acl(filename, **kwargs)
[classmethod] Should compute the autocorrelation length using the given filename. Also allows for other keyword arguments.

abstract acl()[source]

The autocorrelation length.

This method should convert the raw ACLs into an integer or array that can be used to extract independent samples from a chain.

property act

The autocorrelation time(s).

The autocorrelation time is defined as the autocorrelation length times the thin_interval. It gives the number of iterations between independent samples. Depending on the sampler, this may either be a single integer or an array of values.

Returns None if no ACLs have been calculated.

abstract property base_shape

What shape the sampler’s samples arrays are in, excluding the iterations dimension.

For example, if a sampler uses 20 chains and 3 temperatures, this would be (3, 20). If a sampler only uses a single walker and no temperatures this would be ().

property burn_in: The class for doing burn-in tests (if specified).

checkpoint()[source]: Dumps current samples to the checkpoint file.

static checkpoint_from_config(cp, section)[source]

Gets the checkpoint interval from the given config file.

This looks for ‘checkpoint-interval’ in the section.

Parameters:

cp (ConfigParser) – Open config parser to retrieve the argument from.
section (str) – Name of the section to retrieve from.

Returns:

The checkpoint interval, if it is in the section. Otherw

Return type:

int or None

property checkpoint_interval: The number of iterations to do between checkpoints.

property checkpoint_signal: The signal to use when checkpointing.

static ckpt_signal_from_config(cp, section)[source]

Gets the checkpoint signal from the given config file.

This looks for ‘checkpoint-signal’ in the section.

Parameters:

cp (ConfigParser) – Open config parser to retrieve the argument from.
section (str) – Name of the section to retrieve from.

Returns:

The checkpoint interval, if it is in the section. Otherw

Return type:

int or None

abstract clear_samples()[source]: A method to clear samples from memory.

abstract compute_acf(filename, **kwargs)[source]: A method to compute the autocorrelation function of samples in the given file.

abstract compute_acl(filename, **kwargs)[source]: A method to compute the autocorrelation length of samples in the given file.

abstract effective_nsamples()[source]: The effective number of samples post burn-in that the sampler has acquired so far.

get_thin_interval()[source]

Gets the thin interval to use.

If max_samples_per_chain is set, this will figure out what thin interval is needed to satisfy that criteria. In that case, the thin interval used must be a multiple of the currently used thin interval.

property max_samples_per_chain: The maximum number of samplers per chain that is written to disk.

property nchains: The number of chains used.

property niterations: The current number of iterations.

property p0

A dictionary of the initial position of the chains.

This is set by using set_p0. If not set yet, a ValueError is raised when the attribute is accessed.

property pos

A dictionary of the current walker positions.

If the sampler hasn’t been run yet, returns p0.

property raw_acls

Dictionary of parameter names -> autocorrelation lengths.

Depending on the sampler, the ACLs may be an integer, or an arrray of values per chain and/or per temperature.

Returns None if no ACLs have been calculated.

property raw_acts

Dictionary of parameter names -> autocorrelation time(s).

Returns None if no ACLs have been calculated.

resume_from_checkpoint()[source]: Resume the sampler from the checkpoint file

run()[source]: Runs the sampler.

abstract run_mcmc(niterations)[source]: Run the MCMC for the given number of iterations.

set_burn_in(burn_in)[source]: Sets the object to use for doing burn-in tests.

set_burn_in_from_config(cp)[source]

Sets the burn in class from the given config file.

If no burn-in section exists in the file, then this just set the burn-in class to None.

set_p0(samples_file=None, prior=None)[source]

Sets the initial position of the chains.

Parameters:

samples_file (InferenceFile, optional) – If provided, use the last iteration in the given file for the starting positions.
prior (JointDistribution, optional) – Use the given prior to set the initial positions rather than model’s prior.

Returns:

p0 – A dictionary maping sampling params to the starting positions.

Return type:

dict

set_start_from_config(cp)[source]: Sets the initial state of the sampler from config file

abstract set_state_from_file(filename)[source]: Sets the state of the sampler to the instance saved in a file.

set_target(niterations=None, eff_nsamples=None)[source]

Sets the target niterations/nsamples for the sampler.

One or the other must be provided, not both.

set_target_from_config(cp, section)[source]

Sets the target using the given config file.

This looks for niterations to set the target_niterations, and effective-nsamples to set the target_eff_nsamples.

Parameters:

cp (ConfigParser) – Open config parser to retrieve the argument from.
section (str) – Name of the section to retrieve from.

set_thin_interval_from_config(cp, section)[source]: Sets thinning options from the given config file.

property target_eff_nsamples: The target number of effective samples the sampler should get.

property target_niterations: The number of iterations the sampler should run for.

property thin_interval: Returns the thin interval being used.

property thin_safety_factor: The minimum value that max_samples_per_chain may be set to.

abstract write_results(filename)[source]: Should write all samples currently in memory to the given file.

class pycbc.inference.sampler.base_mcmc.EnsembleSupport[source]

Bases: object

Adds support for ensemble MCMC samplers.

property acl

The autocorrelation length of the ensemble.

This is calculated by taking the maximum over all of the raw_acls. This works for both single and parallel-tempered ensemble samplers.

Returns None if no ACLs have been set.

property effective_nsamples: The effective number of samples post burn-in that the sampler has acquired so far.

property nwalkers

The number of walkers used.

Alias of nchains.

pycbc.inference.sampler.base_mcmc.blob_data_to_dict(stat_names, blobs)[source]

Converts list of “blobs” to a dictionary of model stats.

Samplers like emcee store the extra tuple returned by CallModel to a list called blobs. This is a list of lists of tuples with shape niterations x nwalkers x nstats, where nstats is the number of stats returned by the model’s default_stats. This converts that list to a dictionary of arrays keyed by the stat names.

Parameters:

stat_names (list of str) – The list of the stat names.
blobs (list of list of tuples) – The data to convert.

Returns:

A dictionary mapping the model’s default_stats to arrays of values. Each array will have shape nwalkers x niterations.

Return type:

dict

pycbc.inference.sampler.base_mcmc.ensemble_compute_acf(filename, start_index=None, end_index=None, per_walker=False, walkers=None, parameters=None)[source]

Computes the autocorrleation function for an ensemble MCMC.

By default, parameter values are averaged over all walkers at each iteration. The ACF is then calculated over the averaged chain. An ACF per-walker will be returned instead if per_walker=True.

Parameters:

filename (str) – Name of a samples file to compute ACFs for.
start_index (int, optional) – The start index to compute the acl from. If None (the default), will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample.
end_index (int, optional) – The end index to compute the acl to. If None (the default), will go to the end of the current iteration.
per_walker (bool, optional) – Return the ACF for each walker separately. Default is False.
walkers (int or array, optional) – Calculate the ACF using only the given walkers. If None (the default) all walkers will be used.
parameters (str or array, optional) – Calculate the ACF for only the given parameters. If None (the default) will calculate the ACF for all of the model params.

Returns:

Dictionary of arrays giving the ACFs for each parameter. If per-walker is True, the arrays will have shape nwalkers x niterations.

Return type:

dict

pycbc.inference.sampler.base_mcmc.ensemble_compute_acl(filename, start_index=None, end_index=None, min_nsamples=10)[source]

Computes the autocorrleation length for an ensemble MCMC.

Parameter values are averaged over all walkers at each iteration. The ACL is then calculated over the averaged chain. If an ACL cannot be calculated because there are not enough samples, it will be set to inf.

Parameters:

filename (str) – Name of a samples file to compute ACLs for.
start_index (int, optional) – The start index to compute the acl from. If None, will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample.
end_index (int, optional) – The end index to compute the acl to. If None, will go to the end of the current iteration.
min_nsamples (int, optional) – Require a minimum number of samples to compute an ACL. If the number of samples per walker is less than this, will just set to inf. Default is 10.

Returns:

A dictionary giving the ACL for each parameter.

Return type:

dict

pycbc.inference.sampler.base_mcmc.get_optional_arg_from_config(cp, section, arg, dtype=<class 'str'>)[source]

Convenience function to retrieve an optional argument from a config file.

Parameters:

cp (ConfigParser) – Open config parser to retrieve the argument from.
section (str) – Name of the section to retrieve from.
arg (str) – Name of the argument to retrieve.
dtype (datatype, optional) – Cast the retrieved value (if it exists) to the given datatype. Default is str.

Returns:

val – If the argument is present, the value. Otherwise, None.

Return type:

None or str

pycbc.inference.sampler.base_mcmc.raw_samples_to_dict(sampler, raw_samples)[source]

Convenience function for converting ND array to a dict of samples.

The samples are assumed to have dimension [sampler.base_shape x] niterations x len(sampler.sampling_params).

Parameters:

sampler (sampler instance) – An instance of an MCMC sampler.
raw_samples (array) – The array of samples to convert.

Returns:

A dictionary mapping the raw samples to the variable params. If the sampling params are not the same as the variable params, they will also be included. Each array will have shape [sampler.base_shape x] niterations.

Return type:

dict

pycbc.inference.sampler.base_multitemper module

Provides constructor classes provide support for parallel tempered MCMC samplers.

class pycbc.inference.sampler.base_multitemper.MultiTemperedSupport[source]

Bases: object

Provides methods for supporting multi-tempered samplers.

static betas_from_config(cp, section)[source]

Loads number of temperatures or betas from a config file.

This looks in the given section for:

ntemps :
The number of temperatures to use. Either this, or inverse-temperatures-file must be provided (but not both).
inverse-temperatures-file :
Path to an hdf file containing the inverse temperatures (“betas”) to use. The betas will be retrieved from the file’s .attrs['betas']. Either this or ntemps must be provided (but not both).

Parameters:

cp (WorkflowConfigParser instance) – Config file object to parse.
section (str) – The name of the section to look in.

Returns:

ntemps (int or None) – The number of temperatures to use, if it was provided.
betas (array) – The array of betas to use, if a inverse-temperatures-file was provided.

property ntemps: The number of temeratures that are set.

pycbc.inference.sampler.base_multitemper.acl_from_raw_acls(acls)[source]

Calculates the ACL for one or more chains from a dictionary of ACLs.

This is for parallel tempered MCMCs in which the chains are independent of each other.

The ACL for each chain is maximized over the temperatures and parameters.

Parameters:: acls (dict) – Dictionary of parameter names -> ntemps x nchains arrays of ACLs (the thing returned by compute_acl()).
Returns:: The ACL of each chain.
Return type:: array

pycbc.inference.sampler.base_multitemper.compute_acf(filename, start_index=None, end_index=None, chains=None, parameters=None, temps=None)[source]

Computes the autocorrleation function for independent MCMC chains with parallel tempering.

Parameters:

filename (str) – Name of a samples file to compute ACFs for.
start_index (int, optional) – The start index to compute the acl from. If None (the default), will try to use the burn in iteration for each chain; otherwise, will start at the first sample.
end_index ({None, int}) – The end index to compute the acl to. If None, will go to the end of the current iteration.
chains (optional, int or array) – Calculate the ACF for only the given chains. If None (the default) ACFs for all chains will be estimated.
parameters (optional, str or array) – Calculate the ACF for only the given parameters. If None (the default) will calculate the ACF for all of the model params.
temps (optional, (list of) int or 'all') – The temperature index (or list of indices) to retrieve. If None (the default), the ACF will only be computed for the coldest (= 0) temperature chain. To compute an ACF for all temperates pass ‘all’, or a list of all of the temperatures.

Returns:

Dictionary parameter name -> ACF arrays. The arrays have shape ntemps x nchains x niterations.

Return type:

dict

pycbc.inference.sampler.base_multitemper.compute_acl(filename, start_index=None, end_index=None, min_nsamples=10)[source]

Computes the autocorrleation length for independent MCMC chains with parallel tempering.

ACLs are calculated separately for each chain.

Parameters:

filename (str) – Name of a samples file to compute ACLs for.
start_index ({None, int}) – The start index to compute the acl from. If None, will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample.
end_index ({None, int}) – The end index to compute the acl to. If None, will go to the end of the current iteration.
min_nsamples (int, optional) – Require a minimum number of samples to compute an ACL. If the number of samples per walker is less than this, will just set to inf. Default is 10.

Returns:

A dictionary of ntemps x nchains arrays of the ACLs of each parameter.

Return type:

dict

pycbc.inference.sampler.base_multitemper.ensemble_compute_acf(filename, start_index=None, end_index=None, per_walker=False, walkers=None, parameters=None, temps=None)[source]

Computes the autocorrleation function for a parallel tempered, ensemble MCMC.

By default, parameter values are averaged over all walkers at each iteration. The ACF is then calculated over the averaged chain for each temperature. An ACF per-walker will be returned instead if per_walker=True.

Parameters:

filename (str) – Name of a samples file to compute ACFs for.
start_index (int, optional) – The start index to compute the acl from. If None (the default), will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample.
end_index (int, optional) – The end index to compute the acl to. If None (the default), will go to the end of the current iteration.
per_walker (bool, optional) – Return the ACF for each walker separately. Default is False.
walkers (int or array, optional) – Calculate the ACF using only the given walkers. If None (the default) all walkers will be used.
parameters (str or array, optional) – Calculate the ACF for only the given parameters. If None (the default) will calculate the ACF for all of the model params.
temps ((list of) int or 'all', optional) – The temperature index (or list of indices) to retrieve. If None (the default), the ACF will only be computed for the coldest (= 0) temperature chain. To compute an ACF for all temperates pass ‘all’, or a list of all of the temperatures.

Returns:

Dictionary of arrays giving the ACFs for each parameter. If per-walker is True, the arrays will have shape ntemps x nwalkers x niterations. Otherwise, the returned array will have shape ntemps x niterations.

Return type:

dict

pycbc.inference.sampler.base_multitemper.ensemble_compute_acl(filename, start_index=None, end_index=None, min_nsamples=10)[source]

Computes the autocorrleation length for a parallel tempered, ensemble MCMC.

Parameter values are averaged over all walkers at each iteration and temperature. The ACL is then calculated over the averaged chain.

Parameters:

filename (str) – Name of a samples file to compute ACLs for.
start_index (int, optional) – The start index to compute the acl from. If None (the default), will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample.
end_index (int, optional) – The end index to compute the acl to. If None, will go to the end of the current iteration.
min_nsamples (int, optional) – Require a minimum number of samples to compute an ACL. If the number of samples per walker is less than this, will just set to inf. Default is 10.

Returns:

A dictionary of ntemps-long arrays of the ACLs of each parameter.

Return type:

dict

pycbc.inference.sampler.base_multitemper.read_betas_from_hdf(filename)[source]: Loads inverse temperatures from the given file.

pycbc.inference.sampler.dummy module

Dummy class when no actual sampling is needed, but we may want to do some reconstruction supported by the likelihood model.

class pycbc.inference.sampler.dummy.DummySampler(model, *args, nprocesses=1, use_mpi=False, num_samples=1000, **kwargs)[source]

Bases: BaseSampler

Dummy sampler for not doing sampling

Parameters:: model (Model) – An instance of a model from pycbc.inference.models.

checkpoint(): The sampler must have a checkpoint method for dumping raw samples and stats to the file type defined by io.

finalize()[source]: Do any finalization to the samples file before exiting.

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]: This should initialize the sampler given a config file.

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property model_stats

A dict mapping model’s metadata fields to arrays of values for each sample in raw_samples.

The arrays may have any shape, and may or may not be thinned.

name = 'dummy'

resume_from_checkpoint(): Resume the sampler from the output file.

run()[source]

This function should run the sampler.

Any checkpointing should be done internally in this function.

property samples

A dict mapping variable_params to arrays of samples currently in memory. The dictionary may also contain sampling_params.

The sample arrays may have any shape, and may or may not be thinned.

pycbc.inference.sampler.dummy.call_reconstruct(iteration)[source]: Accessor to update the global model and call its reconstruction routine.

pycbc.inference.sampler.dynesty module

This modules provides classes and functions for using the dynesty sampler packages for parameter estimation.

class pycbc.inference.sampler.dynesty.DynestySampler(model, nlive, nprocesses=1, checkpoint_time_interval=None, maxcall=None, loglikelihood_function=None, use_mpi=False, no_save_state=False, run_kwds=None, extra_kwds=None, internal_kwds=None, **kwargs)[source]

Bases: BaseSampler

This class is used to construct an Dynesty sampler from the dynesty package.

Parameters:

model (model) – A model from pycbc.inference.models.
nlive (int) – Number of live points to use in sampler.
pool (function with map, Optional) – A provider of a map function that allows a function call to be run over multiple sets of arguments and possibly maps them to cores/nodes/etc.

checkpoint()[source]: Checkpoint function for dynesty sampler

finalize()[source]: Finalze and write it to the results file

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False, loglikelihood_function=None)[source]

Loads the sampler from the given config file. Many options are directly passed to the underlying dynesty sampler, see the official dynesty documentation for more details on these.

The following options are retrieved in the [sampler] section:

name = STR:
Required. This must match the sampler’s name.
maxiter = INT:
The maximum number of iterations to run.
dlogz = FLOAT:
The target dlogz stopping condition.
logl_max = FLOAT:
The maximum logl stopping condition.
n_effective = INT:
Target effective number of samples stopping condition
sample = STR:
The method to sample the space. Should be one of ‘uniform’, ‘rwalk’, ‘rwalk2’ (a modified version of rwalk), or ‘slice’.
walk = INT:
Used for some of the walk methods. Sets the minimum number of steps to take when evolving a point.
maxmcmc = INT:
Used for some of the walk methods. Sets the maximum number of steps to take when evolving a point.
nact = INT:
used for some of the walk methods. Sets number of autorcorrelation lengths before terminating evolution of a point.
first_update_min_ncall = INT:
The minimum number of calls before updating the bounding region for the first time.
first_update_min_neff = FLOAT:
Don’t update the the bounding region untill the efficiency drops below this value.
bound = STR:
The method of bounding of the prior volume. Should be one of ‘single’, ‘balls’, ‘cubes’, ‘multi’ or ‘none’.
update_interval = INT:
Number of iterations between updating the bounding regions
enlarge = FLOAT:
Factor to enlarge the bonding region.
bootstrap = INT:
The number of bootstrap iterations to determine the enlargement factor.
maxcall = INT:
The maximum number of calls before checking if we should checkpoint
checkpoint_time_interval:
Sets the time in seconds between checkpointing.
loglikelihood-function:
The attribute of the model to use for the loglikelihood. If not provided, will default to loglikelihood.

Parameters:

cp (WorkflowConfigParser instance) – Config file object to parse.
model (pycbc.inference.model.BaseModel instance) – The model to use.
output_file (str, optional) – The name of the output file to checkpoint and write results to.
nprocesses (int, optional) – The number of parallel processes to use. Default is 1.
use_mpi (bool, optional) – Use MPI for parallelization. Default is False.

Returns:

The sampler instance.

Return type:

DynestySampler

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property logz: return bayesian evidence estimated by dynesty sampler

property logz_err: return error in bayesian evidence estimated by dynesty sampler

property model_stats

A dict mapping model’s metadata fields to arrays of values for each sample in raw_samples.

The arrays may have any shape, and may or may not be thinned.

name = 'dynesty'

property niterations

resume_from_checkpoint()[source]: Resume the sampler from the output file.

run()[source]

This function should run the sampler.

Any checkpointing should be done internally in this function.

property samples: Returns raw nested samples

set_initial_conditions(initial_distribution=None, samples_file=None)[source]

Sets up the starting point for the sampler.

Should also set the sampler’s random state.

set_state_from_file(filename)[source]: Sets the state of the sampler back to the instance saved in a file.

write_results(filename)[source]

Writes samples, model stats, acceptance fraction, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

pycbc.inference.sampler.dynesty.estimate_nmcmc(accept_ratio, old_act, maxmcmc, safety=5, tau=None)[source]

Estimate autocorrelation length of chain using acceptance fraction

Using ACL = (2/acc) - 1 multiplied by a safety margin. Code adapated from CPNest:

Parameters:

accept_ratio (float [0, 1]) – Ratio of the number of accepted points to the total number of points
old_act (int) – The ACT of the last iteration
maxmcmc (int) – The maximum length of the MCMC chain to use
safety (int) – A safety factor applied in the calculation
tau (int (optional)) – The ACT, if given, otherwise estimated.

pycbc.inference.sampler.dynesty.sample_rwalk_mod(args)[source]

Modified version of dynesty.sampling.sample_rwalk

Adapted from version used in bilby/dynesty

pycbc.inference.sampler.emcee module

This modules provides classes and functions for using the emcee sampler packages for parameter estimation.

class pycbc.inference.sampler.emcee.EmceeEnsembleSampler(model, nwalkers, checkpoint_interval=None, checkpoint_signal=None, logpost_function=None, nprocesses=1, use_mpi=False)[source]

Bases: EnsembleSupport, BaseMCMC, BaseSampler

This class is used to construct an MCMC sampler from the emcee package’s EnsembleSampler.

Parameters:

model (model) – A model from pycbc.inference.models.
nwalkers (int) – Number of walkers to use in sampler.
pool (function with map, Optional) – A provider of a map function that allows a function call to be run over multiple sets of arguments and possibly maps them to cores/nodes/etc.

property base_shape

What shape the sampler’s samples arrays are in, excluding the iterations dimension.

For example, if a sampler uses 20 chains and 3 temperatures, this would be (3, 20). If a sampler only uses a single walker and no temperatures this would be ().

burn_in_class: alias of EnsembleMCMCBurnInTests

clear_samples()[source]: Clears the samples and stats from memory.

static compute_acf(filename, **kwargs)[source]

Computes the autocorrelation function.

Calls base_mcmc.ensemble_compute_acf(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACFs for.
**kwargs – All other keyword arguments are passed to base_mcmc.ensemble_compute_acf().

Returns:

Dictionary of arrays giving the ACFs for each parameter. If per-walker is True, the arrays will have shape nwalkers x niterations.

Return type:

dict

static compute_acl(filename, **kwargs)[source]

Computes the autocorrelation length.

Calls base_mcmc.ensemble_compute_acl(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACLs for.
**kwargs – All other keyword arguments are passed to base_mcmc.ensemble_compute_acf().

Returns:

A dictionary giving the ACL for each parameter.

Return type:

dict

finalize()[source]: All data is written by the last checkpoint in the run method, so this just passes.

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]: Loads the sampler from the given config file.

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property model_stats

A dict mapping the model’s default_stats to arrays of values.

The returned array has shape nwalkers x niterations.

name = 'emcee'

run_mcmc(niterations)[source]

Advance the ensemble for a number of samples.

Parameters:: niterations (int) – Number of iterations to run the sampler for.

property samples

A dict mapping variable_params to arrays of samples currently in memory.

The arrays have shape nwalkers x niterations.

set_state_from_file(filename)[source]: Sets the state of the sampler back to the instance saved in a file.

write_results(filename)[source]

Writes samples, model stats, acceptance fraction, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

pycbc.inference.sampler.emcee_pt module

This modules provides classes and functions for using the emcee_pt sampler packages for parameter estimation.

class pycbc.inference.sampler.emcee_pt.EmceePTSampler(model, ntemps, nwalkers, betas=None, checkpoint_interval=None, checkpoint_signal=None, loglikelihood_function=None, nprocesses=1, use_mpi=False)[source]

Bases: MultiTemperedSupport, EnsembleSupport, BaseMCMC, BaseSampler

This class is used to construct a parallel-tempered MCMC sampler from the emcee package’s PTSampler.

Parameters:

model (model) – A model from pycbc.inference.models.
ntemps (int) – Number of temeratures to use in the sampler.
nwalkers (int) – Number of walkers to use in sampler.
betas (array) – An array of inverse temperature values to be used in emcee_pt’s temperature ladder. If not provided, emcee_pt will use the number of temperatures and the number of dimensions of the parameter space to construct the ladder with geometrically spaced temperatures.
loglikelihood_function (str, optional) – Set the function to call from the model for the loglikelihood. Default is loglikelihood.
nprocesses (int, optional) – The number of parallel processes to use. Default is 1 (no paralleliztion).
use_mpi (bool, optional) – Use MPI for parallelization. Default (False) will use python’s multiprocessing.

property base_shape

What shape the sampler’s samples arrays are in, excluding the iterations dimension.

For example, if a sampler uses 20 chains and 3 temperatures, this would be (3, 20). If a sampler only uses a single walker and no temperatures this would be ().

property betas

burn_in_class: alias of EnsembleMultiTemperedMCMCBurnInTests

classmethod calculate_logevidence(filename, thin_start=None, thin_end=None, thin_interval=None)[source]

Calculates the log evidence from the given file using emcee_pt’s thermodynamic integration.

Parameters:

filename (str) – Name of the file to read the samples from. Should be an EmceePTFile.
thin_start (int) – Index of the sample to begin returning stats. Default is to read stats after burn in. To start from the beginning set thin_start to 0.
thin_interval (int) – Interval to accept every i-th sample. Default is to use the fp.acl. If fp.acl is not set, then use all stats (set thin_interval to 1).
thin_end (int) – Index of the last sample to read. If not given then fp.niterations is used.

Returns:

lnZ (float) – The estimate of log of the evidence.
dlnZ (float) – The error on the estimate.

clear_samples()[source]: Clears the chain and blobs from memory.

static compute_acf(filename, **kwargs)[source]

Computes the autocorrelation function.

Calls base_multitemper.ensemble_compute_acf(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACFs for.
**kwargs – All other keyword arguments are passed to base_multitemper.ensemble_compute_acf().

Returns:

Dictionary of arrays giving the ACFs for each parameter. If per-walker=True is passed as a keyword argument, the arrays will have shape ntemps x nwalkers x niterations. Otherwise, the returned array will have shape ntemps x niterations.

Return type:

dict

static compute_acl(filename, **kwargs)[source]

Computes the autocorrelation length.

Calls base_multitemper.ensemble_compute_acl(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACLs for.
**kwargs – All other keyword arguments are passed to base_multitemper.ensemble_compute_acl().

Returns:

A dictionary of ntemps-long arrays of the ACLs of each parameter.

Return type:

dict

finalize()[source]

Calculates the log evidence and writes to the checkpoint file.

If sampling transforms were used, this also corrects the jacobian stored on disk.

The thin start/interval/end for calculating the log evidence are retrieved from the checkpoint file’s thinning attributes.

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]

Loads the sampler from the given config file.

The following options are retrieved in the [sampler] section:

name :
Required. This must match the samlper’s name.
nwalkers :
Required. The number of walkers to use.
ntemps :
The number of temperatures to use. Either this, or inverse-temperatures-file must be provided (but not both).
inverse-temperatures-file :
Path to an hdf file containing the inverse temperatures (“betas”) to use. The betas will be retrieved from the file’s .attrs['betas']. Either this or ntemps must be provided (but not both).
niterations :
The number of iterations to run the sampler for. Either this or effective-nsamples must be provided (but not both).
effective-nsamples :
Run the sampler until the given number of effective samples are obtained. A checkpoint-interval must also be provided in this case. Either this or niterations must be provided (but not both).
thin-interval :
Thin the samples by the given value before saving to disk. May provide this, or max-samples-per-chain, but not both. If neither options are provided, will save all samples.
max-samples-per-chain :
Thin the samples such that the number of samples per chain per temperature that are saved to disk never exceeds the given value. May provide this, or thin-interval, but not both. If neither options are provided, will save all samples.
checkpoint-interval :
Sets the checkpoint interval to use. Must be provided if using effective-nsamples.
checkpoint-signal :
Set the checkpoint signal, e.g., “USR2”. Optional.
logl-function :
The attribute of the model to use for the loglikelihood. If not provided, will default to loglikelihood.

Settings for burn-in tests are read from [sampler-burn_in]. In particular, the burn-in-test option is used to set the burn in tests to perform. See MultiTemperedMCMCBurnInTests.from_config() for details. If no burn-in-test is provided, no burn in tests will be carried out.

Parameters:

cp (WorkflowConfigParser instance) – Config file object to parse.
model (pycbc.inference.model.BaseModel instance) – The model to use.
output_file (str, optional) – The name of the output file to checkpoint and write results to.
nprocesses (int, optional) – The number of parallel processes to use. Default is 1.
use_mpi (bool, optional) – Use MPI for parallelization. Default is False.

Returns:

The sampler instance.

Return type:

EmceePTSampler

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property model_stats

Returns the log likelihood ratio and log prior as a dict of arrays.

The returned array has shape ntemps x nwalkers x niterations.

Unfortunately, because emcee_pt does not have blob support, this will only return the loglikelihood and logprior (with the logjacobian set to zero) regardless of what stats the model can return.

Warning

Since the logjacobian is not saved by emcee_pt, the logprior returned here is the log of the prior pdf in the sampling coordinate frame rather than the variable params frame. This differs from the variable params frame by the log of the Jacobian of the transform from one frame to the other. If no sampling transforms were used, then the logprior is the same.

name = 'emcee_pt'

run_mcmc(niterations)[source]

Advance the ensemble for a number of samples.

Parameters:: niterations (int) – Number of samples to get from sampler.

property samples

A dict mapping variable_params to arrays of samples currently in memory.

The arrays have shape ntemps x nwalkers x niterations.

set_state_from_file(filename)[source]: Sets the state of the sampler back to the instance saved in a file.

write_results(filename)[source]

Writes samples, model stats, acceptance fraction, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

pycbc.inference.sampler.epsie module

This module provides classes for interacting with epsie samplers.

class pycbc.inference.sampler.epsie.EpsieSampler(model, nchains, ntemps=None, betas=None, proposals=None, default_proposal=None, default_proposal_args=None, seed=None, swap_interval=1, checkpoint_interval=None, checkpoint_signal=None, loglikelihood_function=None, nprocesses=1, use_mpi=False)[source]

Bases: MultiTemperedSupport, BaseMCMC, BaseSampler

Constructs an MCMC sampler using epsie’s parallel-tempered sampler.

Parameters:

model (model) – A model from pycbc.inference.models.
nchains (int) – Number of chains to use in the sampler.
ntemps (int, optional) – Number of temperatures to use in the sampler. A geometrically-spaced temperature ladder with the gievn number of levels will be constructed based on the number of parameters. If not provided, must provide betas.
betas (array, optional) – An array of inverse temperature values to be used in for the temperature ladder. If not provided, must provide ntemps.
proposals (list, optional) – List of proposals to use. Any parameters that do not have a proposal provided will use the default_propsal. Note: proposals should be specified for the sampling parameters, not the variable parameters.
default_proposal (an epsie.Proposal class, optional) – The default proposal to use for parameters not in proposals. Default is epsie.proposals.Normal.
default_proposal_args (dict, optional) – Dictionary of arguments to pass to the default proposal.
swap_interval (int, optional) – The number of iterations between temperature swaps. Default is 1.
seed (int, optional) – Seed for epsie’s random number generator. If None provided, will create one.
checkpoint_interval (int, optional) – Specify the number of iterations to do between checkpoints. If not provided, no checkpointin will be done.
checkpoint_signal (str, optional) – Set the signal to use when checkpointing. For example, ‘USR2’.
loglikelihood_function (str, optional) – Set the function to call from the model for the loglikelihood. Default is loglikelihood.
nprocesses (int, optional) – The number of parallel processes to use. Default is 1 (no paralleliztion).
use_mpi (bool, optional) – Use MPI for parallelization. Default (False) will use python’s multiprocessing.

property acl: The autocorrelation lengths of the chains.

property base_shape

What shape the sampler’s samples arrays are in, excluding the iterations dimension.

For example, if a sampler uses 20 chains and 3 temperatures, this would be (3, 20). If a sampler only uses a single walker and no temperatures this would be ().

property betas: The inverse temperatures being used.

burn_in_class: alias of MultiTemperedMCMCBurnInTests

clear_samples()[source]: Clears the chain and blobs from memory.

static compute_acf(filename, **kwargs)[source]

Computes the autocorrelation function.

Calls base_multitemper.compute_acf(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACFs for.
**kwargs – All other keyword arguments are passed to base_multitemper.compute_acf().

Returns:

Dictionary of arrays giving the ACFs for each parameter. The arrays will have shape ntemps x nchains x niterations.

Return type:

dict

static compute_acl(filename, **kwargs)[source]

Computes the autocorrelation length.

Calls base_multitemper.compute_acl(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACLs for.
**kwargs – All other keyword arguments are passed to base_multitemper.compute_acl().

Returns:

A dictionary of ntemps-long arrays of the ACLs of each parameter.

Return type:

dict

property effective_nsamples: The effective number of samples post burn-in that the sampler has acquired so far.

finalize()[source]: Do any finalization to the samples file before exiting.

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]

Loads the sampler from the given config file.

The following options are retrieved in the [sampler] section:

name :
(required) must match the samlper’s name
nchains :
(required) the number of chains to use
ntemps :
The number of temperatures to use. Either this, or inverse-temperatures-file must be provided (but not both).
inverse-temperatures-file :
Path to an hdf file containing the inverse temperatures (“betas”) to use. The betas will be retrieved from the file’s .attrs['betas']. Either this or ntemps must be provided (but not both).
niterations :
The number of iterations to run the sampler for. Either this or effective-nsamples must be provided (but not both).
effective-nsamples :
Run the sampler until the given number of effective samples are obtained. A checkpoint-interval must also be provided in this case. Either this or niterations must be provided (but not both).
thin-interval :
Thin the samples by the given value before saving to disk. May provide this, or max-samples-per-chain, but not both. If neither options are provided, will save all samples.
max-samples-per-chain :
Thin the samples such that the number of samples per chain per temperature that are saved to disk never exceeds the given value. May provide this, or thin-interval, but not both. If neither options are provided, will save all samples.
checkpoint-interval :
Sets the checkpoint interval to use. Must be provided if using effective-nsamples.
checkpoint-signal :
Set the checkpoint signal, e.g., “USR2”. Optional.
seed :
The seed to use for epsie’s random number generator. If not provided, epsie will create one.
logl-function :
The attribute of the model to use for the loglikelihood. If not provided, will default to loglikelihood.
swap-interval :
The number of iterations between temperature swaps. Default is 1.

Jump proposals must be provided for every sampling parameter. These are retrieved from subsections [jump_proposal-{params}], where params is a pycbc.VARARGS_DELIM separated list of parameters the proposal should be used for. See inference.jump.epsie_proposals_from_config() for details.

Note

Jump proposals should be specified for sampling parameters, not variable parameters.

Settings for burn-in tests are read from [sampler-burn_in]. In particular, the burn-in-test option is used to set the burn in tests to perform. See MultiTemperedMCMCBurnInTests.from_config() for details. If no burn-in-test is provided, no burn in tests will be carried out.

Parameters:

cp (WorkflowConfigParser instance) – Config file object to parse.
model (pycbc.inference.model.BaseModel instance) – The model to use.
output_file (str, optional) – The name of the output file to checkpoint and write results to.
nprocesses (int, optional) – The number of parallel processes to use. Default is 1.
use_mpi (bool, optional) – Use MPI for parallelization. Default is False.

Returns:

The sampler instance.

Return type:

EpsiePTSampler

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property model_stats

A dict mapping the model’s default_stats to arrays of values.

The arrays have shape ntemps x nchains x niterations.

name = 'epsie'

property pos: A dictionary of the current chain positions.

run_mcmc(niterations)[source]

Advance the chains for a number of iterations.

Parameters:: niterations (int) – Number of samples to get from sampler.

property samples

A dict mapping variable_params to arrays of samples currently in memory.

The arrays have shape ntemps x nchains x niterations.

The dictionary also contains sampling parameters.

property seed

The seed used for epsie’s random bit generator.

This is not the same as the seed used for the prior distributions.

set_p0(samples_file=None, prior=None)[source]

Sets the initial position of the chains.

Parameters:

samples_file (InferenceFile, optional) – If provided, use the last iteration in the given file for the starting positions.
prior (JointDistribution, optional) – Use the given prior to set the initial positions rather than model’s prior.

Returns:

p0 – A dictionary maping sampling params to the starting positions.

Return type:

dict

set_state_from_file(filename)[source]: Sets the state of the sampler back to the instance saved in a file.

property swap_interval: Number of iterations between temperature swaps.

write_results(filename)[source]

Writes samples, model stats, acceptance ratios, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

pycbc.inference.sampler.games module

Direct monte carlo sampling using pregenerated mapping files that encode the intrinsic parameter space.

class pycbc.inference.sampler.games.GameSampler(model, *args, nprocesses=1, use_mpi=False, mapfile=None, loglr_region=25, target_likelihood_calls=100000.0, rounds=1, **kwargs)[source]

Bases: DummySampler

Direct importance sampling using a preconstructed parameter space mapping file.

Parameters:

model (Model) – An instance of a model from pycbc.inference.models.
mapfile (str) – Path to the pre-generated file containing the pre-mapped prior volume
loglr_region (int) – Only use regions from the prior volume tiling that are within this loglr difference of the maximum tile.
target_likelihood_calls (int) – Try to use this many likelihood calls in each round of the analysis.
rounds (int) – The number of iterations to use before terminated.

draw_samples_from_bin(i, size)[source]: Get samples from the binned prior space

name = 'games'

run()[source]: Produce posterior samples

sample_round(bin_weight, node_idx, lengths)[source]

Sample from the posterior using pre-binned sets of points and the weighting factor of each bin.

bin_weight: Array: The weighting importance factor of each bin of the prior space
node_idx: Array: The set of ids into the prebinned prior volume to use. This should map to the given weights.
lengths: Array: The size of each bin, used to self-normalize

exception pycbc.inference.sampler.games.OutOfSamples[source]

Bases: Exception

Exception if we ran out of samples

pycbc.inference.sampler.games.call_likelihood(params)[source]: Accessor to update the global model

pycbc.inference.sampler.multinest module

This modules provides classes and functions for using the Multinest sampler packages for parameter estimation.

class pycbc.inference.sampler.multinest.MultinestSampler(model, nlivepoints, checkpoint_interval=1000, importance_nested_sampling=False, evidence_tolerance=0.1, sampling_efficiency=0.01, constraints=None)[source]

Bases: BaseSampler

This class is used to construct a nested sampler from the Multinest package.

Parameters:

model (model) – A model from pycbc.inference.models.
nlivepoints (int) – Number of live points to use in sampler.

check_if_finished()[source]: Estimate remaining evidence to see if desired evidence-tolerance stopping criterion has been reached.

checkpoint()[source]: Dumps current samples to the checkpoint file.

property checkpoint_interval: Get the number of iterations between checkpoints.

property dlogz: Get the current error estimate of the log evidence.

finalize()[source]: All data is written by the last checkpoint in the run method, so this just passes.

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]: Loads the sampler from the given config file.

get_posterior_samples()[source]: Read posterior samples from ASCII output file created by multinest.

property importance_dlogz: Get the current error estimate of the importance weighted log evidence.

property importance_logz: Get the current importance weighted estimate of the log evidence.

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

loglikelihood(cube, *extra_args)[source]: Log likelihood evaluator that gets passed to multinest.

property logz: Get the current estimate of the log evidence.

property model_stats: A dict mapping the model’s default_stats to arrays of values.

name = 'multinest'

property niterations: Get the current number of iterations.

property nlivepoints: Get the number of live points used in sampling.

resume_from_checkpoint()[source]: Resume sampler from checkpoint

run()[source]: Runs the sampler until the specified evidence tolerance is reached.

property samples: A dict mapping variable_params to arrays of samples currently in memory.

set_initial_conditions(initial_distribution=None, samples_file=None)[source]

Sets the initial starting point for the sampler.

If a starting samples file is provided, will also load the random state from it.

set_state_from_file(filename)[source]: Sets the state of the sampler back to the instance saved in a file.

setup_output(output_file)[source]

Sets up the sampler’s checkpoint and output files.

The checkpoint file has the same name as the output file, but with .checkpoint appended to the name. A backup file will also be created.

Parameters:

sampler (sampler instance) – Sampler
output_file (str) – Name of the output file.

transform_prior(cube, *extra_args)[source]: Transforms the unit hypercube that multinest makes its draws from, into the prior space defined in the config file.

write_results(filename)[source]

Writes samples, model stats, acceptance fraction, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

pycbc.inference.sampler.nessai module

This modules provides class for using the nessai sampler package for parameter estimation.

Documentation for nessai: https://nessai.readthedocs.io/en/latest/

class pycbc.inference.sampler.nessai.NessaiModel(model, loglikelihood_function=None)[source]

Bases: Model

Wrapper for PyCBC Inference model class for use with nessai.

Parameters:

model (inference.BaseModel instance) – A model instance from PyCBC.
loglikelihood_function (str) – Name of the log-likelihood method to call.

from_unit_hypercube(x)[source]: Map from the unit-hypercube to the prior.

log_likelihood(x)[source]: Compute the log-likelihood

log_prior(x)[source]: Compute the log-prior

new_point(N=1)[source]: Draw a new point

new_point_log_prob(x)[source]: Log-probability for the new_point method

to_dict(x)[source]: Convert a nessai live point array to a dictionary

to_live_points(x)[source]: Convert to the structured arrays used by nessai

to_unit_hypercube(x)[source]: Map to the unit-hypercube to the prior.

class pycbc.inference.sampler.nessai.NessaiSampler(model, loglikelihood_function, nlive=1000, nprocesses=1, use_mpi=False, run_kwds=None, extra_kwds=None)[source]

Bases: BaseSampler

Class to construct a FlowSampler from the nessai package.

checkpoint()[source]: Checkpoint the sampler

checkpoint_callback(state)[source]

Callback for checkpointing.

This will be called periodically by nessai.

finalize()[source]: Finalize sampling

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]: Loads the sampler from the given config file.

static get_default_kwds(importance_nested_sampler=False)[source]

Return lists of all allowed keyword arguments for nessai.

Returns:

default_kwds (list) – List of keyword arguments that can be passed to FlowSampler
run_kwds (list) – List of keyword arguments that can be passed to FlowSampler.run

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property model_stats

A dict mapping model’s metadata fields to arrays of values for each sample in raw_samples.

The arrays may have any shape, and may or may not be thinned.

name = 'nessai'

resume_from_checkpoint()[source]: Reads the resume data from the checkpoint file.

run(**kwargs)[source]: Run the sampler

property samples: The raw nested samples including the corresponding weights

set_initial_conditions(initial_distribution=None, samples_file=None)[source]

Sets up the starting point for the sampler.

This is not used for nessai.

write_results(filename)[source]

Write the results to a given file.

Writes the nested samples, log-evidence and log-evidence error.

pycbc.inference.sampler.ptemcee module

This modules provides classes and functions for using the emcee_pt sampler packages for parameter estimation.

class pycbc.inference.sampler.ptemcee.PTEmceeSampler(model, nwalkers, ntemps=None, Tmax=None, betas=None, adaptive=False, adaptation_lag=None, adaptation_time=None, scale_factor=None, loglikelihood_function=None, checkpoint_interval=None, checkpoint_signal=None, nprocesses=1, use_mpi=False)[source]

Bases: EnsembleSupport, BaseMCMC, BaseSampler

This class is used to construct the parallel-tempered ptemcee sampler.

Parameters:

model (model) – A model from pycbc.inference.models.
nwalkers (int) – Number of walkers to use in sampler.
ntemps (int, optional) – Specify the number of temps to use. Either this, Tmax, or betas must be specified.
Tmax (float, optional) – Specify the maximum temperature to use. This may be used with ntemps; see ptemcee.make_ladder() for details. Either this, ntemps, or betas must be specified.
betas (list of float, optional) – Specify the betas to use. Must be provided if ntemps and Tmax are not given. Will override ntemps and Tmax if provided.
adaptive (bool, optional) – Whether or not to use adaptive temperature levels. Default is False.
adaptation_lag (int, optional) – Only used if adaptive is True; see ptemcee.Sampler for details. If not provided, will use ptemcee’s default.
adaptation_time (int, optional) – Only used if adaptive is True; see ptemcee.Sampler for details. If not provided, will use ptemcee’s default.
scale_factor (float, optional) – Scale factor used for the stretch proposal; see ptemcee.Sampler for details. If not provided, will use ptemcee’s default.
loglikelihood_function (str, optional) – Set the function to call from the model for the loglikelihood. Default is loglikelihood.
nprocesses (int, optional) – The number of parallel processes to use. Default is 1 (no paralleliztion).
use_mpi (bool, optional) – Use MPI for parallelization. Default (False) will use python’s multiprocessing.

property adaptation_lag: The adaptation lag for the beta evolution.

property adaptation_time: The adaptation time for the beta evolution.

property adaptive: Whether or not the betas are adapted.

property base_shape

What shape the sampler’s samples arrays are in, excluding the iterations dimension.

For example, if a sampler uses 20 chains and 3 temperatures, this would be (3, 20). If a sampler only uses a single walker and no temperatures this would be ().

property betas: Returns the beta history currently in memory.

burn_in_class: alias of EnsembleMultiTemperedMCMCBurnInTests

classmethod calculate_logevidence(filename, thin_start=None, thin_end=None, thin_interval=None)[source]

Calculates the log evidence from the given file. This uses ptemcee’s thermodynamic integration.

Parameters:

filename (str) – Name of the file to read the samples from. Should be an PTEmceeFile.
thin_start (int) – Index of the sample to begin returning stats. Default is to read stats after burn in. To start from the beginning set thin_start to 0.
thin_interval (int) – Interval to accept every i-th sample. Default is to use the fp.acl. If fp.acl is not set, then use all stats (set thin_interval to 1).
thin_end (int) – Index of the last sample to read. If not given then fp.niterations is used.

Returns:

lnZ (float) – The estimate of log of the evidence.
dlnZ (float) – The error on the estimate.

property chain: The current chain of samples in memory. The chain is returned as a ptemcee.chain.Chain instance. If no chain has been created yet (_chain is None), then will create a new chain using the current ensemble.

clear_samples()[source]: Clears the chain and blobs from memory.

static compute_acf(filename, **kwargs)[source]

Computes the autocorrelation function.

Calls base_multitemper.ensemble_compute_acf(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACFs for.
**kwargs – All other keyword arguments are passed to base_multitemper.ensemble_compute_acf().

Returns:

Dictionary of arrays giving the ACFs for each parameter. If per-walker=True is passed as a keyword argument, the arrays will have shape ntemps x nwalkers x niterations. Otherwise, the returned array will have shape ntemps x niterations.

Return type:

dict

static compute_acl(filename, **kwargs)[source]

Computes the autocorrelation length.

Calls base_multitemper.ensemble_compute_acl(); see that function for details.

Parameters:

filename (str) – Name of a samples file to compute ACLs for.
**kwargs – All other keyword arguments are passed to base_multitemper.ensemble_compute_acl().

Returns:

A dictionary of ntemps-long arrays of the ACLs of each parameter.

Return type:

dict

property ensemble

Returns the current ptemcee ensemble.

The ensemble stores the current location of and temperatures of the walkers. If the ensemble hasn’t been setup yet, will set one up using p0 for the positions. If set_p0 hasn’t been run yet, this will result in a ValueError.

finalize()[source]

Calculates the log evidence and writes to the checkpoint file.

If sampling transforms were used, this also corrects the jacobian stored on disk.

The thin start/interval/end for calculating the log evidence are retrieved from the checkpoint file’s thinning attributes.

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]

Loads the sampler from the given config file.

The following options are retrieved in the [sampler] section:

name = STR :
Required. This must match the sampler’s name.
nwalkers = INT :
Required. The number of walkers to use.
ntemps = INT :
The number of temperatures to use. This may be used in combination with Tmax. Either this, Tmax, betas or betas-file must be provided.
tmax = FLOAT :
The maximum temperature to use. This may be used in combination with ntemps, or alone.
betas = FLOAT1 FLOAT2 [...] :
Space-separated list of (intial) inverse temperatures (“betas”) to use. This sets both the number of temperatures and the tmax. A ValueError will be raised if both this and ntemps or Tmax are provided.
betas-file = STR :
Path to an hdf file containing the inverse temperatures (“betas”) to use. The betas will be retrieved from the file’s .attrs['betas']. A ValueError will be raised if both this and betas are provided.
adaptive = :
If provided, temperature adaptation will be turned on.
adaptation-lag = INT :
The adaptation lag to use (see ptemcee for details).
adaptation-time = INT :
The adaptation time to use (see ptemcee for details).
scale-factor = FLOAT :
The scale factor to use for the emcee stretch.
niterations = INT :
The number of iterations to run the sampler for. Either this or effective-nsamples must be provided (but not both).
effective-nsamples = INT :
Run the sampler until the given number of effective samples are obtained. A checkpoint-interval must also be provided in this case. Either this or niterations must be provided (but not both).
thin-interval = INT :
Thin the samples by the given value before saving to disk. May provide this, or max-samples-per-chain, but not both. If neither options are provided, will save all samples.
max-samples-per-chain = INT :
Thin the samples such that the number of samples per chain per temperature that are saved to disk never exceeds the given value. May provide this, or thin-interval, but not both. If neither options are provided, will save all samples.
checkpoint-interval = INT :
Sets the checkpoint interval to use. Must be provided if using effective-nsamples.
checkpoint-signal = STR :
Set the checkpoint signal, e.g., “USR2”. Optional.
logl-function = STR :
The attribute of the model to use for the loglikelihood. If not provided, will default to loglikelihood.

Settings for burn-in tests are read from [sampler-burn_in]. In particular, the burn-in-test option is used to set the burn in tests to perform. See EnsembleMultiTemperedMCMCBurnInTests.from_config() for details. If no burn-in-test is provided, no burn in tests will be carried out.

Parameters:

cp (WorkflowConfigParser instance) – Config file object to parse.
model (pycbc.inference.model.BaseModel instance) – The model to use.
output_file (str, optional) – The name of the output file to checkpoint and write results to.
nprocesses (int, optional) – The number of parallel processes to use. Default is 1.
use_mpi (bool, optional) – Use MPI for parallelization. Default is False.

Returns:

The sampler instance.

Return type:

EmceePTSampler

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property model_stats

Returns the log likelihood ratio and log prior as a dict of arrays.

The returned array has shape ntemps x nwalkers x niterations.

Unfortunately, because ptemcee does not have blob support, this will only return the loglikelihood and logprior (with the logjacobian set to zero) regardless of what stats the model can return.

Warning

Since the logjacobian is not saved by ptemcee, the logprior returned here is the log of the prior pdf in the sampling coordinate frame rather than the variable params frame. This differs from the variable params frame by the log of the Jacobian of the transform from one frame to the other. If no sampling transforms were used, then the logprior is the same.

name = 'ptemcee'

property ntemps: The number of temeratures that are set.

run_mcmc(niterations)[source]

Advance the ensemble for a number of samples.

Parameters:: niterations (int) – Number of samples to get from sampler.

property samples: A dict mapping variable_params to arrays of samples currently in memory. The arrays have shape ntemps x nwalkers x niterations.

property scale_factor: The scale factor used by ptemcee.

set_state_from_file(filename)[source]: Sets the state of the sampler back to the instance saved in a file.

property starting_betas: Returns the betas that were used at startup.

write_results(filename)[source]

Writes samples, model stats, acceptance fraction, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

pycbc.inference.sampler.refine module

Sampler that uses kde refinement of an existing posterior estimate.

class pycbc.inference.sampler.refine.RefineSampler(model, *args, nprocesses=1, use_mpi=False, num_samples=100000, iterative_kde_samples=1000, min_refinement_steps=5, max_refinement_steps=40, offbase_fraction=0.7, entropy=0.01, dlogz=0.01, kde=None, update_groups=None, max_kde_samples=50000, **kwargs)[source]

Bases: DummySampler

Sampler for kde drawn refinement of existing posterior estimate

Parameters:

model (Model) – An instance of a model from pycbc.inference.models.
num_samples (int) – The number of samples to draw from the kde at the conclusion
iterative_kde_samples (int) – The number of samples to add to the kde during each iterations
min_refinement_steps (int) – The minimum number of iterations to take.
max_refinement_steps (The maximum number of refinment steps to take.)
entropy (float) – The target entropy between iterative kdes
dlogz (float) – The target evidence difference between iterative kde updates
kde (kde) – The inital kde to use.

static compare_kde(kde1, kde2, size=10000)[source]: Calculate information difference between two kde distributions

converged(step, kde_new, factor, logp)[source]: Check that kde is converged by comparing to previous iteration

draw_samples(size, update_params=None)[source]: Draw new samples within the model priors

classmethod from_config(cp, model, output_file=None, nprocesses=1, use_mpi=False)[source]: This should initialize the sampler given a config file.

name = 'refine'

run()[source]: Iterative sample from kde and update based on likelihood values

run_samples(ksamples, update_params=None, iteration=False)[source]: Calculate the likelihoods and weights for a set of samples

set_start_from_config(cp)[source]: Sets the initial state of the sampler from config file

pycbc.inference.sampler.refine.call_model(params)[source]

pycbc.inference.sampler.refine.resample_equal(samples, logwt, seed=0)[source]

pycbc.inference.sampler.snowline module

This modules provides classes and functions for using the snowline sampler packages for parameter estimation.

class pycbc.inference.sampler.snowline.SnowlineSampler(model, **kwargs)[source]

Bases: BaseSampler

This class is used to construct an Snowline sampler from the snowline package.

Parameters:: model (model) – A model from pycbc.inference.models

checkpoint()[source]: There is currently no checkpointing implemented

finalize()[source]: Do any finalization to the samples file before exiting.

classmethod from_config(cp, model, output_file=None, **kwds)[source]: Loads the sampler from the given config file.

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property logz: Return bayesian evidence estimated by snowline sampler.

property logz_err: Return error in bayesian evidence estimated by snowline sampler.

property model_stats

A dict mapping model’s metadata fields to arrays of values for each sample in raw_samples.

The arrays may have any shape, and may or may not be thinned.

name = 'snowline'

property niterations

resume_from_checkpoint()[source]: There is currently no checkpointing implemented

run()[source]

This function should run the sampler.

Any checkpointing should be done internally in this function.

property samples

A dict mapping variable_params to arrays of samples currently in memory. The dictionary may also contain sampling_params.

The sample arrays may have any shape, and may or may not be thinned.

write_results(filename)[source]

Writes samples, model stats, acceptance fraction, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

pycbc.inference.sampler.ultranest module

This modules provides classes and functions for using the ultranest sampler packages for parameter estimation.

class pycbc.inference.sampler.ultranest.UltranestSampler(model, log_dir=None, stepsampling=False, enable_plots=False, **kwargs)[source]

Bases: BaseSampler

This class is used to construct an Ultranest sampler from the ultranest package.

Parameters:

model (model) – A model from pycbc.inference.models.
log_dir (str) – Folder where files should be stored for resuming (optional).
stepsampling (bool) – If false, uses rejection sampling. If true, uses hit-and-run sampler, which scales better with dimensionality.

checkpoint()[source]: The sampler must have a checkpoint method for dumping raw samples and stats to the file type defined by io.

finalize()[source]: Do any finalization to the samples file before exiting.

classmethod from_config(cp, model, output_file=None, **kwds)[source]: Loads the sampler from the given config file.

property io

A class that inherits from BaseInferenceFile to handle IO with an hdf file.

This should be a class, not an instance of class, so that the sampler can initialize it when needed.

property logz: Return bayesian evidence estimated by ultranest sampler.

property logz_err: Return error in bayesian evidence estimated by ultranest sampler.

property model_stats

A dict mapping model’s metadata fields to arrays of values for each sample in raw_samples.

The arrays may have any shape, and may or may not be thinned.

name = 'ultranest'

property niterations

resume_from_checkpoint()[source]: Resume the sampler from the output file.

run()[source]

This function should run the sampler.

Any checkpointing should be done internally in this function.

property samples

A dict mapping variable_params to arrays of samples currently in memory. The dictionary may also contain sampling_params.

The sample arrays may have any shape, and may or may not be thinned.

write_results(filename)[source]

Writes samples, model stats, acceptance fraction, and random state to the given file.

Parameters:: filename (str) – The file to write to. The file is opened using the io class in an an append state.

Module contents

This module provides a list of implemented samplers for parameter estimation.

pycbc.inference.sampler.load_from_config(cp, model, **kwargs)[source]

Loads a sampler from the given config file.

This looks for a name in the section [sampler] to determine which sampler class to load. That sampler’s from_config is then called.

Parameters:

cp (WorkflowConfigParser) – Config parser to read from.
model (pycbc.inference.model) – Which model to pass to the sampler.
**kwargs – All other keyword arguments are passed directly to the sampler’s from_config file.

Returns:

The initialized sampler.

Return type:

sampler