pycbc.inference.io package

Submodules

pycbc.inference.io.base_hdf module

This modules defines functions for reading and writing samples that the inference samplers generate.

class pycbc.inference.io.base_hdf.BaseInferenceFile(path, mode=None, **kwargs)[source]

Bases: File

Base class for all inference hdf files.

This is a subclass of the h5py.File object. It adds functions for handling reading and writing the samples from the samplers.

Parameters:
  • path (str) – The path to the HDF file.

  • mode ({None, str}) – The mode to open the file, eg. “w” for write and “r” for read.

property cmd

Returns the (last) saved command line.

If the file was created from a run that resumed from a checkpoint, only the last command line used is returned.

Returns:

cmd – The command line that created this InferenceFile.

Return type:

string

config_group = 'config_file'
copy(other, ignore=None, parameters=None, parameter_names=None, read_args=None, write_args=None)[source]

Copies metadata, info, and samples in this file to another file.

Parameters:
  • other (str or InferenceFile) – The file to write to. May be either a string giving a filename, or an open hdf file. If the former, the file will be opened with the write attribute (note that if a file already exists with that name, it will be deleted).

  • ignore ((list of) strings) – Don’t copy the given groups. If the samples group is included, no samples will be copied.

  • parameters (list of str, optional) – List of parameters in the samples group to copy. If None, will copy all parameters.

  • parameter_names (dict, optional) – Rename one or more parameters to the given name. The dictionary should map parameter -> parameter name. If None, will just use the original parameter names.

  • read_args (dict, optional) – Arguments to pass to read_samples.

  • write_args (dict, optional) – Arguments to pass to write_samples.

Returns:

The open file handler to other.

Return type:

InferenceFile

copy_info(other, ignore=None)[source]

Copies “info” from this file to the other.

“Info” is defined all groups that are not the samples group.

Parameters:
  • other (output file) – The output file. Must be an hdf file.

  • ignore ((list of) str) – Don’t copy the given groups.

copy_metadata(other)[source]

Copies all metadata from this file to the other file.

Metadata is defined as everything in the top-level .attrs.

Parameters:

other (InferenceFile) – An open inference file to write the data to.

copy_samples(other, parameters=None, parameter_names=None, read_args=None, write_args=None)[source]

Should copy samples to the other files.

Parameters:
  • other (InferenceFile) – An open inference file to write to.

  • parameters (list of str, optional) – List of parameters to copy. If None, will copy all parameters.

  • parameter_names (dict, optional) – Rename one or more parameters to the given name. The dictionary should map parameter -> parameter name. If None, will just use the original parameter names.

  • read_args (dict, optional) – Arguments to pass to read_samples.

  • write_args (dict, optional) – Arguments to pass to write_samples.

data_group = 'data'
property effective_nsamples

Returns the effective number of samples stored in the file.

static extra_args_parser(parser=None, skip_args=None, **kwargs)[source]

Provides a parser that can be used to parse sampler-specific command line options for loading samples.

This is optional. Inheriting classes may override this if they want to implement their own options.

Parameters:
  • parser (argparse.ArgumentParser, optional) – Instead of creating a parser, add arguments to the given one. If none provided, will create one.

  • skip_args (list, optional) – Don’t include the given options. Options should be given as the option string, minus the ‘–’. For example, skip_args=['iteration'] would cause the --iteration argument not to be included.

  • **kwargs – All other keyword arguments are passed to the parser that is created.

Returns:

  • parser (argparse.ArgumentParser or None) – If this class adds extra arguments, an argument parser with the extra arguments. Otherwise, will just return whatever was passed for the parser argument (default is None).

  • actions (list of argparse.Action) – List of the actions that were added.

static get_slice(thin_start=None, thin_interval=None, thin_end=None)[source]

Formats a slice to retrieve a thinned array from an HDF file.

Parameters:
  • thin_start (float or int, optional) – The starting index to use. If provided, the int will be taken.

  • thin_interval (float or int, optional) – The interval to use. If provided the ceiling of it will be taken.

  • thin_end (float or int, optional) – The end index to use. If provided, the int will be taken.

Returns:

The slice needed.

Return type:

slice

getattrs(group=None, create_missing=True)[source]

Convenience function for getting the attrs from the file or group.

Parameters:
  • group (str, optional) – Get the attrs of the specified group. If None or /, will retrieve the file’s attrs.

  • create_missing (bool, optional) – If group is provided, but doesn’t yet exist in the file, create the group. Otherwise, a KeyError will be raised. Default is True.

Returns:

An attrs instance of the file or requested group.

Return type:

h5py.File.attrs

injections_group = 'injections'
property log_evidence

Returns the log of the evidence and its error, if they exist in the file. Raises a KeyError otherwise.

name = None
parse_parameters(parameters, array_class=None)[source]

Parses a parameters arg to figure out what fields need to be loaded.

Parameters:
  • parameters ((list of) strings) – The parameter(s) to retrieve. A parameter can be the name of any field in samples_group, a virtual field or method of FieldArray (as long as the file contains the necessary fields to derive the virtual field or method), and/or a function of these.

  • array_class (array class, optional) – The type of array to use to parse the parameters. The class must have a parse_parameters method. Default is to use a FieldArray.

Returns:

A list of strings giving the fields to load from the file.

Return type:

list

read_config_file(return_cp=True, index=-1)[source]

Reads the config file that was used.

A ValueError is raised if no config files have been saved, or if the requested index larger than the number of stored config files.

Parameters:
  • return_cp (bool, optional) – If true, returns the loaded config file as pycbc.workflow.configuration.WorkflowConfigParser type. Otherwise will return as string buffer. Default is True.

  • index (int, optional) – The config file to load. If write_config_file has been called multiple times (as would happen if restarting from a checkpoint), there will be config files stored. Default (-1) is to load the last saved file.

Returns:

The parsed config file.

Return type:

WorkflowConfigParser or StringIO

read_data(group=None)[source]

Loads the data stored in the file as a FrequencySeries.

Only works for models that store data as a frequency series in data/DET/stilde. A KeyError will be raised if the model used did not store data in that path.

Parameters:

group (str, optional) – Group that the data group is in. Default (None) is to look in the top-level.

Returns:

Dictionary of detector name -> FrequencySeries.

Return type:

dict

read_injections(group=None)[source]

Gets injection parameters.

Injections are retrieved from [{group}/]injections.

Parameters:

group (str, optional) – Group that the injections group is in. Default (None) is to look in the top-level.

Returns:

Array of the injection parameters.

Return type:

FieldArray

read_psds(group=None)[source]

Loads the PSDs stored in the file as a FrequencySeries.

Only works for models that store PSDs in data/DET/psds/0. A KeyError will be raised if the model used did not store PSDs in that path.

Parameters:

group (str, optional) – Group that the data group is in. Default (None) is to look in the top-level.

Returns:

Dictionary of detector name -> FrequencySeries.

Return type:

dict

read_random_state(group=None)[source]

Reads the state of the random number generator from the file.

Parameters:

group (str) – Name of group to read random state from.

Returns:

A tuple with 5 elements that can be passed to numpy.set_state.

Return type:

tuple

abstract read_raw_samples(fields, **kwargs)[source]

Low level function for reading datasets in the samples group.

This should return a dictionary of numpy arrays.

read_samples(parameters, array_class=None, **kwargs)[source]

Reads samples for the given parameter(s).

The parameters can be the name of any dataset in samples_group, a virtual field or method of FieldArray (as long as the file contains the necessary fields to derive the virtual field or method), and/or any numpy function of these.

The parameters are parsed to figure out what datasets are needed. Only those datasets will be loaded, and will be the base-level fields of the returned FieldArray.

The static_params are also added as attributes of the returned FieldArray.

Parameters:
  • parameters ((list of) strings) – The parameter(s) to retrieve.

  • array_class (FieldArray-like class, optional) – The type of array to return. The class must have from_kwargs and parse_parameters methods. If None, will return a FieldArray.

  • **kwargs – All other keyword arguments are passed to read_raw_samples.

Returns:

The samples as a FieldArray.

Return type:

FieldArray

sampler_group = 'sampler_info'
samples_from_cli(opts, parameters=None, **kwargs)[source]

Reads samples from the given command-line options.

Parameters:
  • opts (argparse Namespace) – The options with the settings to use for loading samples (the sort of thing returned by ArgumentParser().parse_args).

  • parameters ((list of) str, optional) – A list of the parameters to load. If none provided, will try to get the parameters to load from opts.parameters.

  • **kwargs – All other keyword arguments are passed to read_samples. These will override any options with the same name.

Returns:

Array of the loaded samples.

Return type:

FieldArray

samples_group = 'samples'
property static_params

Returns a dictionary of the static_params. The keys are the argument names, values are the value they were set to.

property thin_end

The defaut end index to use when reading samples.

Unless overriden by sub-class attribute, just return None.

property thin_interval

The default interval to use when reading samples.

Unless overridden by sub-class attribute, just returns 1.

property thin_start

The default start index to use when reading samples.

Unless overridden by sub-class attribute, just returns 0.

write_command_line()[source]

Writes command line to attributes.

The command line is written to the file’s attrs['cmd']. If this attribute already exists in the file (this can happen when resuming from a checkpoint), attrs['cmd'] will be a list storing the current command line and all previous command lines.

write_config_file(cp)[source]

Writes the given config file parser.

File is stored as a pickled buffer array to config_parser/{index}, where {index} is an integer corresponding to the number of config files that have been saved. The first time a save is called, it is stored to 0, and incremented from there.

Parameters:

cp (ConfigParser) – Config parser to save.

write_data(name, data, path=None, append=False)[source]

Convenience function to write data.

Given data is written as a dataset with name in path. If the dataset or path do not exist yet, the dataset and path will be created.

Parameters:
  • name (str) – The name to associate with the data. This will be the dataset name (if data is array-like) or the key in the attrs.

  • data (array, dict, or atomic) – The data to write. If a dictionary, a subgroup will be created for each key, and the values written there. This will be done recursively until an array or atomic (e.g., float, int, str), is found. Otherwise, the data is written to the given name.

  • path (str, optional) – Write to the given path. Default (None) will write to the top level. If the path does not exist in the file, it will be created.

  • append (bool, optional) – Append the data to what is currently in the file if path/name already exists in the file, and if it does not, create the dataset so that its last dimension can be resized. The data can only be appended along the last dimension, and if it already exists in the data, it must be resizable along this dimension. If False (the default) what is in the file will be overwritten, and the given data must have the same shape.

write_effective_nsamples(effective_nsamples)[source]

Writes the effective number of samples stored in the file.

write_injections(injection_file, group=None)[source]

Writes injection parameters from the given injection file.

Everything in the injection file is copied to [{group}/]injections_group, where {group} is the optional keyword argument.

Parameters:
  • injection_file (str) – Path to HDF injection file.

  • group (str, optional) – Specify a top-level group to write the injections group to. If None (the default), injections group will be written to the file’s top level.

classmethod write_kwargs_to_attrs(attrs, **kwargs)[source]

Writes the given keywords to the given attrs.

If any keyword argument points to a dict, the keyword will point to a list of the dict’s keys. Each key is then written to the attrs with its corresponding value.

Parameters:
  • attrs (an HDF attrs) – The attrs of an hdf file or a group in an hdf file.

  • **kwargs – The keywords to write.

write_logevidence(lnz, dlnz)[source]

Writes the given log evidence and its error.

Results are saved to file’s ‘log_evidence’ and ‘dlog_evidence’ attributes.

Parameters:
  • lnz (float) – The log of the evidence.

  • dlnz (float) – The error in the estimate of the log evidence.

write_psd(psds, group=None)[source]

Writes PSD for each IFO to file.

PSDs are written to [{group}/]data/{detector}/psds/0, where {group} is the optional keyword argument.

Parameters:
  • psds (dict) – A dict of detector name -> FrequencySeries.

  • group (str, optional) – Specify a top-level group to write the data to. If None (the default), data will be written to the file’s top level.

write_random_state(group=None, state=None)[source]

Writes the state of the random number generator from the file.

The random state is written to sampler_group/random_state.

Parameters:
  • group (str) – Name of group to write random state to.

  • state (tuple, optional) – Specify the random state to write. If None, will use numpy.random.get_state().

abstract write_samples(samples, **kwargs)[source]

This should write all of the provided samples.

This function should be used to write both samples and model stats.

Parameters:
  • samples (dict) – Samples should be provided as a dictionary of numpy arrays.

  • **kwargs – Any other keyword args the sampler needs to write data.

write_stilde(stilde_dict, group=None)[source]

Writes stilde for each IFO to file.

Parameters:
  • stilde ({dict, FrequencySeries}) – A dict of FrequencySeries where the key is the IFO.

  • group ({None, str}) – The group to write the strain to. If None, will write to the top level.

write_strain(strain_dict, group=None)[source]

Writes strain for each IFO to file.

Parameters:
  • strain ({dict, FrequencySeries}) – A dict of FrequencySeries where the key is the IFO.

  • group ({None, str}) – The group to write the strain to. If None, will write to the top level.

pycbc.inference.io.base_hdf.format_attr(val)[source]

Formats an attr so that it can be read in either python 2 or 3.

In python 2, strings that are saved as an attribute in an hdf file default to unicode. Since unicode was removed in python 3, if you load that file in a python 3 environment, the strings will be read as bytes instead, which causes a number of issues. This attempts to fix that. If the value is a bytes string, then it will be decoded into a string. If the value is a numpy array of byte strings, it will convert the array to a list of strings.

Parameters:

val (obj) – The value to format. This will try to apply decoding to the value

Returns:

If val was a byte string, the value as a str. If the value was a numpy array of bytes_, the value as a list of str. Otherwise, just returns the value.

Return type:

obj

pycbc.inference.io.base_mcmc module

Provides I/O that is specific to MCMC samplers.

class pycbc.inference.io.base_mcmc.CommonMCMCMetadataIO[source]

Bases: object

Provides functions for reading/writing MCMC metadata to file.

The functions here are common to both standard MCMC (in which chains are independent) and ensemble MCMC (in which chains/walkers share information).

property acl

The autocorrelation length (ACL) of the samples.

This is the autocorrelation time (ACT) divided by the file’s thinned_by attribute. Raises a ValueError if the ACT has not been calculated.

property act

The autocorrelation time (ACT).

This is the ACL times the file’s thinned by. Raises a ValueError if the ACT has not been calculated.

property burn_in_index

Returns the burn in index.

This is the burn in iteration divided by the file’s thinned_by. Requires the class that this is used with has a burn_in_iteration attribute.

property burn_in_iteration

Returns the burn in iteration of all the chains.

Raises a ValueError if no burn in tests were done.

static extra_args_parser(parser=None, skip_args=None, **kwargs)[source]

Create a parser to parse sampler-specific arguments for loading samples.

Parameters:
  • parser (argparse.ArgumentParser, optional) – Instead of creating a parser, add arguments to the given one. If none provided, will create one.

  • skip_args (list, optional) – Don’t parse the given options. Options should be given as the option string, minus the ‘–’. For example, skip_args=['iteration'] would cause the --iteration argument not to be included.

  • **kwargs – All other keyword arguments are passed to the parser that is created.

Returns:

  • parser (argparse.ArgumentParser) – An argument parser with th extra arguments added.

  • actions (list of argparse.Action) – A list of the actions that were added.

property is_burned_in

Returns whether or not chains are burned in.

Raises a ValueError if no burn in tests were done.

iterations(parameter)[source]

Returns the iteration each sample occurred at.

last_iteration(parameter=None, group=None)[source]

Returns the iteration of the last sample of the given parameter.

Parameters:
  • parameter (str, optional) – The name of the parameter to get the last iteration for. If None provided, will just use the first parameter in group.

  • group (str, optional) – The name of the group to get the last iteration from. Default is the samples_group.

property nchains

Returns the number of chains used by the sampler.

Alias of nwalkers.

property niterations

Returns the number of iterations the sampler was run for.

property nwalkers

Returns the number of walkers used by the sampler.

Alias of nchains.

property raw_acls

Dictionary of parameter names -> raw autocorrelation length(s).

Depending on the sampler, the autocorrelation lengths may be floats, or [ntemps x] [nchains x] arrays.

The ACLs are the autocorrelation times (ACT) divided by the file’s thinned_by attribute. Raises a ValueError is no raw acts have been set.

property raw_acts

Dictionary of parameter names -> raw autocorrelation time(s).

Depending on the sampler, the autocorrelation times may be floats, or [ntemps x] [nchains x] arrays.

Raises a ValueError is no raw acts have been set.

thin(thin_interval)[source]

Thins the samples on disk to the given thinning interval.

The interval must be a multiple of the file’s current thinned_by.

Parameters:

thin_interval (int) – The interval the samples on disk should be thinned by.

property thinned_by

Returns interval samples have been thinned by on disk.

This looks for thinned_by in the samples group attrs. If none is found, will just return 1.

write_niterations(niterations)[source]

Writes the given number of iterations to the sampler group.

write_resume_point()[source]

Keeps a list of the number of iterations that were in a file when a run was resumed from a checkpoint.

write_sampler_metadata(sampler)[source]

Writes the sampler’s metadata.

class pycbc.inference.io.base_mcmc.EnsembleMCMCMetadataIO[source]

Bases: object

Provides functions for reading/writing metadata to file for ensemble MCMCs.

property thin_interval

Returns the default thin interval to use for reading samples.

If a finite ACL exists in the file, will return that. Otherwise, returns 1.

property thin_start

Returns the default thin start to use for reading samples.

If burn-in tests were done, returns the burn in index. Otherwise, returns 0.

class pycbc.inference.io.base_mcmc.MCMCMetadataIO[source]

Bases: object

Provides functions for reading/writing metadata to file for MCMCs in which all chains are independent of each other.

Overrides the BaseInference file’s thin_start and thin_interval attributes. Instead of integers, these return arrays.

property thin_interval

Returns the default thin interval to use for reading samples.

If a finite ACL exists in the file, will return that. Otherwise, returns 1.

property thin_start

Returns the default thin start to use for reading samples.

If burn-in tests were done, this will return the burn-in index of every chain that has burned in. The start index for chains that have not burned in will be greater than the number of samples, so that those chains return no samples. If no burn-in tests were done, returns 0 for all chains.

pycbc.inference.io.base_mcmc.ensemble_read_raw_samples(fp, fields, thin_start=None, thin_interval=None, thin_end=None, iteration=None, walkers=None, flatten=True, group=None)[source]

Base function for reading samples from ensemble MCMC files without parallel tempering.

Parameters:
  • fp (BaseInferenceFile) – Open file handler to write files to. Must be an instance of BaseInferenceFile with EnsembleMCMCMetadataIO methods added.

  • fields (list) – The list of field names to retrieve.

  • thin_start (int, optional) – Start reading from the given iteration. Default is to start from the first iteration.

  • thin_interval (int, optional) – Only read every thin_interval -th sample. Default is 1.

  • thin_end (int, optional) – Stop reading at the given iteration. Default is to end at the last iteration.

  • iteration (int, optional) – Only read the given iteration. If this provided, it overrides the thin_(start|interval|end) options.

  • walkers ((list of) int, optional) – Only read from the given walkers. Default (None) is to read all.

  • flatten (bool, optional) – Flatten the samples to 1D arrays before returning. Otherwise, the returned arrays will have shape (requested walkers x requested iteration(s)). Default is True.

  • group (str, optional) – The name of the group to read sample datasets from. Default is the file’s samples_group.

Returns:

A dictionary of field name -> numpy array pairs.

Return type:

dict

pycbc.inference.io.base_mcmc.nsamples_in_chain(start_iter, interval, niterations)[source]

Calculates the number of samples in an MCMC chain given a thinning start, end, and interval.

This function will work with either python scalars, or numpy arrays.

Parameters:
  • start_iter ((array of) int) – Start iteration. If negative, will count as being how many iterations to start before the end; otherwise, counts how many iterations to start before the beginning. If this is larger than niterations, will just return 0.

  • interval ((array of) int) – Thinning interval.

  • niterations ((array of) int) – The number of iterations.

Returns:

num_samples – The number of samples in a chain, >= 0.

Return type:

(array of) numpy.int

pycbc.inference.io.base_mcmc.thin_samples_for_writing(fp, samples, parameters, last_iteration, group, thin_by=None)[source]

Thins samples for writing to disk.

The thinning interval to use is determined by the given file handler’s thinned_by attribute. If that attribute is 1, just returns the samples.

Parameters:
  • fp (CommonMCMCMetadataIO instance) – The file the sampels will be written to. Needed to determine the thin interval used on disk.

  • samples (dict) – Dictionary mapping parameter names to arrays of (unthinned) samples. The arrays are thinned along their last dimension.

  • parameters (list of str) – The parameters to thin in samples before writing. All listed parameters must be in samples.

  • last_iteration (int) – The iteration that the last sample in samples occurred at. This is needed to figure out where to start the thinning in samples, such that the interval between the last sample on disk and the first new sample is the same as all of the other samples.

  • group (str) – The name of the group that the samples will be written to. This is needed to determine what the last iteration saved on disk was.

  • thin_by (int, optional) – Override the thinned_by attribute in the file for with the given value. Only do this if you are thinning something other than inference samples!

Returns:

Dictionary of the thinned samples to write.

Return type:

dict

pycbc.inference.io.base_mcmc.write_samples(fp, samples, parameters=None, last_iteration=None, samples_group=None, thin_by=None)[source]

Writes samples to the given file.

This works for both standard MCMC and ensemble MCMC samplers without parallel tempering.

Results are written to samples_group/{vararg}, where {vararg} is the name of a model params. The samples are written as an nwalkers x niterations array. If samples already exist, the new samples are appended to the current.

If the current samples on disk have been thinned (determined by the thinned_by attribute in the samples group), then the samples will be thinned by the same amount before being written. The thinning is started at the sample in samples that occured at the iteration equal to the last iteration on disk plus the thinned_by interval. If this iteration is larger than the iteration of the last given sample, then none of the samples will be written.

Parameters:
  • fp (BaseInferenceFile) – Open file handler to write files to. Must be an instance of BaseInferenceFile with CommonMCMCMetadataIO methods added.

  • samples (dict) – The samples to write. Each array in the dictionary should have shape nwalkers x niterations.

  • parameters (list, optional) – Only write the specified parameters to the file. If None, will write all of the keys in the samples dict.

  • last_iteration (int, optional) – The iteration of the last sample. If the file’s thinned_by attribute is > 1, this is needed to determine where to start thinning the samples such that the interval between the last sample currently on disk and the first new sample is the same as all of the other samples.

  • samples_group (str, optional) – Which group to write the samples to. Default (None) will result in writing to “samples”.

  • thin_by (int, optional) – Override the thinned_by attribute in the file with the given value. Only set this if you are using this function to write something other than inference samples!

pycbc.inference.io.base_multitemper module

Provides I/O support for multi-tempered sampler.

class pycbc.inference.io.base_multitemper.CommonMultiTemperedMetadataIO[source]

Bases: CommonMCMCMetadataIO

Adds support for reading/writing multi-tempered metadata to CommonMCMCMetadatIO.

static extra_args_parser(parser=None, skip_args=None, **kwargs)[source]

Adds –temps to MCMCIO parser.

property ntemps

Returns the number of temperatures used by the sampler.

write_sampler_metadata(sampler)[source]

Adds writing ntemps to file.

class pycbc.inference.io.base_multitemper.ParseTempsArg(type=<class 'str'>, **kwargs)[source]

Bases: Action

Argparse action that will parse temps argument.

If the provided argument is ‘all’, sets ‘all’ in the namespace dest. If a a sequence of numbers are provided, converts those numbers to ints before saving to the namespace.

pycbc.inference.io.base_multitemper.ensemble_read_raw_samples(fp, fields, thin_start=None, thin_interval=None, thin_end=None, iteration=None, temps='all', walkers=None, flatten=True, group=None)[source]

Base function for reading samples from ensemble MCMC file with parallel tempering.

Parameters:
  • fp (BaseInferenceFile) – Open file handler to write files to. Must be an instance of BaseInferenceFile with CommonMultiTemperedMetadataIO methods added.

  • fields (list) – The list of field names to retrieve.

  • thin_start (int, optional) – Start reading from the given iteration. Default is to start from the first iteration.

  • thin_interval (int, optional) – Only read every thin_interval -th sample. Default is 1.

  • thin_end (int, optional) – Stop reading at the given iteration. Default is to end at the last iteration.

  • iteration (int, optional) – Only read the given iteration. If this provided, it overrides the thin_(start|interval|end) options.

  • temps ('all' or (list of) int, optional) – The temperature index (or list of indices) to retrieve. To retrieve all temperates pass ‘all’, or a list of all of the temperatures. Default is ‘all’.

  • walkers ((list of) int, optional) – Only read from the given walkers. Default (None) is to read all.

  • flatten (bool, optional) – Flatten the samples to 1D arrays before returning. Otherwise, the returned arrays will have shape (requested temps x requested walkers x requested iteration(s)). Default is True.

  • group (str, optional) – The name of the group to read sample datasets from. Default is the file’s samples_group.

Returns:

A dictionary of field name -> numpy array pairs.

Return type:

dict

pycbc.inference.io.base_multitemper.read_raw_samples(fp, fields, thin_start=None, thin_interval=None, thin_end=None, iteration=None, temps='all', chains=None, flatten=True, group=None)[source]

Base function for reading samples from a collection of independent MCMC chains file with parallel tempering.

This may collect differing numbering of samples from each chains, depending on the thinning settings for each chain. If not flattened the returned array will have dimensions requested temps x requested chains x max samples, where max samples is the largest number of samples retrieved from a single chain. Chains that retrieve fewer samples will be padded with numpy.nan. If flattened, the NaNs are removed prior to returning.

Parameters:
  • fp (BaseInferenceFile) – Open file handler to read samples from. Must be an instance of BaseInferenceFile with CommonMultiTemperedMetadataIO methods added.

  • fields (list) – The list of field names to retrieve.

  • thin_start (array or int, optional) – Start reading from the given sample. May either provide an array indicating the start index for each chain, or an integer. If the former, the array must have the same length as the number of chains that will be retrieved. If the latter, the given value will be used for all chains. Default (None) is to use the file’s thin_start attribute.

  • thin_interval (array or int, optional) – Only read every thin_interval-th sample. May either provide an array indicating the interval to use for each chain, or an integer. If the former, the array must have the same length as the number of chains that will be retrieved. If the latter, the given value will be used for all chains. Default (None) is to use the file’s thin_interval attribute.

  • thin_end (array or int, optional) – Stop reading at the given sample index. May either provide an array indicating the end index to use for each chain, or an integer. If the former, the array must have the same length as the number of chains that will be retrieved. If the latter, the given value will be used for all chains. Default (None) is to use the the file’s thin_end attribute.

  • iteration (int, optional) – Only read the given iteration from all chains. If provided, it overrides the thin_(start|interval|end) options.

  • temps ('all' or (list of) int, optional) – The temperature index (or list of indices) to retrieve. To retrieve all temperates pass ‘all’, or a list of all of the temperatures. Default is ‘all’.

  • chains ((list of) int, optional) – Only read from the given chains. Default is to read all.

  • flatten (bool, optional) – Remove NaNs and flatten the samples to 1D arrays before returning. Otherwise, the returned arrays will have shape (requested temps x requested chains x max requested iteration(s)), with chains that return fewer samples padded with NaNs. Default is True.

  • group (str, optional) – The name of the group to read sample datasets from. Default is the file’s samples_group.

Returns:

A dictionary of field name -> numpy array pairs.

Return type:

dict

pycbc.inference.io.base_multitemper.write_samples(fp, samples, parameters=None, last_iteration=None, samples_group=None, thin_by=None)[source]

Writes samples to the given file.

This works both for standard MCMC and ensemble MCMC samplers with parallel tempering.

Results are written to samples_group/{vararg}, where {vararg} is the name of a model params. The samples are written as an ntemps x nwalkers x niterations array.

Parameters:
  • fp (BaseInferenceFile) – Open file handler to write files to. Must be an instance of BaseInferenceFile with CommonMultiTemperedMetadataIO methods added.

  • samples (dict) – The samples to write. Each array in the dictionary should have shape ntemps x nwalkers x niterations.

  • parameters (list, optional) – Only write the specified parameters to the file. If None, will write all of the keys in the samples dict.

  • last_iteration (int, optional) – The iteration of the last sample. If the file’s thinned_by attribute is > 1, this is needed to determine where to start thinning the samples to match what has already been stored on disk.

  • samples_group (str, optional) – Which group to write the samples to. Default (None) will result in writing to “samples”.

  • thin_by (int, optional) – Override the thinned_by attribute in the file with the given value. Only set this if you are using this function to write something other than inference samples!

pycbc.inference.io.base_nested_sampler module

Provides IO for the dynesty sampler.

class pycbc.inference.io.base_nested_sampler.BaseNestedSamplerFile(path, mode=None, **kwargs)[source]

Bases: BaseSamplerFile

Class to handle file IO for the nested samplers cpnest and dynesty.

name = 'base_nest_file'
read_raw_samples(fields, **kwargs)[source]

Low level function for reading datasets in the samples group.

This should return a dictionary of numpy arrays.

write_niterations(niterations)[source]

Writes the given number of iterations to the sampler group.

write_resume_point()[source]

Should write the point that a sampler starts up.

How the resume point is indexed is up to the sampler. For example, MCMC samplers use the number of iterations that are stored in the checkpoint file.

write_sampler_metadata(sampler)[source]

Adds writing betas to MultiTemperedMCMCIO.

write_samples(samples, parameters=None)[source]

Writes samples to the given file.

Results are written to samples_group/{vararg}, where {vararg} is the name of a model params. The samples are written as an array of length niterations.

Parameters:
  • samples (dict) – The samples to write. Each array in the dictionary should have length niterations.

  • parameters (list, optional) – Only write the specified parameters to the file. If None, will write all of the keys in the samples dict.

pycbc.inference.io.base_sampler module

Provides abstract base class for all samplers.

class pycbc.inference.io.base_sampler.BaseSamplerFile(path, mode=None, **kwargs)[source]

Bases: BaseInferenceFile

Base HDF class for all samplers.

This adds abstract methods write_resume_point and write_sampler_metadata to BaseInferenceFile.

property run_end_time

The (UNIX) time pycbc inference finished.

property run_start_time

The (UNIX) time pycbc inference began running.

If the run resumed from a checkpoint, the time the last checkpoint started is reported.

update_checkpoint_history()[source]

Writes a copy of relevant metadata to the file’s checkpoint history.

All data are written to sampler_info/checkpoint_history. If the group does not exist yet, it will be created.

This function writes the current time and the time since the last checkpoint to the file. It will also call _update_sampler_history() to write sampler-specific history.

validate()[source]

Runs a validation test.

This checks that a samples group exist, and that there are more than one sample stored to it.

Returns:

Whether or not the file is valid as a checkpoint file.

Return type:

bool

abstract write_resume_point()[source]

Should write the point that a sampler starts up.

How the resume point is indexed is up to the sampler. For example, MCMC samplers use the number of iterations that are stored in the checkpoint file.

write_run_end_time()[source]

“Writes the curent (UNIX) time as the run_end_time attribute.

write_run_start_time()[source]

Writes the current (UNIX) time to the file.

Times are stored as a list in the file’s attrs, with name run_start_time. If the attrbute already exists, the current time is appended. Otherwise, the attribute will be created and time added.

abstract write_sampler_metadata(sampler)[source]

This should write the given sampler’s metadata to the file.

This should also include the model’s metadata.

pycbc.inference.io.dynesty module

Provides IO for the dynesty sampler.

class pycbc.inference.io.dynesty.CommonNestedMetadataIO[source]

Bases: object

Provides functions for reading/writing dynesty metadata to file.

static extra_args_parser(parser=None, skip_args=None, **kwargs)[source]

Create a parser to parse sampler-specific arguments for loading samples.

Parameters:
  • parser (argparse.ArgumentParser, optional) – Instead of creating a parser, add arguments to the given one. If none provided, will create one.

  • skip_args (list, optional) – Don’t parse the given options. Options should be given as the option string, minus the ‘–’. For example, skip_args=['iteration'] would cause the --iteration argument not to be included.

  • **kwargs – All other keyword arguments are passed to the parser that is created.

Returns:

  • parser (argparse.ArgumentParser) – An argument parser with th extra arguments added.

  • actions (list of argparse.Action) – A list of the actions that were added.

read_pickled_data_from_checkpoint_file()[source]

Load the sampler state (pickled) from checkpoint file

validate()[source]

Runs a validation test. This checks that a samples group exist, and that pickeled data can be loaded.

Returns:

Whether or not the file is valid as a checkpoint file.

Return type:

bool

write_pickled_data_into_checkpoint_file(state)[source]

Dump the sampler state into checkpoint file

write_raw_samples(data, parameters=None)[source]

Write the nested samples to the file

class pycbc.inference.io.dynesty.DynestyFile(path, mode=None, **kwargs)[source]

Bases: CommonNestedMetadataIO, BaseNestedSamplerFile

Class to handle file IO for the dynesty sampler.

name = 'dynesty_file'
read_raw_samples(fields, raw_samples=False, seed=0)[source]

Reads samples from a dynesty file and constructs a posterior.

Parameters:
  • fields (list of str) – The names of the parameters to load. Names must correspond to dataset names in the file’s samples group.

  • raw_samples (bool, optional) – Return the raw (unweighted) samples instead of the estimated posterior samples. Default is False.

  • seed (int, optional) – When extracting the posterior, samples are randomly shuffled. To make this reproduceable, numpy’s random generator seed is set with the given value prior to the extraction. Default is 0.

Returns:

Dictionary of parameter names -> samples.

Return type:

dict

pycbc.inference.io.emcee module

Provides IO for the emcee sampler.

class pycbc.inference.io.emcee.EmceeFile(path, mode=None, **kwargs)[source]

Bases: EnsembleMCMCMetadataIO, CommonMCMCMetadataIO, BaseSamplerFile

Class to handle file IO for the emcee sampler.

name = 'emcee_file'
read_acceptance_fraction(walkers=None)[source]

Reads the acceptance fraction.

Parameters:

walkers ((list of) int, optional) – The walker index (or a list of indices) to retrieve. If None, samples from all walkers will be obtained.

Returns:

Array of acceptance fractions with shape (requested walkers,).

Return type:

array

read_raw_samples(fields, **kwargs)[source]

Base function for reading samples.

Calls base_mcmc.ensemble_read_raw_samples(). See that function for details.

Parameters:
  • fields (list) – The list of field names to retrieve.

  • **kwargs – All other keyword arguments are passed to base_mcmc.ensemble_read_raw_samples().

Returns:

A dictionary of field name -> numpy array pairs.

Return type:

dict

write_acceptance_fraction(acceptance_fraction)[source]

Write acceptance_fraction data to file. Results are written to the [sampler_group]/acceptance_fraction.

Parameters:

acceptance_fraction (numpy.ndarray) – Array of acceptance fractions to write.

write_samples(samples, **kwargs)[source]

Writes samples to the given file.

Calls base_mcmc.write_samples(). See that function for details.

Parameters:
  • samples (dict) – The samples to write. Each array in the dictionary should have shape nwalkers x niterations.

  • **kwargs – All other keyword arguments are passed to base_mcmc.write_samples().

pycbc.inference.io.emcee_pt module

Provides I/O support for emcee_pt.

class pycbc.inference.io.emcee_pt.EmceePTFile(path, mode=None, **kwargs)[source]

Bases: EnsembleMCMCMetadataIO, CommonMultiTemperedMetadataIO, BaseSamplerFile

Class to handle file IO for the emcee sampler.

property betas

The betas that were used.

name = 'emcee_pt_file'
read_acceptance_fraction(temps=None, walkers=None)[source]

Reads the acceptance fraction.

Parameters:
  • temps ((list of) int, optional) – The temperature index (or a list of indices) to retrieve. If None, acfs from all temperatures and all walkers will be retrieved.

  • walkers ((list of) int, optional) – The walker index (or a list of indices) to retrieve. If None, samples from all walkers will be obtained.

Returns:

Array of acceptance fractions with shape (requested temps, requested walkers).

Return type:

array

read_raw_samples(fields, **kwargs)[source]

Base function for reading samples.

Calls base_multitemper.ensemble_read_raw_samples(). See that function for details.

Parameters:
  • fields (list) – The list of field names to retrieve.

  • **kwargs – All other keyword arguments are passed to base_multitemper.ensemble_read_raw_samples().

Returns:

A dictionary of field name -> numpy array pairs.

Return type:

dict

write_acceptance_fraction(acceptance_fraction)[source]

Write acceptance_fraction data to file.

Results are written to [sampler_group]/acceptance_fraction; the resulting dataset has shape (ntemps, nwalkers).

Parameters:

acceptance_fraction (numpy.ndarray) – Array of acceptance fractions to write. Must have shape ntemps x nwalkers.

write_sampler_metadata(sampler)[source]

Adds writing betas to MultiTemperedMCMCIO.

write_samples(samples, **kwargs)[source]

Writes samples to the given file.

Calls base_multitemper.write_samples(). See that function for details.

Parameters:
  • samples (dict) – The samples to write. Each array in the dictionary should have shape ntemps x nwalkers x niterations.

  • **kwargs – All other keyword arguments are passed to base_multitemper.write_samples().

pycbc.inference.io.epsie module

This module provides IO classes for epsie samplers.

class pycbc.inference.io.epsie.EpsieFile(path, mode=None, **kwargs)[source]

Bases: MCMCMetadataIO, CommonMultiTemperedMetadataIO, BaseSamplerFile

Class to handle IO for Epsie’s parallel-tempered sampler.

property betas

The betas that were used.

name = 'epsie_file'
property nchains

Alias for nwalkers.

read_acceptance_fraction(temps=None, walkers=None)[source]

Alias for read_acceptance_rate().

read_acceptance_rate(temps=None, chains=None)[source]

Reads the acceptance rate.

This calls read_acceptance_ratio(), then averages the ratios over all iterations to get the average rate.

Parameters:
  • temps ((list of) int, optional) – The temperature index (or a list of indices) to retrieve. If None, acceptance rates from all temperatures and all chains will be retrieved.

  • chains ((list of) int, optional) – The chain index (or a list of indices) to retrieve. If None, rates from all chains will be obtained.

Returns:

Array of acceptance ratios with shape (requested temps, requested chains).

Return type:

array

read_acceptance_ratio(temps=None, chains=None)[source]

Reads the acceptance ratios.

Ratios larger than 1 are set back to 1 before returning.

Parameters:
  • temps ((list of) int, optional) – The temperature index (or a list of indices) to retrieve. If None, acceptance ratios from all temperatures and all chains will be retrieved.

  • chains ((list of) int, optional) – The chain index (or a list of indices) to retrieve. If None, ratios from all chains will be obtained.

Returns:

Array of acceptance ratios with shape (requested temps, requested chains, niterations).

Return type:

array

read_raw_samples(fields, **kwargs)[source]

Base function for reading samples.

Calls base_multitemper.read_raw_samples(). See that function for details.

Parameters:
  • fields (list) – The list of field names to retrieve.

  • **kwargs – All other keyword arguments are passed to base_multitemper.read_raw_samples().

Returns:

A dictionary of field name -> numpy array pairs.

Return type:

dict

property seed

The sampler’s seed.

property swap_interval

The interval that temperature swaps occurred at.

thin(thin_interval)[source]

Thins the samples on disk to the given thinning interval.

Also thins the acceptance ratio and the temperature data, both of which are stored in the sampler_info group.

validate()[source]

Adds attemp to load checkpoint to validation test.

write_acceptance_ratio(acceptance_ratio, last_iteration=None)[source]

Writes the acceptance ratios to the sampler info group.

Parameters:

acceptance_ratio (array) – The acceptance ratios to write. Should have shape ntemps x nchains x niterations.

write_sampler_metadata(sampler)[source]

Adds writing seed and betas to MultiTemperedMCMCIO.

write_samples(samples, **kwargs)[source]

Writes samples to the given file.

Calls base_multitemper.write_samples(). See that function for details.

Parameters:
  • samples (dict) – The samples to write. Each array in the dictionary should have shape ntemps x nwalkers x niterations.

  • **kwargs – All other keyword arguments are passed to base_multitemper.write_samples().

write_temperature_data(swap_index, acceptance_ratio, swap_interval, last_iteration)[source]

Writes temperature swaps and acceptance ratios.

Parameters:
  • swap_index (array) – The indices indicating which temperatures were swapped. Should have shape ntemps x nchains x (niterations/swap_interval).

  • acceptance_ratio (array) – The array of acceptance ratios between temperatures. Should have shape (ntemps-1) x nchains x (niterations/swap_interval). arrays.

  • swap_interval (int) – The number of iterations between temperature swaps.

  • last_iteration (int) – The iteration of the last sample.

pycbc.inference.io.multinest module

Provides I/O support for multinest.

class pycbc.inference.io.multinest.MultinestFile(path, mode=None, **kwargs)[source]

Bases: BaseSamplerFile

Class to handle file IO for the multinest sampler.

name = 'multinest_file'
property niterations

Returns the number of iterations the sampler was run for.

read_raw_samples(fields, iteration=None)[source]

Low level function for reading datasets in the samples group.

This should return a dictionary of numpy arrays.

write_logevidence(lnz, dlnz, importance_lnz, importance_dlnz)[source]

Writes the given log evidence and its error.

Results are saved to file’s ‘log_evidence’ and ‘dlog_evidence’ attributes, as well as the importance-weighted versions of these stats if they exist.

Parameters:
  • lnz (float) – The log of the evidence.

  • dlnz (float) – The error in the estimate of the log evidence.

  • importance_lnz (float, optional) – The importance-weighted log of the evidence.

  • importance_dlnz (float, optional) – The error in the importance-weighted estimate of the log evidence.

write_niterations(niterations)[source]

Writes the given number of iterations to the sampler group.

write_resume_point()[source]

Keeps a list of the number of iterations that were in a file when a run was resumed from a checkpoint.

write_sampler_metadata(sampler)[source]

Writes the sampler’s metadata.

write_samples(samples, parameters=None)[source]

Writes samples to the given file.

Results are written to samples_group/{vararg}, where {vararg} is the name of a model params. The samples are written as an array of length niterations.

Parameters:
  • samples (dict) – The samples to write. Each array in the dictionary should have length niterations.

  • parameters (list, optional) – Only write the specified parameters to the file. If None, will write all of the keys in the samples dict.

pycbc.inference.io.nessai module

Provides IO for the nessai sampler

class pycbc.inference.io.nessai.NessaiFile(path, mode=None, **kwargs)[source]

Bases: CommonNestedMetadataIO, BaseNestedSamplerFile

Class to handle file IO for the nessai sampler.

name = 'nessai_file'
read_raw_samples(fields, raw_samples=False, seed=0)[source]

Reads samples from a nessai file and constructs a posterior.

Using rejection sampling to resample the nested samples

Parameters:
  • fields (list of str) – The names of the parameters to load. Names must correspond to dataset names in the file’s samples group.

  • raw_samples (bool, optional) – Return the raw (unweighted) samples instead of the estimated posterior samples. Default is False.

Returns:

Dictionary of parameter fields -> samples.

Return type:

dict

pycbc.inference.io.posterior module

Provides simplified standard format just for posterior data

class pycbc.inference.io.posterior.PosteriorFile(path, mode=None, **kwargs)[source]

Bases: BaseInferenceFile

Class to handle file IO for the simplified Posterior file.

name = 'posterior_file'
read_raw_samples(fields, **kwargs)[source]

Low level function for reading datasets in the samples group.

This should return a dictionary of numpy arrays.

write_resume_point()[source]
write_run_end_time()
write_run_start_time()
write_sampler_metadata(sampler)[source]
write_samples(samples, parameters=None)[source]

This should write all of the provided samples.

This function should be used to write both samples and model stats.

Parameters:
  • samples (dict) – Samples should be provided as a dictionary of numpy arrays.

  • **kwargs – Any other keyword args the sampler needs to write data.

pycbc.inference.io.posterior.read_raw_samples_from_file(fp, fields, **kwargs)[source]
pycbc.inference.io.posterior.write_samples_to_file(fp, samples, parameters=None, group=None)[source]

Writes samples to the given file.

Results are written to samples_group/{vararg}, where {vararg} is the name of a model params. The samples are written as an array of length niterations.

Parameters:
  • fp (self) – Pass the ‘self’ from BaseInferenceFile class.

  • samples (dict) – The samples to write. Each array in the dictionary should have length niterations.

  • parameters (list, optional) – Only write the specified parameters to the file. If None, will write all of the keys in the samples dict.

pycbc.inference.io.ptemcee module

Provides I/O support for ptemcee.

class pycbc.inference.io.ptemcee.PTEmceeFile(path, mode=None, **kwargs)[source]

Bases: EnsembleMCMCMetadataIO, CommonMultiTemperedMetadataIO, BaseSamplerFile

Class to handle file IO for the ptemcee sampler.

name = 'ptemcee_file'
read_betas(thin_start=None, thin_interval=None, thin_end=None, iteration=None)[source]

Reads betas from the file.

Parameters:
  • thin_start (int, optional) – Start reading from the given iteration. Default is to start from the first iteration.

  • thin_interval (int, optional) – Only read every thin_interval -th sample. Default is 1.

  • thin_end (int, optional) – Stop reading at the given iteration. Default is to end at the last iteration.

  • iteration (int, optional) – Only read the given iteration. If this provided, it overrides the thin_(start|interval|end) options.

Returns:

A ntemps x niterations array of the betas.

Return type:

array

read_ensemble_attrs()[source]

Reads ensemble attributes from the file.

Returns:

Dictionary of the ensemble attributes.

Return type:

dict

read_raw_samples(fields, **kwargs)[source]

Base function for reading samples.

Calls base_multitemper.ensemble_read_raw_samples(). See that function for details.

Parameters:
  • fields (list) – The list of field names to retrieve.

  • **kwargs – All other keyword arguments are passed to base_multitemper.ensemble_read_raw_samples().

Returns:

A dictionary of field name -> numpy array pairs.

Return type:

dict

property starting_betas

The starting betas that were used.

write_betas(betas, last_iteration=None)[source]

Writes the betas to sampler group.

As the betas may change with iterations, this writes the betas as a ntemps x niterations array to the file.

write_ensemble_attrs(ensemble)[source]

Writes ensemble attributes necessary to restart from checkpoint.

Parameters:

ensemble (ptemcee.Ensemble) – The ensemble to write attributes for.

write_sampler_metadata(sampler)[source]

Adds writing ptemcee-specific metadata to MultiTemperedMCMCIO.

write_samples(samples, **kwargs)[source]

Writes samples to the given file.

Calls base_multitemper.write_samples(). See that function for details.

Parameters:
  • samples (dict) – The samples to write. Each array in the dictionary should have shape ntemps x nwalkers x niterations.

  • **kwargs – All other keyword arguments are passed to base_multitemper.write_samples().

pycbc.inference.io.snowline module

Provides IO for the snowline sampler.

class pycbc.inference.io.snowline.SnowlineFile(path, mode=None, **kwargs)[source]

Bases: PosteriorFile

Class to handle file IO for the snowline sampler.

name = 'snowline_file'

pycbc.inference.io.txt module

This modules defines functions for reading and samples that the inference samplers generate and are stored in an ASCII TXT file.

class pycbc.inference.io.txt.InferenceTXTFile(path, mode=None, delimiter=None)[source]

Bases: object

A class that has extra functions for handling reading the samples from posterior-only TXT files.

Parameters:
  • path (str) – The path to the TXT file.

  • mode ({None, str}) – The mode to open the file. Only accepts “r” or “rb” for reading.

  • delimiter (str) – Delimiter to use for TXT file. Default is space-delimited.

comments = ''
delimiter = ' '
name = 'txt'
classmethod write(output_file, samples, labels, delimiter=None)[source]

Writes a text file with samples.

Parameters:
  • output_file (str) – The path of the file to write.

  • samples (FieldArray) – Samples to write to file.

  • labels (list) – A list of strings to include as header in TXT file.

  • delimiter (str) – Delimiter to use in TXT file.

pycbc.inference.io.ultranest module

Provides IO for the ultranest sampler.

class pycbc.inference.io.ultranest.UltranestFile(path, mode=None, **kwargs)[source]

Bases: BaseNestedSamplerFile

Class to handle file IO for the ultranest sampler.

name = 'ultranest_file'

Module contents

I/O utilities for pycbc inference

exception pycbc.inference.io.NoInputFileError[source]

Bases: Exception

Raised in custom argparse Actions by arguments needing input-files when no file(s) were provided.

class pycbc.inference.io.PrintFileParams(skip_args=None, nargs=0, **kwargs)[source]

Bases: Action

Argparse action that will load input files and print possible parameters to screen. Once this is done, the program is forced to exit immediately.

The behvior is similar to –help, except that the input-file is read.

Note

The input_file attribute must be set in the parser namespace before this action is called. Otherwise, a NoInputFileError is raised.

class pycbc.inference.io.ResultsArgumentParser(skip_args=None, defaultparams=None, autoparamlabels=True, **kwargs)[source]

Bases: ArgumentParser

Wraps argument parser, and preloads arguments needed for loading samples from a file.

This parser class should be used by any program that wishes to use the standard arguments for loading samples. It provides functionality to parse file specific options. These file-specific arguments are not included in the standard --help (since they depend on what input files are given), but can be seen by running --file-help/-H. The --file-help will also print off information about what parameters may be used given the input files.

As with the standard ArgumentParser, running this class’s parse_args will result in an error if arguments are provided that are not recognized by the parser, nor by any of the file-specific arguments. For example, parse_args would work on the command --input-file results.hdf --walker 0 if results.hdf was created by a sampler that recognizes a --walker argument, but would raise an error if results.hdf was created by a sampler that does not recognize a --walker argument. The extra arguments that are recognized are determined by the sampler IO class’s extra_args_parser.

Some arguments may be excluded from the parser using the skip_args optional parameter.

Parameters:
  • skip_args (list of str, optional) – Do not add the given arguments to the parser. Arguments should be specified as the option string minus the leading ‘–’; e.g., skip_args=['thin-start'] would cause the thin-start argument to not be included. May also specify sampler-specific arguments. Note that input-file, file-help, and parameters are always added.

  • defaultparams ({'variable_params', 'all'}, optional) – If no --parameters provided, which collection of parameters to load. If ‘all’ will load all parameters in the file’s samples_group. If ‘variable_params’ or None (the default) will load the variable parameters.

  • autoparamlabels (bool, optional) – Passed to add_results_option_group; see that function for details.

  • **kwargs – All other keyword arguments are passed to argparse.ArgumentParser.

property actions

Exposes the actions this parser can do as a dictionary.

The dictionary maps the dest to actions.

add_results_option_group(autoparamlabels=True)[source]

Adds the options used to call pycbc.inference.io.results_from_cli function to the parser.

These are options releated to loading the results from a run of pycbc_inference, for purposes of plotting and/or creating tables.

Any argument strings included in the skip_args attribute will not be added.

Parameters:

autoparamlabels (bool, optional) – If True, the --parameters option will use labels from waveform.parameters if a parameter name is the same as a parameter there. Otherwise, will just use whatever label is provided. Default is True.

parse_known_args(args=None, namespace=None)[source]

Parse args method to handle input-file dependent arguments.

pycbc.inference.io.check_integrity(filename)[source]

Checks the integrity of an InferenceFile.

Checks done are:

  • can the file open?

  • do all of the datasets in the samples group have the same shape?

  • can the first and last sample in all of the datasets in the samples group be read?

If any of these checks fail, an IOError is raised.

Parameters:

filename (str) – Name of an InferenceFile to check.

Raises:
  • ValueError – If the given file does not exist.

  • KeyError – If the samples group does not exist.

  • IOError – If any of the checks fail.

pycbc.inference.io.get_common_parameters(input_files, collection=None)[source]

Gets a list of variable params that are common across all input files.

If no common parameters are found, a ValueError is raised.

Parameters:
  • input_files (list of str) – List of input files to load.

  • collection (str, optional) – What group of parameters to load. Can be the name of a list of parameters stored in the files’ attrs (e.g., “variable_params”), or “all”. If “all”, will load all of the parameters in the files’ samples group. Default is to load all.

Returns:

List of the parameter names.

Return type:

list

pycbc.inference.io.get_file_type(filename)[source]

Returns I/O object to use for file.

Parameters:

filename (str) – Name of file.

Returns:

file_type – The type of inference file object to use.

Return type:

{InferenceFile, InferenceTXTFile}

pycbc.inference.io.injections_from_cli(opts)[source]

Gets injection parameters from the inference file(s).

If the opts have a injection_samples_map option, the injection parameters will be remapped accordingly. See pycbc.inference.option_utils.add_injsamples_map_opt() for details.

Parameters:

opts (argparser) – Argparser object that has the command-line objects to parse.

Returns:

Array of the injection parameters from all of the input files given by opts.input_file.

Return type:

FieldArray

pycbc.inference.io.loadfile(path, mode=None, filetype=None, **kwargs)[source]

Loads the given file using the appropriate InferenceFile class.

If filetype is not provided, this will try to retreive the filetype from the file’s attrs. If the file does not exist yet, an IOError will be raised if filetype is not provided.

Parameters:
  • path (str) – The filename to load.

  • mode (str, optional) – What mode to load the file with, e.g., ‘w’ for write, ‘r’ for read, ‘a’ for append. Default will default to h5py.File’s mode, which is ‘a’.

  • filetype (str, optional) – Force the file to be loaded with the given class name. This must be provided if creating a new file.

Returns:

An open file handler to the file. The class used for IO with the file is determined by the filetype keyword (if provided) or the filetype stored in the file (if not provided).

Return type:

filetype instance

pycbc.inference.io.results_from_cli(opts, load_samples=True, **kwargs)[source]

Loads an inference result file along with any labels associated with it from the command line options.

Parameters:
  • opts (ArgumentParser options) – The options from the command line.

  • load_samples (bool, optional) – Load the samples from the file.

Returns:

  • fp_all ((list of) BaseInferenceFile type) – The result file as an hdf file. If more than one input file, then it returns a list.

  • parameters (list of str) – List of the parameters to use, parsed from the parameters option.

  • labels (dict) – Dictionary of labels to associate with the parameters.

  • samples_all ((list of) FieldArray(s) or None) – If load_samples, the samples as a FieldArray; otherwise, None. If more than one input file, then it returns a list.

  • **kwargs – Any other keyword arguments that are passed to read samples using samples_from_cli

pycbc.inference.io.validate_checkpoint_files(checkpoint_file, backup_file, check_nsamples=True)[source]

Checks if the given checkpoint and/or backup files are valid.

The checkpoint file is considered valid if:

  • it passes all tests run by check_integrity;

  • it has at least one sample written to it (indicating at least one checkpoint has happened).

The same applies to the backup file. The backup file must also have the same number of samples as the checkpoint file, otherwise, the backup is considered invalid.

If the checkpoint (backup) file is found to be valid, but the backup (checkpoint) file is not valid, then the checkpoint (backup) is copied to the backup (checkpoint). Thus, this function ensures that checkpoint and backup files are either both valid or both invalid.

Parameters:
  • checkpoint_file (string) – Name of the checkpoint file.

  • backup_file (string) – Name of the backup file.

Returns:

checkpoint_valid – Whether or not the checkpoint (and backup) file may be used for loading samples.

Return type:

bool