pycbc.workflow package

Submodules

pycbc.workflow.coincidence module

This module is responsible for setting up the coincidence stage of pycbc workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/coincidence.html

class pycbc.workflow.coincidence.CensorForeground(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

current_retention_level = 3
class pycbc.workflow.coincidence.MergeExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCAddStatmap(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: PyCBCCombineStatmap

Combine statmap files and add FARs over different coinc types

create_node(statmap_files, background_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCBank2HDFExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Converts xml tmpltbank to hdf format

create_node(bank_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCCombineStatmap(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Combine coincs over different bins and apply trials factor

create_node(statmap_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCDistributeBackgroundBins(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Distribute coinc files among different background bins

create_node(coinc_files, bank_file, background_bins, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.coincidence.PyCBCExcludeZerolag(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Remove times of zerolag coincidences of all types from exclusive background

create_node(statmap_file, other_statmap_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCFindCoincExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Find coinc triggers using a folded interval method

create_node(trig_files, bank_file, stat_files, veto_file, veto_name, template_str, pivot_ifo, fixed_ifo, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.coincidence.PyCBCFindSnglsExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Calculate single-detector ranking statistic for triggers

create_node(trig_files, bank_file, stat_files, veto_file, veto_name, template_str, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
file_input_options = ['--statistic-files']
class pycbc.workflow.coincidence.PyCBCFitByTemplateExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Calculates values that describe the background distribution template by template

create_node(trig_file, bank_file, veto_file, veto_name)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCFitOverParamExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Smooths the background distribution parameters over a continuous parameter

create_node(raw_fit_file, bank_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCHDFInjFindExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Find injections in the hdf files output

create_node(inj_coinc_file, inj_xml_file, veto_file, veto_name, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCSnglsStatMapExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Calculate FAP, IFAR, etc for singles

create_node(sngls_files, ifo, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCSnglsStatMapInjExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Calculate FAP, IFAR, etc for singles for injections

create_node(sngls_files, background_file, ifos, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCStatMapExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Calculate FAP, IFAR, etc for coincs

create_node(coinc_files, ifos, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCStatMapInjExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Calculate FAP, IFAR, etc for coincs for injections

create_node(coinc_files, full_data, ifos, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.coincidence.PyCBCTrig2HDFExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Converts xml triggers to hdf format, grouped by template hash

create_node(trig_files, bank_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
pycbc.workflow.coincidence.convert_bank_to_hdf(workflow, xmlbank, out_dir, tags=None)[source]

Return the template bank in hdf format

pycbc.workflow.coincidence.convert_trig_to_hdf(workflow, hdfbank, xml_trigger_files, out_dir, tags=None)[source]

Return the list of hdf5 trigger files outputs

pycbc.workflow.coincidence.find_injections_in_hdf_coinc(workflow, inj_coinc_file, inj_xml_file, veto_file, veto_name, out_dir, tags=None)[source]
pycbc.workflow.coincidence.get_ordered_ifo_list(ifocomb, ifo_ids)[source]

This function sorts the combination of ifos (ifocomb) based on the given precedence list (ifo_ids dictionary) and returns the first ifo as pivot the second ifo as fixed, and the ordered list joined as a string.

pycbc.workflow.coincidence.make_foreground_censored_veto(workflow, bg_file, veto_file, veto_name, censored_name, out_dir, tags=None)[source]
pycbc.workflow.coincidence.merge_single_detector_hdf_files(workflow, bank_file, trigger_files, out_dir, tags=None)[source]
pycbc.workflow.coincidence.rerank_coinc_followup(workflow, statmap_file, bank_file, out_dir, tags=None, injection_file=None, ranking_file=None)[source]
pycbc.workflow.coincidence.select_files_by_ifo_combination(ifocomb, insps)[source]

This function selects single-detector files (‘insps’) for a given ifo combination

pycbc.workflow.coincidence.setup_combine_statmap(workflow, final_bg_file_list, bg_file_list, out_dir, tags=None)[source]

Combine the statmap files into one background file

pycbc.workflow.coincidence.setup_exclude_zerolag(workflow, statmap_file, other_statmap_files, out_dir, ifos, tags=None)[source]

Exclude single triggers close to zerolag triggers from forming any background events

pycbc.workflow.coincidence.setup_interval_coinc(workflow, hdfbank, trig_files, stat_files, veto_file, veto_name, out_dir, pivot_ifo, fixed_ifo, tags=None)[source]

This function sets up exact match coincidence

pycbc.workflow.coincidence.setup_interval_coinc_inj(workflow, hdfbank, inj_trig_files, stat_files, background_file, veto_file, veto_name, out_dir, pivot_ifo, fixed_ifo, tags=None)[source]

This function sets up exact match coincidence for injections

pycbc.workflow.coincidence.setup_sngls(workflow, hdfbank, trig_files, stat_files, veto_file, veto_name, out_dir, tags=None)[source]

This function sets up getting statistic values for single-detector triggers

pycbc.workflow.coincidence.setup_sngls_inj(workflow, hdfbank, inj_trig_files, stat_files, background_file, veto_file, veto_name, out_dir, tags=None)[source]

This function sets up getting statistic values for single-detector triggers from injections

pycbc.workflow.coincidence.setup_sngls_statmap(workflow, ifo, sngls_files, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_sngls_statmap_inj(workflow, ifo, sngls_inj_files, background_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_statmap(workflow, ifos, coinc_files, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_statmap_inj(workflow, ifos, coinc_files, background_file, out_dir, tags=None)[source]
pycbc.workflow.coincidence.setup_trigger_fitting(workflow, insps, hdfbank, veto_file, veto_name, output_dir=None, tags=None)[source]

pycbc.workflow.configparser_test module

pycbc.workflow.configparser_test.add_options_to_section(cp, section, items, preserve_orig_file=False, overwrite_options=False)[source]

Add a set of options and values to a section of a ConfigParser object. Will throw an error if any of the options being added already exist, this behaviour can be overridden if desired

Parameters:
  • cp (The ConfigParser class)

  • section (string) – The name of the section to add options+values to

  • items (list of tuples) – Each tuple contains (at [0]) the option and (at [1]) the value to add to the section of the ini file

  • preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False

  • overwrite_options (Boolean, optional) – By default this function will throw a ValueError if an option exists in both the original section in the ConfigParser and in the provided items. This will override so that the options+values given in items will replace the original values if the value is set to True. Default = True

Returns:

cp

Return type:

The ConfigParser class

pycbc.workflow.configparser_test.check_duplicate_options(cp, section1, section2, raise_error=False)[source]

Check for duplicate options in two sections, section1 and section2. Will return True if there are duplicate options and False if not

Parameters:
  • cp (The ConfigParser class)

  • section1 (string) – The name of the first section to compare

  • section2 (string) – The name of the second section to compare

  • raise_error (Boolean, optional) – If True, raise an error if duplicates are present. Default = False

Returns:

duplicate – List of duplicate options

Return type:

List

pycbc.workflow.configparser_test.interpolate_string(testString, cp, section)[source]

Take a string and replace all example of ExtendedInterpolation formatting within the string with the exact value.

For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*

For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things

Nested interpolation is not supported here.

Parameters:
  • testString (String) – The string to parse and interpolate

  • cp (ConfigParser) – The ConfigParser object to look for the interpolation strings within

  • section (String) – The current section of the ConfigParser object

Returns:

testString – Interpolated string

Return type:

String

pycbc.workflow.configparser_test.parse_workflow_ini_file(cpFile, parsed_filepath=None)[source]

Read a .ini file in, parse it as described in the documentation linked to above, and return the parsed ini file.

Parameters:
  • cpFile (The path to a .ini file to be read in)

  • parsed_filepath (Boolean, optional) – If provided, the .ini file, after parsing, will be written to this location

Returns:

cp

Return type:

The parsed ConfigParser class containing the read in .ini file

pycbc.workflow.configparser_test.perform_extended_interpolation(cp, preserve_orig_file=False)[source]

Filter through an ini file and replace all examples of ExtendedInterpolation formatting with the exact value. For values like ${example} this is replaced with the value that corresponds to the option called example *in the same section*

For values like ${common|example} this is replaced with the value that corresponds to the option example in the section [common]. Note that in the python3 config parser this is ${common:example} but python2.7 interprets the : the same as a = and this breaks things

Nested interpolation is not supported here.

Parameters:
  • cp (ConfigParser object)

  • preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False

Returns:

cp

Return type:

parsed ConfigParser object

pycbc.workflow.configparser_test.read_ini_file(cpFile)[source]

Read a .ini file and return it as a ConfigParser class. This function does none of the parsing/combining of sections. It simply reads the file and returns it unedited

Parameters:

cpFile (The path to a .ini file to be read in)

Returns:

cp

Return type:

The ConfigParser class containing the read in .ini file

pycbc.workflow.configparser_test.sanity_check_subsections(cp)[source]

This function goes through the ConfigParset and checks that any options given in the [SECTION_NAME] section are not also given in any [SECTION_NAME-SUBSECTION] sections.

Parameters:

cp (The ConfigParser class)

Return type:

None

pycbc.workflow.configparser_test.split_multi_sections(cp, preserve_orig_file=False)[source]

Parse through a supplied ConfigParser object and splits any sections labelled with an “&” sign (for e.g. [inspiral&tmpltbank]) into [inspiral] and [tmpltbank] sections. If these individual sections already exist they will be appended to. If an option exists in both the [inspiral] and [inspiral&tmpltbank] sections an error will be thrown

Parameters:
  • cp (The ConfigParser class)

  • preserve_orig_file (Boolean, optional) – By default the input ConfigParser object will be modified in place. If this is set deepcopy will be used and the input will be preserved. Default = False

Returns:

cp

Return type:

The ConfigParser class

pycbc.workflow.configuration module

This module provides a wrapper to the ConfigParser utilities for pycbc workflow construction. This module is described in the page here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/initialization_inifile.html

class pycbc.workflow.configuration.WorkflowConfigParser(configFiles=None, overrideTuples=None, parsedFilePath=None, deleteTuples=None, copy_to_cwd=False)[source]

Bases: InterpolatingConfigParser

This is a sub-class of InterpolatingConfigParser, which lets us add a few additional helper features that are useful in workflows.

get_cli_option(section, option_name, **kwds)[source]

Return option using CLI action parsing

Parameters:
  • section (str) – Section to find option to parse

  • option_name (str) – Name of the option to parse from the config file

  • kwds (keywords) – Additional keywords are passed directly to the argument parser.

Returns:

The parsed value for this option

Return type:

value

interpolate_exe(testString)[source]

Replace testString with a path to an executable based on the format.

If this looks like

${which:lalapps_tmpltbank}

it will return the equivalent of which(lalapps_tmpltbank)

Otherwise it will return an unchanged string.

Parameters:

testString (string) – The input string

Returns:

newString – The output string.

Return type:

string

perform_exe_expansion()[source]

This function will look through the executables section of the ConfigParser object and replace any values using macros with full paths.

For any values that look like

${which:lalapps_tmpltbank}

will be replaced with the equivalent of which(lalapps_tmpltbank)

Otherwise values will be unchanged.

resolve_file_url(test_string)[source]

Replace test_string with a path to an executable based on the format.

If this looks like

${which:lalapps_tmpltbank}

it will return the equivalent of which(lalapps_tmpltbank)

Otherwise it will return an unchanged string.

Parameters:

test_string (string) – The input string

Returns:

new_string – The output string.

Return type:

string

resolve_urls()[source]

This function will look through all sections of the ConfigParser object and replace any URLs that are given the resolve magic flag with a path on the local drive.

Specifically for any values that look like

${resolve:https://git.ligo.org/detchar/SOME_GATING_FILE.txt}

the file will be replaced with the output of resolve_url(URL)

Otherwise values will be unchanged.

section_to_cli(section, skip_opts=None)[source]

Converts a section into a command-line string.

For example:

[section_name]
foo =
bar = 10

yields: ‘–foo –bar 10’.

Parameters:
  • section (str) – The name of the section to convert.

  • skip_opts (list, optional) – List of options to skip. Default (None) results in all options in the section being converted.

Returns:

The options as a command-line string.

Return type:

str

pycbc.workflow.configuration.add_workflow_command_line_group(parser)[source]

The standard way of initializing a ConfigParser object in workflow will be to do it from the command line. This is done by giving a

–local-config-files filea.ini fileb.ini filec.ini

command. You can also set config file override commands on the command line. This will be most useful when setting (for example) start and end times, or active ifos. This is done by

–config-overrides section1:option1:value1 section2:option2:value2 …

This can also be given as

–config-overrides section1:option1

where the value will be left as ‘’.

To remove a configuration option, use the command line argument

–config-delete section1:option1

which will delete option1 from [section1] or

–config-delete section1

to delete all of the options in [section1]

Deletes are implemented before overrides.

This function returns an argparse OptionGroup to ensure these options are parsed correctly and can then be sent directly to initialize an WorkflowConfigParser.

Parameters:

parser (argparse.ArgumentParser instance) – The initialized argparse instance to add the workflow option group to.

pycbc.workflow.configuration.hash_compare(filename_1, filename_2, chunk_size=None, max_chunks=None)[source]

Calculate the sha1 hash of a file, or of part of a file

Parameters:
  • filename_1 (string or path) – the first file to be hashed / compared

  • filename_2 (string or path) – the second file to be hashed / compared

  • chunk_size (integer) – This size of chunks to be read in and hashed. If not given, will read the whole file (may be slow for large files).

  • max_chunks (integer) – This many chunks to be compared. If all chunks so far have been the same, then just assume its the same file. Default 10

Returns:

hash – The hexdigest() after a sha1 hash of (part of) the file

Return type:

string

pycbc.workflow.configuration.resolve_url(url, directory=None, permissions=None, copy_to_cwd=True, hash_max_chunks=None, hash_chunk_size=None)[source]

Resolves a URL to a local file, and returns the path to that file.

If a URL is given, the file will be copied to the current working directory. If a local file path is given, the file will only be copied to the current working directory if copy_to_cwd is True (the default).

pycbc.workflow.core module

This module provides the worker functions and classes that are used when creating a workflow. For details about the workflow module see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope.html

exception pycbc.workflow.core.CalledProcessErrorMod(returncode, cmd, errFile=None, outFile=None, cmdFile=None)[source]

Bases: Exception

This exception is raised when subprocess.call returns a non-zero exit code and checking has been requested. This should not be accessed by the user it is used only within make_external_call.

class pycbc.workflow.core.Executable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

ALL_TRIGGERS = 2
DO_NOT_KEEP = 0
FINAL_RESULT = 4
INTERMEDIATE_PRODUCT = 1
KEEP_BUT_RAISE_WARNING = 5
MERGED_TRIGGERS = 3
add_ini_profile(cp, sec)[source]

Add profile from configuration file.

Parameters:
  • cp (ConfigParser object) – The ConfigParser object holding the workflow configuration settings

  • sec (string) – The section containing options for this job.

add_opt(opt, value=None)[source]

Add option to job.

Parameters:
  • opt (string) – Name of option (e.g. –output-file-format)

  • value (string, (default=None)) – The value for the option (no value if set to None).

create_node(**kwargs)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 5
file_input_options = ['--gating-file', '--frame-files', '--injection-file', '--statistic-files', '--bank-file', '--config-files', '--psd-file', '--asd-file', '--fake-strain-from-file', '--sgburst-injection-file']
get_opt(opt)[source]

Get value of option from configuration file

Parameters:

opt (string) – Name of option (e.g. output-file-format)

Returns:

value – The value for the option. Returns None if option not present.

Return type:

string

get_transformation()[source]
has_opt(opt)[source]

Check if option is present in configuration file

Parameters:

opt (string) – Name of option (e.g. output-file-format)

property ifo

Return the ifo.

If only one ifo in the ifo list this will be that ifo. Otherwise an error is raised.

time_dependent_options = []
update_current_retention_level(value)[source]

Set a new value for the current retention level.

This updates the value of self.retain_files for an updated value of the retention level.

Parameters:

value (int) – The new value to use for the retention level.

update_current_tags(tags)[source]

Set a new set of tags for this executable.

Update the set of tags that this job will use. This updated default file naming and shared options. It will not update the pegasus profile, which belong to the executable and cannot be different for different nodes.

Parameters:

tags (list) – The new list of tags to consider.

update_output_directory(out_dir=None)[source]

Update the default output directory for output files.

Parameters:

out_dir (string (optional, default=None)) – If provided use this as the output directory. Else choose this automatically from the tags.

class pycbc.workflow.core.File(ifos, exe_name, segs, file_url=None, extension=None, directory=None, tags=None, store_file=True, use_tmp_subdirs=False)[source]

Bases: File

This class holds the details of an individual output file This file(s) may be pre-supplied, generated from within the workflow command line script, or generated within the workflow. The important stuff is:

  • The ifo that the File is valid for

  • The time span that the OutFile is valid for

  • A short description of what the file is

  • The extension that the file should have

  • The url where the file should be located

An example of initiating this class:

>> c = File(“H1”, “INSPIRAL_S6LOWMASS”, segments.segment(815901601, 815902001), file_url=”file://localhost/home/spxiwh/H1-INSPIRAL_S6LOWMASS-815901601-400.xml.gz” )

another where the file url is generated from the inputs:

>> c = File(“H1”, “INSPIRAL_S6LOWMASS”, segments.segment(815901601, 815902001), directory=”/home/spxiwh”, extension=”xml.gz” )

add_metadata(key, value)[source]

Add arbitrary metadata to this file

property cache_entry

Returns a CacheEntry instance for File.

classmethod from_path(path, attrs=None, **kwargs)[source]

Create an output File object from path, with optional attributes.

property ifo

If only one ifo in the ifo_list this will be that ifo. Otherwise an error is raised.

property segment

If only one segment in the segmentlist this will be that segment. Otherwise an error is raised.

class pycbc.workflow.core.FileList(iterable=(), /)[source]

Bases: list

This class holds a list of File objects. It inherits from the built-in list class, but also allows a number of features. ONLY pycbc.workflow.File instances should be within a FileList instance.

categorize_by_attr(attribute)[source]

Function to categorize a FileList by a File object attribute (eg. ‘segment’, ‘ifo’, ‘description’).

Parameters:

attribute (string) – File object attribute to categorize FileList

Returns:

  • keys (list) – A list of values for an attribute

  • groups (list) – A list of FileLists

convert_to_lal_cache()[source]

Return all files in this object as a glue.lal.Cache object

dump(filename)[source]

Output this FileList to a pickle file

entry_class

alias of File

find_all_output_in_range(ifo, currSeg, useSplitLists=False)[source]

Return all files that overlap the specified segment.

find_output(ifo, time)[source]

Returns one File most appropriate at the given time/time range.

Return one File that covers the given time, or is most appropriate for the supplied time range.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the file should be valid for.

  • time (int/float/LIGOGPStime or tuple containing two values) – If int/float/LIGOGPStime (or similar may of specifying one time) is given, return the File corresponding to the time. This calls self.find_output_at_time(ifo,time). If a tuple of two values is given, return the File that is most appropriate for the time range given. This calls self.find_output_in_range

Returns:

pycbc_file – The File that corresponds to the time or time range

Return type:

pycbc.workflow.File instance

find_output_at_time(ifo, time)[source]

Return File that covers the given time.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the File should correspond to

  • time (int/float/LIGOGPStime) – Return the Files that covers the supplied time. If no File covers the time this will return None.

Returns:

The Files that corresponds to the time.

Return type:

list of File classes

find_output_in_range(ifo, start, end)[source]

Return the File that is most appropriate for the supplied time range. That is, the File whose coverage time has the largest overlap with the supplied time range. If no Files overlap the supplied time window, will return None.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the File should correspond to

  • start (int/float/LIGOGPStime) – The start of the time range of interest.

  • end (int/float/LIGOGPStime) – The end of the time range of interest

Returns:

The File that is most appropriate for the time range

Return type:

File class

find_output_with_ifo(ifo)[source]

Find all files who have ifo = ifo

find_output_with_tag(tag, fail_if_not_single_file=False)[source]

Find all files who have tag in self.tags

Parameters:
  • tag (string) – Tag used to seive the file names

  • fail_if_not_single_file (boolean) – kwarg (default is False) that triggers a sanity check if the user expects to find a single file with the desired tag in its name

Returns:

If fail_if_not_single_file is False the FileList Containing File instances with tag in self.tags is returned, otherwise the single File with tag in self.tags is returned (if the sanity check requested with fail_if_not_single_file=True is passed)

Return type:

FileList/File class

find_output_without_tag(tag)[source]

Find all files who do not have tag in self.tags

find_outputs_in_range(ifo, current_segment, useSplitLists=False)[source]

Return the list of Files that is most appropriate for the supplied time range. That is, the Files whose coverage time has the largest overlap with the supplied time range.

Parameters:
  • ifo (string) – Name of the ifo (or ifos) that the File should correspond to

  • current_segment (igwn_segments.segment) – The segment of time that files must intersect.

Returns:

The list of Files that are most appropriate for the time range

Return type:

FileList class

get_times_covered_by_files()[source]

Find the coalesced intersection of the segments of all files in the list.

classmethod load(filename)[source]

Load a FileList from a pickle file

to_file_object(name, out_dir)[source]

Dump to a pickle file and return an File object reference

Parameters:
  • name (str) – An identifier of this file. Needs to be unique.

  • out_dir (path) – path to place this file

Returns:

file

Return type:

AhopeFile

class pycbc.workflow.core.Node(executable, valid_seg=None)[source]

Bases: Node

add_multiifo_input_list_opt(opt, inputs)[source]

Add an option that determines a list of inputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 …..

add_multiifo_output_list_opt(opt, outputs)[source]

Add an option that determines a list of outputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 …..

get_command_line()[source]
new_multiifo_output_list_opt(opt, ifos, analysis_time, extension, tags=None, store_file=None, use_tmp_subdirs=False)[source]

Add an option that determines a list of outputs from multiple detectors. Files will be supplied as –opt ifo1:input1 ifo2:input2 ….. File names are created internally from the provided extension and analysis time.

new_output_file_opt(valid_seg, extension, option_name, tags=None, store_file=None, use_tmp_subdirs=False)[source]

This function will create a workflow.File object corresponding to the given information and then add that file as output of this node.

Parameters:
  • valid_seg (igwn_segments.segment) – The time span over which the job is valid for.

  • extension (string) – The extension to be used at the end of the filename. E.g. ‘.xml’ or ‘.sqlite’.

  • option_name (string) – The option that is used when setting this job as output. For e.g. ‘output-name’ or ‘output-file’, whatever is appropriate for the current executable.

  • tags (list of strings, (optional, default=[])) – These tags will be added to the list of tags already associated with the job. They can be used to uniquely identify this output file.

  • store_file (Boolean, (optional, default=True)) – This file is to be added to the output mapper and will be stored in the specified output location if True. If false file will be removed when no longer needed in the workflow.

property output_file

If only one output file return it. Otherwise raise an exception.

property output_files
resolve_td_options(td_options)[source]
class pycbc.workflow.core.SegFile(ifo_list, description, valid_segment, segment_dict=None, seg_summ_dict=None, **kwargs)[source]

Bases: File

This class inherits from the File class, and is designed to store workflow output files containing a segment dict. This is identical in usage to File except for an additional kwarg for holding the segment dictionary, if it is known at workflow run time.

classmethod from_multi_segment_list(description, segmentlists, names, ifos, seg_summ_lists=None, **kwargs)[source]

Initialize a SegFile object from a list of segmentlists.

Parameters:
  • description (string (required)) – See File.__init__

  • segmentlists (List of igwn_segments.segmentslist) – List of segment lists that will be stored in this file.

  • names (List of str) – List of names of the segment lists to be stored in the file.

  • ifos (str) – List of ifos of the segment lists to be stored in this file.

  • seg_summ_lists (igwn_segments.segmentslist (OPTIONAL)) – Specify the segment_summary segmentlists that go along with the segmentlists. Default=None, in this case segment_summary is taken from the valid_segment of the SegFile class.

classmethod from_segment_list(description, segmentlist, name, ifo, seg_summ_list=None, **kwargs)[source]

Initialize a SegFile object from a segmentlist.

Parameters:
  • description (string (required)) – See File.__init__

  • segmentlist (igwn_segments.segmentslist) – The segment list that will be stored in this file.

  • name (str) – The name of the segment lists to be stored in the file.

  • ifo (str) – The ifo of the segment lists to be stored in this file.

  • seg_summ_list (igwn_segments.segmentslist (OPTIONAL)) – Specify the segment_summary segmentlist that goes along with the segmentlist. Default=None, in this case segment_summary is taken from the valid_segment of the SegFile class.

classmethod from_segment_list_dict(description, segmentlistdict, ifo_list=None, valid_segment=None, file_exists=False, seg_summ_dict=None, **kwargs)[source]

Initialize a SegFile object from a segmentlistdict.

Parameters:
  • description (string (required)) – See File.__init__

  • segmentlistdict (igwn_segments.segmentslistdict) – See SegFile.__init__

  • ifo_list (string or list (optional)) – See File.__init__, if not given a list of all ifos in the segmentlistdict object will be used

  • valid_segment (igwn_segments.segment or igwn_segments.segmentlist) – See File.__init__, if not given the extent of all segments in the segmentlistdict is used.

  • file_exists (boolean (default = False)) – If provided and set to True it is assumed that this file already exists on disk and so there is no need to write again.

  • seg_summ_dict (igwn_segments.segmentslistdict) – Optional. See SegFile.__init__.

classmethod from_segment_xml(xml_file, **kwargs)[source]

Read a igwn_segments.segmentlist from the file object file containing an xml segment table.

Parameters:

xml_file (file object) – file object for segment xml file

parse_segdict_key(key)[source]

Return ifo and name from the segdict key.

remove_short_sci_segs(minSegLength)[source]

Function to remove all science segments shorter than a specific length. Also updates the file on disk to remove these segments.

Parameters:

minSegLength (int) – Maximum length of science segments. Segments shorter than this will be removed.

return_union_seglist()[source]
to_segment_xml(override_file_if_exists=False)[source]

Write the segment list in self.segmentList to self.storage_path.

class pycbc.workflow.core.Workflow(args, name=None)[source]

Bases: Workflow

This class manages a pycbc workflow. It provides convenience functions for finding input files using time and keywords. It can also generate cache files from the inputs.

property exec_sites_str
execute_node(node, verbatim_exe=False)[source]

Execute this node immediately on the local machine

get_ifo_combinations()[source]

Get a list of strings for all possible combinations of IFOs in the workflow

property output_map
save(filename=None, output_map_path=None, root=True)[source]

Write this workflow to DAX file

save_config(fname, output_dir, cp=None)[source]

Writes configuration file to disk and returns a pycbc.workflow.File instance for the configuration file.

Parameters:
  • fname (string) – The filename of the configuration file written to disk.

  • output_dir (string) – The directory where the file is written to disk.

  • cp (ConfigParser object) – The ConfigParser object to write. If None then uses self.cp.

Returns:

The FileList object with the configuration file.

Return type:

FileList

property sites

List of all possible exucution sites for jobs in this workflow

property staging_site

Site to use for staging to/from each site

property staging_site_str
pycbc.workflow.core.add_workflow_settings_cli(parser, include_subdax_opts=False)[source]

Adds workflow options to an argument parser.

Parameters:
  • parser (argparse.ArgumentParser) – Argument parser to add the options to.

  • include_subdax_opts (bool, optional) – If True, will add output-map and dax-file-directory options to the parser. These can be used for workflows that are generated as a subdax of another workflow. Default is False.

pycbc.workflow.core.configparser_value_to_file(cp, sec, opt, attrs=None)[source]

Fetch a file given its url location via the section and option in the workflow configuration parser.

Parameters:
  • cp (ConfigParser object) – The ConfigParser object holding the workflow configuration settings

  • sec (string) – The section containing options for this job.

  • opt (string) – Name of option (e.g. –output-file)

  • attrs (list to specify the 4 attributes of the file.)

Returns:

fileobj_from_path – specified by opt, within sec, in cp.

Return type:

workflow.File object obtained from the path

pycbc.workflow.core.get_full_analysis_chunk(science_segs)[source]

Function to find the first and last time point contained in the science segments and return a single segment spanning that full time.

Parameters:

science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.

Returns:

fullSegment – The segment spanning the first and last time point contained in science_segs.

Return type:

igwn_segments.segment

pycbc.workflow.core.get_random_label()[source]

Get a random label string to use when clustering jobs.

pycbc.workflow.core.make_analysis_dir(path)[source]

Make the analysis directory path, any parent directories that don’t already exist, and the ‘logs’ subdirectory of path.

pycbc.workflow.core.make_external_call(cmdList, out_dir=None, out_basename='external_call', shell=False, fail_on_error=True)[source]

Use this to make an external call using the python subprocess module. See the subprocess documentation for more details of how this works. http://docs.python.org/2/library/subprocess.html

Parameters:
  • cmdList (list of strings) – This list of strings contains the command to be run. See the subprocess documentation for more details.

  • out_dir (string) – If given the stdout and stderr will be redirected to os.path.join(out_dir,out_basename+[“.err”,”.out]) If not given the stdout and stderr will not be recorded

  • out_basename (string) – The value of out_basename used to construct the file names used to store stderr and stdout. See out_dir for more information.

  • shell (boolean, default=False) – This value will be given as the shell kwarg to the subprocess call. WARNING See the subprocess documentation for details on this Kwarg including a warning about a serious security exploit. Do not use this unless you are sure it is necessary and safe.

  • fail_on_error (boolean, default=True) – If set to true an exception will be raised if the external command does not return a code of 0. If set to false such failures will be ignored. Stderr and Stdout can be stored in either case using the out_dir and out_basename options.

Returns:

exitCode – The code returned by the process.

Return type:

int

pycbc.workflow.core.resolve_td_option(val_str, valid_seg)[source]

Take an option which might be time-dependent and resolve it

Some options might take different values depending on the GPS time. For example if you want opt_1 to take value_a if the time is between 10 and 100, value_b if between 100 and 250, and value_c if between 250 and 500 you can supply:

value_a[10:100],value_b[100:250],value_c[250:500].

This function will parse that string (as opt) and return the value fully contained in valid_seg. If valid_seg is not full contained in one, and only one, of these options. The code will fail. If given a simple option like:

value_a

The function will just return value_a.

pycbc.workflow.core.resolve_url_to_file(curr_pfn, attrs=None, hash_max_chunks=10, hash_chunk_size=1000000)[source]

Resolves a PFN into a workflow.File object.

This function will resolve a PFN to a workflow.File object. If a File object already exists for that PFN that will be returned, otherwise a new object is returned. We will implement default site schemes here as needed, for example cvfms paths will be added to the osg and nonfsio sites in addition to local. If the LFN is a duplicate of an existing one, but with a different PFN an AssertionError is raised. The attrs keyword-argument can be used to specify attributes of a file. All files have 4 possible attributes. A list of ifos, an identifying string - usually used to give the name of the executable that created the file, a segmentlist over which the file is valid and tags specifying particular details about those files. If attrs[‘ifos’] is set it will be used as the ifos, otherwise this will default to [‘H1’, ‘K1’, ‘L1’, ‘V1’]. If attrs[‘exe_name’] is given this will replace the “exe_name” sent to File.__init__ otherwise ‘INPUT’ will be given. segs will default to [[1,2000000000]] unless overridden with attrs[‘segs’]. tags will default to an empty list unless overriden with attrs[‘tag’]. If attrs is None it will be ignored and all defaults will be used. It is emphasized that these attributes are for the most part not important with input files. Exceptions include things like input template banks, where ifos and valid times will be checked in the workflow and used in the naming of child job output files.

hash_max_chunks and hash_chunk_size are used to decide how much of the files to check before they are considered the same, and not copied.

pycbc.workflow.datafind module

This module is responsible for querying a datafind server to determine the availability of the data that the code is attempting to run on. It also performs a number of tests and can act on these as described below. Full documentation for this function can be found here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/datafind.html

pycbc.workflow.datafind.convert_cachelist_to_filelist(datafindcache_list)[source]

Take as input a list of glue.lal.Cache objects and return a pycbc FileList containing all frames within those caches.

Parameters:

datafindcache_list (list of glue.lal.Cache objects) – The list of cache files to convert.

Returns:

datafind_filelist – The list of frame files.

Return type:

FileList of frame File objects

pycbc.workflow.datafind.datafind_keep_unique_backups(backup_outs, orig_outs)[source]

This function will take a list of backup datafind files, presumably obtained by querying a remote datafind server, e.g. CIT, and compares these against a list of original datafind files, presumably obtained by querying the local datafind server. Only the datafind files in the backup list that do not appear in the original list are returned. This allows us to use only files that are missing from the local cluster.

Parameters:
  • backup_outs (FileList) – List of datafind files from the remote datafind server.

  • orig_outs (FileList) – List of datafind files from the local datafind server.

Returns:

List of datafind files in backup_outs and not in orig_outs.

Return type:

FileList

pycbc.workflow.datafind.get_missing_segs_from_frame_file_cache(datafindcaches)[source]

This function will use os.path.isfile to determine if all the frame files returned by the local datafind server actually exist on the disk. This can then be used to update the science times if needed.

Parameters:

datafindcaches (OutGroupList) – List of all the datafind output files.

Returns:

  • missingFrameSegs (Dict. of ifo keyed igwn_segments.segmentlist instances) – The times corresponding to missing frames found in datafindOuts.

  • missingFrames (Dict. of ifo keyed lal.Cache instances) – The list of missing frames

pycbc.workflow.datafind.get_science_segs_from_datafind_outs(datafindcaches)[source]

This function will calculate the science segments that are covered in the OutGroupList containing the frame files returned by various calls to the datafind server. This can then be used to check whether this list covers what it is expected to cover.

Parameters:

datafindcaches (OutGroupList) – List of all the datafind output files.

Returns:

newScienceSegs – The times covered by the frames found in datafindOuts.

Return type:

Dictionary of ifo keyed igwn_segments.segmentlist instances

pycbc.workflow.datafind.get_segment_summary_times(scienceFile, segmentName)[source]

This function will find the times for which the segment_summary is set for the flag given by segmentName.

Parameters:
  • scienceFile (SegFile) – The segment file that we want to use to determine this.

  • segmentName (string) – The DQ flag to search for times in the segment_summary table.

Returns:

summSegList – The times that are covered in the segment summary table.

Return type:

igwn_segments.segmentlist

pycbc.workflow.datafind.log_datafind_command(observatory, frameType, startTime, endTime, outputDir, **dfKwargs)[source]

This command will print an equivalent gw_data_find command to disk that can be used to debug why the internal datafind module is not working.

pycbc.workflow.datafind.run_datafind_instance(cp, outputDir, observatory, frameType, startTime, endTime, ifo, tags=None)[source]

This function will query the datafind server once to find frames between the specified times for the specified frame type and observatory.

Parameters:
  • cp (ConfigParser instance) – Source for any kwargs that should be sent to the datafind module

  • outputDir (Output cache files will be written here. We also write the) – commands for reproducing what is done in this function to this directory.

  • observatory (string) – The observatory to query frames for. Ex. ‘H’, ‘L’ or ‘V’. NB: not ‘H1’, ‘L1’, ‘V1’ which denote interferometers.

  • frameType (string) – The frame type to query for.

  • startTime (int) – Integer start time to query the datafind server for frames.

  • endTime (int) – Integer end time to query the datafind server for frames.

  • ifo (string) – The interferometer to use for naming output. Ex. ‘H1’, ‘L1’, ‘V1’. Maybe this could be merged with the observatory string, but this could cause issues if running on old ‘H2’ and ‘H1’ data.

  • tags (list of string, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniquify the actual filename. FIXME: Filenames may not be unique with current codes!

Returns:

  • dfCache (glue.lal.Cache instance) – The glue.lal.Cache representation of the call to the datafind server and the returned frame files.

  • cacheFile (pycbc.workflow.core.File) – Cache file listing all of the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_from_pregenerated_lcf_files(cp, ifos, outputDir, tags=None)[source]

This function is used if you want to run with pregenerated lcf frame cache files.

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files

  • ifos (list of ifo strings) – List of ifos to get pregenerated files for.

  • outputDir (path) – All output files written by datafind processes will be written to this directory. Currently this sub-module writes no output.

  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename.

Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.

  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_cache_multi_calls_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_single_call_perifo this call will one call to the datafind server for every science segment. This function will return a list of output files that correspond to the cache .lcf files that are produced, which list the locations of all frame files. This will cause problems with pegasus, which expects to know about all input files (ie. the frame files themselves.)

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files

  • scienceSegs (Dictionary of ifo keyed igwn_segments.segmentlist instances) – This contains the times that the workflow is expected to analyse.

  • outputDir (path) – All output files written by datafind processes will be written to this directory.

  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!

Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.

  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_cache_single_call_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_generated this call will only run one call to datafind per ifo, spanning the whole time. This function will return a list of output files that correspond to the cache .lcf files that are produced, which list the locations of all frame files. This will cause problems with pegasus, which expects to know about all input files (ie. the frame files themselves.)

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files

  • scienceSegs (Dictionary of ifo keyed igwn_segments.segmentlist instances) – This contains the times that the workflow is expected to analyse.

  • outputDir (path) – All output files written by datafind processes will be written to this directory.

  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!

Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.

  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_frames_multi_calls_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_single_call_perifo this call will one call to the datafind server for every science segment. This function will return a list of files corresponding to the individual frames returned by the datafind query. This will allow pegasus to more easily identify all the files used as input, but may cause problems for codes that need to take frame cache files as input.

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files

  • scienceSegs (Dictionary of ifo keyed igwn_segments.segmentlist instances) – This contains the times that the workflow is expected to analyse.

  • outputDir (path) – All output files written by datafind processes will be written to this directory.

  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!

Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.

  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_runtime_frames_single_call_perifo(cp, scienceSegs, outputDir, tags=None)[source]

This function uses the gwdatafind library to obtain the location of all the frame files that will be needed to cover the analysis of the data given in scienceSegs. This function will not check if the returned frames cover the whole time requested, such sanity checks are done in the pycbc.workflow.setup_datafind_workflow entry function. As opposed to setup_datafind_runtime_generated this call will only run one call to datafind per ifo, spanning the whole time. This function will return a list of files corresponding to the individual frames returned by the datafind query. This will allow pegasus to more easily identify all the files used as input, but may cause problems for codes that need to take frame cache files as input.

Parameters:
  • cp (ConfigParser.ConfigParser instance) – This contains a representation of the information stored within the workflow configuration files

  • scienceSegs (Dictionary of ifo keyed igwn_segments.segmentlist instances) – This contains the times that the workflow is expected to analyse.

  • outputDir (path) – All output files written by datafind processes will be written to this directory.

  • tags (list of strings, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!

Returns:

  • datafindcaches (list of glue.lal.Cache instances) – The glue.lal.Cache representations of the various calls to the datafind server and the returned frame files.

  • datafindOuts (pycbc.workflow.core.FileList) – List of all the datafind output files for use later in the pipeline.

pycbc.workflow.datafind.setup_datafind_workflow(workflow, scienceSegs, outputDir, seg_file=None, tags=None)[source]

Setup datafind section of the workflow. This section is responsible for generating, or setting up the workflow to generate, a list of files that record the location of the frame files needed to perform the analysis. There could be multiple options here, the datafind jobs could be done at run time or could be put into a dag. The subsequent jobs will know what was done here from the OutFileList containing the datafind jobs (and the Dagman nodes if appropriate. For now the only implemented option is to generate the datafind files at runtime. This module can also check if the frameFiles actually exist, check whether the obtained segments line up with the original ones and update the science segments to reflect missing data files.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.

  • scienceSegs (Dictionary of ifo keyed igwn_segments.segmentlist instances) – This contains the times that the workflow is expected to analyse.

  • outputDir (path) – All output files written by datafind processes will be written to this directory.

  • seg_file (SegFile, optional (default=None)) – The file returned by get_science_segments containing the science segments and the associated segment_summary. This will be used for the segment_summary test and is required if, and only if, performing that test.

  • tags (list of string, optional (default=None)) – Use this to specify tags. This can be used if this module is being called more than once to give call specific configuration (by setting options in [workflow-datafind-${TAG}] rather than [workflow-datafind]). This is also used to tag the Files returned by the class to uniqueify the Files and uniqueify the actual filename. FIXME: Filenames may not be unique with current codes!

Returns:

  • datafindOuts (OutGroupList) – List of all the datafind output files for use later in the pipeline.

  • sci_avlble_file (SegFile) – SegFile containing the analysable time after checks in the datafind module are applied to the input segment list. For production runs this is expected to be equal to the input segment list.

  • scienceSegs (Dictionary of ifo keyed igwn_segments.segmentlist instances) – This contains the times that the workflow is expected to analyse. If the updateSegmentTimes kwarg is given this will be updated to reflect any instances of missing data.

  • sci_avlble_name (string) – The name with which the analysable time is stored in the sci_avlble_file.

pycbc.workflow.dq module

class pycbc.workflow.dq.PyCBCBinTemplatesDQExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

create_node(workflow, ifo, template_bank_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.dq.PyCBCBinTriggerRatesDQExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

create_node(workflow, flag_file, flag_name, analysis_segment_file, analysis_segment_name, trig_file, template_bins_file)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
pycbc.workflow.dq.setup_dq_reranking(workflow, insps, bank, analyzable_seg_file, analyzable_name, dq_seg_file, output_dir=None, tags=None)[source]

pycbc.workflow.grb_utils module

This library code contains functions and classes that are used in the generation of pygrb workflows. For details about pycbc.workflow see here: http://pycbc.org/pycbc/latest/html/workflow.html

class pycbc.workflow.grb_utils.PycbcGrbInjFinderExecutable(cp, exe_name)[source]

Bases: Executable

The class responsible for creating jobs for pycbc_grb_inj_finder

create_node(inj_files, inj_insp_files, bank_file, out_dir, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.grb_utils.PycbcGrbTrigClusterExecutable(cp, name)[source]

Bases: Executable

The class responsible for creating jobs for ‘’pycbc_grb_trig_cluster’’.

create_node(in_file, out_dir)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.grb_utils.PycbcGrbTrigCombinerExecutable(cp, name)[source]

Bases: Executable

The class responsible for creating jobs for ‘’pycbc_grb_trig_combiner’’.

create_node(ifo_tag, seg_dir, segment, insp_files, out_dir, bank_file, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
pycbc.workflow.grb_utils.build_segment_filelist(seg_dir)[source]

Construct a FileList instance containing all segments txt files

pycbc.workflow.grb_utils.fermi_core_tail_model(sky_err, rad, core_frac=0.98, core_sigma=3.6, tail_sigma=29.6)[source]

Fermi systematic error model following https://arxiv.org/abs/1909.03006, with default values valid before 11 September 2019.

Parameters:
  • core_frac (float) – Fraction of the systematic uncertainty contained within the core component.

  • core_sigma (float) – Size of the GBM systematic core component.

  • tail_sigma (float) – Size of the GBM systematic tail component.

Returns:

Tuple containing the core and tail probability distributions as a function of radius.

Return type:

tuple

pycbc.workflow.grb_utils.generate_tc_prior(wflow, tc_path, buffer_seg)[source]

Generate the configuration file for the prior on the coalescence time of injections, ensuring that these times fall in the analysis time and avoid the onsource and its buffer.

Parameters:
  • tc_path (str) – Path where the configuration file for the prior needs to be written.

  • buffer_seg (segmentlist) – Start and end times of the buffer segment encapsulating the onsource.

pycbc.workflow.grb_utils.get_sky_grid_scale(sky_error=0.0, containment=0.9, upscale=False, fermi_sys=False, precision=0.001, **kwargs)[source]

Calculate the angular radius corresponding to a desired localization uncertainty level. This is used to generate the search grid and involves scaling up the standard 1-sigma value provided to the workflow, assuming a normal probability profile. Fermi systematic errors can be included, following https://arxiv.org/abs/1909.03006, with default values valid before 11 September 2019. The default probability coverage is 90%.

Parameters:
  • sky_error (float) – The reported statistical 1-sigma sky error of the trigger.

  • containment (float) – The desired localization probability to be covered by the sky grid.

  • upscale (bool, optional) – Whether to apply rescale to convert from 1 sigma -> containment for non-Fermi triggers. Default = True as Swift reports 90% radius directly.

  • fermi_sys (bool, optional) – Whether to apply Fermi-GBM systematics via fermi_core_tail_model. Default = False.

  • precision (float, optional) – Precision (in degrees) for calculating the error radius via Fermi-GBM model.

  • **kwargs – Additional keyword arguments passed to fermi_core_tail_model.

Returns:

Sky error radius in degrees.

Return type:

float

pycbc.workflow.grb_utils.make_gating_node(workflow, datafind_files, outdir=None, tags=None)[source]

Generate jobs for autogating the data for PyGRB runs.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.

  • datafind_files (pycbc.workflow.core.FileList) – A FileList containing the frame files to be gated.

  • outdir (string) – Path of the output directory

  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.

Returns:

  • condition_strain_nodes (list) – List containing the pycbc.workflow.core.Node objects representing the autogating jobs.

  • condition_strain_outs (pycbc.workflow.core.FileList) – FileList containing the pycbc.workflow.core.File objects representing the gated frame files.

pycbc.workflow.grb_utils.make_pygrb_info_table(workflow, exec_name, out_dir, in_files=None, tags=None)[source]

Setup a job to create an html snippet with the GRB trigger information or exlusion distances information.

pycbc.workflow.grb_utils.make_pygrb_injs_tables(workflow, out_dir, bank_file, off_file, seg_files, inj_file=None, on_file=None, veto_file=None, tags=None)[source]

Adds a job to make quiet-found and missed-found injection tables, or loudest trigger(s) table.

pycbc.workflow.grb_utils.make_pygrb_plot(workflow, exec_name, out_dir, ifo=None, inj_file=None, trig_file=None, onsource_file=None, bank_file=None, seg_files=None, veto_file=None, tags=None, **kwargs)[source]

Adds a node for a plot of PyGRB results to the workflow

pycbc.workflow.grb_utils.make_skygrid_node(workflow, out_dir, tags=None)[source]

Adds a job to the workflow to produce the PyGRB search skygrid.

pycbc.workflow.grb_utils.set_grb_start_end(cp, start, end)[source]

Function to update analysis boundaries as workflow is generated

Parameters:
  • cp (pycbc.workflow.configuration.WorkflowConfigParser object)

  • pycbc.workflow.core.Workflow. (The parsed configuration options of a)

  • start (int)

  • time. (The end of the workflow analysis)

  • end (int)

  • time.

Returns:

  • cp (pycbc.workflow.configuration.WorkflowConfigParser object)

  • The modified WorkflowConfigParser object.

pycbc.workflow.grb_utils.setup_pygrb_minifollowups(workflow, followups_file, trigger_file, dax_output, out_dir, seg_files=None, veto_file=None, tags=None)[source]

Create plots that followup the the loudest PyGRB triggers or missed injections from an HDF file.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • followups_file (pycbc.workflow.File) – The File class holding the triggers/injections to follow up

  • trigger_file (pycbc.workflow.File) – The File class holding the triggers

  • dax_output (The directory that will contain the dax file)

  • out_dir (path) – The directory to store minifollowups result plots and files

  • seg_files ({pycbc.workflow.FileList, optional}) – The list of segments Files

  • veto_file ({pycbc.workflow.File, optional}) – The veto definer file

  • tags ({None, optional}) – Tags to add to the minifollowups executables

pycbc.workflow.grb_utils.setup_pygrb_pp_workflow(wf, pp_dir, seg_dir, segment, bank_file, insp_files, inj_files, inj_insp_files, inj_tags)[source]

Generate post-processing section of PyGRB offline workflow

Parameters:
  • wf (The workflow object)

  • pp_dir (The directory where the post-processing files will be stored)

  • seg_dir (The directory where the segment files are stored)

  • segment (The segment to be analyzed)

  • bank_file (The full template bank file)

  • insp_files (The list of inspiral files)

  • inj_files (The list of injection files)

  • inj_insp_files (The list of inspiral files for injections)

  • inj_tags (The list of injection tags)

Returns:

  • trig_files (FileList) – The list of combined trigger files [ALL_TIMES, ONSOURCE, OFFSOURCE, OFFTRIAL_1, …, OFFTRIAL_N] FileList (N can be set by the user and is 6 by default)

  • clustered_files (FileList) – CLUSTERED FileList, same order as trig_files Contains triggers after clustering

  • inj_find_files (FileList) – FOUNDMISSED FileList covering all injection sets

pycbc.workflow.grb_utils.setup_pygrb_results_workflow(workflow, res_dir, trig_files, inj_files, bank_file, seg_dir, veto_file=None, tags=None, explicit_dependencies=None)[source]

Create subworkflow to produce plots, tables, and results webpage for a PyGRB analysis.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • res_dir (The post-processing directory where) – results (plots, etc.) will be stored

  • trig_files (FileList of trigger files)

  • inj_files (FileList of injection results)

  • bank_file (The template bank File object)

  • seg_dir (The directory path with the segments files)

  • veto_file ({None, optional}) – The veto File object

  • tags ({None, optional}) – Tags to add to the executables

  • explicit_dependencies ({None, optional}) – nodes that must precede this

pycbc.workflow.inference_followups module

Module that contains functions for setting up the inference workflow.

pycbc.workflow.inference_followups.create_fits_file(workflow, inference_file, output_dir, name='create_fits_file', analysis_seg=None, tags=None)[source]

Sets up job to create fits files from some given samples files.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is create_fits_file.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.create_posterior_files(workflow, samples_files, output_dir, parameters=None, name='extract_posterior', analysis_seg=None, tags=None)[source]

Sets up job to create posterior files from some given samples files.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The workflow instance we are populating

  • samples_files (str or list of str) – One or more files to extract the posterior samples from.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is extract_posterior.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.get_diagnostic_plots(workflow)[source]

Determines what diagnostic plots to create based on workflow.

The plots to create are based on what executable’s are specified in the workflow’s config file. A list of strings is returned giving the diagnostic plots to create. This list may contain:

  • samples: For MCMC samplers, a plot of the sample chains as a function of iteration. This will be created if plot_samples is in the executables section.

  • acceptance_rate: For MCMC samplers, a plot of the acceptance rate. This will be created if plot_acceptance_rate is in the executables section.

Returns:

List of names of diagnostic plots.

Return type:

list

pycbc.workflow.inference_followups.get_plot_group(cp, section_tag)[source]

Gets plotting groups from [workflow-section_tag].

pycbc.workflow.inference_followups.make_diagnostic_plots(workflow, diagnostics, samples_file, label, rdir, tags=None)[source]

Makes diagnostic plots.

Diagnostic plots are sampler-specific plots the provide information on how the sampler performed. All diagnostic plots use the output file produced by pycbc_inference as their input. Diagnostic plots are added to the results directory rdir/NAME where NAME is the name of the diagnostic given in diagnostics.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow to add the plotting jobs to.

  • diagnostics (list of str) – The names of the diagnostic plots to create. See get_diagnostic_plots() for recognized names.

  • samples_file ((list of) pycbc.workflow.File) – One or more samples files with which to create the diagnostic plots. If a list of files is provided, a diagnostic plot for each file will be created.

  • label (str) – Event label for the diagnostic plots.

  • rdir (pycbc.results.layout.SectionNumber) – Results directory layout.

  • tags (list of str, optional) – Additional tags to add to the file names.

Returns:

Dictionary of diagnostic name -> list of files giving the plots that will be created.

Return type:

dict

pycbc.workflow.inference_followups.make_inference_acceptance_rate_plot(workflow, inference_file, output_dir, name='plot_acceptance_rate', analysis_seg=None, tags=None)[source]

Sets up a plot of the acceptance rate (for MCMC samplers).

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_acceptance_rate.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_dynesty_run_plot(workflow, inference_file, output_dir, name='plot_dynesty_run', analysis_seg=None, tags=None)[source]

Sets up a debugging plot for the dynesty run (for Dynesty sampler).

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_dynesty_run.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_dynesty_trace_plot(workflow, inference_file, output_dir, name='plot_dynesty_traceplot', analysis_seg=None, tags=None)[source]

Sets up a trace plot for the dynesty run (for Dynesty sampler).

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_dynesty_traceplot.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_inj_recovery_plot(workflow, posterior_files, output_dir, parameter, injection_samples_map=None, name='inj_recovery', analysis_seg=None, tags=None)[source]

Sets up the recovered versus injected parameter plot in the workflow.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_files (pycbc.workflow.core.FileList) – List of files with posteriors of injections.

  • output_dir (str) – The directory to store result plots and files.

  • parameter (str) – The parameter to plot.

  • injection_samples_map ((list of) str, optional) – Map between injection parameters and parameters in the posterior file. Format is INJECTION_PARAM:SAMPLES_PARAM.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is inj_recovery.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_plot(workflow, input_file, output_dir, name, analysis_seg=None, tags=None, input_file_opt='input-file', output_file_extension='.png', add_to_workflow=False)[source]

Boiler-plate function for creating a standard plotting job.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • input_file ((list of) pycbc.workflow.File) – The file used for the input. May provide either a single file or a list of files.

  • output_dir (str) – The directory to store result plots.

  • name (str) – The name in the [executables] section of the configuration file to use.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

  • input_file_opt (str, optional) – The name of the input-file option used by the executable. Default is input-file.

  • output_file_extension (str, optional) – What file type to create. Default is .png.

  • add_to_workflow (bool, optional) – If True, the node will be added to the workflow before being returned. This means that no options may be added to the node afterward. Default is False.

Returns:

The job node for creating the plot.

Return type:

pycbc.workflow.plotting.PlotExecutable

pycbc.workflow.inference_followups.make_inference_plot_mcmc_history(workflow, inference_file, output_dir, name='plot_mcmc_history', analysis_seg=None, tags=None)[source]

Sets up a plot showing the checkpoint history of an MCMC sampler.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_mcmc_history.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_posterior_plot(workflow, inference_file, output_dir, parameters=None, plot_prior_from_file=None, name='plot_posterior', analysis_seg=None, tags=None)[source]

Sets up the corner plot of the posteriors in the workflow.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • parameters (list or str) – The parameters to plot.

  • plot_prior_from_file (str, optional) – Plot the prior from the given config file on the 1D marginal plots.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_posterior.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_pp_plot(workflow, posterior_files, output_dir, parameters=None, injection_samples_map=None, name='plot_pp', analysis_seg=None, tags=None)[source]

Sets up a pp plot in the workflow.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • posterior_files (pycbc.workflow.core.FileList) – List of files with posteriors of injections.

  • output_dir (str) – The directory to store result plots and files.

  • parameters (list or str, optional) – The parameters to plot.

  • injection_samples_map ((list of) str, optional) – Map between injection parameters and parameters in the posterior file. Format is INJECTION_PARAM:SAMPLES_PARAM.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_pp.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_pp_table(workflow, posterior_files, output_dir, parameters=None, injection_samples_map=None, name='pp_table_summary', analysis_seg=None, tags=None)[source]

Performs a PP, writing results to an html table.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • posterior_files (pycbc.workflow.core.FileList) – List of files with posteriors of injections.

  • output_dir (str) – The directory to store result plots and files.

  • parameters (list or str, optional) – A list or string of parameters to generate the table for. If a string is provided, separate parameters should be space or new-line separated.

  • injection_samples_map ((list of) str, optional) – Map between injection parameters and parameters in the posterior file. Format is INJECTION_PARAM:SAMPLES_PARAM.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is table_summary.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_prior_plot(workflow, config_file, output_dir, name='plot_prior', analysis_seg=None, tags=None)[source]

Sets up the corner plot of the priors in the workflow.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • config_file (pycbc.workflow.File) – The WorkflowConfigParser parasable inference configuration file..

  • output_dir (str) – The directory to store result plots and files.

  • name (str) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_prior.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of the output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_samples_plot(workflow, inference_file, output_dir, name='plot_samples', analysis_seg=None, tags=None)[source]

Sets up a plot of the samples versus iteration (for MCMC samplers).

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_samples.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_skymap(workflow, fits_file, output_dir, name='plot_skymap', analysis_seg=None, tags=None)[source]

Sets up the skymap plot.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • fits_file (pycbc.workflow.File) – The fits file with the sky location.

  • output_dir (str) – The directory to store result plots and files.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is plot_skymap.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of result and output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_inference_summary_table(workflow, inference_file, output_dir, parameters=None, print_metadata=None, name='table_summary', analysis_seg=None, tags=None)[source]

Sets up the html table summarizing parameter estimates.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • inference_file (pycbc.workflow.File) – The file with posterior samples.

  • output_dir (str) – The directory to store result plots and files.

  • parameters (list or str) – A list or string of parameters to generate the table for. If a string is provided, separate parameters should be space or new-line separated.

  • print_metadata (list or str) – A list or string of metadata parameters to print. Syntax is the same as for parameters.

  • name (str, optional) – The name in the [executables] section of the configuration file to use, and the section to read for additional arguments to pass to the executable. Default is table_summary.

  • analysis_segs (igwn_segments.segment, optional) – The segment this job encompasses. If None then use the total analysis time from the workflow.

  • tags (list, optional) – Tags to add to the inference executables.

Returns:

A list of output files.

Return type:

pycbc.workflow.FileList

pycbc.workflow.inference_followups.make_posterior_workflow(workflow, samples_files, config_file, label, rdir, posterior_file_dir='posterior_files', tags=None)[source]

Adds jobs to a workflow that make a posterior file and subsequent plots.

A posterior file is first created from the given samples file(s). The settings for extracting the posterior are set by the [extract_posterior] section. If that section has a parameters argument, then the parameters in the posterior file (and for use in all subsequent plotting) will be whatever that option is set to. Otherwise, the parameters in the posterior file will be whatever is common to all of the given samples file(s).

Except for prior plots (which use the given inference config file), all subsequent jobs use the posterior file. The following are created:

  • Summary table: an html table created using the table_summary executable. The parameters to print in the table are retrieved from the table-params option in the [workflow-summary_table] section. Metadata may also be printed by adding a print-metadata option to that section.

  • Summary posterior plots: a collection of posterior plots to include in the summary page, after the summary table. The parameters to plot are read from [workflow-summary_plots]. Parameters should be grouped together by providing plot-group-NAME = PARAM1[:LABEL1] PARAM2[:LABEL2] in that section, where NAME is a unique name for each group. One posterior plot will be created for each plot group. For clarity, only one or two parameters should be plotted in each summary group, but this is not enforced. Settings for the plotting executable are read from the plot_posterior_summary section; likewise, the executable used is read from plot_posterior_summary in the [executables] section.

  • Sky maps: if both create_fits_file and plot_skymap are listed in the [executables] section, then a .fits file and sky map plot will be produced. The sky map plot will be included in the summary plots. You must be running in a python 3 environment to create these.

  • Prior plots: plots of the prior will be created using the plot_prior executable. By default, all of the variable parameters will be plotted. The prior plots are added to priors/LALBEL/ in the results directory, where LABEL is the given label.

  • Posterior plots: additional posterior plots are created using the plot_posterior executable. The parameters to plot are read from [workflow-plot_params] section. As with the summary posterior plots, parameters are grouped together by providing plot-group-NAME options in that section. A posterior plot will be created for each group, and added to the posteriors/LABEL/ directory. Plot settings are read from the [plot_posterior] section; this is kept separate from the posterior summary so that different settings can be used. For example, you may want to make a density plot for the summary plots, but a scatter plot colored by SNR for the posterior plots.

Parameters:
  • samples_file (pycbc.workflow.core.FileList) – List of samples files to combine into a single posterior file.

  • config_file (pycbc.worfkow.File) – The inference configuration file used to generate the samples file(s). This is needed to make plots of the prior.

  • label (str) – Unique label for the plots. Used in file names.

  • rdir (pycbc.results.layout.SectionNumber) – The results directory to save the plots to.

  • posterior_file_dir (str, optional) – The name of the directory to save the posterior file to. Default is posterior_files.

  • tags (list of str, optional) – Additional tags to add to the file names.

Returns:

  • posterior_file (pycbc.workflow.File) – The posterior file that was created.

  • summary_files (list) – List of files to go on the summary results page.

  • prior_plots (list) – List of prior plots that will be created. These will be saved to priors/LABEL/ in the resuls directory, where LABEL is the provided label.

  • posterior_plots (list) – List of posterior plots that will be created. These will be saved to posteriors/LABEL/ in the results directory.

pycbc.workflow.injection module

This module is responsible for setting up the part of a pycbc workflow that will generate the injection files to be used for assessing the workflow’s ability to detect predicted signals. Full documentation for this module can be found here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

class pycbc.workflow.injection.PyCBCMergeHDFExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Merge HDF injection files executable class

create_node(workflow, input_files)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
class pycbc.workflow.injection.PyCBCOptimalSNRExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Compute optimal SNR for injections

create_node(workflow, inj_file, precalc_psd_files, group_str)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
pycbc.workflow.injection.compute_inj_optimal_snr(workflow, inj_file, precalc_psd_files, out_dir, tags=None)[source]

Set up a job for computing optimal SNRs of a sim_inspiral file.

pycbc.workflow.injection.cut_distant_injections(workflow, inj_file, out_dir, tags=None)[source]

Set up a job for removing injections that are too distant to be seen

pycbc.workflow.injection.inj_to_hdf(workflow, inj_file, out_dir, tags=None)[source]

Convert injection file to hdf format.

If the file is already PyCBC HDF format, this will just make a copy.

pycbc.workflow.injection.setup_injection_workflow(workflow, output_dir=None, inj_section_name='injections', tags=None)[source]

This function is the gateway for setting up injection-generation jobs in a workflow. It should be possible for this function to support a number of different ways/codes that could be used for doing this, however as this will presumably stay as a single call to a single code (which need not be inspinj) there are currently no subfunctions in this moudle.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.

  • output_dir (path) – The directory in which injection files will be stored.

  • inj_section_name (string (optional, default='injections')) – The string that corresponds to the option describing the exe location in the [executables] section of the .ini file and that corresponds to the section (and sub-sections) giving the options that will be given to the code at run time.

  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. This will be used in output names.

Returns:

  • inj_files (pycbc.workflow.core.FileList) – The list of injection files created by this call.

  • inj_tags (list of strings) – The tag corresponding to each injection file and used to uniquely identify them. The FileList class contains functions to search based on tags.

pycbc.workflow.injection.veto_injections(workflow, inj_file, veto_file, veto_name, out_dir, tags=None)[source]

pycbc.workflow.jobsetup module

This library code contains functions and classes that are used to set up and add jobs/nodes to a pycbc workflow. For details about pycbc.workflow see: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope.html

class pycbc.workflow.jobsetup.JobSegmenter(data_lengths, valid_chunks, valid_lengths, curr_seg, curr_exe_class)[source]

Bases: object

This class is used when running sngl_ifo_job_setup to determine what times should be analysed be each job and what data is needed.

get_data_times_for_job(num_job)[source]

Get the data that this job will read in.

get_valid_times_for_job(num_job, allow_overlap=True)[source]

Get the times for which this job is valid.

pick_tile_size(seg_size, data_lengths, valid_chunks, valid_lengths)[source]

Choose job tiles size based on science segment length

class pycbc.workflow.jobsetup.LalappsInspinjExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

The class used to create jobs for the lalapps_inspinj Executable.

create_node(segment, exttrig_file=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
extension = '.xml'
class pycbc.workflow.jobsetup.LigolwAddExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

The class used to create nodes for the ligolw_add Executable.

create_node(jobSegment, input_files, output=None, use_tmp_subdirs=True, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
class pycbc.workflow.jobsetup.PyCBCInspiralExecutable(cp, exe_name, ifo=None, out_dir=None, injection_file=None, tags=None, reuse_executable=False)[source]

Bases: Executable

The class used to create jobs for pycbc_inspiral Executable.

create_node(data_seg, valid_seg, parent=None, df_parents=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
get_valid_times()[source]

Determine possible dimensions of needed input and valid output

time_dependent_options = ['--channel-name']
zero_pad_data_extend(job_data_seg, curr_seg)[source]

When using zero padding, all data is analysable, but the setup functions must include the padding data where it is available so that we are not zero-padding in the middle of science segments. This function takes a job_data_seg, that is chosen for a particular node and extends it with segment-start-pad and segment-end-pad if that data is available.

class pycbc.workflow.jobsetup.PyCBCMultiInspiralExecutable(cp, name, ifo=None, injection_file=None, gate_files=None, out_dir=None, tags=None)[source]

Bases: Executable

The class responsible for setting up jobs for the pycbc_multi_inspiral executable.

create_node(data_seg, valid_seg, parent=None, inj_file=None, dfParents=None, bankVetoBank=None, skygrid_file=None, ipn_file=None, slide=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
file_input_options = ['--gating-file', '--frame-files', '--injection-file', '--statistic-files', '--bank-file', '--config-files', '--psd-file', '--asd-file', '--fake-strain-from-file', '--sgburst-injection-file', '--bank-veto-bank-file']
get_valid_times()[source]
class pycbc.workflow.jobsetup.PyCBCTmpltbankExecutable(cp, exe_name, ifo=None, out_dir=None, tags=None, write_psd=False, psd_files=None)[source]

Bases: Executable

The class used to create jobs for pycbc_geom_nonspin_bank Executable and any other Executables using the same command line option groups.

create_nodata_node(valid_seg, tags=None)[source]

A simplified version of create_node that creates a node that does not need to read in data.

Parameters:

valid_seg (igwn_segments.segment) – The segment over which to declare the node valid. Usually this would be the duration of the analysis.

Returns:

node – The instance corresponding to the created node.

Return type:

pycbc.workflow.core.Node

create_node(data_seg, valid_seg, parent=None, df_parents=None, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 3
get_valid_times()[source]
class pycbc.workflow.jobsetup.PycbcConditionStrainExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

The class responsible for creating jobs for pycbc_condition_strain.

create_node(input_files, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.jobsetup.PycbcCreateInjectionsExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

The class responsible for creating jobs for pycbc_create_injections.

create_node(config_files=None, seed=None, tags=None)[source]

Set up a CondorDagmanNode class to run pycbc_create_injections.

Parameters:
  • config_files (pycbc.workflow.core.FileList) – A pycbc.workflow.core.FileList for injection configuration files to be used with --config-files option.

  • seed (int) – Seed to use for generating injections.

  • tags (list) – A list of tags to include in filenames.

Returns:

node – The node to run the job.

Return type:

pycbc.workflow.core.Node

current_retention_level = 2
extension = '.hdf'
class pycbc.workflow.jobsetup.PycbcHDFSplitInjExecutable(cp, exe_name, num_splits, ifo=None, out_dir=None)[source]

Bases: Executable

The class responsible for creating jobs for pycbc_hdf_splitinj.

create_node(parent, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 2
class pycbc.workflow.jobsetup.PycbcInferenceExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

The class responsible for creating jobs for pycbc_inference.

create_node(config_file, seed=None, tags=None, analysis_time=None)[source]

Set up a pegasus.Node instance to run pycbc_inference.

Parameters:
  • config_file (pycbc.workflow.core.File) – A pycbc.workflow.core.File for inference configuration file to be used with --config-files option.

  • seed (int) – An int to be used with --seed option.

  • tags (list) – A list of tags to include in filenames.

Returns:

node – The node to run the job.

Return type:

pycbc.workflow.core.Node

current_retention_level = 2
class pycbc.workflow.jobsetup.PycbcSplitBankExecutable(cp, exe_name, num_banks, ifo=None, out_dir=None)[source]

Bases: Executable

The class responsible for creating jobs for pycbc_hdf5_splitbank.

create_node(bank, tags=None)[source]

Set up a CondorDagmanNode class to run splitbank code

Parameters:

bank (pycbc.workflow.core.File) – The File containing the template bank to be split

Returns:

node – The node to run the job

Return type:

pycbc.workflow.core.Node

current_retention_level = 2
extension = '.hdf'
class pycbc.workflow.jobsetup.PycbcSplitBankXmlExecutable(cp, exe_name, num_banks, ifo=None, out_dir=None)[source]

Bases: PycbcSplitBankExecutable

Subclass resonsible for creating jobs for pycbc_splitbank.

extension = '.xml.gz'
class pycbc.workflow.jobsetup.PycbcSplitInspinjExecutable(cp, exe_name, num_splits, ifo=None, out_dir=None)[source]

Bases: Executable

The class responsible for running the pycbc_split_inspinj executable

create_node(parent, tags=None)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 1
pycbc.workflow.jobsetup.identify_needed_data(curr_exe_job)[source]

This function will identify the length of data that a specific executable needs to analyse and what part of that data is valid (ie. inspiral doesn’t analyse the first or last 8s of data it reads in).

Parameters:

curr_exe_job (Job) – An instance of the Job class that has a get_valid times method.

Returns:

  • dataLength (float) – The amount of data (in seconds) that each instance of the job must read in.

  • valid_chunk (igwn_segments.segment) – The times within dataLength for which that jobs output can be valid (ie. for inspiral this is (72, dataLength-72) as, for a standard setup the inspiral job cannot look for triggers in the first 72 or last 72 seconds of data read in.)

  • valid_length (float) – The maximum length of data each job can be valid for. This is abs(valid_segment).

pycbc.workflow.jobsetup.int_gps_time_to_str(t)[source]

Takes an integer GPS time, either given as int or lal.LIGOTimeGPS, and converts it to a string. If a LIGOTimeGPS with nonzero decimal part is given, raises a ValueError.

pycbc.workflow.jobsetup.multi_ifo_coherent_job_setup(workflow, out_files, curr_exe_job, science_segs, datafind_outs, output_dir, parents=None, slide_dict=None, tags=None)[source]

Method for setting up coherent inspiral jobs.

pycbc.workflow.jobsetup.select_generic_executable(workflow, exe_tag)[source]

Returns a class that is appropriate for setting up jobs to run executables having specific tags in the workflow config. Executables should not be “specialized” jobs fitting into one of the select_XXX_class functions above, i.e. not a matched filter or template bank job, which require extra setup.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance.

  • exe_tag (string) – The name of the config section storing options for this executable and the option giving the executable path in the [executables] section.

Returns:

exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have a method job.create_node()

Return type:

Sub-class of pycbc.workflow.core.Executable that holds utility

pycbc.workflow.jobsetup.select_matchedfilter_class(curr_exe)[source]

This function returns a class that is appropriate for setting up matched-filtering jobs within workflow.

Parameters:

curr_exe (string) – The name of the matched filter executable to be used.

Returns:

exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have methods * job.create_node() and * job.get_valid_times(ifo, )

Return type:

Sub-class of pycbc.workflow.core.Executable that holds utility

pycbc.workflow.jobsetup.select_tmpltbank_class(curr_exe)[source]

This function returns a class that is appropriate for setting up template bank jobs within workflow.

Parameters:

curr_exe (string) – The name of the executable to be used for generating template banks.

Returns:

exe_class – functions appropriate for the given executable. Instances of the class (‘jobs’) must have methods * job.create_node() and * job.get_valid_times(ifo, )

Return type:

Sub-class of pycbc.workflow.core.Executable that holds utility

pycbc.workflow.jobsetup.sngl_ifo_job_setup(workflow, ifo, out_files, curr_exe_job, science_segs, datafind_outs, parents=None, allow_overlap=True)[source]

This function sets up a set of single ifo jobs. A basic overview of how this works is as follows:

  • (1) Identify the length of data that each job needs to read in, and what part of that data the job is valid for.

  • START LOOPING OVER SCIENCE SEGMENTS

  • (2) Identify how many jobs are needed (if any) to cover the given science segment and the time shift between jobs. If no jobs continue.

  • START LOOPING OVER JOBS

  • (3) Identify the time that the given job should produce valid output (ie. inspiral triggers) over.

  • (4) Identify the data range that the job will need to read in to produce the aforementioned valid output.

    1. Identify all parents/inputs of the job.

    1. Add the job to the workflow

  • END LOOPING OVER JOBS

  • END LOOPING OVER SCIENCE SEGMENTS

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instance of the Workflow class that manages the constructed workflow.

  • ifo (string) – The name of the ifo to set up the jobs for

  • out_files (pycbc.workflow.core.FileList) – The FileList containing the list of jobs. Jobs will be appended to this list, and it does not need to be empty when supplied.

  • curr_exe_job (Job) – An instanced of the Job class that has a get_valid times method.

  • science_segs (igwn_segments.segmentlist) – The list of times that the jobs should cover

  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.

  • parents (pycbc.workflow.core.FileList (optional, kwarg, default=None)) – The FileList containing the list of jobs that are parents to the one being set up.

  • allow_overlap (boolean (optional, kwarg, default = True)) – If this is set the times that jobs are valid for will be allowed to overlap. This may be desired for template banks which may have some overlap in the times they cover. This may not be desired for inspiral jobs, where you probably want triggers recorded by jobs to not overlap at all.

Returns:

out_files – A list of the files that will be generated by this step in the workflow.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter module

This module is responsible for setting up the matched-filtering stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None)[source]

Setup matched-filter jobs that are generated as part of the workflow. This module can support any matched-filter code that is similar in principle to lalapps_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.

  • science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.

  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.

  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.

  • output_dir (path) – The directory in which output will be stored.

  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.

  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.

Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated_multi(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None)[source]

Setup matched-filter jobs that are generated as part of the workflow in which a single job reads in and generates triggers over multiple ifos. This module can support any matched-filter code that is similar in principle to pycbc_multi_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.

  • science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.

  • datafind_outs (pycbc.workflow.core.FileList) – A FileList of the datafind files that are needed to obtain the data used in the analysis, and (if requested by the user) the vetoes File and (if requested by the user) the search sky-grid File.

  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.

  • output_dir (path) – The directory in which output will be stored.

  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.

  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.

Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter.setup_matchedfltr_workflow(workflow, science_segs, datafind_outs, tmplt_banks, output_dir=None, injection_file=None, tags=None)[source]

This function aims to be the gateway for setting up a set of matched-filter jobs in a workflow. This function is intended to support multiple different ways/codes that could be used for doing this. For now the only supported sub-module is one that runs the matched-filtering by setting up a serious of matched-filtering jobs, from one executable, to create matched-filter triggers covering the full range of science times for which there is data and a template bank file.

Parameters:
  • Workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to.

  • science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.

  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.

  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.

  • output_dir (path) – The directory in which output will be stored.

  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.

  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.

Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.minifollowups module

class pycbc.workflow.minifollowups.PlotQScanExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: PlotExecutable

Class to be used for to create workflow.Executable instances for the pycbc_plot_qscan executable. Basically inherits directly from PlotExecutable.

time_dependent_options = ['--channel-name', '--frame-type']
class pycbc.workflow.minifollowups.SingleTemplateExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: PlotExecutable

Class to be used for to create workflow.Executable instances for the pycbc_single_template executable. Basically inherits directly from PlotExecutable.

time_dependent_options = ['--channel-name', '--frame-type']
class pycbc.workflow.minifollowups.SingleTimeFreqExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: PlotExecutable

Class to be used for to create workflow.Executable instances for the pycbc_plot_singles_timefreq executable. Basically inherits directly from PlotExecutable.

time_dependent_options = ['--channel-name', '--frame-type']
pycbc.workflow.minifollowups.get_single_template_params(curr_idx, times, bank_data, bank_id, fsdt, tids)[source]

A function to get the parameters needed for the make_single_template_files function.

Parameters:
  • curr_idx (int) – The index of the event in the file

  • times (dictionary keyed on IFO of numpy arrays, dtype float) – The array of trigger times for each detector

  • bank_data (dictionary or h5py file) – Structure containing the bank information

  • bank_id (int) – The template index within the bank

  • fsdt (dictionary of h5py files, keyed on IFO) – The single-detector TRIGGER_MERGE files, keyed by IFO

  • tids (dictionary keyed on IFO of numpy arrays, dtype int) – The trigger indexes in fsdt for each IFO

Returns:

params – A dictionary containing the parameters needed for the event used

Return type:

dictionary

pycbc.workflow.minifollowups.grouper(iterable, n, fillvalue=None)[source]

Create a list of n length tuples

pycbc.workflow.minifollowups.make_coinc_info(workflow, singles, bank, coinc_file, out_dir, n_loudest=None, trig_id=None, file_substring=None, sort_order=None, sort_var=None, title=None, tags=None)[source]
pycbc.workflow.minifollowups.make_inj_info(workflow, injection_file, injection_index, num, out_dir, tags=None)[source]
pycbc.workflow.minifollowups.make_plot_waveform_plot(workflow, params, out_dir, ifos, exclude=None, require=None, tags=None)[source]

Add plot_waveform jobs to the workflow.

pycbc.workflow.minifollowups.make_qscan_plot(workflow, ifo, trig_time, out_dir, injection_file=None, data_segments=None, time_window=100, tags=None)[source]

Generate a make_qscan node and add it to workflow.

This function generates a single node of the singles_timefreq executable and adds it to the current workflow. Parent/child relationships are set by the input/output files automatically.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.

  • ifo (str) – Which interferometer are we using?

  • trig_time (int) – The time of the trigger being followed up.

  • out_dir (str) – Location of directory to output to

  • injection_file (pycbc.workflow.File (optional, default=None)) – If given, add the injections in the file to strain before making the plot.

  • data_segments (igwn_segments.segmentlist (optional, default=None)) – The list of segments for which data exists and can be read in. If given the start/end times given to singles_timefreq will be adjusted if [trig_time - time_window, trig_time + time_window] does not completely lie within a valid data segment. A ValueError will be raised if the trig_time is not within a valid segment, or if it is not possible to find 2*time_window (plus the padding) of continuous data around the trigger. This must be coalesced.

  • time_window (int (optional, default=None)) – The amount of data (not including padding) that will be read in by the singles_timefreq job. The default value of 100s should be fine for most cases.

  • tags (list (optional, default=None)) – List of tags to add to the created nodes, which determine file naming.

pycbc.workflow.minifollowups.make_single_template_files(workflow, segs, ifo, data_read_name, analyzed_name, params, out_dir, inj_file=None, exclude=None, require=None, tags=None, store_file=False, use_mean_time=False, use_exact_inj_params=False)[source]

Function for creating jobs to run the pycbc_single_template code and add these jobs to the workflow.

Parameters:
  • workflow (workflow.Workflow instance) – The pycbc.workflow.Workflow instance to add these jobs to.

  • segs (workflow.File instance) – The pycbc.workflow.File instance that points to the XML file containing the segment lists of data read in and data analyzed.

  • ifo (str) – The name of the interferometer

  • data_read_name (str) – The name of the segmentlist containing the data read in by each inspiral job in the segs file.

  • analyzed_name (str) – The name of the segmentlist containing the data analyzed by each inspiral job in the segs file.

  • params (dictionary) – A dictionary containing the parameters of the template to be used. params[ifo+’end_time’] is required for all ifos in workflow.ifos. If use_exact_inj_params is False then also need to supply values for [mass1, mass2, spin1z, spin2x]. For precessing templates one also needs to supply [spin1y, spin1x, spin2x, spin2y, inclination] additionally for precession one must supply u_vals or u_vals_+ifo for all ifos. u_vals is the ratio between h_+ and h_x to use when constructing h(t). h(t) = (h_+ * u_vals) + h_x.

  • out_dir (str) – Directory in which to store the output files.

  • inj_file (workflow.File (optional, default=None)) – If given send this injection file to the job so that injections are made into the data.

  • exclude (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections that do not match strings in this list.

  • require (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections matching strings in this list.

  • tags (list (optional, default=None)) – The tags to use for this job.

  • store_file (boolean (optional, default=False)) – Keep the output files of this job.

  • use_mean_time (boolean (optional, default=False)) – Use the mean time as the center time for all ifos

  • use_exact_inj_params (boolean (optional, default=False)) – If True do not use masses and spins listed in the params dictionary but instead use the injection closest to the filter time as a template.

Returns:

output_files – The list of workflow.Files created in this function.

Return type:

workflow.FileList

pycbc.workflow.minifollowups.make_single_template_plots(workflow, segs, data_read_name, analyzed_name, params, out_dir, inj_file=None, exclude=None, data_segments=None, require=None, tags=None, params_str=None, use_exact_inj_params=False)[source]

Function for creating jobs to run the pycbc_single_template code and to run the associated plotting code pycbc_single_template_plots and add these jobs to the workflow.

Parameters:
  • workflow (workflow.Workflow instance) – The pycbc.workflow.Workflow instance to add these jobs to.

  • segs (workflow.File instance) – The pycbc.workflow.File instance that points to the XML file containing the segment lists of data read in and data analyzed.

  • data_read_name (str) – The name of the segmentlist containing the data read in by each inspiral job in the segs file.

  • analyzed_name (str) – The name of the segmentlist containing the data analyzed by each inspiral job in the segs file.

  • params (dictionary) – A dictionary containing the parameters of the template to be used. params[ifo+’end_time’] is required for all ifos in workflow.ifos. If use_exact_inj_params is False then also need to supply values for [mass1, mass2, spin1z, spin2x]. For precessing templates one also needs to supply [spin1y, spin1x, spin2x, spin2y, inclination] additionally for precession one must supply u_vals or u_vals_+ifo for all ifos. u_vals is the ratio between h_+ and h_x to use when constructing h(t). h(t) = (h_+ * u_vals) + h_x.

  • out_dir (str) – Directory in which to store the output files.

  • inj_file (workflow.File (optional, default=None)) – If given send this injection file to the job so that injections are made into the data.

  • exclude (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections that do not match strings in this list.

  • require (list (optional, default=None)) – If given, then when considering which subsections in the ini file to parse for options to add to single_template_plot, only use subsections matching strings in this list.

  • data_segments (dictionary of segment lists) – Dictionary of segment lists keyed on the IFO. Used to decide if an IFO is plotted if there is valid data. If not given, will plot if the IFO produced a trigger which contributed to the event

  • tags (list (optional, default=None)) – Add this list of tags to all jobs.

  • params_str (str (optional, default=None)) – If given add this string to plot title and caption to describe the template that was used.

  • use_exact_inj_params (boolean (optional, default=False)) – If True do not use masses and spins listed in the params dictionary but instead use the injection closest to the filter time as a template.

Returns:

  • hdf_files (workflow.FileList) – The list of workflow.Files created by single_template jobs in this function.

  • plot_files (workflow.FileList) – The list of workflow.Files created by single_template_plot jobs in this function.

pycbc.workflow.minifollowups.make_singles_timefreq(workflow, single, bank_file, trig_time, out_dir, veto_file=None, time_window=10, data_segments=None, tags=None)[source]

Generate a singles_timefreq node and add it to workflow.

This function generates a single node of the singles_timefreq executable and adds it to the current workflow. Parent/child relationships are set by the input/output files automatically.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow class that stores the jobs that will be run.

  • single (pycbc.workflow.core.File instance) – The File object storing the single-detector triggers to followup.

  • bank_file (pycbc.workflow.core.File instance) – The File object storing the template bank.

  • trig_time (int) – The time of the trigger being followed up.

  • out_dir (str) – Location of directory to output to

  • veto_file (pycbc.workflow.core.File (optional, default=None)) – If given use this file to veto triggers to determine the loudest event. FIXME: Veto files should be provided a definer argument and not just assume that all segments should be read.

  • time_window (int (optional, default=None)) – The amount of data (not including padding) that will be read in by the singles_timefreq job. The default value of 10s should be fine for most cases.

  • data_segments (igwn_segments.segmentlist (optional, default=None)) – The list of segments for which data exists and can be read in. If given the start/end times given to singles_timefreq will be adjusted if [trig_time - time_window, trig_time + time_window] does not completely lie within a valid data segment. A ValueError will be raised if the trig_time is not within a valid segment, or if it is not possible to find 2*time_window (plus the padding) of continuous data around the trigger. This must be coalesced.

  • tags (list (optional, default=None)) – List of tags to add to the created nodes, which determine file naming.

pycbc.workflow.minifollowups.make_skipped_html(workflow, skipped_data, out_dir, tags)[source]

Make a html snippet from the list of skipped background coincidences

pycbc.workflow.minifollowups.make_sngl_ifo(workflow, sngl_file, bank_file, trigger_id, out_dir, ifo, statfiles=None, title=None, tags=None)[source]

Setup a job to create sngl detector sngl ifo html summary snippet.

pycbc.workflow.minifollowups.make_trigger_timeseries(workflow, singles, ifo_times, out_dir, special_tids=None, exclude=None, require=None, tags=None)[source]
pycbc.workflow.minifollowups.make_upload_files(workflow, psd_files, snr_timeseries, xml_all, event_id, approximant, out_dir, channel_name, tags=None)[source]

Make files including xml, skymap fits and plots for uploading to gracedb for a given event

Parameters:
  • psd_files (FileList([])) – PSD Files from MERGE_PSDs for the search as appropriate for the event

  • snr_timeseries (FileList([])) – SNR timeseries files, one from each IFO, to add to the XML and plot output from pysbs_single_template

  • xml_all (pycbc.workflow.core.File instance) – XML file containing all events from the search

  • event_id (string) – an integer to describe the event’s position in the xml_all file

  • approximant (byte string) – The approximant used for the template of the event, to be passed to bayestar for sky location

  • out_dir – The directory where all the output files should go

  • channel_name (string) – Channel name to be added to the XML file to be uploaded

  • tags ({None, optional}) – Tags to add to the minifollowups executables

Returns:

all_output_files – List of all output files from this process

Return type:

FileList

pycbc.workflow.minifollowups.setup_foreground_minifollowups(workflow, coinc_file, single_triggers, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, tags=None)[source]

Create plots that followup the Nth loudest coincident injection from a statmap produced HDF file.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • coinc_file

  • single_triggers (list of pycbc.workflow.File) – A list cointaining the file objects associated with the merged single detector trigger files for each ifo.

  • tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank

  • insp_segs (SegFile) – The segment file containing the data read and analyzed by each inspiral job.

  • insp_data_name (str) – The name of the segmentlist storing data read.

  • insp_anal_name (str) – The name of the segmentlist storing data analyzed.

  • dax_output (directory) – Location of the dax outputs

  • out_dir (path) – The directory to store minifollowups result plots and files

  • tags ({None, optional}) – Tags to add to the minifollowups executables

Returns:

layout – A list of tuples which specify the displayed file layout for the minifollops plots.

Return type:

list

pycbc.workflow.minifollowups.setup_injection_minifollowups(workflow, injection_file, inj_xml_file, single_triggers, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, tags=None)[source]

Create plots that followup the closest missed injections

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • coinc_file

  • single_triggers (list of pycbc.workflow.File) – A list cointaining the file objects associated with the merged single detector trigger files for each ifo.

  • tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank

  • insp_segs (SegFile) – The segment file containing the data read by each inspiral job.

  • insp_data_name (str) – The name of the segmentlist storing data read.

  • insp_anal_name (str) – The name of the segmentlist storing data analyzed.

  • out_dir (path) – The directory to store minifollowups result plots and files

  • tags ({None, optional}) – Tags to add to the minifollowups executables

Returns:

layout – A list of tuples which specify the displayed file layout for the minifollops plots.

Return type:

list

pycbc.workflow.minifollowups.setup_single_det_minifollowups(workflow, single_trig_file, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, veto_file=None, veto_segment_name=None, fg_file=None, fg_name=None, statfiles=None, tags=None)[source]

Create plots that followup the Nth loudest clustered single detector triggers from a merged single detector trigger HDF file.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • single_trig_file (pycbc.workflow.File) – The File class holding the single detector triggers.

  • tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank

  • insp_segs (SegFile) – The segment file containing the data read by each inspiral job.

  • insp_data_name (str) – The name of the segmentlist storing data read.

  • insp_anal_name (str) – The name of the segmentlist storing data analyzed.

  • out_dir (path) – The directory to store minifollowups result plots and files

  • statfiles (FileList (optional, default=None)) – Supplementary files necessary for computing the single-detector statistic.

  • tags ({None, optional}) – Tags to add to the minifollowups executables

Returns:

layout – A list of tuples which specify the displayed file layout for the minifollops plots.

Return type:

list

pycbc.workflow.minifollowups.setup_upload_prep_minifollowups(workflow, coinc_file, xml_all_file, single_triggers, psd_files, tmpltbank_file, insp_segs, insp_data_name, insp_anal_name, dax_output, out_dir, tags=None)[source]

Create plots that followup the Nth loudest coincident injection from a statmap produced HDF file.

Parameters:
  • workflow (pycbc.workflow.Workflow) – The core workflow instance we are populating

  • coinc_file

  • single_triggers (list of pycbc.workflow.File) – A list cointaining the file objects associated with the merged single detector trigger files for each ifo.

  • psd_files (list of pycbc.workflow.File) – A list containing the file objects associated with the merged psd files for each ifo.

  • xml_all_file (workflow file object) – XML File containing all foreground events

  • tmpltbank_file (pycbc.workflow.File) – The file object pointing to the HDF format template bank

  • insp_segs (SegFile) – The segment file containing the data read and analyzed by each inspiral job. The segment file containing the data read and analyzed by each inspiral job.

  • insp_data_name (str) – The name of the segmentlist storing data read.

  • insp_anal_name (str) – The name of the segmentlist storing data analyzed.

  • dax_output (directory) – Location of the dax outputs

  • out_dir (path) – The directory to store minifollowups result plots and files

  • tags ({None, optional}) – Tags to add to the minifollowups executables

Returns:

layout – A list of tuples which specify the displayed file layout for the minifollowups plots.

Return type:

list

pycbc.workflow.pegasus_sites module

This module provides default site catalogs, which should be suitable for most use cases. You can override individual details here. It should also be possible to implement a new site, but not sure how that would work in practice.

pycbc.workflow.pegasus_sites.add_condorpool_copy_site(sitecat, cp)[source]

Add condorpool_copy site to site catalog

pycbc.workflow.pegasus_sites.add_condorpool_shared_site(sitecat, cp, local_path, local_url)[source]

Add condorpool_shared site to site catalog

Add condorpool_symlink site to site catalog

pycbc.workflow.pegasus_sites.add_ini_site_profile(site, cp, sec)[source]

Add options from sec in configparser to site

pycbc.workflow.pegasus_sites.add_local_site(sitecat, cp, local_path, local_url)[source]

Add the local site to site catalog

pycbc.workflow.pegasus_sites.add_osg_site(sitecat, cp)[source]

Add osg site to site catalog

pycbc.workflow.pegasus_sites.add_site(sitecat, sitename, cp, out_dir=None)[source]

Add site sitename to site catalog

pycbc.workflow.pegasus_sites.add_site_pegasus_profile(site, cp)[source]

Add options from [pegasus_profile] in configparser to site

pycbc.workflow.pegasus_sites.make_catalog(cp, out_dir)[source]

Make combined catalog of built-in known sites

pycbc.workflow.pegasus_workflow module

This module provides thin wrappers around Pegasus.DAX3 functionality that provides additional abstraction and argument handling.

class pycbc.workflow.pegasus_workflow.Executable(name, os='linux', arch='x86_64', installed=False, container=None)[source]

Bases: ProfileShortcuts

The workflow representation of an Executable

add_profile(namespace, key, value)[source]

Add profile information to this executable

create_transformation(site, url)[source]
id = 0
class pycbc.workflow.pegasus_workflow.File(name)[source]

Bases: File

The workflow representation of a physical file

An object that represents a file from the perspective of setting up a workflow. The file may or may not exist at the time of workflow generation. If it does, this is represented by containing a physical file name (PFN). A storage path is also available to indicate the desired final destination of this file.

add_pfn(url, site)[source]

Associate a PFN with this file. Takes a URL and associated site.

property dax_repr

Return the dax representation of a File.

classmethod from_path(path)[source]

Takes a path and returns a File object with the path as the PFN.

has_pfn(url, site='local')[source]

Check if the url, site is already associated to this File. If site is not provided, we will assume it is ‘local’.

insert_into_dax(rep_cat, sites)[source]
output_map_str()[source]
class pycbc.workflow.pegasus_workflow.Node(transformation)[source]

Bases: ProfileShortcuts

add_arg(arg)[source]

Add an argument

add_input(inp)[source]

Declares an input file without adding it as a command-line option.

add_input_arg(inp)[source]

Add an input as an argument

add_input_list_opt(opt, inputs, **kwargs)[source]

Add an option that determines a list of inputs

add_input_opt(opt, inp, **kwargs)[source]

Add an option that determines an input

add_list_opt(opt, values, **kwargs)[source]

Add an option with a list of non-file parameters.

add_opt(opt, value=None, check_existing_options=True, **kwargs)[source]

Add an option

add_output(inp)[source]

Declares an output file without adding it as a command-line option.

add_output_arg(out)[source]

Add an output as an argument

add_output_list_opt(opt, outputs, **kwargs)[source]

Add an option that determines a list of outputs

add_output_opt(opt, out, **kwargs)[source]

Add an option that determines an output

add_profile(namespace, key, value)[source]

Add profile information to this node at the DAX level

add_raw_arg(arg)[source]

Add an argument to the command line of this job, but do NOT add white space between arguments. This can be added manually by adding ‘ ‘ if needed

new_output_file_opt(opt, name)[source]

Add an option and return a new file handle

class pycbc.workflow.pegasus_workflow.ProfileShortcuts[source]

Bases: object

Container of common methods for setting pegasus profile information on Executables and nodes. This class expects to be inherited from and for a add_profile method to be implemented.

set_category(category)[source]
set_execution_site(site)[source]
set_memory(size)[source]

Set the amount of memory that is required in megabytes

set_num_cpus(number)[source]
set_num_retries(number)[source]
set_priority(priority)[source]
set_storage(size)[source]

Set the amount of storage required in megabytes

set_universe(universe)[source]
class pycbc.workflow.pegasus_workflow.SubWorkflow(*args, **kwargs)[source]

Bases: SubWorkflow

Workflow job representation of a SubWorkflow.

This follows the Pegasus nomenclature where there are Workflows, Jobs and SubWorkflows. Be careful though! A SubWorkflow is actually a Job, not a Workflow. If creating a sub-workflow you would create a Workflow as normal and write out the necessary dax files. Then you would create a SubWorkflow object, which acts as the Job in the top-level workflow. Most of the special linkages that are needed for sub-workflows are then handled at that stage. We do add a little bit of functionality here.

add_into_workflow(container_wflow)[source]

Add this Job into a container Workflow

add_planner_arg(value, option)[source]
set_subworkflow_properties(output_map_file, staging_site, cache_file)[source]
class pycbc.workflow.pegasus_workflow.Transformation(name: str, namespace: str | None = None, version: str | None = None, site: str | None = None, pfn: str | Path | None = None, is_stageable: bool = False, bypass_staging: bool = False, arch: Arch | None = None, os_type: OS | None = None, os_release: str | None = None, os_version: str | None = None, container: Container | str | None = None, checksum: Dict[str, str] | None = None)[source]

Bases: Transformation

is_same_as(other)[source]
class pycbc.workflow.pegasus_workflow.Workflow(name='my_workflow', directory=None, cache_file=None, dax_file_name=None)[source]

Bases: object

add_container(container)[source]

Add a container to this workflow

Adds the input container to this workflow.

Parameters:

container (Pegasus.api.Container) – The container to be added.

add_explicit_dependancy(parent, child)[source]

Add an explicit dependancy between two Nodes in this workflow.

Most dependencies (in PyCBC and Pegasus thinking) are added by declaring file linkages. However, there are some cases where you might want to override that and add an explicit dependancy.

Parameters:
  • parent (Node instance) – The parent Node.

  • child (Node instance) – The child Node

add_node(node)[source]

Add a node to this workflow

This function adds nodes to the workflow. It also determines parent/child relations from the inputs to this job.

Parameters:

node (pycbc.workflow.pegasus_workflow.Node) – A node that should be executed as part of this workflow.

add_subworkflow_dependancy(parent_workflow, child_workflow)[source]

Add a dependency between two sub-workflows in this workflow

This is done if those subworkflows are themselves declared as Workflows which are sub-workflows and not explicit SubWorkflows. (These Workflows contain SubWorkflows inside them …. Yes, the relationship between PyCBC and Pegasus becomes confusing here). If you are working with explicit SubWorkflows these can be added normally using File relations.

Parameters:
  • parent_workflow (Workflow instance) – The sub-workflow to use as the parent dependence. Must be a sub-workflow of this workflow.

  • child_workflow (Workflow instance) – The sub-workflow to add as the child dependence. Must be a sub-workflow of this workflow.

add_transformation(tranformation)[source]

Add a transformation to this workflow

Adds the input transformation to this workflow.

Parameters:

transformation (Pegasus.api.Transformation) – The transformation to be added.

add_workflow(workflow)[source]

Add a sub-workflow to this workflow

This function adds a sub-workflow of Workflow class to this workflow. Parent child relationships are determined by data dependencies

Parameters:

workflow (Workflow instance) – The sub-workflow to add to this one

plan_and_submit(submit_now=True)[source]

Plan and submit the workflow now.

save(filename=None, submit_now=False, plan_now=False, output_map_path=None, root=True)[source]

Write this workflow to DAX file

traverse_workflow_io()[source]

If input is needed from another workflow within a larger hierarchical workflow, determine the path for the file to reach the destination and add the file to workflows input / output as needed.

pycbc.workflow.plotting module

This module is responsible for setting up plotting jobs. https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

class pycbc.workflow.plotting.PlotExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

plot executable

create_node(**kwargs)[source]

Default node constructor.

This is usually overridden by subclasses of Executable.

current_retention_level = 4
pycbc.workflow.plotting.excludestr(tags, substr)[source]
pycbc.workflow.plotting.make_bank_compression_plots(workflow, bank_files, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_binned_hist(workflow, trig_file, veto_file, veto_name, out_dir, bank_file, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_coinc_snrchi_plot(workflow, inj_file, inj_trig, stat_file, trig_file, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_dq_flag_trigger_rate_plot(workflow, dq_file, dq_label, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_dq_segment_table(workflow, dq_file, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_foreground_table(workflow, trig_file, bank_file, out_dir, singles=None, extension='.html', tags=None, hierarchical_level=None)[source]
pycbc.workflow.plotting.make_foundmissed_plot(workflow, inj_file, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_gating_plot(workflow, insp_files, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_ifar_plot(workflow, trigger_file, out_dir, tags=None, hierarchical_level=None, executable='page_ifar')[source]

Creates a node in the workflow for plotting cumulative histogram of IFAR values.

pycbc.workflow.plotting.make_inj_table(workflow, inj_file, out_dir, missed=False, singles=None, tags=None)[source]
pycbc.workflow.plotting.make_range_plot(workflow, psd_files, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_results_web_page(workflow, results_dir, template='orange', explicit_dependencies=None)[source]
pycbc.workflow.plotting.make_seg_plot(workflow, seg_files, out_dir, seg_names=None, tags=None)[source]

Creates a node in the workflow for plotting science, and veto segments.

pycbc.workflow.plotting.make_seg_table(workflow, seg_files, seg_names, out_dir, tags=None, title_text=None, description=None)[source]

Creates a node in the workflow for writing the segment summary table. Returns a File instances for the output file.

pycbc.workflow.plotting.make_segments_plot(workflow, seg_files, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_sensitivity_plot(workflow, inj_file, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_single_hist(workflow, trig_file, veto_file, veto_name, out_dir, bank_file=None, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_singles_plot(workflow, trig_files, bank_file, veto_file, veto_name, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_snrchi_plot(workflow, trig_files, veto_file, veto_name, out_dir, exclude=None, require=None, tags=None)[source]
pycbc.workflow.plotting.make_snrifar_plot(workflow, bg_file, out_dir, closed_box=False, cumulative=True, tags=None, hierarchical_level=None)[source]
pycbc.workflow.plotting.make_snrratehist_plot(workflow, bg_file, out_dir, closed_box=False, tags=None, hierarchical_level=None)[source]
pycbc.workflow.plotting.make_spectrum_plot(workflow, psd_files, out_dir, tags=None, hdf_group=None, precalc_psd_files=None)[source]
pycbc.workflow.plotting.make_template_bin_table(workflow, dq_file, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_template_plot(workflow, bank_file, out_dir, bins=None, tags=None)[source]
pycbc.workflow.plotting.make_throughput_plot(workflow, insp_files, out_dir, tags=None)[source]
pycbc.workflow.plotting.make_veto_table(workflow, out_dir, vetodef_file=None, tags=None)[source]

Creates a node in the workflow for writing the veto_definer table. Returns a File instances for the output file.

pycbc.workflow.plotting.requirestr(tags, substr)[source]

pycbc.workflow.psd module

This module is responsible for setting up PSD-related jobs in workflows.

pycbc.workflow.psd.make_average_psd(workflow, psd_files, out_dir, tags=None, output_fmt='.txt')[source]
pycbc.workflow.psd.make_psd_file(workflow, frame_files, segment_file, segment_name, out_dir, tags=None)[source]
pycbc.workflow.psd.merge_psds(workflow, files, ifo, out_dir, tags=None)[source]
pycbc.workflow.psd.setup_psd_calculate(workflow, frame_files, ifo, segments, segment_name, out_dir, tags=None)[source]

pycbc.workflow.psdfiles module

This module is responsible for setting up the psd files used by CBC workflows.

pycbc.workflow.psdfiles.setup_psd_pregenerated(workflow, tags=None)[source]

Setup CBC workflow to use pregenerated psd files. The file given in cp.get(‘workflow’,’pregenerated-psd-file-(ifo)’) will be used as the –psd-file argument to geom_nonspinbank, geom_aligned_bank and pycbc_plot_psd_file.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.

  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.

Returns:

psd_files – The FileList holding the gating files

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.psdfiles.setup_psd_workflow(workflow, science_segs, datafind_outs, output_dir=None, tags=None)[source]

Setup static psd section of CBC workflow. At present this only supports pregenerated psd files, in the future these could be created within the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.

  • science_segs (Keyed dictionary of igwn_segments.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.

  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.

  • output_dir (path string) – The directory where data products will be placed.

  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.

Returns:

psd_files – The FileList holding the psd files, 0 or 1 per ifo

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.segment module

This module is responsible for setting up the segment generation stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/segments.html

pycbc.workflow.segment.generate_triggered_segment(workflow, out_dir, sciencesegs)[source]
pycbc.workflow.segment.get_flag_segments_file(workflow, name, option_name, out_dir, tags=None)[source]

Get segments from option name syntax for each ifo for indivudal flags.

Use syntax of configparser string to define the resulting segment_file e.x. option_name = +up_flag1,+up_flag2,+up_flag3,-down_flag1,-down_flag2 Each ifo may have a different string and is stored separately in the file. Each flag is stored separately in the file. Flags which add time must precede flags which subtract time.

Parameters:
  • workflow (pycbc.workflow.Workflow)

  • name (string) – Name of the segment list being created

  • option_name (str) – Name of option in the associated config parser to get the flag list

  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.

Returns:

seg_file – SegFile intance that points to the segment xml file on disk.

Return type:

pycbc.workflow.SegFile

pycbc.workflow.segment.get_segments_file(workflow, name, option_name, out_dir, tags=None)[source]

Get cumulative segments from option name syntax for each ifo.

Use syntax of configparser string to define the resulting segment_file e.x. option_name = +up_flag1,+up_flag2,+up_flag3,-down_flag1,-down_flag2 Each ifo may have a different string and is stored separately in the file. Flags which add time must precede flags which subtract time.

Parameters:
  • workflow (pycbc.workflow.Workflow)

  • name (string) – Name of the segment list being created

  • option_name (str) – Name of option in the associated config parser to get the flag list

  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.

Returns:

seg_file – SegFile intance that points to the segment xml file on disk.

Return type:

pycbc.workflow.SegFile

pycbc.workflow.segment.get_triggered_coherent_segment(workflow, sciencesegs)[source]

Construct the coherent network on and off source segments. Can switch to construction of segments for a single IFO search when coherent segments are insufficient for a search.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The workflow instance that the calculated segments belong to.

  • sciencesegs (dict) – Dictionary of all science segments within analysis time.

Returns:

  • onsource (igwn_segments.segmentlistdict) – A dictionary containing the on source segments for network IFOs

  • offsource (igwn_segments.segmentlistdict) – A dictionary containing the off source segments for network IFOs

pycbc.workflow.segment.save_veto_definer(cp, out_dir, tags=None)[source]

Retrieve the veto definer file and save it locally

Parameters:
  • cp (ConfigParser instance)

  • out_dir (path)

  • tags (list of strings) – Used to retrieve subsections of the ini file for configuration options.

pycbc.workflow.splittable module

This module is responsible for setting up the splitting output files stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

pycbc.workflow.splittable.select_splitfilejob_instance(curr_exe)[source]

This function returns an instance of the class that is appropriate for splitting an output file up within workflow (for e.g. splitbank).

Parameters:
  • curr_exe (string) – The name of the Executable that is being used.

  • curr_section (string) – The name of the section storing options for this executble

Returns:

exe class – The class that holds the utility functions appropriate for the given Executable. This class must contain * exe_class.create_job() and the job returned by this must contain * job.create_node()

Return type:

sub-class of pycbc.workflow.core.Executable

pycbc.workflow.splittable.setup_splittable_dax_generated(workflow, input_tables, out_dir, tags)[source]

Function for setting up the splitting jobs as part of the workflow.

Parameters:
Returns:

split_table_outs – The list of split up files as output from this job.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.splittable.setup_splittable_workflow(workflow, input_tables, out_dir=None, tags=None)[source]

This function aims to be the gateway for code that is responsible for taking some input file containing some table, and splitting into multiple files containing different parts of that table. For now the only supported operation is using lalapps_splitbank to split a template bank xml file into multiple template bank xml files.

Parameters:
Returns:

split_table_outs – The list of split up files as output from this job.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank module

This module is responsible for setting up the template bank stage of CBC workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/ahope/template_bank.html

pycbc.workflow.tmpltbank.make_combine_split_banks(workflow, bank_files, out_dir, tags=None)[source]
pycbc.workflow.tmpltbank.make_compress_split_banks(workflow, bank_files, out_dir, tags=None)[source]
pycbc.workflow.tmpltbank.setup_tmpltbank_dax_generated(workflow, science_segs, datafind_outs, output_dir, tags=None, psd_files=None)[source]

Setup template bank jobs that are generated as part of the CBC workflow. This function will add numerous jobs to the CBC workflow using configuration options from the .ini file. The following executables are currently supported:

  • lalapps_tmpltbank

  • pycbc_geom_nonspin_bank

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.

  • science_segs (Keyed dictionary of igwn_segments.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.

  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.

  • output_dir (path string) – The directory where data products will be placed.

  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.

  • psd_file (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.

Returns:

tmplt_banks – The FileList holding the details of all the template bank jobs.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank.setup_tmpltbank_pregenerated(workflow, tags=None)[source]

Setup CBC workflow to use a pregenerated template bank. The bank given in cp.get(‘workflow’,’pregenerated-template-bank’) will be used as the input file for all matched-filtering jobs. If this option is present, workflow will assume that it should be used and not generate template banks within the workflow.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.

  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.

Returns:

tmplt_banks – The FileList holding the details of the template bank.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank.setup_tmpltbank_without_frames(workflow, output_dir, tags=None, independent_ifos=False, psd_files=None)[source]

Setup CBC workflow to use a template bank (or banks) that are generated in the workflow, but do not use the data to estimate a PSD, and therefore do not vary over the duration of the workflow. This can either generate one bank that is valid for all ifos at all times, or multiple banks that are valid only for a single ifo at all times (one bank per ifo).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.

  • output_dir (path string) – The directory where the template bank outputs will be placed.

  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.

  • independent_ifos (Boolean, optional (default=False)) – If given this will produce one template bank per ifo. If not given there will be on template bank to cover all ifos.

  • psd_file (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.

Returns:

tmplt_banks – The FileList holding the details of the template bank(s).

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.tmpltbank.setup_tmpltbank_workflow(workflow, science_segs, datafind_outs, output_dir=None, psd_files=None, tags=None, return_format=None)[source]

Setup template bank section of CBC workflow. This function is responsible for deciding which of the various template bank workflow generation utilities should be used.

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – An instanced class that manages the constructed workflow.

  • science_segs (Keyed dictionary of igwn_segments.segmentlist objects) – scienceSegs[ifo] holds the science segments to be analysed for each ifo.

  • datafind_outs (pycbc.workflow.core.FileList) – The file list containing the datafind files.

  • output_dir (path string) – The directory where data products will be placed.

  • psd_files (pycbc.workflow.core.FileList) – The file list containing predefined PSDs, if provided.

  • tags (list of strings) – If given these tags are used to uniquely name and identify output files that would be produced in multiple calls to this function.

Returns:

tmplt_banks – The FileList holding the details of all the template bank jobs.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.versioning module

Module to generate/manage the executable used for version information in workflows

class pycbc.workflow.versioning.VersioningExecutable(cp, name, ifos=None, out_dir=None, tags=None, reuse_executable=True, set_submit_subdir=True)[source]

Bases: Executable

Executable for getting version information

current_retention_level = 4
pycbc.workflow.versioning.make_versioning_page(workflow, config_parser, out_dir, tags=None)[source]

Make executable for versioning information

Module contents

This package provides the utilities to construct an inspiral workflow for performing a coincident CBC matched-filter analysis on gravitational-wave interferometer data