The workflow matched-filter module

Introduction

The matched-filter section of pycbc’s workflow module is responsible for matched-filtering the data against the template bank(s) from the template bank section and generating a list of “triggers” for each interferometer. These triggers should be a list of any event where the signal to noise ratio and any signal consistency test are such that that point should be sent forward to check for coincidence in other ifos.

Any single-ifo signal consistency tests (ie. chi-squared tests etc.) should be computed in this section and stored within the lists of triggers. The workflow does not make any specifications on the output format, but obviously code in the next stages of the workflow must know how to process that input.

The matched-filtering section should be as independent of the other stages of the workflow as possible. This means that we don’t require the data read in by matched-filter jobs to match that read in by template bank jobs (however this may be desirable in some cases, so should be possible where sensible). Options should also not be hardcoded (so there are no cases where an option that gets sent to a template bank job also gets sent to a matched-filter job without any way to stop that). However, it is possible to duplicate options where this is desireable (see Pycbc’s workflow module configuration file(s) and command line interface).

The return from the matched-filter section of the workflow module is a list of File objects corresponding to each actual file (one for each job) that will be generated within the workflow. This will be the only thing that will be passed from the matched-filter section to the future sections.

Usage

Using this module requires a number of things

  • A configuration file (or files) containing the information needed to tell this module how to generate GW triggers.
  • An initialized instance of the workflow class, containing the ConfigParser.
  • A list of segments to be analysed by this module.
  • A FileList returned by the templatebank module containing the template banks available for use.
  • A FileList returned by the datafind module containing the frames that contain the data that will be used to make the template banks.
  • If desired an injection file for assessing sensitivity to simulated signals.

The module is then called according to

pycbc.workflow.setup_matchedfltr_workflow(workflow, science_segs, datafind_outs, tmplt_banks, output_dir=None, injection_file=None, tags=None)[source]

This function aims to be the gateway for setting up a set of matched-filter jobs in a workflow. This function is intended to support multiple different ways/codes that could be used for doing this. For now the only supported sub-module is one that runs the matched-filtering by setting up a serious of matched-filtering jobs, from one executable, to create matched-filter triggers covering the full range of science times for which there is data and a template bank file.

Parameters:
  • Workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

Configuration file setup

Here we describe the options given in the configuration file used in the workflow that will be needed in this section

[workflow-matchedfilter] section

The configuration file must have an [workflow-matchedfilter] section, which is used to tell the workflow how to construct (or gather) the template banks. The first option to choose and provide is

  • matchedfilter-method = VALUE

The choices here and their description are as described below

  • WORKFLOW_INDEPENDENT_IFOS - Matched-filter trigger files will be generated within the workflow. These banks will be made to cover only short (normally ~ 2000s) of data to reflect PSD changes over time and will be independent and distinct for each analysed interferometer. This uses the setup_matchedfltr_dax_generated sub-module.

Currently only one option, but others can be added. The subfunctions used are described here

pycbc.workflow.setup_matchedfltr_dax_generated(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None, link_to_tmpltbank=False, compatibility_mode=False)[source]

Setup matched-filter jobs that are generated as part of the workflow. This module can support any matched-filter code that is similar in principle to lalapps_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
  • link_to_tmpltbank (boolean, optional (default=True)) – If this option is given, the job valid_times will be altered so that there will be one inspiral file for every template bank and they will cover the same time span. Note that this option must also be given during template bank generation to be meaningful.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

When using the setup_matchedfltr_dax_generated sub-module the following additional options apply in the [workflow-matchedfilter] section:

  • matchedfilter-link-to-tmpltbank - OPTIONAL. If this is given the workflow module will attempt to ensure a one-to-one correspondence between template banks and matched-filter outputs. This may not work in all cases and should be considered an option to be used for comparing with ihope output.
  • matchedfilter-compatibility-mode - OPTIONAL. If this is given the workflow module will tile the matched-filter jobs in the same way as inspiral_hipe used to. This requires the link option above and that the template bank and matched-filtering jobs are reading the same amount of data in each job.
  • max-analysis-segments = (NOT used for lalapps_inspiral) - REQUIRED. The maximum number of analysis segments to analyze within a single inspiral job. Note that triggers may not be produced for the entire span of time.
  • min-analysis-segments = (NOT used for lalapps_inspiral) - REQUIRED. The minimum number of analysis segments to analyze within a single inspiral job. This may be the same as the maximum.

[executables]

inspiral = /path/to/inspiral_exec

A section, in this case [inspiral], will be used to specify the constant command line options that are sent to all inspiral jobs. How to set up the [{exe_name}] section, and which executables are currently supported is discussed below.

Supported inspiral trigger generators and instructions for using them

The following inspiral trigger generators are currently supported in pycbc’s workflow module

  • lalapps_inspiral
  • pycbc_inspiral

Adding a new executable is not too hard, please ask a developer for some pointers on how to do this if you want to add a new code.

lalapps_inspiral_ahope

Lalapps_inspiral is the legacy C-code that has been used for years to find gravitational-wave triggers in It is a little inflexible in terms of output file names.

lalapps_inspiral is supported in the workflow module via a wrapper script lalapps_inspiral_ahope, this allows us to specify all the frame files and the output file name directly. Documentation for the lalapps_inspiral command line arguments can be found at http://software.ligo.org/docs/lalsuite/lalapps/inspiral_8c.html

Of these options the workflow module or the wrapper script will automatically add the following, which are unique for each job. DO NOT ADD THESE OPTIONS IN THE CONFIGURATION FILE.

  • –gps-start-time
  • –gps-end-time
  • –trig-start-time
  • –trig-end-time
  • –frame-cache
  • –user-tag
  • –ifo-tag

All other options must be provided in the configuration file. Here is an example of a lalapps_inspiral call.

lalapps_inspiral --do-rsq-veto  --trig-end-time 971614817 --enable-rsq-veto  --dynamic-range-exponent 69.0 --autochisq-stride 2 --bank-file datafind/L1-TMPLTBANK_19-971612833-2048.xml.gz --high-pass-order 8 --strain-high-pass-order 8 --ifo-tag FULL_DATA --user-tag 19 --gps-end-time 971614881 --calibrated-data real_8 --channel-name L1:LDAS-STRAIN --snr-threshold 5.5 --cluster-method template --number-of-segments 15 --trig-start-time 971613852 --enable-high-pass 30.0 --gps-start-time 971612833 --enable-filter-inj-only  --maximization-interval 30 --high-pass-attenuation 0.1 --chisq-bins 2 --inverse-spec-length 16 --rsq-veto-threshold 15.0 --segment-length 1048576 --low-frequency-cutoff 40.0 --pad-data 8 --autochisq-two-sided  --sample-rate 4096 --chisq-threshold 10.0 --rsq-veto-max-snr 12.0 --resample-filter ldas --strain-high-pass-atten 0.1 --strain-high-pass-freq 30 --bank-veto-time-freq  --segment-overlap 524288 --frame-cache datafind/L1-DATAFIND-968556757-3058132.lcf --chisq-delta 0.2 --bank-veto-subbank-size 20 --approximant FindChirpSP --rsq-veto-time-thresh 0.0002 --write-compress  --autochisq-length 100 --enable-output  --rsq-veto-window 6.0 --order threePointFivePN --spectrum-type median

pycbc_inspiral

pycbc_inspiral is pycbc’s inspiral matched-filtering program. Designed as a replacement and improvement of lalapps_inspiral. The help message of pycbc_inspiral follows:

$ pycbc_inspiral --help
usage: 

Find single detector gravitational-wave triggers.

optional arguments:
  -h, --help            show this help message and exit
  --version
  -V, --verbose         print extra debugging information
  --update-progress UPDATE_PROGRESS
                        updates a file 'progress.txt' with a value 0 .. 1.0
                        when this amount of (filtering) progress was made
  --update-progress-file UPDATE_PROGRESS_FILE
                        name of the file to write the amount of (filtering)
                        progress to
  --output OUTPUT       FIXME: ADD
  --bank-file BANK_FILE
                        FIXME: ADD
  --snr-threshold SNR_THRESHOLD
                        SNR threshold for trigger generation
  --newsnr-threshold THRESHOLD
                        Cut triggers with NewSNR less than THRESHOLD
  --low-frequency-cutoff LOW_FREQUENCY_CUTOFF
                        The low frequency cutoff to use for filtering (Hz)
  --enable-bank-start-frequency
                        Read the starting frequency of template waveforms from
                        the template bank.
  --max-template-length MAX_TEMPLATE_LENGTH
                        The maximum length of a template is seconds. The
                        starting frequency of the template is modified to
                        ensure the proper length
  --enable-q-transform  compute the q-transform for each segment of a given
                        analysis run. (default = False)
  --approximant APPRX[:COND] [APPRX[:COND] ...]
                        The approximant(s) to use. Multiple approximants to
                        use in different regions may be provided. If multiple
                        approximants are provided, every one but the last must
                        be be followed by a conditional statement defining
                        where that approximant should be used. Conditionals
                        can be any boolean test understood by numpy. For
                        example, 'Apprx:(mtotal > 4) & (mchirp <= 5)' would
                        use approximant 'Apprx' where total mass is > 4 and
                        chirp mass is <= 5. Conditionals are applied in order,
                        with each successive one only applied to regions not
                        covered by previous arguments. For example,
                        `'TaylorF2:mtotal < 4' 'IMRPhenomD:mchirp < 3'` would
                        result in IMRPhenomD being used where chirp mass is <
                        3 and total mass is >= 4. The last approximant given
                        may use 'else' as the conditional or include no
                        conditional. In either case, this will cause the last
                        approximant to be used in any remaning regions after
                        all the previous conditionals have been applied. For
                        the full list of possible parameters to apply
                        conditionals to, see WaveformArray.default_fields().
                        Math operations may also be used on parameters; syntax
                        is python, with any operation recognized by numpy.
  --order {-1,0,1,2,3,4,5,6,7,8}
                        The integer half-PN order at which to generate the
                        approximant. Default is -1 which indicates to use
                        approximant defined default.
  --taper-template {start,end,startend}
                        For time-domain approximants, taper the start and/or
                        end of the waveform before FFTing.
  --cluster-method {template,window}
                        FIXME: ADD
  --cluster-function {findchirp,symmetric}
                        How to cluster together triggers within a window.
                        'findchirp' uses a forward sliding window; 'symmetric'
                        will compare each window to the one before and after,
                        keeping only a local maximum.
  --cluster-window CLUSTER_WINDOW
                        Length of clustering window in seconds. Set to 0 to
                        disable clustering.
  --bank-veto-bank-file BANK_VETO_BANK_FILE
                        FIXME: ADD
  --chisq-snr-threshold CHISQ_SNR_THRESHOLD
                        Minimum SNR to calculate the power chisq
  --chisq-bins CHISQ_BINS
                        Number of frequency bins to use for power chisq.
                        Specify an integer for a constant number of bins, or a
                        function of template attributes. Math functions are
                        allowed, ex.
                        '10./math.sqrt((params.mass1+params.mass2)/100.)'.
                        Non-integer values will be rounded down.
  --chisq-threshold CHISQ_THRESHOLD
                        FIXME: ADD
  --chisq-delta CHISQ_DELTA
                        FIXME: ADD
  --autochi-number-points AUTOCHI_NUMBER_POINTS
                        The number of points to use, in both directions
                        ifdoing a two-sided auto-chisq, to calculate theauto-
                        chisq statistic.
  --autochi-stride AUTOCHI_STRIDE
                        The gap, in sample points, between the points atwhich
                        to calculate auto-chisq.
  --autochi-two-phase   If given auto-chisq will be calculated by testing
                        against both phases of the SNR time-series. If not
                        given, only the phase matching the trigger will be
                        used.
  --autochi-onesided {left,right}
                        Decide whether to calculate auto-chisq usingpoints on
                        both sides of the trigger or only on oneside. If not
                        given points on both sides will beused. If given, with
                        either 'left' or 'right',only points on that side
                        (right = forward in time,left = back in time) will be
                        used.
  --autochi-reverse-template
                        If given, time-reverse the template beforecalculating
                        the auto-chisq statistic. This willcome at additional
                        computational cost as the SNRtime-series will need
                        recomputing for the time-reversed template.
  --autochi-max-valued  If given, store only the maximum value of the auto-
                        chisq over all points tested. A disadvantage of this
                        is that the mean value will not be known analytically.
  --autochi-max-valued-dof INT
                        If using --autochi-max-valued this value denotes the
                        pre-calculated mean value that will be stored as the
                        auto-chisq degrees-of-freedom value.
  --downsample-factor DOWNSAMPLE_FACTOR
                        Factor that determines the interval between the
                        initial SNR sampling. If not set (or 1) no sparse
                        sample is created, and the standard full SNR is
                        calculated.
  --upsample-threshold UPSAMPLE_THRESHOLD
                        The fraction of the SNR threshold to check the sparse
                        SNR sample.
  --upsample-method {pruned_fft}
                        The method to find the SNR points between the sparse
                        SNR sample.
  --user-tag TAG        This is used to identify FULL_DATA jobs for
                        compatibility with pipedown post-processing. Option
                        will be removed when no longer needed.
  --keep-loudest-log-chirp-window KEEP_LOUDEST_LOG_CHIRP_WINDOW
                        Keep loudest triggers within ln chirp mass window
  --keep-loudest-interval KEEP_LOUDEST_INTERVAL
                        Window in seconds to maximize triggers over bank
  --keep-loudest-num KEEP_LOUDEST_NUM
                        Number of triggers to keep from each maximization
                        interval
  --keep-loudest-stat {newsnr,new_snr,snr,newsnr_cut,exp_fit_csnr,exp_fit_sg_csnr,max_cont_trad_newsnr,newsnr_sgveto,newsnr_sgveto_psdvar,newsnr_sgveto_psdvar_scaled,newsnr_sgveto_psdvar_scaled_threshold,exp_fit_sg_csnr_psdvar}
                        Statistic used to determine loudest to keep
  --finalize-events-template-rate NUM TEMPLATES
                        After NUM TEMPLATES perform the various clustering and
                        rejection tests that would be performed at the end of
                        this job. Default is to only do those things at the
                        end of the job. This can help control memory usage if
                        a lot of triggers that would be rejected are being
                        retained. A suggested value for this is 500, but a
                        good number may depend on other settings and your
                        specific use-case.
  --gpu-callback-method GPU_CALLBACK_METHOD
  --use-compressed-waveforms
                        Use compressed waveforms from the bank file.
  --waveform-decompression-method WAVEFORM_DECOMPRESSION_METHOD
                        Method to be used decompress waveforms from the bank
                        file.
  --checkpoint-interval CHECKPOINT_INTERVAL
                        Save results to checkpoint file every X seconds.
                        Default is no checkpointing.
  --require-valid-checkpoint
                        If the checkpoint file is invalid, raise an error.
                        Default is to ignore invalid checkpoint files and to
                        delete the broken file.
  --checkpoint-exit-maxtime CHECKPOINT_EXIT_MAXTIME
                        Checkpoint and exit if X seconds of execution time is
                        exceeded. Default is no checkpointing.
  --checkpoint-exit-code CHECKPOINT_EXIT_CODE
                        Exit code returned if exiting after a checkpoint

Options to select the method of PSD generation:
  The options --psd-model, --psd-file, --asd-file, and --psd-estimation are
  mutually exclusive.

  --psd-model {AdVBNSOptimizedSensitivityP1200087,AdVDesignSensitivityP1200087,AdVEarlyHighSensitivityP1200087,AdVEarlyLowSensitivityP1200087,AdVLateHighSensitivityP1200087,AdVLateLowSensitivityP1200087,AdVMidHighSensitivityP1200087,AdVMidLowSensitivityP1200087,AdvVirgo,CosmicExplorerP1600143,CosmicExplorerPessimisticP1600143,CosmicExplorerWidebandP1600143,EinsteinTelescopeP1600143,GEO,GEOHF,KAGRA,KAGRADesignSensitivityT1600593,KAGRAEarlySensitivityT1600593,KAGRALateSensitivityT1600593,KAGRAMidSensitivityT1600593,KAGRAOpeningSensitivityT1600593,TAMA,Virgo,aLIGOAPlusDesignSensitivityT1800042,aLIGOAdVO3LowT1800545,aLIGOAdVO4IntermediateT1800545,aLIGOAdVO4T1800545,aLIGOBHBH20Deg,aLIGOBHBH20DegGWINC,aLIGOBNSOptimizedSensitivityP1200087,aLIGODesignSensitivityP1200087,aLIGOEarlyHighSensitivityP1200087,aLIGOEarlyLowSensitivityP1200087,aLIGOHighFrequency,aLIGOHighFrequencyGWINC,aLIGOKAGRA128MpcT1800545,aLIGOKAGRA25MpcT1800545,aLIGOKAGRA80MpcT1800545,aLIGOLateHighSensitivityP1200087,aLIGOLateLowSensitivityP1200087,aLIGOMidHighSensitivityP1200087,aLIGOMidLowSensitivityP1200087,aLIGONSNSOpt,aLIGONSNSOptGWINC,aLIGONoSRMHighPower,aLIGONoSRMLowPower,aLIGONoSRMLowPowerGWINC,aLIGOQuantumBHBH20Deg,aLIGOQuantumHighFrequency,aLIGOQuantumNSNSOpt,aLIGOQuantumNoSRMHighPower,aLIGOQuantumNoSRMLowPower,aLIGOQuantumZeroDetHighPower,aLIGOQuantumZeroDetLowPower,aLIGOThermal,aLIGOZeroDetHighPower,aLIGOZeroDetHighPowerGWINC,aLIGOZeroDetLowPower,aLIGOZeroDetLowPowerGWINC,aLIGOaLIGO140MpcT1800545,aLIGOaLIGO175MpcT1800545,aLIGOaLIGODesignSensitivityT1800044,aLIGOaLIGOO3LowT1800545,eLIGOModel,eLIGOShot,iLIGOModel,iLIGOSRD,iLIGOSeismic,iLIGOShot,iLIGOThermal,flat_unity}
                        Get PSD from given analytical model.
  --psd-file PSD_FILE   Get PSD using given PSD ASCII file
  --asd-file ASD_FILE   Get PSD using given ASD ASCII file
  --psd-inverse-length PSD_INVERSE_LENGTH
                        (Optional) The maximum length of the impulse response
                        of the overwhitening filter (s)
  --psd-file-xml-ifo-string PSD_FILE_XML_IFO_STRING
                        If using an XML PSD file, use the PSD in the file's
                        PSD dictionary with this ifo string. If not given and
                        only one PSD present in the file return that, if not
                        given and multiple (or zero) PSDs present an exception
                        will be raised.
  --psd-file-xml-root-name PSD_FILE_XML_ROOT_NAME
                        If given use this as the root name for the PSD XML
                        file. If this means nothing to you, then it is
                        probably safe to ignore this option.
  --psdvar-segment SECONDS
                        Length of segment for mean square calculation of PSD
                        variation.
  --psdvar-short-segment SECONDS
                        Length of short segment for outliers removal in PSD
                        variability calculation.
  --psdvar-long-segment SECONDS
                        Length of long segment when calculating the PSD
                        variability.
  --psdvar-psd-duration SECONDS
                        Duration of short segments for PSD estimation.
  --psdvar-psd-stride SECONDS
                        Separation between PSD estimation segments.
  --psdvar-low-freq HERTZ
                        Minimum frequency to consider in strain bandpass.
  --psdvar-high-freq HERTZ
                        Maximum frequency to consider in strain bandpass.
  --psd-estimation {mean,median,median-mean}
                        Measure PSD from the data, using given average method.
  --psd-segment-length PSD_SEGMENT_LENGTH
                        (Required for --psd-estimation) The segment length for
                        PSD estimation (s)
  --psd-segment-stride PSD_SEGMENT_STRIDE
                        (Required for --psd-estimation) The separation between
                        consecutive segments (s)
  --psd-num-segments PSD_NUM_SEGMENTS
                        (Optional, used only with --psd-estimation). If given,
                        PSDs will be estimated using only this number of
                        segments. If more data is given than needed to make
                        this number of segments then excess data will not be
                        used in the PSD estimate. If not enough data is given,
                        the code will fail.
  --psd-output PSD_OUTPUT
                        (Optional) Write PSD to specified file

Options for obtaining h(t):
  These options are used for generating h(t) either by reading from a file
  or by generating it. This is only needed if the PSD is to be estimated
  from the data, ie. if the --psd-estimation option is given.

  --gps-start-time GPS_START_TIME
                        The gps start time of the data (integer seconds)
  --gps-end-time GPS_END_TIME
                        The gps end time of the data (integer seconds)
  --strain-high-pass STRAIN_HIGH_PASS
                        High pass frequency
  --pad-data PAD_DATA   Extra padding to remove highpass corruption (integer
                        seconds)
  --taper-data TAPER_DATA
                        Taper ends of data to zero using the supplied length
                        as a window (integer seconds)
  --sample-rate SAMPLE_RATE
                        The sample rate to use for h(t) generation (integer
                        Hz).
  --channel-name CHANNEL_NAME
                        The channel containing the gravitational strain data
  --frame-cache FRAME_CACHE [FRAME_CACHE ...]
                        Cache file containing the frame locations.
  --frame-files FRAME_FILES [FRAME_FILES ...]
                        list of frame files
  --hdf-store HDF_STORE
                        Store of time series data in hdf format
  --frame-type FRAME_TYPE
                        (optional), replaces frame-files. Use datafind to get
                        the needed frame file(s) of this type.
  --frame-sieve FRAME_SIEVE
                        (optional), Only use frame files where the URL matches
                        the regular expression given.
  --fake-strain {AdVBNSOptimizedSensitivityP1200087,AdVDesignSensitivityP1200087,AdVEarlyHighSensitivityP1200087,AdVEarlyLowSensitivityP1200087,AdVLateHighSensitivityP1200087,AdVLateLowSensitivityP1200087,AdVMidHighSensitivityP1200087,AdVMidLowSensitivityP1200087,AdvVirgo,CosmicExplorerP1600143,CosmicExplorerPessimisticP1600143,CosmicExplorerWidebandP1600143,EinsteinTelescopeP1600143,GEO,GEOHF,KAGRA,KAGRADesignSensitivityT1600593,KAGRAEarlySensitivityT1600593,KAGRALateSensitivityT1600593,KAGRAMidSensitivityT1600593,KAGRAOpeningSensitivityT1600593,TAMA,Virgo,aLIGOAPlusDesignSensitivityT1800042,aLIGOAdVO3LowT1800545,aLIGOAdVO4IntermediateT1800545,aLIGOAdVO4T1800545,aLIGOBHBH20Deg,aLIGOBHBH20DegGWINC,aLIGOBNSOptimizedSensitivityP1200087,aLIGODesignSensitivityP1200087,aLIGOEarlyHighSensitivityP1200087,aLIGOEarlyLowSensitivityP1200087,aLIGOHighFrequency,aLIGOHighFrequencyGWINC,aLIGOKAGRA128MpcT1800545,aLIGOKAGRA25MpcT1800545,aLIGOKAGRA80MpcT1800545,aLIGOLateHighSensitivityP1200087,aLIGOLateLowSensitivityP1200087,aLIGOMidHighSensitivityP1200087,aLIGOMidLowSensitivityP1200087,aLIGONSNSOpt,aLIGONSNSOptGWINC,aLIGONoSRMHighPower,aLIGONoSRMLowPower,aLIGONoSRMLowPowerGWINC,aLIGOQuantumBHBH20Deg,aLIGOQuantumHighFrequency,aLIGOQuantumNSNSOpt,aLIGOQuantumNoSRMHighPower,aLIGOQuantumNoSRMLowPower,aLIGOQuantumZeroDetHighPower,aLIGOQuantumZeroDetLowPower,aLIGOThermal,aLIGOZeroDetHighPower,aLIGOZeroDetHighPowerGWINC,aLIGOZeroDetLowPower,aLIGOZeroDetLowPowerGWINC,aLIGOaLIGO140MpcT1800545,aLIGOaLIGO175MpcT1800545,aLIGOaLIGODesignSensitivityT1800044,aLIGOaLIGOO3LowT1800545,eLIGOModel,eLIGOShot,iLIGOModel,iLIGOSRD,iLIGOSeismic,iLIGOShot,iLIGOThermal,zeroNoise}
                        Name of model PSD for generating fake gaussian noise.
  --fake-strain-seed FAKE_STRAIN_SEED
                        Seed value for the generation of fake colored gaussian
                        noise
  --fake-strain-from-file FAKE_STRAIN_FROM_FILE
                        File containing ASD for generating fake noise from it.
  --injection-file INJECTION_FILE
                        (optional) Injection file used to add waveforms into
                        the strain
  --sgburst-injection-file SGBURST_INJECTION_FILE
                        (optional) Injection file used to add sine-Gaussian
                        burst waveforms into the strain
  --injection-scale-factor INJECTION_SCALE_FACTOR
                        Divide injections by this factor before injecting into
                        the data.
  --injection-f-ref INJECTION_F_REF
                        Reference frequency in Hz for creating CBC injections
                        from an XML file.
  --injection-f-final INJECTION_F_FINAL
                        Override the f_final field of a CBC XML injection
                        file.
  --gating-file GATING_FILE
                        (optional) Text file of gating segments to apply.
                        Format of each line is (all times in secs): gps_time
                        zeros_half_width pad_half_width
  --autogating-threshold SIGMA
                        If given, find and gate glitches producing a deviation
                        larger than SIGMA in the whitened strain time series.
  --autogating-max-iterations SIGMA
                        If given, iteratively apply autogating
  --autogating-cluster SECONDS
                        Length of clustering window for detecting glitches for
                        autogating.
  --autogating-width SECONDS
                        Half-width of the gating window.
  --autogating-taper SECONDS
                        Taper the strain before and after each gating window
                        over a duration of SECONDS.
  --autogating-pad SECONDS
                        Ignore the given length of whitened strain at the ends
                        of a segment, to avoid filters ringing.
  --normalize-strain NORMALIZE_STRAIN
                        (optional) Divide frame data by constant.
  --zpk-z ZPK_Z [ZPK_Z ...]
                        (optional) Zero-pole-gain (zpk) filter strain. A list
                        of zeros for transfer function
  --zpk-p ZPK_P [ZPK_P ...]
                        (optional) Zero-pole-gain (zpk) filter strain. A list
                        of poles for transfer function
  --zpk-k ZPK_K         (optional) Zero-pole-gain (zpk) filter strain.
                        Transfer function gain
  --witness-frame-type WITNESS_FRAME_TYPE
                        (optional), frame type which will be use to query the
                        witness channel data.
  --witness-tf-file WITNESS_TF_FILE
                        an hdf file containing the transfer functions and the
                        associated channel names
  --witness-filter-length WITNESS_FILTER_LENGTH
                        filter length in seconds for the transfer function

Options for segmenting the strain:
  These options are used to determine how to segment the strain into smaller
  chunks, and for determining the portion of each to analyze for triggers.

  --trig-start-time TRIG_START_TIME
                        (optional) The gps time to start recording triggers
  --trig-end-time TRIG_END_TIME
                        (optional) The gps time to stop recording triggers
  --segment-length SEGMENT_LENGTH
                        The length of each strain segment in seconds.
  --segment-start-pad SEGMENT_START_PAD
                        The time in seconds to ignore of the beginning of each
                        segment in seconds.
  --segment-end-pad SEGMENT_END_PAD
                        The time in seconds to ignore at the end of each
                        segment in seconds.
  --allow-zero-padding  Allow for zero padding of data to analyze requested
                        times, if needed.
  --filter-inj-only     Analyze only segments that contain an injection.
  --injection-window INJECTION_WINDOW
                        If using --filter-inj-only then only search for
                        injections within +/- injection window of the
                        injections's end time. This is useful to speed up a
                        coherent search or a search where we initially filter
                        at lower sample rate, and then filter at full rate
                        where needed. NOTE: Reverts to full analysis if two
                        injections are in the same segment.

Options for selecting the processing scheme in this program.:
  --processing-scheme PROCESSING_SCHEME
                        The choice of processing scheme. Choices are ['mkl',
                        'cuda', 'cpu', 'numpy']. (optional for CPU scheme) The
                        number of execution threads can be indicated by
                        cpu:NUM_THREADS, where NUM_THREADS is an integer. The
                        default is a single thread. If the scheme is provided
                        as cpu:env, the number of threads can be provided by
                        the PYCBC_NUM_THREADS environment variable. If the
                        environment variable is not set, the number of threads
                        matches the number of logical cores.
  --processing-device-id PROCESSING_DEVICE_ID
                        (optional) ID of GPU to use for accelerated processing

Options for selecting the FFT backend and controlling its performance in this program.:
  --fft-backends [FFT_BACKENDS [FFT_BACKENDS ...]]
                        Preference list of the FFT backends. Choices are:
                        ['fftw', 'mkl', 'numpy']
  --fftw-measure-level FFTW_MEASURE_LEVEL
                        Determines the measure level used in planning FFTW
                        FFTs; allowed values are: [0, 1, 2, 3]
  --fftw-threads-backend FFTW_THREADS_BACKEND
                        Give 'openmp', 'pthreads' or 'unthreaded' to specify
                        which threaded FFTW to use
  --fftw-input-float-wisdom-file FFTW_INPUT_FLOAT_WISDOM_FILE
                        Filename from which to read single-precision wisdom
  --fftw-input-double-wisdom-file FFTW_INPUT_DOUBLE_WISDOM_FILE
                        Filename from which to read double-precision wisdom
  --fftw-output-float-wisdom-file FFTW_OUTPUT_FLOAT_WISDOM_FILE
                        Filename to which to write single-precision wisdom
  --fftw-output-double-wisdom-file FFTW_OUTPUT_DOUBLE_WISDOM_FILE
                        Filename to which to write double-precision wisdom
  --fftw-import-system-wisdom
                        If given, call fftw[f]_import_system_wisdom()

Options for selecting optimization-specific settings:
  --cpu-affinity CPU_AFFINITY
                        A set of CPUs on which to run, specified in a format
                        suitable to pass to taskset.
  --cpu-affinity-from-env CPU_AFFINITY_FROM_ENV
                        The name of an enivornment variable containing a set
                        of CPUs on which to run, specified in a format
                        suitable to pass to taskset.

Options that, if injections are present in this run, are responsible for performing pre-checks between injections in the data being filtered and the current search template to determine if the template has any chance of actually detecting the injection. The parameters of this test are given by the various options below. The --injection-filter-rejector-chirp-time-window and --injection-filter-rejector-match-threshold options need to be provided if those tests are desired. Other options will take default values unless overriden. More details on these options follow.:
  --injection-filter-rejector-chirp-time-window INJECTION_FILTER_REJECTOR_CHIRP_TIME_WINDOW
                        If this value is not None and an injection file is
                        given then we will calculate the difference in chirp
                        time (tau_0) between the template and each injection
                        in the analysis segment. If the difference is greate
                        than this threshold for all injections then filtering
                        is not performed. By default this will be None.
  --injection-filter-rejector-match-threshold INJECTION_FILTER_REJECTOR_MATCH_THRESHOLD
                        If this value is not None and an injection file is
                        provided then we will calculate a 'coarse match'
                        between the template and each injection in the
                        analysis segment. If the match is less than this
                        threshold for all injections then filtering is not
                        performed. Parameters for the 'coarse match' follow.
                        By default this value will be None.
  --injection-filter-rejector-coarsematch-deltaf INJECTION_FILTER_REJECTOR_COARSEMATCH_DELTAF
                        If injections are present and a match threshold is
                        provided, this option specifies the frequency spacing
                        that will be used for injections, templates and PSD
                        when computing the 'coarse match'. Templates will be
                        generated directly with this spacing. The PSD and
                        injections will be resampled.
  --injection-filter-rejector-coarsematch-fmax INJECTION_FILTER_REJECTOR_COARSEMATCH_FMAX
                        If injections are present and a match threshold is
                        provided, this option specifies the maximum frequency
                        that will be used for injections, templates and PSD
                        when computing the 'coarse match'. Templates will be
                        generated directly with this max frequency. The PSD
                        and injections' frequency series will be truncated.
  --injection-filter-rejector-seg-buffer INJECTION_FILTER_REJECTOR_SEG_BUFFER
                        If injections are present and either a match threshold
                        or a chirp-time window is given, we will determine if
                        injections are 'in' the specified analysis chunk by
                        using the end times. If this value is non-zero the
                        analysis chunk is extended on both sides by this
                        amount before determining if injections are within the
                        given window.
  --injection-filter-rejector-f-lower INJECTION_FILTER_REJECTOR_F_LOWER
                        If injections are present and either a match threshold
                        or a chirp-time window is given, this value is used to
                        set the lower frequency for determine chirp times or
                        for calculating matches. If this value is None the
                        lower frequency used for the full matched-filter is
                        used. Otherwise this value is used.

Sine-Gaussian Chisq:
  --sgchisq-snr-threshold SGCHISQ_SNR_THRESHOLD
                        Minimum SNR threshold to use SG chisq
  --sgchisq-locations SGCHISQ_LOCATIONS [SGCHISQ_LOCATIONS ...]
                        Frequency offsets and quality factors of the sine-
                        Gaussians to use, format 'region-
                        boolean:q1-offset1,q2-offset2'. Offset is relative to
                        the end frequency of the approximant. Region is a
                        boolean expression selecting templates to apply the
                        sine-Gaussians to, ex. 'mtotal>40'

Of these options the workflow module will automatically add the following, which are unique fo r each job. DO NOT ADD THESE OPTIONS IN THE CONFIGURATION FILE.

  • –gps-start-time
  • –gps-end-time
  • –frame-cache
  • –output

All other options must be provided in the configuration file. Here is an example of a pycbc_inspiral call.

pycbc_inspiral --trig-end-time 961592867 --verbose  --cluster-method window --bank-filetmpltbank/L1-TMPLTBANK_01-961591486-1382.xml.gz --gps-end-time 961592884 --channel-name L1:LDAS-STRAIN --processing-scheme cuda --snr-threshold 5.5 --psd-estimation median --trig-start-time 961591534 --gps-start-time 961590836 --chisq-bins 16 --segment-end-pad 16 --segment-length 2048 --low-frequency-cutoff 15 --pad-data 8 --cluster-window 1 --sample-rate 4096 --segment-start-pad 650 --psd-segment-stride 32 --psd-inverse-length 16 --psd-segment-length 64 --frame-cache datafind/L1-DATAFIND-961585543-7349.lcf --approximant SPAtmplt --output inspiral/L1-INSPIRAL_1-961591534-1333.xml.gz --strain-high-pass 30 --order 7

pycbc.workflow.matched_filter Module

This is complete documentation of this module’s code

This module is responsible for setting up the matched-filtering stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html

pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None, link_to_tmpltbank=False, compatibility_mode=False)[source]

Setup matched-filter jobs that are generated as part of the workflow. This module can support any matched-filter code that is similar in principle to lalapps_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
  • link_to_tmpltbank (boolean, optional (default=True)) – If this option is given, the job valid_times will be altered so that there will be one inspiral file for every template bank and they will cover the same time span. Note that this option must also be given during template bank generation to be meaningful.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated_multi(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None, link_to_tmpltbank=False, compatibility_mode=False)[source]

Setup matched-filter jobs that are generated as part of the workflow in which a single job reads in and generates triggers over multiple ifos. This module can support any matched-filter code that is similar in principle to pycbc_multi_inspiral or lalapps_coh_PTF_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).

Parameters:
  • workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList

pycbc.workflow.matched_filter.setup_matchedfltr_workflow(workflow, science_segs, datafind_outs, tmplt_banks, output_dir=None, injection_file=None, tags=None)[source]

This function aims to be the gateway for setting up a set of matched-filter jobs in a workflow. This function is intended to support multiple different ways/codes that could be used for doing this. For now the only supported sub-module is one that runs the matched-filtering by setting up a serious of matched-filtering jobs, from one executable, to create matched-filter triggers covering the full range of science times for which there is data and a template bank file.

Parameters:
  • Workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to.
  • science_segs (ifo-keyed dictionary of ligo.segments.segmentlist instances) – The list of times that are being analysed in this workflow.
  • datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
  • tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
  • output_dir (path) – The directory in which output will be stored.
  • injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
  • tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
Returns:

inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.

Return type:

pycbc.workflow.core.FileList