The workflow matched-filter module
Introduction
The matched-filter section of pycbc’s workflow module is responsible for matched-filtering the data against the template bank(s) from the template bank section and generating a list of “triggers” for each interferometer. These triggers should be a list of any event where the signal to noise ratio and any signal consistency test are such that that point should be sent forward to check for coincidence in other ifos.
Any single-ifo signal consistency tests (ie. chi-squared tests etc.) should be computed in this section and stored within the lists of triggers. The workflow does not make any specifications on the output format, but obviously code in the next stages of the workflow must know how to process that input.
The matched-filtering section should be as independent of the other stages of the workflow as possible. This means that we don’t require the data read in by matched-filter jobs to match that read in by template bank jobs (however this may be desirable in some cases, so should be possible where sensible). Options should also not be hardcoded (so there are no cases where an option that gets sent to a template bank job also gets sent to a matched-filter job without any way to stop that). However, it is possible to duplicate options where this is desireable (see Pycbc’s workflow module configuration file(s) and command line interface).
The return from the matched-filter section of the workflow module is a list of File objects corresponding to each actual file (one for each job) that will be generated within the workflow. This will be the only thing that will be passed from the matched-filter section to the future sections.
Usage
Using this module requires a number of things
A configuration file (or files) containing the information needed to tell this module how to generate GW triggers.
An initialized instance of the workflow class, containing the ConfigParser.
A list of segments to be analysed by this module.
A FileList returned by the templatebank module containing the template banks available for use.
A FileList returned by the datafind module containing the frames that contain the data that will be used to make the template banks.
If desired an injection file for assessing sensitivity to simulated signals.
The module is then called according to
- pycbc.workflow.setup_matchedfltr_workflow(workflow, science_segs, datafind_outs, tmplt_banks, output_dir=None, injection_file=None, tags=None)[source]
This function aims to be the gateway for setting up a set of matched-filter jobs in a workflow. This function is intended to support multiple different ways/codes that could be used for doing this. For now the only supported sub-module is one that runs the matched-filtering by setting up a serious of matched-filtering jobs, from one executable, to create matched-filter triggers covering the full range of science times for which there is data and a template bank file.
- Parameters:
Workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to.
science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.
datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
output_dir (path) – The directory in which output will be stored.
injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
- Returns:
inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
- Return type:
Configuration file setup
Here we describe the options given in the configuration file used in the workflow that will be needed in this section
[workflow-matchedfilter] section
The configuration file must have an [workflow-matchedfilter] section, which is used to tell the workflow how to construct (or gather) the template banks. The first option to choose and provide is
matchedfilter-method = VALUE
The choices here and their description are as described below
WORKFLOW_INDEPENDENT_IFOS - Matched-filter trigger files will be generated within the workflow. These banks will be made to cover only short (normally ~ 2000s) of data to reflect PSD changes over time and will be independent and distinct for each analysed interferometer. This uses the setup_matchedfltr_dax_generated sub-module.
Currently only one option, but others can be added. The subfunctions used are described here
- pycbc.workflow.setup_matchedfltr_dax_generated(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None)[source]
Setup matched-filter jobs that are generated as part of the workflow. This module can support any matched-filter code that is similar in principle to lalapps_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).
- Parameters:
workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.
datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
output_dir (path) – The directory in which output will be stored.
injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
- Returns:
inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
- Return type:
When using the setup_matchedfltr_dax_generated sub-module the following additional options apply in the [workflow-matchedfilter] section:
matchedfilter-link-to-tmpltbank - OPTIONAL. If this is given the workflow module will attempt to ensure a one-to-one correspondence between template banks and matched-filter outputs. This may not work in all cases and should be considered an option to be used for comparing with ihope output.
matchedfilter-compatibility-mode - OPTIONAL. If this is given the workflow module will tile the matched-filter jobs in the same way as inspiral_hipe used to. This requires the link option above and that the template bank and matched-filtering jobs are reading the same amount of data in each job.
max-analysis-segments = (NOT used for lalapps_inspiral) - REQUIRED. The maximum number of analysis segments to analyze within a single inspiral job. Note that triggers may not be produced for the entire span of time.
min-analysis-segments = (NOT used for lalapps_inspiral) - REQUIRED. The minimum number of analysis segments to analyze within a single inspiral job. This may be the same as the maximum.
[executables]
inspiral = /path/to/inspiral_exec
A section, in this case [inspiral], will be used to specify the constant command line options that are sent to all inspiral jobs. How to set up the [{exe_name}] section, and which executables are currently supported is discussed below.
Supported inspiral trigger generators and instructions for using them
The following inspiral trigger generators are currently supported in pycbc’s workflow module
lalapps_inspiral
pycbc_inspiral
Adding a new executable is not too hard, please ask a developer for some pointers on how to do this if you want to add a new code.
lalapps_inspiral_ahope
Lalapps_inspiral is the legacy C-code that has been used for years to find gravitational-wave triggers in It is a little inflexible in terms of output file names.
lalapps_inspiral is supported in the workflow module via a wrapper script lalapps_inspiral_ahope, this allows us to specify all the frame files and the output file name directly. Documentation for the lalapps_inspiral command line arguments can be found at http://software.ligo.org/docs/lalsuite/lalapps/inspiral_8c.html
Of these options the workflow module or the wrapper script will automatically add the following, which are unique for each job. DO NOT ADD THESE OPTIONS IN THE CONFIGURATION FILE.
–gps-start-time
–gps-end-time
–trig-start-time
–trig-end-time
–frame-cache
–user-tag
–ifo-tag
All other options must be provided in the configuration file. Here is an example of a lalapps_inspiral call.
lalapps_inspiral --do-rsq-veto --trig-end-time 971614817 --enable-rsq-veto --dynamic-range-exponent 69.0 --autochisq-stride 2 --bank-file datafind/L1-TMPLTBANK_19-971612833-2048.xml.gz --high-pass-order 8 --strain-high-pass-order 8 --ifo-tag FULL_DATA --user-tag 19 --gps-end-time 971614881 --calibrated-data real_8 --channel-name L1:LDAS-STRAIN --snr-threshold 5.5 --cluster-method template --number-of-segments 15 --trig-start-time 971613852 --enable-high-pass 30.0 --gps-start-time 971612833 --enable-filter-inj-only --maximization-interval 30 --high-pass-attenuation 0.1 --chisq-bins 2 --inverse-spec-length 16 --rsq-veto-threshold 15.0 --segment-length 1048576 --low-frequency-cutoff 40.0 --pad-data 8 --autochisq-two-sided --sample-rate 4096 --chisq-threshold 10.0 --rsq-veto-max-snr 12.0 --resample-filter ldas --strain-high-pass-atten 0.1 --strain-high-pass-freq 30 --bank-veto-time-freq --segment-overlap 524288 --frame-cache datafind/L1-DATAFIND-968556757-3058132.lcf --chisq-delta 0.2 --bank-veto-subbank-size 20 --approximant FindChirpSP --rsq-veto-time-thresh 0.0002 --write-compress --autochisq-length 100 --enable-output --rsq-veto-window 6.0 --order threePointFivePN --spectrum-type median
pycbc_inspiral
pycbc_inspiral is pycbc’s inspiral matched-filtering program. Designed as a replacement and improvement of lalapps_inspiral. The help message of pycbc_inspiral follows:
$ pycbc_inspiral --help
No CuPy
No CuPy or GPU PhenomHM module.
No CuPy or GPU response available.
No CuPy or GPU interpolation available.
usage:
Find single detector gravitational-wave triggers.
options:
-h, --help show this help message and exit
--update-progress UPDATE_PROGRESS
updates a file 'progress.txt' with a value 0 .. 1.0
when this amount of (filtering) progress was made
--update-progress-file UPDATE_PROGRESS_FILE
name of the file to write the amount of (filtering)
progress to
--output OUTPUT FIXME: ADD
--bank-file BANK_FILE
FIXME: ADD
--snr-threshold SNR_THRESHOLD
SNR threshold for trigger generation
--newsnr-threshold THRESHOLD
Cut triggers with NewSNR less than THRESHOLD
--low-frequency-cutoff LOW_FREQUENCY_CUTOFF
The low frequency cutoff to use for filtering (Hz)
--enable-bank-start-frequency
Read the starting frequency of template waveforms from
the template bank.
--max-template-length MAX_TEMPLATE_LENGTH
The maximum length of a template is seconds. The
starting frequency of the template is modified to
ensure the proper length
--enable-q-transform compute the q-transform for each segment of a given
analysis run. (default = False)
--approximant APPRX[:COND] [APPRX[:COND] ...]
The approximant(s) to use. Multiple approximants to
use in different regions may be provided. If multiple
approximants are provided, every one but the last must
be be followed by a conditional statement defining
where that approximant should be used. Conditionals
can be any boolean test understood by numpy. For
example, 'Apprx:(mtotal > 4) & (mchirp <= 5)' would
use approximant 'Apprx' where total mass is > 4 and
chirp mass is <= 5. Conditionals are applied in order,
with each successive one only applied to regions not
covered by previous arguments. For example,
`'TaylorF2:mtotal < 4' 'IMRPhenomD:mchirp < 3'` would
result in IMRPhenomD being used where chirp mass is <
3 and total mass is >= 4. The last approximant given
may use 'else' as the conditional or include no
conditional. In either case, this will cause the last
approximant to be used in any remaning regions after
all the previous conditionals have been applied. For
the full list of possible parameters to apply
conditionals to, see WaveformArray.default_fields().
Math operations may also be used on parameters; syntax
is python, with any operation recognized by numpy.
--order {-1,0,1,2,3,4,5,6,7,8}
The integer half-PN order at which to generate the
approximant. Default is -1 which indicates to use
approximant defined default.
--taper-template {start,end,startend}
For time-domain approximants, taper the start and/or
end of the waveform before FFTing.
--cluster-function {findchirp,symmetric}
How to cluster together triggers within a window.
'findchirp' uses a forward sliding window; 'symmetric'
will compare each window to the one before and after,
keeping only a local maximum.
--cluster-window CLUSTER_WINDOW
Length of clustering window in seconds. Set to 0 to
disable clustering.
--bank-veto-bank-file BANK_VETO_BANK_FILE
FIXME: ADD
--chisq-snr-threshold CHISQ_SNR_THRESHOLD
Minimum SNR to calculate the power chisq
--chisq-bins CHISQ_BINS
Number of frequency bins to use for power chisq.
Specify an integer for a constant number of bins, or a
function of template attributes. Math functions are
allowed, ex.
'10./math.sqrt((params.mass1+params.mass2)/100.)'.
Non-integer values will be rounded down.
--chisq-threshold CHISQ_THRESHOLD
FIXME: ADD
--chisq-delta CHISQ_DELTA
FIXME: ADD
--autochi-number-points AUTOCHI_NUMBER_POINTS
The number of points to use, in both directions
ifdoing a two-sided auto-chisq, to calculate theauto-
chisq statistic.
--autochi-stride AUTOCHI_STRIDE
The gap, in sample points, between the points atwhich
to calculate auto-chisq.
--autochi-two-phase If given auto-chisq will be calculated by testing
against both phases of the SNR time-series. If not
given, only the phase matching the trigger will be
used.
--autochi-onesided {left,right}
Decide whether to calculate auto-chisq usingpoints on
both sides of the trigger or only on oneside. If not
given points on both sides will beused. If given, with
either 'left' or 'right',only points on that side
(right = forward in time,left = back in time) will be
used.
--autochi-reverse-template
If given, time-reverse the template beforecalculating
the auto-chisq statistic. This willcome at additional
computational cost as the SNRtime-series will need
recomputing for the time-reversed template.
--autochi-max-valued If given, store only the maximum value of the auto-
chisq over all points tested. A disadvantage of this
is that the mean value will not be known analytically.
--autochi-max-valued-dof INT
If using --autochi-max-valued this value denotes the
pre-calculated mean value that will be stored as the
auto-chisq degrees-of-freedom value.
--downsample-factor DOWNSAMPLE_FACTOR
Factor that determines the interval between the
initial SNR sampling. If not set (or 1) no sparse
sample is created, and the standard full SNR is
calculated.
--upsample-threshold UPSAMPLE_THRESHOLD
The fraction of the SNR threshold to check the sparse
SNR sample.
--upsample-method {pruned_fft}
The method to find the SNR points between the sparse
SNR sample.
--user-tag TAG This is used to identify FULL_DATA jobs for
compatibility with pipedown post-processing. Option
will be removed when no longer needed.
--keep-loudest-log-chirp-window KEEP_LOUDEST_LOG_CHIRP_WINDOW
Keep loudest triggers within ln chirp mass window
--keep-loudest-interval KEEP_LOUDEST_INTERVAL
Window in seconds to maximize triggers over bank
--keep-loudest-num KEEP_LOUDEST_NUM
Number of triggers to keep from each maximization
interval
--keep-loudest-stat {snr,newsnr,new_snr,newsnr_sgveto,newsnr_sgveto_psdvar,newsnr_sgveto_psdvar_threshold,newsnr_sgveto_psdvar_scaled,newsnr_sgveto_psdvar_scaled_threshold}
Statistic used to determine loudest to keep
--finalize-events-template-rate NUM TEMPLATES
After NUM TEMPLATES perform the various clustering and
rejection tests that would be performed at the end of
this job. Default is to only do those things at the
end of the job. This can help control memory usage if
a lot of triggers that would be rejected are being
retained. A suggested value for this is 512, but a
good number may depend on other settings and your
specific use-case.
--gpu-callback-method GPU_CALLBACK_METHOD
--use-compressed-waveforms
Use compressed waveforms from the bank file (if
available).
--waveform-decompression-method WAVEFORM_DECOMPRESSION_METHOD
Method to be used decompress waveforms from the bank
file.
--checkpoint-interval CHECKPOINT_INTERVAL
Save results to checkpoint file every X seconds.
Default is no checkpointing.
--require-valid-checkpoint
If the checkpoint file is invalid, raise an error.
Default is to ignore invalid checkpoint files and to
delete the broken file.
--checkpoint-exit-maxtime CHECKPOINT_EXIT_MAXTIME
Checkpoint and exit if X seconds of execution time is
exceeded. Default is no checkpointing.
--checkpoint-exit-code CHECKPOINT_EXIT_CODE
Exit code returned if exiting after a checkpoint
--multiprocessing-nprocesses MULTIPROCESSING_NPROCESSES
Parallelize over multiple processes, note this is
separate from threading using the proc. scheme. Used
in conjunction with the option--finalize-events-
template-rate which should be setto a multiple of the
number of processes.
PyCBC common options:
Common options for PyCBC executables.
-v, --verbose Add verbosity to logging. Adding the option multiple
times makes logging progressively more verbose, e.g.
--verbose or -v provides logging at the info level,
but -vv or --verbose --verbose provides debug logging.
--version [VERSION] Display PyCBC version information and exit. Can
optionally supply a modifier integer to control the
verbosity of the version information. 0 and 1 are the
same as --version; 2 provides more detailed PyCBC
library information; 3 provides information about
PyCBC, LAL and LALSimulation packages (if installed)
Options to select the method of PSD generation:
The options --psd-model, --psd-file, --asd-file, and --psd-estimation are
mutually exclusive.
--psd-model {AdVBNSOptimizedSensitivityP1200087,AdVDesignSensitivityP1200087,AdVEarlyHighSensitivityP1200087,AdVEarlyLowSensitivityP1200087,AdVLateHighSensitivityP1200087,AdVLateLowSensitivityP1200087,AdVMidHighSensitivityP1200087,AdVMidLowSensitivityP1200087,AdVO3LowT1800545,AdVO4IntermediateT1800545,AdVO4T1800545,AdvVirgo,CosmicExplorerP1600143,CosmicExplorerPessimisticP1600143,CosmicExplorerWidebandP1600143,EinsteinTelescopeP1600143,GEO,GEOHF,KAGRA,KAGRA128MpcT1800545,KAGRA25MpcT1800545,KAGRA80MpcT1800545,KAGRADesignSensitivityT1600593,KAGRAEarlySensitivityT1600593,KAGRALateSensitivityT1600593,KAGRAMidSensitivityT1600593,KAGRAOpeningSensitivityT1600593,TAMA,Virgo,aLIGO140MpcT1800545,aLIGO175MpcT1800545,aLIGOAPlusDesignSensitivityT1800042,aLIGOAdVO3LowT1800545,aLIGOAdVO4IntermediateT1800545,aLIGOAdVO4T1800545,aLIGOBHBH20Deg,aLIGOBHBH20DegGWINC,aLIGOBNSOptimizedSensitivityP1200087,aLIGODesignSensitivityP1200087,aLIGODesignSensitivityT1800044,aLIGOEarlyHighSensitivityP1200087,aLIGOEarlyLowSensitivityP1200087,aLIGOHighFrequency,aLIGOHighFrequencyGWINC,aLIGOKAGRA128MpcT1800545,aLIGOKAGRA25MpcT1800545,aLIGOKAGRA80MpcT1800545,aLIGOLateHighSensitivityP1200087,aLIGOLateLowSensitivityP1200087,aLIGOMidHighSensitivityP1200087,aLIGOMidLowSensitivityP1200087,aLIGONSNSOpt,aLIGONSNSOptGWINC,aLIGONoSRMHighPower,aLIGONoSRMLowPower,aLIGONoSRMLowPowerGWINC,aLIGOO3LowT1800545,aLIGOQuantumBHBH20Deg,aLIGOQuantumHighFrequency,aLIGOQuantumNSNSOpt,aLIGOQuantumNoSRMHighPower,aLIGOQuantumNoSRMLowPower,aLIGOQuantumZeroDetHighPower,aLIGOQuantumZeroDetLowPower,aLIGOThermal,aLIGOZeroDetHighPower,aLIGOZeroDetHighPowerGWINC,aLIGOZeroDetLowPower,aLIGOZeroDetLowPowerGWINC,aLIGOaLIGO140MpcT1800545,aLIGOaLIGO175MpcT1800545,aLIGOaLIGODesignSensitivityT1800044,aLIGOaLIGOO3LowT1800545,eLIGOModel,eLIGOShot,iLIGOModel,iLIGOSRD,iLIGOSeismic,iLIGOShot,iLIGOThermal,analytical_psd_lisa_tdi_AE,analytical_psd_lisa_tdi_AE_confusion,analytical_psd_lisa_tdi_T,analytical_psd_lisa_tdi_XYZ,analytical_psd_taiji_tdi_AE,analytical_psd_taiji_tdi_AE_confusion,analytical_psd_taiji_tdi_T,analytical_psd_taiji_tdi_XYZ,analytical_psd_tianqin_tdi_AE,analytical_psd_tianqin_tdi_AE_confusion,analytical_psd_tianqin_tdi_T,analytical_psd_tianqin_tdi_XYZ,flat_unity,sh_transformed_psd_lisa_tdi_XYZ}
Get PSD from given analytical model.
--psd-extra-args PARAM:VALUE [PARAM:VALUE ...]
(optional) Extra arguments passed to the PSD models.
--psd-file PSD_FILE Get PSD using given PSD ASCII file
--asd-file ASD_FILE Get PSD using given ASD ASCII file
--psd-inverse-length PSD_INVERSE_LENGTH
(Optional) The maximum length of the impulse response
of the overwhitening filter (s)
--invpsd-trunc-method {hann}
(Optional) What truncation method to use when applying
psd-inverse-length. If not provided, a hard truncation
will be used.
--psd-file-xml-ifo-string PSD_FILE_XML_IFO_STRING
If using an XML PSD file, use the PSD in the file's
PSD dictionary with this ifo string. If not given and
only one PSD present in the file return that, if not
given and multiple (or zero) PSDs present an exception
will be raised.
--psd-file-xml-root-name PSD_FILE_XML_ROOT_NAME
If given use this as the root name for the PSD XML
file. If this means nothing to you, then it is
probably safe to ignore this option.
--psdvar-segment SECONDS
Length of segment for mean square calculation of PSD
variation.
--psdvar-short-segment SECONDS
Length of short segment for outliers removal in PSD
variability calculation.
--psdvar-long-segment SECONDS
Length of long segment when calculating the PSD
variability.
--psdvar-psd-duration SECONDS
Duration of short segments for PSD estimation.
--psdvar-psd-stride SECONDS
Separation between PSD estimation segments.
--psdvar-low-freq HERTZ
Minimum frequency to consider in strain bandpass.
--psdvar-high-freq HERTZ
Maximum frequency to consider in strain bandpass.
--psd-estimation {mean,median,median-mean}
Measure PSD from the data, using given average method.
--psd-segment-length PSD_SEGMENT_LENGTH
(Required for --psd-estimation) The segment length for
PSD estimation (s)
--psd-segment-stride PSD_SEGMENT_STRIDE
(Required for --psd-estimation) The separation between
consecutive segments (s)
--psd-num-segments PSD_NUM_SEGMENTS
(Optional, used only with --psd-estimation). If given,
PSDs will be estimated using only this number of
segments. If more data is given than needed to make
this number of segments then excess data will not be
used in the PSD estimate. If not enough data is given,
the code will fail.
--psd-output PSD_OUTPUT
(Optional) Write PSD to specified file
Options for obtaining h(t):
These options are used for generating h(t) either by reading from a file
or by generating it. This is only needed if the PSD is to be estimated
from the data, ie. if the --psd-estimation option is given.
--gps-start-time GPS_START_TIME
The gps start time of the data (integer seconds)
--gps-end-time GPS_END_TIME
The gps end time of the data (integer seconds)
--strain-high-pass STRAIN_HIGH_PASS
High pass frequency
--strain-low-pass STRAIN_LOW_PASS
Low pass frequency
--pad-data PAD_DATA Extra padding to remove highpass corruption (integer
seconds, default 8)
--taper-data TAPER_DATA
Taper ends of data to zero using the supplied length
as a window (integer seconds)
--sample-rate SAMPLE_RATE
The sample rate to use for h(t) generation (integer
Hz)
--channel-name CHANNEL_NAME
The channel containing the gravitational strain data
--frame-cache FRAME_CACHE [FRAME_CACHE ...]
Cache file containing the frame locations.
--frame-files FRAME_FILES [FRAME_FILES ...]
list of frame files
--hdf-store HDF_STORE
Store of time series data in hdf format
--frame-type S:TYPE (optional), replaces frame-files. Use datafind to get
the needed frame file(s) of this type from site S.
--frame-sieve FRAME_SIEVE
(optional), Only use frame files where the URL matches
the regular expression given.
--fake-strain {AdVBNSOptimizedSensitivityP1200087,AdVDesignSensitivityP1200087,AdVEarlyHighSensitivityP1200087,AdVEarlyLowSensitivityP1200087,AdVLateHighSensitivityP1200087,AdVLateLowSensitivityP1200087,AdVMidHighSensitivityP1200087,AdVMidLowSensitivityP1200087,AdVO3LowT1800545,AdVO4IntermediateT1800545,AdVO4T1800545,AdvVirgo,CosmicExplorerP1600143,CosmicExplorerPessimisticP1600143,CosmicExplorerWidebandP1600143,EinsteinTelescopeP1600143,GEO,GEOHF,KAGRA,KAGRA128MpcT1800545,KAGRA25MpcT1800545,KAGRA80MpcT1800545,KAGRADesignSensitivityT1600593,KAGRAEarlySensitivityT1600593,KAGRALateSensitivityT1600593,KAGRAMidSensitivityT1600593,KAGRAOpeningSensitivityT1600593,TAMA,Virgo,aLIGO140MpcT1800545,aLIGO175MpcT1800545,aLIGOAPlusDesignSensitivityT1800042,aLIGOAdVO3LowT1800545,aLIGOAdVO4IntermediateT1800545,aLIGOAdVO4T1800545,aLIGOBHBH20Deg,aLIGOBHBH20DegGWINC,aLIGOBNSOptimizedSensitivityP1200087,aLIGODesignSensitivityP1200087,aLIGODesignSensitivityT1800044,aLIGOEarlyHighSensitivityP1200087,aLIGOEarlyLowSensitivityP1200087,aLIGOHighFrequency,aLIGOHighFrequencyGWINC,aLIGOKAGRA128MpcT1800545,aLIGOKAGRA25MpcT1800545,aLIGOKAGRA80MpcT1800545,aLIGOLateHighSensitivityP1200087,aLIGOLateLowSensitivityP1200087,aLIGOMidHighSensitivityP1200087,aLIGOMidLowSensitivityP1200087,aLIGONSNSOpt,aLIGONSNSOptGWINC,aLIGONoSRMHighPower,aLIGONoSRMLowPower,aLIGONoSRMLowPowerGWINC,aLIGOO3LowT1800545,aLIGOQuantumBHBH20Deg,aLIGOQuantumHighFrequency,aLIGOQuantumNSNSOpt,aLIGOQuantumNoSRMHighPower,aLIGOQuantumNoSRMLowPower,aLIGOQuantumZeroDetHighPower,aLIGOQuantumZeroDetLowPower,aLIGOThermal,aLIGOZeroDetHighPower,aLIGOZeroDetHighPowerGWINC,aLIGOZeroDetLowPower,aLIGOZeroDetLowPowerGWINC,aLIGOaLIGO140MpcT1800545,aLIGOaLIGO175MpcT1800545,aLIGOaLIGODesignSensitivityT1800044,aLIGOaLIGOO3LowT1800545,eLIGOModel,eLIGOShot,iLIGOModel,iLIGOSRD,iLIGOSeismic,iLIGOShot,iLIGOThermal,analytical_psd_lisa_tdi_AE,analytical_psd_lisa_tdi_AE_confusion,analytical_psd_lisa_tdi_T,analytical_psd_lisa_tdi_XYZ,analytical_psd_taiji_tdi_AE,analytical_psd_taiji_tdi_AE_confusion,analytical_psd_taiji_tdi_T,analytical_psd_taiji_tdi_XYZ,analytical_psd_tianqin_tdi_AE,analytical_psd_tianqin_tdi_AE_confusion,analytical_psd_tianqin_tdi_T,analytical_psd_tianqin_tdi_XYZ,flat_unity,sh_transformed_psd_lisa_tdi_XYZ,zeroNoise}
Name of model PSD for generating fake gaussian noise.
--fake-strain-extra-args PARAM:VALUE [PARAM:VALUE ...]
(optional) Extra arguments passed to the PSD models.
--fake-strain-seed FAKE_STRAIN_SEED
Seed value for the generation of fake colored gaussian
noise
--fake-strain-from-file FAKE_STRAIN_FROM_FILE
File containing ASD for generating fake noise from it.
--fake-strain-flow FAKE_STRAIN_FLOW
Low frequency cutoff of the fake strain
--fake-strain-filter-duration FAKE_STRAIN_FILTER_DURATION
Duration in seconds of the fake data coloring filter
--fake-strain-sample-rate FAKE_STRAIN_SAMPLE_RATE
Sample rate of the fake data generation
--injection-file INJECTION_FILE
(optional) Injection file containing parameters of CBC
signals to be added to the strain
--sgburst-injection-file SGBURST_INJECTION_FILE
(optional) Injection file containing parametersof
sine-Gaussian burst signals to add to the strain
--injection-scale-factor INJECTION_SCALE_FACTOR
Divide injections by this factor before adding to the
strain data
--injection-sample-rate INJECTION_SAMPLE_RATE
Sample rate to use for injections (integer Hz).
Typically similar to the strain data sample rate.If
not provided, the strain sample rate will be used
--injection-f-ref INJECTION_F_REF
Reference frequency in Hz for creating CBC injections
from an XML file
--injection-f-final INJECTION_F_FINAL
Override the f_final field of a CBC XML injection file
(frequency in Hz)
--gating-file GATING_FILE
(optional) Text file of gating segments to apply.
Format of each line is (all values in seconds):
gps_time zeros_half_width pad_half_width
--autogating-threshold SIGMA
If given, find and gate glitches producing a deviation
larger than SIGMA in the whitened strain time series.
--autogating-max-iterations SIGMA
If given, iteratively apply autogating
--autogating-cluster SECONDS
Length of clustering window for detecting glitches for
autogating.
--autogating-width SECONDS
Half-width of the gating window.
--autogating-taper SECONDS
Taper the strain before and after each gating window
over a duration of SECONDS.
--autogating-pad SECONDS
Ignore the given length of whitened strain at the ends
of a segment, to avoid filters ringing.
--gating-method {hard,taper,paint}
Choose the method for gating. Default: `taper`
--normalize-strain NORMALIZE_STRAIN
(optional) Divide frame data by constant.
--zpk-z ZPK_Z [ZPK_Z ...]
(optional) Zero-pole-gain (zpk) filter strain. A list
of zeros for transfer function
--zpk-p ZPK_P [ZPK_P ...]
(optional) Zero-pole-gain (zpk) filter strain. A list
of poles for transfer function
--zpk-k ZPK_K (optional) Zero-pole-gain (zpk) filter strain.
Transfer function gain
--witness-frame-type WITNESS_FRAME_TYPE
(optional), frame type which will be use to query the
witness channel data.
--witness-tf-file WITNESS_TF_FILE
an hdf file containing the transfer functions and the
associated channel names
--witness-filter-length WITNESS_FILTER_LENGTH
filter length in seconds for the transfer function
Options for segmenting the strain:
These options are used to determine how to segment the strain into smaller
chunks, and for determining the portion of each to analyze for triggers.
--trig-start-time TRIG_START_TIME
(optional) The gps time to start recording triggers
--trig-end-time TRIG_END_TIME
(optional) The gps time to stop recording triggers
--segment-length SEGMENT_LENGTH
The length of each strain segment in seconds.
--segment-start-pad SEGMENT_START_PAD
The time in seconds to ignore of the beginning of each
segment in seconds.
--segment-end-pad SEGMENT_END_PAD
The time in seconds to ignore at the end of each
segment in seconds.
--allow-zero-padding Allow for zero padding of data to analyze requested
times, if needed.
--filter-inj-only Analyze only segments that contain an injection.
--injection-window INJECTION_WINDOW
If using --filter-inj-only then only search for
injections within +/- injection window of the
injections's end time. This is useful to speed up a
coherent search or a search where we initially filter
at lower sample rate, and then filter at full rate
where needed. NOTE: Reverts to full analysis if two
injections are in the same segment.
Options for selecting the processing scheme in this program.:
--processing-scheme PROCESSING_SCHEME
The choice of processing scheme. Choices are ['mkl',
'cupy', 'cpu', 'numpy', 'cuda']. (optional for CPU
scheme) The number of execution threads can be
indicated by cpu:NUM_THREADS, where NUM_THREADS is an
integer. The default is a single thread. If the scheme
is provided as cpu:env, the number of threads can be
provided by the PYCBC_NUM_THREADS environment
variable. If the environment variable is not set, the
number of threads matches the number of logical cores.
--processing-device-id PROCESSING_DEVICE_ID
(optional) ID of GPU to use for accelerated processing
Options for selecting the FFT backend and controlling its performance in this program.:
--fft-backends [FFT_BACKENDS ...]
Preference list of the FFT backends. Choices are:
['mkl', 'fftw', 'numpy', 'cupy']
--fftw-measure-level FFTW_MEASURE_LEVEL
Determines the measure level used in planning FFTW
FFTs; allowed values are: [0, 1, 2, 3]
--fftw-threads-backend FFTW_THREADS_BACKEND
Give 'openmp', 'pthreads' or 'unthreaded' to specify
which threaded FFTW to use
--fftw-input-float-wisdom-file FFTW_INPUT_FLOAT_WISDOM_FILE
Filename from which to read single-precision wisdom
--fftw-input-double-wisdom-file FFTW_INPUT_DOUBLE_WISDOM_FILE
Filename from which to read double-precision wisdom
--fftw-output-float-wisdom-file FFTW_OUTPUT_FLOAT_WISDOM_FILE
Filename to which to write single-precision wisdom
--fftw-output-double-wisdom-file FFTW_OUTPUT_DOUBLE_WISDOM_FILE
Filename to which to write double-precision wisdom
--fftw-import-system-wisdom
If given, call fftw[f]_import_system_wisdom()
Options for selecting optimization-specific settings:
--cpu-affinity CPU_AFFINITY
A set of CPUs on which to run, specified in a format
suitable to pass to taskset.
--cpu-affinity-from-env CPU_AFFINITY_FROM_ENV
The name of an enivornment variable containing a set
of CPUs on which to run, specified in a format
suitable to pass to taskset.
Options that, if injections are present in this run, are responsible for performing pre-checks between injections in the data being filtered and the current search template to determine if the template has any chance of actually detecting the injection. The parameters of this test are given by the various options below. The --injection-filter-rejector-chirp-time-window and --injection-filter-rejector-match-threshold options need to be provided if those tests are desired. Other options will take default values unless overriden. More details on these options follow.:
--injection-filter-rejector-chirp-time-window INJECTION_FILTER_REJECTOR_CHIRP_TIME_WINDOW
If this value is not None and an injection file is
given then we will calculate the difference in chirp
time (tau_0) between the template and each injection
in the analysis segment. If the difference is greate
than this threshold for all injections then filtering
is not performed. By default this will be None.
--injection-filter-rejector-match-threshold INJECTION_FILTER_REJECTOR_MATCH_THRESHOLD
If this value is not None and an injection file is
provided then we will calculate a 'coarse match'
between the template and each injection in the
analysis segment. If the match is less than this
threshold for all injections then filtering is not
performed. Parameters for the 'coarse match' follow.
By default this value will be None.
--injection-filter-rejector-coarsematch-deltaf INJECTION_FILTER_REJECTOR_COARSEMATCH_DELTAF
If injections are present and a match threshold is
provided, this option specifies the frequency spacing
that will be used for injections, templates and PSD
when computing the 'coarse match'. Templates will be
generated directly with this spacing. The PSD and
injections will be resampled.
--injection-filter-rejector-coarsematch-fmax INJECTION_FILTER_REJECTOR_COARSEMATCH_FMAX
If injections are present and a match threshold is
provided, this option specifies the maximum frequency
that will be used for injections, templates and PSD
when computing the 'coarse match'. Templates will be
generated directly with this max frequency. The PSD
and injections' frequency series will be truncated.
--injection-filter-rejector-seg-buffer INJECTION_FILTER_REJECTOR_SEG_BUFFER
If injections are present and either a match threshold
or a chirp-time window is given, we will determine if
injections are 'in' the specified analysis chunk by
using the end times. If this value is non-zero the
analysis chunk is extended on both sides by this
amount before determining if injections are within the
given window.
--injection-filter-rejector-f-lower INJECTION_FILTER_REJECTOR_F_LOWER
If injections are present and either a match threshold
or a chirp-time window is given, this value is used to
set the lower frequency for determine chirp times or
for calculating matches. If this value is None the
lower frequency used for the full matched-filter is
used. Otherwise this value is used.
Sine-Gaussian Chisq:
--sgchisq-snr-threshold SGCHISQ_SNR_THRESHOLD
Minimum SNR threshold to use SG chisq
--sgchisq-locations SGCHISQ_LOCATIONS [SGCHISQ_LOCATIONS ...]
Frequency offsets and quality factors of the sine-
Gaussians to use, format 'region-
boolean:q1-offset1,q2-offset2'. Offset is relative to
the end frequency of the approximant. Region is a
boolean expression selecting templates to apply the
sine-Gaussians to, ex. 'mtotal>40'
Of these options the workflow module will automatically add the following, which are unique fo r each job. DO NOT ADD THESE OPTIONS IN THE CONFIGURATION FILE.
–gps-start-time
–gps-end-time
–frame-cache
–output
All other options must be provided in the configuration file. Here is an example of a pycbc_inspiral call.
pycbc_inspiral --trig-end-time 961592867 --verbose --cluster-method window --bank-filetmpltbank/L1-TMPLTBANK_01-961591486-1382.xml.gz --gps-end-time 961592884 --channel-name L1:LDAS-STRAIN --processing-scheme cuda --snr-threshold 5.5 --psd-estimation median --trig-start-time 961591534 --gps-start-time 961590836 --chisq-bins 16 --segment-end-pad 16 --segment-length 2048 --low-frequency-cutoff 15 --pad-data 8 --cluster-window 1 --sample-rate 4096 --segment-start-pad 650 --psd-segment-stride 32 --psd-inverse-length 16 --psd-segment-length 64 --frame-cache datafind/L1-DATAFIND-961585543-7349.lcf --approximant SPAtmplt --output inspiral/L1-INSPIRAL_1-961591534-1333.xml.gz --strain-high-pass 30 --order 7
pycbc.workflow.matched_filter
Module
This is complete documentation of this module’s code
This module is responsible for setting up the matched-filtering stage of workflows. For details about this module and its capabilities see here: https://ldas-jobs.ligo.caltech.edu/~cbc/docs/pycbc/NOTYETCREATED.html
- pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None)[source]
Setup matched-filter jobs that are generated as part of the workflow. This module can support any matched-filter code that is similar in principle to lalapps_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).
- Parameters:
workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.
datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
output_dir (path) – The directory in which output will be stored.
injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
- Returns:
inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
- Return type:
- pycbc.workflow.matched_filter.setup_matchedfltr_dax_generated_multi(workflow, science_segs, datafind_outs, tmplt_banks, output_dir, injection_file=None, tags=None)[source]
Setup matched-filter jobs that are generated as part of the workflow in which a single job reads in and generates triggers over multiple ifos. This module can support any matched-filter code that is similar in principle to pycbc_multi_inspiral, but for new codes some additions are needed to define Executable and Job sub-classes (see jobutils.py).
- Parameters:
workflow (pycbc.workflow.core.Workflow) – The Workflow instance that the coincidence jobs will be added to.
science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.
datafind_outs (pycbc.workflow.core.FileList) – A FileList of the datafind files that are needed to obtain the data used in the analysis, and (if requested by the user) the vetoes File and (if requested by the user) the search sky-grid File.
tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
output_dir (path) – The directory in which output will be stored.
injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
- Returns:
inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
- Return type:
- pycbc.workflow.matched_filter.setup_matchedfltr_workflow(workflow, science_segs, datafind_outs, tmplt_banks, output_dir=None, injection_file=None, tags=None)[source]
This function aims to be the gateway for setting up a set of matched-filter jobs in a workflow. This function is intended to support multiple different ways/codes that could be used for doing this. For now the only supported sub-module is one that runs the matched-filtering by setting up a serious of matched-filtering jobs, from one executable, to create matched-filter triggers covering the full range of science times for which there is data and a template bank file.
- Parameters:
Workflow (pycbc.workflow.core.Workflow) – The workflow instance that the coincidence jobs will be added to.
science_segs (ifo-keyed dictionary of igwn_segments.segmentlist instances) – The list of times that are being analysed in this workflow.
datafind_outs (pycbc.workflow.core.FileList) – An FileList of the datafind files that are needed to obtain the data used in the analysis.
tmplt_banks (pycbc.workflow.core.FileList) – An FileList of the template bank files that will serve as input in this stage.
output_dir (path) – The directory in which output will be stored.
injection_file (pycbc.workflow.core.File, optional (default=None)) – If given the file containing the simulation file to be sent to these jobs on the command line. If not given no file will be sent.
tags (list of strings (optional, default = [])) – A list of the tagging strings that will be used for all jobs created by this call to the workflow. An example might be [‘BNSINJECTIONS’] or [‘NOINJECTIONANALYSIS’]. This will be used in output names.
- Returns:
inspiral_outs – A list of output files written by this stage. This will not contain any intermediate products produced within this stage of the workflow. If you require access to any intermediate products produced at this stage you can call the various sub-functions directly.
- Return type: