pycbc_make_inference_inj_workflow: A parameter estimation workflow generator for injections

Introduction

The executable pycbc_make_inference_inj_workflow is a workflow generator to setup a parameter estimation analysis on one or more simulated signals. Optionally, it can also run a percentile-percentile on the injections it analyzed.

The workflow is very similar to the standard inference workflow created by pycbc_make_inference_workflow. The main differences are:

  • Rather than providing one or more [event-{label}] sections in the workflow config file, you provide a single [workflow-inference] section. The syntax for this section is very similar to the [event] section(s) in the standard workflow, as it sets the configuration files that are used by pycbc_inference. The difference is that the same settings are used for all injections.

  • When you create the workflow, you either pass it a --num-injections or a --injection-file. If the former, the workflow will draw the specified number of injections from the prior given to pycbc_inference and analyze them. If the latter, the workflow will analyze the injections specified in the given injection file. The file must be an HDF file; see pycbc_create_injections for details. In either case, each injection is treated as an independent event, with its own summary section in the results page.

  • You may optionally have the workflow do a percentile-percentile test on the injections. You do this by adding the necessary executables and corresponding sections to the workflow_config.ini file. See the example below for details. If a percentile-percentile test is done, the results page will have an additional tab that gives a summary of the PP test on all of the parameters, as well as PP plots and plots of injected versus recoverd values.

  • It is recommend (though not required) that you add plot-injection-parameters to the [plot_posterior] and [plot_posterior_summary] sections. Doing so will cause redlines to be plotted at the injected parameter values on the posterior plots, so that you may visually inspect how well the injected values are recovered. This may also require providing an injection-samples-map argument. See the example file below for details.

In the standard workflow we used two workflow configuration files, a workflow_config.ini and an events.ini. For the injection workflow, we can use the same workflow_config.ini; we just need to setup an injections_config.ini to add the needed sections and arguments for the injections workflow.

In the example below, we demonstrate how to use the injections workflow using the same prior and sampler settings as given in the standard workflow example.

Example: BBH injections with dynesty

In this example we use the same prior and sampler settings as the example of analyzing GW150914 and GW170814 in the pycbc_make_inference_workflow documentation. We will analyze 10 injections, and do a percentile-percentile test on them. (This is only as an example. To do a full PP test, we recommend using at least 100 injections.)

Get the inference configuration files

We can use the same prior, model, and sampler configuration files as used in the pycbc_make_inference_workflow example. However, instead of analyzing O1 or O2 data, we will create fake Gaussian noise. To do that, we will use the data.ini file used for the BBH simulation example.

Setup the workflow configuration file

As discussed above, we can use the same workflow configuration file as used in the dynesty example in the standard workflow. We need to create an injections_config.ini file to go along with the workflow_config.ini:

[workflow-inference]
; The inference configuration file(s) and any overrides to use for all of the
; injections.
; If no injection file is provided on the command line, injections will be
; drawn from the prior specified in the inference config file
config-files = bbh-uniform_comoving_volume.ini
               marginalized_phase.ini
               dynesty.ini
               data.ini
; As with events sections in the standard workflow, you can specify
; config-overrides for the above file(s). Here, we will change the prior from
; uniform in comoving volume to uniform in the log10 of the comoving volume.
; We'll do this so as to get a distribution of injections with appreciable
; SNR. (A uniform in comoving volume prior leads to most of the injections
; having SNR < 7, and so we mostly end up seeing prior-in, prior-out.)
config-overrides = prior-comoving_volume:name:uniform_log10
; Optionally, you may also specify the number of times to run inference on
; each injection by setting nruns. Each run will use different random seeds,
; and samples from the runs will be combined into a single posterior file for
; each injection. Not setting this is equivalent to nruns = 1
;nruns = 1

; For the injection workflow, we need to add an executable to create the
; injections. Optionally, we may also add executables to perform
; percentile-percentile (pp) tests.
[executables]
create_injections = ${which:pycbc_create_injections}
; Executables for percentile-percentile test. These are optional. If you do
; not include them in this section, no PP test will be done. If you do
; include them, all 3 must be included.
pp_table_summary = ${which:pycbc_inference_pp_table_summary}
plot_pp = ${which:pycbc_inference_plot_pp}
inj_recovery = ${which:pycbc_inference_plot_inj_recovery}
; We do not need to provide any of the other executables since they are
; specified in workflow_config.ini. When this file is combined with
; workflow_config.ini, these options are automatically added to the
; [executables] section in that file.


[workflow-pp_test]
; Since we have included the executables to make the PP plots, we need to
; provide this section.
; The pp-params option specifies what parameters to perform the percentile-
; percentile test on. If you do not provide anything, all parameters
; in the posterior file will be used (that is set by the parameters
; argument in the [extract_posterior] section in workflow_config.ini). A
; p-value of p-values will be calculated for all parameters and reported
; in the summary table on the Percentile-Percentile table of the results
; page. We therefore do not want to include all parameters in the posterior
; file, since we have added parameters that are derived from the others in
; [extract_posterior] section. For this reason, we manually list all the
; parameters we want to do the pp test on here:
pp-params = delta_tc srcmass1 srcmass2 spin1_a spin1_azimuthal spin1_polar
            spin2_a spin2_azimuthal spin2_polar distance inclination
            polarization ra dec
; In order to do the PP test, the code needs to know what parameters in the
; posterior correspond to which injection parameters. Since we have applied
; some functions to the samples parameters when creating the posterior file
; (again, refer to the [extract_posterior] section in workflow_config.ini),
; the mapping between posterior parameters and injection parameters is no
; longer a 1:1. To tell the code how to map from the injections parameters
; the posterior parameters, we provide the following injection-samples-map.
; We can just copy most of the parameters argument from the [extract_posterior]
; section for this (we can't just do a reference because the wildcard (*) that
; is there is not understood by the injection-samples-map option.
injection-samples-map = 'primary_mass(srcmass1, srcmass2):srcmass1'
             'secondary_mass(srcmass1, srcmass2):srcmass2'
             'primary_spin(srcmass1, srcmass2, spin1_a, spin2_a):spin1_a'
             'primary_spin(srcmass1, srcmass2, spin1_azimuthal, spin2_azimuthal):spin1_azimuthal'
             'primary_spin(srcmass1, srcmass2, spin1_polar, spin2_polar):spin1_polar'
             'secondary_spin(srcmass1, srcmass2, spin1_a, spin2_a):spin2_a'
             'secondary_spin(srcmass1, srcmass2, spin1_azimuthal, spin2_azimuthal):spin2_azimuthal'
             'secondary_spin(srcmass1, srcmass2, spin1_polar, spin2_polar):spin2_polar'
             'mchirp_from_mass1_mass2(srcmass1, srcmass2):srcmchirp'
             'chi_eff_from_spherical(srcmass1, srcmass2, spin1_a, spin1_polar, spin2_a, spin2_polar):chi_eff'
             'chi_p_from_spherical(srcmass1, srcmass2, spin1_a, spin1_azimuthal, spin1_polar, spin2_a, spin2_azimuthal, spin2_polar):chi_p'
             'redshift_from_comoving_volume(comoving_volume):redshift'
             'distance_from_comoving_volume(comoving_volume):distance'
; Notice that we can provide more parameters to the injection-samples-map then
; what we will be using in the PP test. This is fine, extra parameters are
; just ignored. By providing all of the parameters here, we can re-use this
; argument for the posterior plots (see below).

[create_injections]
; Options for the create_injections executable. Do not provide a config file
; nor the number of injections to create here. The inference
; config file is used for generating injections, and the number is determined
; by the command-line options given to make_inference_inj_workflow

[plot_posterior_summary]
; Adding plot-injection-parameters will cause a red line to be plotted on
; the posterior plots showing the injection parameters.
plot-injection-parameters =
; In order for the redline to be plotted in the right place, we have to
; provide an injection samples map. We can just use what was used in the
; workflow-pp_test section.
injection-samples-map = ${workflow-pp_test|injection-samples-map}
; We do not need to provide any arguments, as the rest are set in
; workflow_config.ini.

[plot_posterior]
; Do the same for the full corner plots.
plot-injection-parameters =
injection-samples-map = ${workflow-pp_test|injection-samples-map}
; We do not need to provide any arguments, as the rest are set in
; workflow_config.ini.

[pp_table_summary]
; command line options for percentile-percentile table summary
; do not provide parameters or injection-samples map here, as that is read
; from the [workflow-pp_test] section

[plot_pp]
; command line options for percentile-percentile plot
; do not provide parameters or injection-samples map here, as that is read
; from the [workflow-pp_test] section

[inj_recovery]
; command line options for injection recovery plots
; do not provide parameters or injection-samples map here, as that is read
; from the [workflow-pp_test] section

Download

Generate the workflow

Assuming that you have downloaded all of the configuration files to the same directory, you can generate the workflow by running the following script:

set -e

WORKFLOW_NAME=bbh_injections-dynesty
# Set the HTML_DIR to point to your public html page. This is where the results
# page will be written.
HTML_DIR=''
if [ "${HTML_DIR}" == '' ]; then
    echo "Please set an HTML_DIR"
    exit 1
fi
SEED=983124
# Set the number of injections to create. For a full PP test, we suggest using
# 100.
NINJ=10
pycbc_make_inference_inj_workflow \
    --seed ${SEED} \
    --num-injections 10 \
    --config-files workflow_config.ini injections_config.ini \
    --workflow-name ${WORKFLOW_NAME} \
    --config-overrides results_page:output-path:${HTML_DIR}/${WORKFLOW_NAME}

Download

Note that you need to set the HTML_DIR before running. This tells the workflow where to save the results page when done. You can also change WORKFLOW_NAME if you like.

You should also change the SEED everytime you create a different workflow. This sets the seed that is passed to pycbc_inference (you set it here because it will be incremented for every pycbc_inference job that will be run in the workflow).

After the workflow has finished it will have created a directory named ${WORKFLOW_NAME}-output. This contains the dax and all necessary files to run the workflow.

Plan and execute the workflow

Change directory into the ${WORKFLOW_NAME}-output directory:

cd ${WORKFLOW_NAME}-output

If you are on the ATLAS cluster (at AEI Hannover) or on an LDG cluster, you need to define an accounting group tag (talk to your cluster admins if you do not know what this is). Once you know what accounting-group tag to use, plan and submit the workflow with:

# submit workflow
pycbc_submit_dax --dax ${WORKFLOW_NAME}.dax \
    --no-grid \
    --no-create-proxy \
    --enable-shared-filesystem \
    --accounting-group ${ACCOUNTING_GROUP}

Here, ${ACCOUNTING_GROUP} is the appropriate tag for your workflow.

Once it is running, you can monitor the status of the workflow by running ./status from within the ${WORKFLOW_NAME}-output directory. If your workflow fails for any reason, you can see what caused the failure by running ./debug. If you need to stop the workflow at any point, run ./stop. To resume a workflow, run ./start. If the pycbc_inference jobs were still running, and they had checkpointed, they will resume from their last checkpoint upon restart.

Results page

When the workflow has completed successfully it will write out the results page to the directory you specified in the create_inj_workflow.sh script. You can see what the result page will look like here.