.. _workflowsplittablemod: ######################################## The workflow table splitting module ######################################## ============== Introduction ============== This module is used when you want to split a file into multiple parts, normally to enable analysis to proceed in parallel. The most common example of this is to split the list of templates output by a template bank generation code to enable a set of matched-filter jobs to analyse that bank in parallel. If you want to do something similar this module is the place to do it. The return of the table splitting module is a pycbc FileList of the split files generated by this module. ======= Usage ======= Using this module requires a number of things * A configuration file (or files) containing the information needed to tell this module how to generate (or gather) the template banks (described below). * An initialized instance of the pycbc Workflow class, containing the ConfigParser. * A FileList of the files that are to be split. This module is then called according to .. autofunction:: pycbc.workflow.setup_splittable_workflow :noindex: ------------------------- Configuration file setup ------------------------- Here we describe the options given in the configuration file used in the workflow that will be needed in this section $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ [workflow-splittable] section $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ The configuration file must have a [workflow-splittable] section, which is used to tell the workflow how to construct the split output files. The first option to choose and provide is * splittable-method = VALUE The choices here and their description are as described below * IN_WORKFLOW - The file splitting jobs will be added as jobs in the workflow and will be generated after submission of the workflow. * NOOP - Do nothing and return the input file list. It is better not to call the module at all if you do not want to split files, but this can be useful if you want to use an existing script and do not need the splittable functionality. When using IN_WORKFLOW the following additional option is needed: * splittable-num-banks = VALUE - Specifies how many parts to split each input file into. $$$$$$$$$$$$$$$ [executables] $$$$$$$$$$$$$$$ In this section, if not using NOOP, you need to supply the executable that will be used to generate the time slide files. This is done in the [executables] section by adding something like: splittable = /path/to/pycbc_splitbank The option, in this case 'splittable', will be used to specify the constant command line options that are sent to all pycbc_splitbank jobs. These will need to be put in a section called [splittable] and the options themselves are discussed below. The tag 'splittable' cannot be changed currently. **FIXME: Tag support is not yet present in splittable, the following is currently untrue, but should be fixed.** As with other modules tagged subsections [splittable-TAG] [workflow-splittable-TAG] sub-sections are supported, if this module needs to be run in different configurations ------------------------------------------------------------------- Supported splittable executables and instructions for using them ------------------------------------------------------------------- The following splittable executables are currently supported: * pycbc_splitbank * lalapps_splitbank - **NOTE**: The output of this code can be unpredicatable, or broken. We strongly recommend using pycbc_splitbank. For this reason we do not give any further details about running this code. Adding a new executable is not too hard, please ask a developer for some pointers on how to do this if you want to add a new code. $$$$$$$$$$$$$$$$$ pycbc_splitbank $$$$$$$$$$$$$$$$$ pycbc_splitbank is a pycbc python code that can be used from splitting any table in an input xml file. Normally this splits the sngl_inspiral table that holds the template bank. The help message for pycbc_splitbank is as follows .. command-output:: pycbc_splitbank --help An example of a pycbc_splitbank call is given below .. code-block:: bash /home/spxiwh/lscsoft_git/executables_master/bin/pycbc_splitbank --random-sort --bank-file /home/spxiwh/lscsoft_git/src/pycbc/examples/ahope/weekly_ahope/961585543-961671944/datafind/H1-TMPLTBANK-961585551-2048.xml.gz --output-filenames /home/spxiwh/lscsoft_git/src/pycbc/examples/ahope/weekly_ahope/961585543-961671944/datafind/H1-TMPLTBANK_SPLITTABLE_BANK0-961585551-2048.xml.gz /home/spxiwh/lscsoft_git/src/pycbc/examples/ahope/weekly_ahope/961585543-961671944/datafind/H1-TMPLTBANK_SPLITTABLE_BANK1-961585551-2048.xml.gz /home/spxiwh/lscsoft_git/src/pycbc/examples/ahope/weekly_ahope/961585543-961671944/datafind/H1-TMPLTBANK_SPLITTABLE_BANK2-961585551-2048.xml.gz /home/spxiwh/lscsoft_git/src/pycbc/examples/ahope/weekly_ahope/961585543-961671944/datafind/H1-TMPLTBANK_SPLITTABLE_BANK3-961585551-2048.xml.gz /home/spxiwh/lscsoft_git/src/pycbc/examples/ahope/weekly_ahope/961585543-961671944/datafind/H1-TMPLTBANK_SPLITTABLE_BANK4-961585551-2048.xml.gz The following options are added by the workflow module and **must not** be provided in the configuration file * --bank-file * --output-filenames ============================================ :mod:`pycbc.workflow.splittable` Module ============================================ This is complete documentation of this module's code .. automodule:: pycbc.workflow.splittable :noindex: :members: :undoc-members: :show-inheritance: