slurmpie package¶
Submodules¶
slurmpie.slurmpie module¶
-
class
slurmpie.slurmpie.
Job
(script, script_is_file=True, array=[], cpus_per_task=-1, error_file='', gpus={}, gres={}, logfile_directory='', mail_address='', mail_type='', memory_size='', name='', nodes=-1, nodelist='', output_file='', partition='', tasks=-1, time='', workdir='')[source]¶ Bases:
object
-
__init__
(script, script_is_file=True, array=[], cpus_per_task=-1, error_file='', gpus={}, gres={}, logfile_directory='', mail_address='', mail_type='', memory_size='', name='', nodes=-1, nodelist='', output_file='', partition='', tasks=-1, time='', workdir='')[source]¶ A SLURM job which is submitted using sbatch
- Parameters
script (str) – The script file or command which the job should execute.
script_is_file (bool) – If the script string is a command to execute directly instead of a bash script, set this to False. Defaults to True.
array (list or str, optional) – Optional array parameters to launch multiple jobs. When a list is provided a job will be executed with each parameters in the list. A string can be provided to allow array construction in the SLURM format.
cpus_per_task (int, optional) – Number of cpus for each task.
error_file (str, optional) – File path for the slurm error file.
gpus (dict, optional) – Specify the gpu requirements for the job. See also gres.
gres (dict, optional) – Specify the gres requirements for the jobs. See
slurmpie.slurmpie.Job.gres()
for the full specification.logfile_directory (str, optional) – Set a base directory for the output and error files. If this is set, the full paths don’t have to be specified for error_file and output_file
mail_address (str, optional) – Mail address to send notifications to.
mail_type (str, optional) – Specify for which events a notification should be send. One of: NONE, BEGIN, END, FAIL, REQUEUE, ALL
memory_size (str or int) – Specify memory requirement for job. See
slurmpie.slurmpie.Job.memory_size()
for the specification.name (str, optional) – The name of the job.
nodes (int, optional) – Number of nodes to use for the job.
nodelist (str, optional) – Request specific host nodes for job.
output_file (str, optional) – File path for the slurm output file.
partition (str, optional) – Name of the partition to which to submit the job.
tasks (int, optional) – Number of tasks.
time (str, optional) – The expected/maximum wall time for the job. Needs to be specified in the SLURM format, one of: “minutes”, “minutes:seconds”, “hours:minutes:seconds”, “days-hours”, “days-hours:minutes” and “days-hours:minutes:seconds”
workdir (str, optional) – The directory to change to at the start of job execution.
- Raises
RuntimeError – If the job could not be successfully executed
-
property
array
¶ The values of the job array.
SLURM supports submitting the same script with different parameters. Set the array either as a list of values or as a string. When specified as string it is directly parsed to SLURM, thus the string should already be in the SLURM format. When a list is specified it will parse all these values to the array setting.
- Return type
str
-
static
attribute_is_empty
(attribute_value)[source]¶ Checks whether an attribute is empty
- Parameters
attribute_value (str, numbers.Number, dict, or list]) – The value to check
- Returns
True if attribute is empty,false otherwise.
- Return type
bool
-
depends_on
(job_id, dependency_type='afterany')[source]¶ Sets the dependencies of this job based on the SLURM job number.
When submitting a job that depends on another job this can be set using the job id of the job.
Example
>>> from slurmpie import slurmpie >>> job = slurmpie.Job("slurm_script.sh") >>> dependent_job = slurmpie.Job("slurm_script_2.sh") >>> job_id = job.submit() >>> dependent_job.depends_on(job_id) >>> dependent_job.submit()
The dependent_job will now only start running when job has finished.
- Parameters
job_id (list or str) – The job id (or multiple ids as a list) on which the job depends.
dependency_type (str, optional) – The dependency type of the job (see sbatch documentation). Defaults to “afterany”.
-
property
gpus
¶ The gpus to request from the SLURM jobs.
Just like the gres resources, one can request the gpu resources for the job. This is configuration dependent, so make sure your cluster supports this. The configuration has to be applied in the same way as for gres.
- Return type
str
-
property
gres
¶ The gres resources to request for the job.
The gres resources should be formatted as a (possibly nested) dict. For example: job.gres = {“gpu”:1} requests one gpu from gres. jobs.gres = {“gpu”: {“k40”: 1, “k80”: 1}} requests one k40 gpu and one k80 gpu.
- Return type
str
-
property
memory_size
¶ The memory size to request for the job.
Memory size can be set either as a float, then the default memory units for the SLURM configuration is used. Otherwise, the memory size can specified as a string, including the units. For example “15GB” will set the request memory to 15 GB. Supported memory units are K/KB for kilobyte, M/MB for megabyte G/GB for gigabyte and T/TB for terabyte
- Return type
str
-
property
memory_units
¶ The current memory units
- Return type
str
-
property
nodelist
¶ List of nodes to run the job on.
When you want to run a job on a specific node (that is not specified by a queue), you can use this argument to specify the exact nodes you want to run the job on.
- Return type
str
-
-
class
slurmpie.slurmpie.
Pipeline
(common_job_header=None, **kwargs)[source]¶ Bases:
object
Examples
Simple pipeline in which jobs are added consecutively.
from slurmpie import slurmpie pipeline = slurmpie.Pipeline() start_job = slurmpie.Job("slurm_script.sh") dependent_job = slurmpie.Job("slurm_script_2.sh") pipeline.add(start_job) # This job will wait for start_job to finish pipeline.add(dependent_job) pipeline.submit()
-
__init__
(common_job_header=None, **kwargs)[source]¶ Pipeline to be constructed with multiple jobs depending on each other.
This pipeline makes it easier to create multiple job that depend on each other and submit them all at ones. Jobs can be added to the pipeline with different dependencies and the pipeline can then be submitted as a whole, which will take care off the dependencies between the different jobs.
- Parameters
common_job_header (str) – In case the command to execute is direct command (not a file), this will be prepended to every job.
kwargs – Arguments that will be specified for each job, if that argument has not been set for the job already.
-
add
(jobs, parent_job=None)[source]¶ Add dependency jobs to the pipeline.
Jobs can keep being added, in which case they are execute consecutively. Otherwise, a dict can be specified with the job dependency type and list of the jobs with that dependency type. A parent job can be set if the jobs depend on a certain parent job. Otherwise the jobs will just be added to the end of the list and are executed consecutively.
- Parameters
jobs (Job or dict) – The jobs to add to the pipeline. Either a single job which will be added to the end of the pipeline, or a dict specifying the dependency type and a list with the dependent jobs
parent_job (Job or list, optional) – If not None, will use this as the job on which the jobs are dependent, a list in case of multiple parent jobs. Defaults to None.
-