xqute.schedulers.slurm_scheduler
module
xqute.schedulers.slurm_scheduler
The scheduler to run jobs on Slurm
Classes
SlurmJob
— Slurm job</>SlurmScheduler
— The Slurm scheduler</>
class
xqute.schedulers.slurm_scheduler.
SlurmJob
(
index
, cmd
, metadir=PosixPath('.xqute')
, error_retry=None
, num_retries=None
)
Bases
Slurm job
Parameters
index
(int) — The index of the jobcmd
(str | List[str]) — The command of the jobmetadir
(PathLike, optional) — The meta directory of the Joberror_retry
(Optional[bool], optional) — Whether we should retry if error happenednum_retries
(Optional[int], optional) — Total number of retries
Attributes
CMD_WRAPPER_SHELL
— The shell to run the wrapped scriptCMD_WRAPPER_TEMPLATE
— The template for job wrapping_error_retry
— Whether we should retry if error happened_num_retries
— Total number of retries_rc
— The return code of the job_status
— The status of the job_wrapped_cmd
— The wrapped cmd, used for job submissioncmd
— The commandhook_done
— Mark whether hooks have already been. Since we don't havea trigger for job finished/failed, so we do a polling on it. This is to avoid calling the hooks repeatedlyindex
— The index of the jobjid
— The jid of the job in scheduler systemjid
(int | str | none) — Get the jid of the job in scheduler system</>jid_file
(Path) — The jid file of the job</>metadir
— The metadir of the jobrc
(int) — The return code of the job</>rc_file
(Path) — The rc file of the job</>retry_dir
(Path) — The retry directory of the job</>status
(int) — Query the status of the job
If the job is submitted, try to query it from the status file Make sure the status is updated by trap in wrapped script </>status_file
(Path) — The status file of the job</>stderr_file
(Path) — The stderr file of the job</>stdout_file
(Path) — The stdout file of the job</>strcmd
(str) — Get the string representation of the command</>trial_count
— The count for re-tries
method
wrapped_script
(
scheduler
)
Get the wrapped script
Parameters
scheduler
(Scheduler) — The scheduler
Returns (PathLike)
The path of the wrapped script
class
xqute.schedulers.slurm_scheduler.
SlurmScheduler
(
*args
, **kwargs
)
The Slurm scheduler
Attributes
job_class
— The job classname
— The name of the scheduler
Parameters
**kwargs
— Other arguments for the scheduler
Methods
job_is_running
(
job
)
(bool) — Tell if a job is really running, not only the job.jid_file</>job_is_submitted_or_running
(
job
)
(bool) — Check if a job is already submitted or running</>kill_job
(
job
)
— Kill a job on Slurm</>kill_job_and_update_status
(
job
)
— Kill a job and update its status</>kill_running_jobs
(
jobs
)
— Try to kill all running jobs</>polling_jobs
(
jobs
,on
,halt_on_error
)
(bool) — Check if all jobs are done or new jobs can submit</>retry_job
(
job
)
— Retry a job</>submit_job
(
job
)
(str) — Submit a job to Slurm</>submit_job_and_update_status
(
job
)
— Submit and update the status</>
method
submit_job_and_update_status
(
job
)
Submit and update the status
- Check if the job is already submitted or running
- If not, run the hook
- If the hook is not cancelled, clean the job
- Submit the job, raising an exception if it fails
- If the job is submitted successfully, update the status
- If the job fails to submit, update the status and write stderr to the job file
Parameters
job
(Job) — The job
method
kill_job_and_update_status
(
job
)
Kill a job and update its status
Parameters
job
(Job) — The job
method
polling_jobs
(
jobs
, on
, halt_on_error
)
Check if all jobs are done or new jobs can submit
Parameters
jobs
(List) — The list of jobson
(str) — query on status:can_submit
orall_done
halt_on_error
(bool) — Whether we should halt the whole pipeline on error
Returns (bool)
True if yes otherwise False.
method
kill_running_jobs
(
jobs
)
Try to kill all running jobs
Parameters
jobs
(List) — The list of jobs
method
job_is_submitted_or_running
(
job
)
Check if a job is already submitted or running
Parameters
job
(Job) — The job
Returns (bool)
True if yes otherwise False.
class
xqute.schedulers.slurm_scheduler.
SlurmJob
(
index
, cmd
, metadir=PosixPath('.xqute')
, error_retry=None
, num_retries=None
)
Bases
Slurm job
Attributes
jid
(int | str | none) — Get the jid of the job in scheduler system</>jid_file
(Path) — The jid file of the job</>rc
(int) — The return code of the job</>rc_file
(Path) — The rc file of the job</>retry_dir
(Path) — The retry directory of the job</>status
(int) — Query the status of the job
If the job is submitted, try to query it from the status file Make sure the status is updated by trap in wrapped script </>status_file
(Path) — The status file of the job</>stderr_file
(Path) — The stderr file of the job</>stdout_file
(Path) — The stdout file of the job</>strcmd
(str) — Get the string representation of the command</>
method
Get the wrapped script
Parameters
scheduler
(Scheduler) — The scheduler
Returns (PathLike)
The path of the wrapped script