If your pending job requests more time than is available before a scheduled maintenance window begins, it will not start. It may be deleted during the maintenance period. If your job can finish in less time, you could use the qalter command to change its time limit to less than the remaining time to shutdown. Here’s how to do it.
- Find all the -l arguments that your job uses. At the shell prompt, enter:
qstat -j jobnumber | grep 'hard resource_list'
where jobnumber is your job number. It returns something like:
hard resource_list: h_data=4000M,h_rt=1209600,highp=true
- Use the qalter command to fix the job. At the shell prompt, enter:
qalter -l h_data=4000M,highp=true,h_rt=288:00:00 jobnumber
where 288:00:00 is no larger than the maximum highp h_rt value.
You can find the maximum highp h_rt value with:
qconf -sq highp-queue-name | grep h_rt
where highp-queue-name is one of the highp queues from
qconf -sql
You can find all of your pending (qw) job numbers with:
myjobs -s p
or,
qstat -s p -u $USER
If you have a lot of jobs that need qalter
If you have a lot of jobs that need qalter, you can save a list of the job numbers, one per line, in a file named $USER.joblist and use the create-qalter-commands script to make those qalter commands. The script does not run the qalter commands, it just creates them. You may want to use it like:
- Make the job number list and run the script:
qstat -s p -u $USER > $USER.joblist create-qalter-commands 288 > my.qalter.cmds
where 288 is your new time request in hours.
- Check the script output:
cat my.qalter.cmds
- Run the qalter commands:
sh -x my.qalter.cmds
If you need assistance doing this, please contact user support.