Job Submission System - S P E C I F I C A T I O N

James Werner

Toolbox for Job Submission System.

There are some utilitary scripts that will help you with grid events and remove useless files and jobs. There are 5 tools:

  1. easylist: provide a list of all jobs in execution, files in SE/RLS and files in NFS. If you have change any NFS configuration, you need to change this script as well (lines 29, 31, and 33).

    1 #!/bin/bash 2 # Escript de submissao do GRID 3 # Author: Dr James Cunha Werner 4 # www.geocities.com/jamwer2002 5 # University of Manchester 6 # 7 if ( ls *.tok>/dev/null 2>/dev/null ) then 8 echo 9 echo "Jobs in production:" 10 echo 11 ls *.tok 12 echo 13 else 14 echo 15 echo No jobs in production. 16 echo 17 fi 18 if ( ls *.setok>/dev/null 2>/dev/null ) then 19 echo 20 echo Binaries stored in RLS/SE: 21 echo 22 ls *.setok 23 echo 24 else 25 echo 26 echo No binaries stored in RLS/SE. 27 echo 28 fi 29 if ( ls /exp_software/babar01/$USER*>/dev/null 2>/dev/null ) then 30 echo 31 echo Binaries stored in /exp_software/babar01: 32 echo 33 ls /exp_software/babar01/$USER* 34 echo 35 else 36 echo 37 echo No binaries stored in /exp_software/babar01. 38 echo 39 fi
  2. easyresub: if you need to resubmit any job previously generated, just type: ./easyresub dataset_name number This is a very important command to recover from crashes and aborts. Do not use easygrid in this case, otherwise you will process all files again.

    1 #!/bin/bash 2 # Escript de REsubmissao do GRID 3 # Author: Dr James Cunha Werner 4 # www.geocities.com/jamwer2002 5 # University of Manchester 6 # 7 if [ $# != 2 ] 8 then 9 echo Dear $USER, 10 echo "You should provide the dataset name and number you want to re-submit. For example," 11 echo 12 echo " ./easyresub Tau11-Run3-OnPeak-R14 5 " 13 echo 14 echo "will submit Tau11-Run3-OnPeak-R14-5 job again.Recovering process will remove previous results for this processing." 15 echo 16 echo "You can find more information here: http://www.hep.man.ac.uk/u/jamwer/#sec8" 17 echo " User manual at http://www.hep.man.ac.uk/u/jamwer/userman.html" 18 exit 19 fi 20 21 if (! grid-proxy-info -e); then 22 echo "No valid proxy. Creating one with 10 days long." 23 grid-proxy-init -valid 240:00 24 fi 25 echo Job Resubmission. 26 edg-job-submit --vo babar $1-$2.jdl > gridtokens 27 awk '/https/ {print $2}' gridtokens >> $1-$2.tok 28 echo >> $1.histo 29 echo "--------------------------------------------------------------------------------" >> $1.histo 30 echo " R E S U B M I S S I O N " >> $1.histo 31 date >> $1.histo 32 echo >> $1.histo 33 cat gridtokens >> $1.histo 34 subok=`cat $1-$2.tok | wc -l` 35 if [ $subok == 0 ] 36 then 37 echo Resubmission abended with errors: see $1.histo file for details. 38 fi
  3. Easykill: allow cancel any specific job: ./easykill dataset_name number

    1 #!/bin/bash 2 # Escript de eliminacao incondicional dos resultados 3 # Author: Dr James Cunha Werner 4 # www.geocities.com/jamwer2002 5 # University of Manchester 6 # 7 if [ $# != 2 ] 8 then 9 echo Dear $USER, 10 echo "You should provide job's dataset name and number you want to kill. For example," 11 echo 12 echo " ./easykill Tau11-Run3-OnPeak-R14 5 " 13 echo 14 echo "will delete Tau11-Run3-OnPeak-R14-5 file processing. To obtain a complete list of pending jobs, type:" 15 echo 16 echo " ./easylist " 17 echo 18 echo "You can find more information here: http://www.hep.man.ac.uk/u/jamwer/#sec8" 19 echo " User manual at http://www.hep.man.ac.uk/u/jamwer/userman.html" 20 exit 21 fi 22 23 if(! grid-proxy-info -e); then 24 echo "No valid proxy. Creating one with 10 days long." 25 grid-proxy-init -valid 240:00 26 fi 27 rm gridrec gridgetout > /dev/null 2>/dev/null 28 cat $1-$2.tok | awk '// {print "edg-job-cancel --noint ",$1," >> gridgetout"}' >> gridrec 29 chmod 700 gridrec 30 ./gridrec 31 echo "----------------------------------------------------------------------" >> $1.histo 32 echo " J o b C a n c e l l i n g " >> $1.histo 33 date >> $1.histo 34 echo >> $1.histo 35 cat gridgetout >> $1.histo 36 echo 37 cat gridgetout
  4. easykall: kill all jobs from one dataset at once. ./easykall dataset_name

    1 #!/bin/bash 2 # Escript de eliminacao incondicional dos jobs em processamento do dataset 3 # Author: Dr James Cunha Werner 4 # www.geocities.com/jamwer2002 5 # University of Manchester 6 # 7 if [ $# != 1 ] 8 then 9 echo Dear $USER, 10 echo "You should provide job's dataset name you want to kill. For example," 11 echo 12 echo " ./easykill Tau11-Run3-OnPeak-R14 " 13 echo 14 echo "will delete ALL Tau11-Run3-OnPeak-R14 file processing. To obtain a complete list of pending jobs, type:" 15 echo 16 echo " ./easylist " 17 echo 18 echo "You can find more information here: http://www.hep.man.ac.uk/u/jamwer/#sec8" 19 echo " User manual at http://www.hep.man.ac.uk/u/jamwer/userman.html" 20 exit 21 fi 22 23 if(! grid-proxy-info -e); then 24 echo "No valid proxy. Creating one with 10 days long." 25 grid-proxy-init -valid 240:00 26 fi 27 28 rm gridkall gridkallout > /dev/null 2>/dev/null 29 cat `ls $1*.tok` | awk '// {print "edg-job-cancel --noint ",$1," >> gridkallout"}' >> gridkall 30 chmod 700 gridkall 31 ./gridkall 32 rm $1*.tok 33 echo "----------------------------------------------------------------------" >> $1.histo 34 echo " T o t a l J o b C a n c e l l i n g " >> $1.histo 35 cat gridkallout >> $1.histo 36 echo
  5. easybinrm: this is the last command you have to use. When you have finished all job, and all of them are fine, you do not need the binary code anymore. If you change your mind, you will have to change the code (if you have change since you submited the last time). ./easybinrm binary_name The name is obtained using easylist. Do not forget to change the SE name you have stored the file (line 27).

    1 #!/bin/bash 2 # Escript de delecao binarios SE/RLS 3 # Author: Dr James Cunha Werner 4 # www.geocities.com/jamwer2002 5 # University of Manchester 6 # 7 if [ $# != 1 ] 8 then 9 echo Dear $USER, 10 echo "You should provide the binary file name you want to remove from SE/RLS. For example," 11 echo 12 echo " ./easybinrm jamwer_bfb.tier2.hep.man.ac.uk_Tau11-Run3-OnPeak-R14_16003622Feb05_BetaMiniApp " 13 echo 14 echo "To obtain a complete list of binary files, type:" 15 echo 16 echo " ./easylist " 17 echo 18 echo "You can find more information here: http://www.hep.man.ac.uk/u/jamwer/#sec8" 19 echo " User manual at http://www.hep.man.ac.uk/u/jamwer/userman.html" 20 exit 21 fi 22 23 if(! grid-proxy-info -e); then 24 echo "No valid proxy. Creating one with 10 days long." 25 grid-proxy-init -valid 240:00 26 fi 27 edg-rm --vo=babar del lfn:$1 -s grid2.fe.infn.it
Top

Last modified:
Copyright 2004 Manchester University
Feedback to: jamwer@hep.man.ac.uk