Resubmit aborted jobs
It happens! What to do? Have a look in the number of the file, and type: easyresub dataset_name number At any time, you can check how it is going typing: easygrid datasetSubmission in batch and scripts.
Sometimes, you want to submit several datasets and programs in a script to avoid wait. There are scripts that will not check if you have proxy, you have grid user interface installed in your computer, etc. These scripts follow the same sintax of their interactive partners, and the only difference is the name: easybapp is the batch version of easyapp easybsub is the batch version of easysub For example: cat tesbatch easybapp Root-04.01-02 data1 run1OnPeak main 21 40 20 easybapp Root-04.01-02 data2 run2OnPeak main 1 20 20 easybapp Root-04.01-02 data3 run3OnPeak main 1 20 20 easybapp Root-04.01-02 data4 run4OnPeak main 1 20 20 chmod 755 tesbatch ./tesbatch The script tesbatch will submit 4 sets of jobs for each different dataset without ask any question. The history files data1.histo, data2.histo, data3.histo, and data4.histo will contain all the messages.Killing some jobs.
If you want to kill some of the files, type: easykill dataset_name number If you really changed your mind and want KIL ALL OF THEM, type: easykall dataset_name You can have a list of all binary files and job pending typing: easylistRemoving your binaries from grid.
Everytime you submit a job to grid, a copy of your binary code is made in each storage element's site where your code will run. After you have finish all jobs, and all results are fine and you will not re-submit any job with the same binary, you need to remove the binary files from SE/LFC. Type the command: easybinrm bin_name You can have a list of all binary files and job pending typing: easylist
Data load in the Storage element.
- Obtain a list of files from BookKeeping: BbkUser --dataset dataset_name --file_status=0 --output=- --quiet file > file_list For example: BbkUser --dataset Tau1N-Run3-OffPeak-R14 --file_status=0 --output=- --quiet file > Tau1N_lista cat Tau1N_lista /store/PRskims/R12/14.3.2h/Tau1N/02/Tau1N_0263.01.root /store/PRskims/R12/14.3.2h/Tau1N/02/Tau1N_0263.02HBCA.root /store/PRskims/R12/14.3.2h/Tau1N/02/Tau1N_0264.01.root /store/PRskims/R12/14.3.2h/Tau1N/02/Tau1N_0231.01.root /store/PRskims/R12/14.3.2h/Tau1N/02/Tau1N_0231.02HBCA.root
- The script to upload the dataset to the storage element is: easydtload SE_name dataset_name store_path file_list &> file_list.sai where SE_name given by: lcg-infosites --vo babar se dataset_name : BbkDatasetTcl --dbsite=local > saida.txt vi saida.txt store_path : vi $BFROOT/kanga/config/KanAccess.cfg file_list : BbkUser --dataset dataset_name --file_status=0 --output=- --quiet file > file_list For example: easydtload dcache01.tier2.hep.manchester.ac.uk Tau1N-Run3-OffPeak-R14 /nfs/work Tau1N_lista &> Tau1N_lista.sai The file Tau1N_lista.sai will contain all messages and errors.
Listing datasets from Storage element.
easydtls dataset_name &> file_list.ls For example: easydtls Tau1N-Run3-OffPeak-R14 &> Tau1N-Run3-OffPeak-R14.lsCreating replicas from datasets already at some storage element.
easydtrep SE_name dataset_name &> file_list.rep where SE_name given by: lcg-infosites --vo babar se dataset_name : BbkDatasetTcl --dbsite=local > saida.txt vi saida.txt For example: easydtrep dcache.gridpp.rl.ac.uk Tau1N-Run3-OffPeak-R14 &> Tau1N-Run3-OffPeak-R14.repDataset remove from storage element.
easydtrm SE_name dataset_name &> file_list.rm where SE_name given by: lcg-infosites --vo babar se dataset_name : BbkDatasetTcl --dbsite=local > saida.txt vi saida.txt For example: easydtrm dcache.gridpp.rl.ac.uk Tau1N-Run3-OffPeak-R14 &> Tau1N-Run3-OffPeak-R14.rm
|
|
|
Feedback to: jamwer@hep.man.ac.uk |