Job Submission System Prototype - Version 2

James Werner

Initial startup and BaBar Analysis


Installing babar analysis.

This process should be done only once per analysis project to install user's babar release. Use Copy (at browser window, Mark text Ctrl-C) and Paste (command shell windows. Rigth-button Paste) to avoid type all these commands.
  1. Obtain the X509 Certificate at eScience: Certificate Authority - Home Page.
  2. Convert it to pem format. openssl pkcs12 -nokeys -clcerts -in user.p12 -out usercert.pem openssl pkcs12 -nocerts -in user.p12 -out userkey.pem
  3. Copy usercert.pem and userkey.pem in .globus directory at your home AFS directory: cd scp -r jamwer@bfb.tier2.hep.man.ac.uk:/home/jamwer/.globus .
  4. The first step is install release in your AFS area, as described in : http://www.hep.man.ac.uk/u/jamwer/bbnewrel.html
  5. Run initialisation script. At Manchester: . /afs/hep.man.ac.uk/g/bfactory/etc/hepix/bashrc
  6. Create a work are for your release installation and to store results (if size bigger than your available area in AFS): mkdir /afs/hep.man.ac.uk/u/jamwer/work mkdir /nfs/work/users/jamwer
  7. Install the new release: newrel -s /afs/hep.man.ac.uk/u/jamwer/work -t analysis-21 BbSoft cd BbSoft/ export CVSROOT=/afs/slac.stanford.edu/g/babar/repo klog jamwer@slac.stanford.edu srtpath addpkg workdir gmake workdir.setup addpkg BetaMiniUser cd BetaMiniUser/ rm *.cc rm *.hh rm *.tcl
  8. Copy (or develop) your analysis program and your tcl files: scp jamwer@bfb.tier2.hep.man.ac.uk:/home/jamwer/PgmCM2/BetaMiniUser/*.cc . scp jamwer@bfb.tier2.hep.man.ac.uk:/home/jamwer/PgmCM2/BetaMiniUser/*.hh . scp jamwer@bfb.tier2.hep.man.ac.uk:/home/jamwer/PgmCM2/BetaMiniUser/*.tcl . cd ../workdir/ scp jamwer@bfb.tier2.hep.man.ac.uk:/home/jamwer/PgmCM2/workdir/* .
  9. Compile and link your analysis program to generate the binary file: cd .. gmake lib gmake BetaMiniUser.bin
  10. Running locally to test if everything is OK: cd workdir/ BetaMiniApp pi0rojectJob.tcl

Installing easygrid for babar analysis.

This process should be done only once per analysis project, as far you do not create new tcl files. In the future, when babar lack of management were solved, this process will be automatic (if necessary!). Meanwhile, until information is public and software properly installed at babar collaboration, if you want to use grid you will have to obtain the information by yourself.
  1. Download easygrid, easyresub, easylist, easybinrm, easykill, easykall, and gera.c files, just click in the name and save in your workdir directory.
  2. Define where do you want to save the binary file (a shared NFS area or SE/RLS). See Stacking process for more information. If you are using RLS, put commentaries in the lines 91 to 96 from easygrid, and remove commentaries from lines 82 to 90.
  3. At line 83 replace grid2.fe.infn.it by your SE. Beg grid manager for your closest SE name. Tell her you need to store around 300 MBytes, which allow you to have 3 submission pendings. Please, them let me know! She does not talk with me.
  4. Generation of grid dependent tcl file. Edit gera.c and make the following changes: In your tcl file, put commentaries (#) in the following lines: #set ConfigPatch Run2 #set levelOfDetail cache #set BetaMiniTuple hbook #set histFileName your_histogram_file_name.hbk #set NEvent 100 #source datafile_file_from_bookkeeper.tcl These commands will be generated in the grid dependent tcl file. At line 32, write the name of your tcl file for analysis. Do not use path, because all files are stored in the same directory in WN.
  5. Generation of runtime script file. Line 55 should have a standard initialisation script. Ask some one about the script in the site you want to run. Lines 62 to 65 is the initialisation of babar release. Set for your release. If you are storing binary file in RLS/SE, use lines 68 to 72 to describe your file lfn (line 69). Put commentaries in lines 74 and 75. If you are storing your binary in shared NFS, put commentaries from line 68 to 72. Write in line 75 the correct path to your binary file.
  6. Generation of JDL file. Replace pi0roject.tcl by your analysis tcl file. Replace my condition db fullboot.sh by your condition and configuration db. You can use my db only if you are runninmg at manchester. If you are running in my testbed for less than 2 hours, select line 91 and comments in 92 and 93. If you are running in my testbed without time limits, use line 92 and comment lines 91 and 93. If you want to run in Production farm, use line 93, and comment lines 91 and 92.
  7. Compile gera.c: gcc gera.c -o gera This software does not have any panic message, warnings, incompatible operating system, etc. It should give you nothing: [jamwer@bfb gridbfb]$ gcc gera.c -o gera [jamwer@bfb gridbfb]$
  8. Verify if User Interface (UI) is installed in your computer. Type: [jamwer@bfb gridbfb]$ edg-job-submit --version Job Submission User Interface version lcg2.1.54 [jamwer@bfb gridbfb]$ If you have the message: bash-2.05b$ edg-job-submit --version bash: edg-job-submit: command not found contact your grid manager. UI is not installed in your computer. Another option is to use lxplus, an AFS version of LCG software: . /afs/cern.ch/project/gd/LCG-share/2.3.1/sl3/etc/profile.d/grid_env.sh Documentation is available at LXplus site.

Running your analysis software in grid.

After you have installed grid in your computer, you will need only to perform the following tasks everytime you want to submit a new analysis software to grid.

  1. Create a proxy. Command valid allow create long term proxys. For example, to create one with 10 days (240 hours): grid-proxy-init -valid 240:00
  2. Submit your job: type ./easygrid dataset_name For example: bash-2.05a$ ./easygrid Tau11-Run3-OnPeak-R14 If you have something such as: ... Sub Tau11-Run3-OnPeak-R14-9 **** Error: UI_NO_VO_CONF_INFO **** Unable to find configuration information for VO "babar" **** Error: UI_NO_VOMS **** Unable to determine a valid user's VO Try download the configuration files Babar_UI.cfg and Babar_VO.cfg and change line 98 from 98 printf("edg-job-submit --vo babar %s >> trab\n",nomearq); to 98 printf("edg-job-submit --config-vo Babar_VO.cfg --config Babar_UI.cfg %s >> trab\n",nomearq); Any case, contact your grid manager. UI not configured for babar project properly.
  3. If you succeed to submit your job, typing the command again will provide the status of your jobs, save results from finished jobs, or recover nasty listings with abort messages: bash-2.05a$ ./easygrid Tau11-Run3-OnPeak-R14 The abort listing is huge, and the only line that is important starts by "reason". For example: - reason = Cannot download fullboot.sh from gsiftp://lcgrb01.gridpp.rl.ac.uk/var/edgwl/SandboxDir/DB/https_3a_2f_2flcgrb01.gridpp.rl.ac.uk_3a9000_2fDB55-uCA3ApJlAqTHwInYg/input/ To see what is going wrong: cat dataset.histo | grep reason and call me for support!
  4. You will find your results in the directories datasetname-1, 2, 3, ... Check if all of them are there.

Resubmit aborted jobs again!!!

It happens! What to do? Have a look in the number of the file (see Toolbox), and type: ./easyresub dataset_name number At any time, you can check how it is going typing: ./easygrid dataset

Killing some jobs.

If you want to kill some of the files (see Toolbox), type: ./easykill dataset_name number If you really changed your mind and want KIL ALL OF THEM, type: ./easykall dataset_name You can have a list of all binary files and job pending typing: ./easylist

Removing your tail.

After you have finish all jobs, and all results are fine, you do not need to keep binary files in SE/RLS. See Toolbox. Type the command: ./easybinrm bin_name You can have a list of all binary files and job pending typing: ./easylist

Top

Last modified:
Copyright 2004 Manchester University
Feedback to: jamwer@hep.man.ac.uk