Authorization to use Hoffman2 GPU nodes
To use a Hoffman2 node which has gpus (graphical processing units), you need to add your account to the gpu group. To do this, point your browser at Getting started and click Update Your Profile.
It will ask you some questions and then give you a form. Check the gpu box at the bottom of the form, and click Update.
After your request is approved and after the sysadmin has added your username to the gpu group, the Grid Identity Manager will send you a confirmation email.
In order to use a node that has a gpu, you need to request it from the job scheduler. Nodes may have two gpus (Tesla T10) or three gpus (Tesla M2070 nodes). To begin an interactive session, at the shell prompt, enter:
qrsh -l gpu
The above qrsh command will reserve an entire gpu node with its 2 or 3 gpu processors. The maximum amount of memory (h_data or mem) that you can request is 24G on the Tesla T10 nodes, or 48G on the Tesla M2070 nodes. An interactive session made with the above qrsh command will expire in 2 hours. The maximum amount of time for a session is 9 hours.
- To specify a different time limit for your session, use the h_rt or time parameter. Example for requesting 9 hours:
qrsh -l gpu,h_rt=9:00:00
- To reserve two nodes at a login node shell prompt, enter:
qrsh -l gpu -pe dc_gpu 2
- To see which node(s) were reserved, at a g-node shell prompt enter:
- To see if the gpu nodes are up and/or in use, at any shell prompt enter:
- To see the specifics for a particular gpu node, at a g-node shell prompt enter:
- To get a quick session for compiling or testing your code. This does not give you exclusive use of the gpu node:
qrsh -l i,gpu
There are multiple GPU types available in the cluster. Each type of GPU has a different compute capability, memory size and clock speed, among other things. If your GPU program requires a specific GPU type to run, you need to specify it explicitly. Without specifying GPU type allows UGE to arbitrarily pick any available GPU for your job. You may need to compile your code on the machine that has the required type of GPU. Currently, the following GPU types are available:
|Tesla T10||1.3||240||4.3 GB||
|Tesla M2070||2.0||448||5.6 GB||
|Tesla M2090||2.0||512||6 GB||
The UGE options in the table above can be combined with other UGE options, for example:
qrsh -l gpu,fermi,h_rt=3:00:00
 If you specify
-l fermi the job will go to either M2070 or M2090 GPU nodes. If you specify
-l M2070 the job will only go to M2070 and will not go to M2090 even when the later is available. If you specify
-l M2090 the job will only go to M2090 and will not go to M2070 even when the later is available. This implies potentially longer wait time.
For most users, we recommend using
-l fermi instead of
-l M2070 or
-l M2090 unless you specifically want to use either one of them (e.g., benchmarking the differences between M2070 and M2090)
CUDA is installed in /u/local/cuda/ on the Hoffman2 Cluster. There are several versions available. The most recent as of December 2011 is 4.0.17 You can refer to the current production version with /u/local/cuda/current/. To install CUDA in your home directory, please see the instructions in the /u/local/cuda/README_ATS file. To install the NVIDIA GPU Computing Software Development Kit in your home directory, please see the instructions in the