Once you have received an allocation, and logged in to Kraken, the next step should be familiarize yourself with the system and how it is used. More specific information about particular tasks may be found via the Kraken User Guide or Quick Start Guide. This page aims to guide you through the process of downloading, compiling, and running a simple program. Take this opportunity to experiment with the various options. Remember that you can find documentation on most commands by typing man <command>.
The example uses OpenMP and MPI, and prints each node, core, MPI rank, and thread ID, which can be useful for understanding how nodes are laid out. If you are not interested in using OpenMP, it is simple to compile without OpenMP and check the behavior of MPI-only applications.
Getting the files
This tutorial requires three files:
- Source code for a hybrid "Hello World" program in C, courtesy of Cray's User Guide (page 119).
- A simple makefile to build the source
- A PBS batch file which submits the job
There is also a compressed version of those files. You could upload these files to Kraken using SFTP/SCP. From another source, it may be convenient to get your software with a program like Subversion or Git. Since these files are online, the easiest way to get them is wget. From a node on Kraken, type:
% wget http://www.nics.utk.edu/\ /sites/default/files/tutorials/HybridHello/HybridHello.tar.gz
Extract the contents with:
% tar zxvf HybridHello.tar.gz
The "z" flag tells tar to uncompress the archive with gzip.
Compiling
-
Compiling by hand
To compile the source directly using the compiler wrappers, you might use the following command:
cc -mp -o HybridHello.x HybridHello.c
Here, the
-mpoption tells the compiler to use OpenMP pragmas. If this option is omitted, the program will compile as a simple MPI program. Note that if you are using GNU/Pathscale compilers, you would use-fopenmpinstead of-mp. Using the make script
This makefile was written for a Cray XT architecture, so it should work by default with PGI compilers, and with a minor change with GNU/Pathscale. In general, makefiles (or a configure script to generate a makefile) will not work without some extra direction, pointing to the Cray compiler wrappers at least (
cc,CCandftn).- Verify that
makewill use the Cray compiler wrappers - Verify that
CFLAGSis tellingmaketo use the correct OpenMP flags, "-mp" for PGI, "-fopenmp" for GNU/Pathscale.
From the directory containing the makefile, type "
make", and it should compile your code.- Verify that
Real programs generally have more advanced installers, but almost always need to be told to use the Cray compiler wrappers. The other common issue with compiling is finding the right libraries. Shared libraries are not supported on Kraken's compute nodes. Many math libraries are available as modules (see available software), they may or may not require adding something to the link command.
Running on Kraken
You should edit HH.pbs to use your account (the #PBS -A line). Once that is written, you can submit the job as follows:
% qsub HH.pbs
There will be a confirmation that your job was submitted, and you should be able to see it on the queue for a brief period:
% showq
It is rare for every node to be in use, so a (short) single node job generally starts right away. It should only take a few seconds before you get an output file, "HybridHelloTest.o${PBS_JOBID}".
There are a few common options that would be good to play with until you have them figured out. You may want to refer to our documentation on running jobs.
PBS Options
There are quite a few PBS options you may want to use. For example, you may want to receive an email when a job fails, or set up jobs so they only run after another job has finished (in the mean time, they are in a "held" state, which does not count towards a jobs wait time).
See our table of common PBS options, Batch script documentation, or '
man qsubfor a list of possible options.Lustre
Compute nodes only have access to Lustre (
/lustre/scratch/*), not your home directory. This means:- Files which are read or written by compute jobs must be on Lustre
- The job attempts to start in the directory they were launched from (via
aprun), so that directory must also be on Lustre
The executable as well as standard input and output from a job are handled by the job launcher, so these files can be anywhere.
As is,
HH.pbsshould take care of all of these issues. It changes to the user's directory in Lustre, and calls the executable assuming that it is in the original directory from which the job was submitted. The output goes into that same directory as well, which need not be on Lustre. Try changing the script and moving files around to see what breaks, and what error messages you get.Aprun options
HH.pbs requests 12 cores, sets the number of OpenMP threads (
OMP_NUM_THREADS) to 2. It automatically figures out how many MPI tasks to spawn to fill the reservation, (calling this "N_MPI"), and launches the job withaprun(it does no error checking, so if you ask it to do something that doesn't make sense, such as use more threads per MPI task than there are cores per node, it will blindly try.)You are welcome to try some of the other options with
aprun, see the web documentation or "man aprun" for more information.

