The National Institute for Computational Sciences

Running a Hybrid Hello World

Once you have received an allocation, and logged in to Darter, the next step should be familiarize yourself with the system and how it is used. More specific information about particular tasks may be found via the Darter Quick Start Guide. This page aims to guide you through the process of downloading, compiling, and running a simple program. Take this opportunity to experiment with the various options. Remember that you can find documentation on most commands by typing man <command>.

The example uses OpenMP and MPI, and prints each node, core, MPI rank, and thread ID, which can be useful for understanding how nodes are laid out. If you are not interested in using OpenMP, it is simple to compile without OpenMP and check the behavior of MPI-only applications.

Getting the files

This tutorial requires three files:

There is also a compressed version of those files. You could upload these files to Darter using SFTP/SCP. From another source, it may be convenient to get your software with a program like Git. Since these files are online, the easiest way to get them is wget. From a node on Darter, type:

% wget http://www.nics.utk.edu/\
/sites/default/files/tutorials/HybridHello/HybridHello.tar.gz

Extract the contents with:

% tar zxvf HybridHello.tar.gz

The "z" flag tells tar to uncompress the archive with gzip.

Compiling

  • Compiling by hand

    To compile the source directly using the compiler wrappers, you might use the following command:

    cc -o HybridHello.x HybridHello.c

    Note that if you are using the GNU compiler, you would use "-fopenmp" and for the Intel compiler, "-openmp".

  • Using the make script

    This makefile was written for a Cray XC30 architecture, so it should work by default with Cray compilers, and with a minor change with GNU. In general, makefiles (or a configure script to generate a makefile) will not work without some extra direction, pointing to the Cray compiler wrappers at least (cc, CC and ftn).

    • Verify that make will use the Cray compiler wrappers
    • Verify that CFLAGS is telling make to use the correct OpenMP flags,"-fopenmp" for GNU.

    From the directory containing the makefile, type "make", and it should compile your code.

Real programs generally have more advanced installers, but almost always need to be told to use the Cray compiler wrappers. The other common issue with compiling is finding the right libraries. Many libraries are available as modules (see available software). They are automatically linked to when the necessary modules are loaded (unless otherwise noted).

Running on Darter

You should edit HH.pbs to use your account (the #PBS -A line). Once that is written, you can submit the job as follows:

% qsub HH.pbs

There will be a confirmation that your job was submitted, and you should be able to see it on the queue for a brief period:

% showq 

It is rare for every node to be in use, so a (short) single node job generally starts right away. It should only take a few seconds before you get an output file, "HybridHelloTest.o${PBS_JOBID}".

There are a few common options that would be good to play with until you have them figured out. You may want to refer to our documentation on running jobs.

  • PBS Options

    There are quite a few PBS options you may want to use. For example, you may want to receive an email when a job fails, or set up jobs so they only run after another job has finished (in the mean time, they are in a "held" state, which does not count towards a jobs wait time).

    See our table of common PBS options, Batch script documentation, or 'man qsub for a list of possible options.

  • Lustre

    Compute nodes only have access to Lustre (/lustre/medusa/*), not your home directory. This means:

    • Files which are read or written by compute jobs must be on Lustre
    • The job attempts to start in the directory they were launched from (via aprun), so that directory must also be on Lustre

    The executable as well as standard input and output from a job are handled by the job launcher, so these files can be anywhere.

    As is, HH.pbs should take care of all of these issues. It changes to the user's directory in Lustre, and calls the executable assuming that it is in the original directory from which the job was submitted. The output goes into that same directory as well, which need not be on Lustre. Try changing the script and moving files around to see what breaks, and what error messages you get.

  • Aprun options

    HH.pbs requests 16 cores, sets the number of OpenMP threads (OMP_NUM_THREADS) to 2. It automatically figures out how many MPI tasks to spawn to fill the reservation, (calling this "N_MPI"), and launches the job with aprun (it does no error checking, so if you ask it to do something that doesn't make sense, such as use more threads per MPI task than there are cores per node, it will blindly try.)

    You are welcome to try some of the other options with aprun, see the web documentation or "man aprun" for more information.