Kraken will be officially retired and no longer accessible on August 27, 2014. For more information see Kraken Decommission FAQs.
Kraken will be officially retired and no longer accessible on August 27, 2014. For more information see Kraken Decommission FAQs.
The National Institute for Computational Sciences

Kraken Quick Start Guide

This guide is geared toward experienced users, and it gives a overview of how to use the Kraken system. If you need more information, see the full user guide.

 


 

Kraken System

Kraken is a Cray XT5 system with 9,408 compute nodes. Each node has:

  • two 2.6 GHz six-core AMD Opteron processors (Istanbul)
  • 12 cores (system total is 112,896 compute cores)
  • 16 GB of memory (system total is 129 TB of compute memory)

Compute nodes run Cray Linux Environment (CLE) 3.1. Each node is connected to a Cray SeaStar router through HyperTransport, and the SeaStars are all interconnected in a 3-D-torus topology.

Getting an Allocation

Details on the eligibility requirements for principal investigators, the application process and the proposal deadlines are located on the XSEDE Allocations page.

Connecting

If you have mailed in the notarized form and received email confirmation that your RSA One Time Password (OTP) token has been activated, use ssh to connect to login.kraken.nics.tennessee.edu. Otherwise, you can use GSI authentication to log in to gsissh.kraken.nics.tennessee.edu.

File Systems and Storage

Home directories have a 2 GB quota. These directories are on a Network File System (NFS), and are generally available by logging in to login.nics.utk.edu even if Kraken is down. However, home directories are not accessible from the compute nodes. In a batch job, the current directory, input and output files must be on Lustre. Otherwise, the job will fail with an error message saying no such file or directory.

Each user has a scratch directory in /lustre/scratch/$USER which is subject to purging based on the time of last access. Lustre is a highly scalable cluster file system that allows you to increase striping width to improve I/O performance. Programs launched with aprun can only access Lustre, not home directories or other directories.

Files may be stored on the mass storage system (HPSS) using the hsi command. You may only access HPSS if you have activated your RSA OTP token. You may access HPSS within a batch job, but only from jobs in the "hpss" queue that are submitted from the RSA SecurID login nodes (login.kraken.nics.tennessee.edu).

Data Transfer

If you are moving large amounts of data (>1 Gbyte) from another XSEDE system, use GridFTP. For smaller amounts of data, use sftp.

User Environment

Kraken uses the module command (softenv is not available) to select libraries, compilers and other software that are not already in the default user environment.
Users can change their login shell via the NICS User Portal.

Compiling

By default, Kraken uses the Portland Group (PGI) compilers. GNU and Intel compilers are available via modules. To compile your MPI program, use the cc(C), CC(C++), or ftn(Fortran) commands. These commands create executables to run on the compute nodes. Other MPI compiler wrappers such as mpicc, mpicxx, mpif77and mpi90are not available on Kraken. The Cray scientific libraries (BLAS, LAPACK, ScaLAPACK, BLACS, and others) are in the default library path. Use the module command for access to libraries such as HDF, netCDF, FFTW, and PETSc.

The login nodes are not intended for computing, but if you need to run short pre- or post-processing programs on them, you can call the compilers directly: for PGI compilers, use pgcc, pgCC, or pgf90.

Running Jobs

Kraken uses PBS/Torque (qsub, qdel, qstat, etc) for batch jobs. To specify the number cores allocated to a batch job, use "#PBS -l size=cores" in the batch script. Since the batch system allocates in units of 12-core nodes, the number of cores must be a multiple of 12.

To specify an account to charge, use "#PBS -A account". The showusage command lists your valid accounts/projects.

Parallel programs may only be run within a batch job with aprun -n nprocks executable; mpirun is not available on Kraken. By default, it runs 12 MPI processes on each dual-socket node (one per core). To run with fewer MPI processes per socket, use the -S option. For example, "aprun -n nprocs -S 4 executable" runs with 4 MPI processes per socket (8 per node).

A sample batch script is available. Use "qsub script" to submit a batch job. Use qsub -I for interactive tasks such as debugging.

Debugging and Optimization

Kraken has a license for running TotalView on up to 64 processors. CrayPAT (Cray Performance Analysis Tools) is useful for profiling and collecting hardware performance data to determine bottlenecks in performance.

Other documentation