• National Institute for Computational Sciences is a UT/ORNL Partnership

Lustre

What should I do in the event of a lustre slowdown?

In the event of a lustre slowdown, there are many things to consider as lustre has many working parts and is shared by all users on the system. NICS continually monitors lustre's performance and seeks to improve researcher's data communications. If you notice your code's I/O performance or the lustre filesystem is slower than usual, please answer the following questions to the best of your knowledge and email XSEDE Help Desk your answers.

  • When did you first notice the slowdown? How long did it last?
  • Which login node were you on?
  • Can you estimate the magnitude of the slowdown? (ex - "It took 2 min instead of 3 secs", "batch job exceeded walltime limit of 10 hours, but normally finishes in 8 hours")
  • What were you doing? Interactive command (like "ls")? Batch job?
  • For interactive commands:
    • Which host were you using?
    • Did you see the same behavior on other hosts?
    • Can you provide the exact command that was run and the directory in which it was run?
  • For batch jobs:
    • Can you supply the job IDs for jobs that were affected?
    • Can you provide any details about the IO pattern for your job?

Tags:

Is there any other faster way to list my files in my Lustre scratch area?

Yes! A basic ls only has to contact the meta-data server (MDS), not the object-storage servers (OSS), where the bottleneck often occurs. In general, ls is aliased to give additional information, which requires the OSS's. You can bypass this by using /bin/ls. When there are many files in the same directory, and you don't need the output to be sorted, /bin/ls -U is even faster.

You can also use the Lustre utility lfs to look for files. For example, the syntax to emulate a regular ls in any directory is

lfs find  -D 0  *

For convenience, you may want to add an alias definition to your login config files. For example Bash users can add to their ~/.bashrc the following line to create an alias called lls.

alias lls="/bin/ls -U"

Tags:

How do I change the striping in Lustre?

A user can change the striping settings for a file or directory in Lustre by using the lfs command. The usage for the lfs command is

lfs setstripe <filename|dirname> -s <size> -i <index> -c <count>

where

size - the number of bytes on each OST (0 indicating default of 1 MB) specified with k, m, or g to indicate units of KB, MB, or GB, respectively.
index - the OST index of first stripe (-1 indicating default)
count - the number of OSTs to stripe over (0 indicating default of 4 and -1 indicating all OSTs [limit of 160]).

NOTE: If you change the settings for existing files, the file will get the new settings only if it is recreated.

To change the settings for an existing directory, you will need to rename the directory, create a new directory with the proper settings, and then copy (not move) the files to the new directory to inherit the new settings.

If your application is the type in which each separate process writes to its own file, then we believe that the best option is to not use striping. This can be set by using this command:

> lfs setstripe <directory> -c 1

Then we see that

> lfs find -v testdirectory
OBDS:
0: ost1_UUID ACTIVE
--snip--
testdirectory/
default stripe_count: 1 stripe_size: 0 stripe_offset: -1

This shows we have a stripe count of 1 (no striping), the stripe size is set to 0 (which means use the default), and the stripe offset is set to -1 (which means to round-robin the files across the OSTs).

NOTE: You should always use -1 for stripe_offset.

The stripe count and stripe size are something you can tweak for performance. If your application writes very large files, then we believe that the best option is to stripe across all or a subset of the OSTs on the file system. Striping across all OSTs can be set by using this command:

> lfs setstripe <directory> -c -1

Caution: Not striping large files may cause a write error if the file's size is larger than the space on a single OST. Each OST has a finite size which is smaller than the total Lustre area of all OSTs.

Tags:

What is the default striping for my files?

A file's striping is inherited from its parent directory. The lfs getstripe command can be used to determine the striping for a file, or the default striping for a directory. Note that each file and directory can have its own striping pattern. This means that a user can set striping patterns for his own files and/or directories. The default stripe width for a user may be 1 or 4, you can determine by running lfs getstripe /lustre/scratch/$USER.

This command will also give you information on the striping information for a directory/file.

lfs find -v <directory/file>

Tags:

What is file striping?

The Lustre file system is made up of an underlying set of file systems called Object Storage Targets (OST's), which are essentially a set of parallel IO servers. A file is said to be striped when read and write operations access multiple OST's concurrently. File striping is a way to increase IO performance since writing or reading from multiple OST's simultaneously increases the available IO bandwidth.

Striping will likely have little impact for the following codes:
  • Serial IO where a single processor or node performs all of the IO for an application.
  • Multiple nodes perform IO, access files at different times.
  • Multiple nodes perform IO simultaneously to different files that are small (each < 100 MB).

Lustre allows users to set file striping at the file or directory level. As mentioned above, striping will not improve IO performance for all files. For example, in a parallel application, if each processor writes its own file then file striping will not provide any benefit. Each file will already be placed in its own OST and the application will be using OST's concurrently. File striping, in this case, could lead to a performance decrease due to contention between the processors as they try to write (or read) pieces of their files spread across multiple OST's.

For MPI applications with parallel IO, multiple processors accessing multiple OST's can provide large IO bandwidths. Using all the available OST's on Kraken will provide maximum performance.

There are a few disadvantages to striping. Interactive commands such as ls -l will be slower for striped files. Additionally, striped files are more likely to suffer from data loss from a hardware failure since the the file is spread across multiple OST's.

Please see also: Scratch Space (Lustre) and I/O and Lustre Tips.

Tags:
Syndicate content