• National Institute for Computational Sciences is a UT/ORNL Partnership

HPSS FAQ

General Frequently Asked Questions

There are two main ways to interact with HPSS. One way is via the hsi command. Issuing the hsi command allows the user to access the HPSS file system through an interactive mode where some typical linux file system commands can be used. The second way is just using hsi commands along with the files you would like to transfer and manipulate. Some of these commands can be found here.

How do I obtain an HPSS account?

NICS users are automatically given an HPSS account. The user's login name for HPSS is the same as for all other NICS systems. Authorization to HPSS is by means of the user's RSA OTP token issued by the NICS.

How do I access HPSS?

Users may access HPSS from any NICS high-performance computing (HPC) system with the Hierarchical Storage Interface (HSI) utility. Access it by typing the command hsi in your linux environment. To exit, simply type quit.

What is HSI?

The HSI utility allows automatic authentication and provi des a user-friendly command line and interactive interface to HPSS.

What is the best way to transfer a large number of small files?
What is the optimal transfer size?

HPSS performance is greatly improved when the transfer size is between 1 and 10 GB. For that reason, users with large numbers of relatively small files should combine those files into one or a few 1 GB to 10 GB files and then transfer the larger files. The files can be combined with tar on the HPC system, or they can be created on the fly with a command similar to tar cvf some_dir -|hsi put - : somedir.tar. This command will tar all files in the some_dir subdirectory into a file named somedir.tar on HPSS. HPSS also supports the htar command.

Can I use HSI without entering my passcode each time?
Can I use HSI in batch scripts?

If you log into kraken using your passcode from your OTP token, you can run HSI without entering your passcode each time. You can also run batch scripts that use HSI in the "hpss" queue. If you logged into kraken-gsi using GSI authentication you will be prompted for your passcode each time you use HSI.

Can I run HSI from my workstation?

Because HSI is a third-party package, clients may be available for your system; however, NICS currently supports access to HPSS only through HSI clients on the HPC systems.

Is the HPSS system able to be accessed by more than one process at a time?

There is nothing that should prevent you from running a script that creates multiple simultaneous connections to HPSS. The HPSS system administrator recommends that you should not create more than 2 or 3 connections at a time. Every time you introduce a new instance the performance of the overall system is degraded.

Why has my HPSS access from Kraken been disabled?

Chances are that HPSS access from Kraken has been disabled because you are archiving too many small files at a time. Archiving too many small files introduces a lot of overhead on the system. Please use htar to tar together your files. Documentation can be found here. Contact us and convey to us that you understand this. 'htar', for the most part, works the same as the regular tar. We would prefer that you perform htar on ~10GB chunks. After you confirm that you will be using htar from now on, we will proceed to provide you access to HPSS. Our system staff would like you to remove all of your archived small files from HPSS and archive them again using htar. The way this HPSS system is configured, there are a limited number of tapes designated to hold all archived small files and these tapes are already full. This archiving system is not designed to handle thousands of small files at a time.

I have a huge amount of data on SDSC HPSS. Is there any way to connect directly to SDSC HPSS from Bigred to download my data?

Please refer to the following page, and the several method-specific pages here: (DIRECT)

How do I retrieve a single directory from HPSS?

To retrieve a single directory from HPSS use the -R option. For example,

>hsi
>get -R dir1

How do I retrieve a single file from HPSS?

Use hsi -ls to show the tar file in HPSS

>hsi ls -l file.tar
...
-rw-------   1 davem     davem          12800 Oct  2  2008 file.tar
Use "htar" to list the contents of the tar file:
> htar -tvf file.tar
HTAR: drwxr-xr-x  davem/nicsstaff          0 2008-10-02 10:47  dir2/
HTAR: -rw-r--r--  davem/nicsstaff       1492 2008-10-02 10:47  dir2/data.pbs
HTAR: -rw-r--r--  davem/nicsstaff       1924 2008-10-02 10:47  dir2/mpi.pbs
Use "htar" to extract a single file (name must match what is listed by the above command):
> htar -tvf file.tar dir2/data.pbs
HTAR: -rw-r--r--  davem/nicsstaff       1492 2008-10-02 10:47  dir2/data.pbs

How do I verify the contents of an archive during creation?

HTAR provides the “-Hverify=option[,option...]” command line option, which causes HTAR to first create the archive file normally, and then to go back and check its work by performing a series of checks on the archive file. You choose the types of checks to be performed by specifying one or more comma-separated options. The options can be either individual items, or the keyword “all”, or a numeric level between 0, 1 or 2. Each numeric level includes all of the checks for lower-valued levels and adds additional checks. The verification options are:

all Enables all possible verification options except “paranoid”
info Reads and verifies the tar-format headers that precede each member file in the archive
crc Reads each member file and recalculates the Cyclic Redundancy Checksum (CRC), and verifies that it matches the value that is stored in the index file.
compare This option directs HTAR to compare each member file in the archive with the original local file.
paranoid This option is only meaningful if “-Hrmlocal” is specified, which causes HTAR to remove any local files or symbolic links that have been succuessfully copied to the archive file.

If “paranoid” is specified, then HTAR makes one last check before removing local files or symlinks to verify that:
a. For files, the modification time has not changed since the member file was copied into the archive
b. The object type has not changed, for example, if the original object was a file, it has not been deleted and recreated as a symlink or directory, etc.
It is also possible to specify a verification option such as “all”, or a numeric level, such as 0, 1 or 2, and then selectively disable one or more options. In practice, this is rarely, if ever, useful, but the following options are provided:
0 Same as “info”
1 Same as “info,crc”
2Same as “info,crc,compare”
nocompareDisables comparison of member files with their original local files
nocrc Disables CRC checking
noparanoidDisables checking of modification time and object type changes
htar -cvf case3/myproject.tar -Hcrc -Hverify=2 tim* --- (1)
htar -Hcrc -tvf case3/myproject.tar --- (2)
In the example above,
(1) the archive file is created (-c) with verification level 2, including CRC generation and checking. The verbose output option (-v) is used to cause HTAR to display information about each file that is added during the create phase, and then verified during the verification phase.
(2) the archive file is then listed (-t) using the "-Hcrc" option to cause HTAR to display the CRC value for each member file.