The National Institute for Computational Sciences

Data Transfer

  Data Transfer

Introduction


The ACF provides several ways for transferring files to/from the NFS home directories, NFS project directories, Lustre project directories, and Lustre scratch directories. DTNs (Data Transfer Nodes) furnish this capability. At the time of this writing, there are four DTNs available to ACF users. The table below shows these nodes.

Data Transfer NodeIP AddressAuthentication SupportedFile Transfer Protocols SupportedFile System Access
datamover1.nics.utk.edu192.249.6.163NetID+Duo,
InCommon Credential
SCP, SFTP, GlobusHome,
/lustre/haven
datamover2.nics.utk.edu192.249.6.164NetID+Duo,
InCommon Credential
SCP, SFTP, GlobusHome,
/lustre/haven
datamover3.nics.utk.edu192.249.6.165NetID+Duo,
InCommon Credential
SCP, SFTP, GlobusHome,
/lustre/haven
datamover4.nics.utk.edu192.249.6.166NetID+Duo,
InCommon Credential
SCP, SFTP, GlobusHome,
/lustre/haven

The listed DTNs are setup for NetID authentication, Duo TFA, and authentication through an InCommon Credential so users can login to this node and perform data transfer functions. To connect to these DTNs, use ssh in a terminal. More information on ssh usage can be found in the Access and Login document. Replace the hostname of the login node with the hostname of the DTN to which you wish to connect, then authenticate with your UT NetID, password, and Duo TFA.

SCP and SFTP


SCP and SFTP are both ssh utilities available for transferring files on the ACF. However, they perform slower than Globus. At the time of this writing, Globus offers the fastest data transfers on the ACF. Still, SCP and SFTP are useful for quick, small transfers. For larger file transfers, please use Globus.

SCP and SFTP are available to Linux and MacOS systems by default. Windows 10 users with the most recent updates can use these utilities within Command Prompt or PowerShell. Windows 7 and 8 users must use a third-party utility to use SCP and SFTP. For more information on ssh in Windows, see the Access and Login document. For Windows 7 and 8 users, the third-party utilities FileZilla and WinSCP are reviewed later in this document.

The general syntax of SCP is given below. In general, SCP is useful when transferring a file on your system to the ACF. The <source> argument is the pathname of the file on your system that you wish to copy. The <destination> (in this case, datamover1) argument is the hostname of the datamover you wish to use. Additionally, the <directory> argument specifies the absolute pathname within the destination to place the file.

scp <source> <NetID>@datamover1.nics.utk.edu:<directory>

If you wanted to copy a file from your system and place it on the ACF, you could use scp ~/<filename><NetID>@acf-login.nics.utk.edu:~/Documents.

For SFTP, you specify the hostname of the system to which you intend to connect. For example, to securely transfer files between your local system and the ACF, use the syntax below in a terminal on your local system. Ensure that you enter SFTP from the directory that contains the file(s) you wish to copy to the ACF. You can use the pwd command to determine your current directory before entering SFTP.

sftp <NetID>@datamover1.nics.utk.edu

Once you authenticate with your UT NetID, password, and Duo TFA, you will enter SFTP’s interactive mode. Use the put command to upload a file to the ACF. For example, to upload a file named JobScript.sh to the ACF from your local machine, use put JobScript.sh. This syntax assumes that the JobScript.sh file is in the directory from which you entered SFTP.

To retrieve files from the ACF, use the get command. To download a file named ResearchResults.txt from the ACF to your local machine, use get ResearchResults.txt. SFTP will place the file in the directory from which you entered the utility. To change directories on the ACF, use the cd command. Use the lcd command to change the directory on your local system. Once you are done with SFTP, use the bye or exit commands to exit it. Other commands are available with the SFTP utility. Type help within SFTP to read more about them.

Globus Web-based Data Transfer


ACF users can use the web-based Globus file transfer interface to perform data transfers to/from ACF supported resources. The visual interface makes it quite easy to move, back up or restore relevant data. To get you started, visit the Globus website and consult the Getting Started guide. There are some fantastic documentation on this capability located in the Globus How-To documentation.

Please note: Using the Globus Web-based interface and Globus API only works with the University of Tennessee CILogon InCommon credential. The ACF username, password and Duo or RSA two factor authentication credentials will not work with ACF datamovers when using Globus. You would not want to use this method anyway as you would have to authenticate for every set of data transfers. Using the X.509 CILogon InCommon credential issued by University of Tennessee will allow for unattended data transfers, retry of data transfers and use of the API without having to use a username/password based authentication credential.


The Globus GUI for file transfer between ACF DTNs and Newton DTNs

Globus Endpoints

The Globus endpoints to access ACF and Newton DTN resources are the following:

  • nics#datamover1
  • nics#datamover2
  • nics#datamover3
  • nics#datamover4

One of the latest features of Globus is Globus Connect Personal. Globus Connect Personal turns your personal computer into a Globus endpoint so you can share and transfer files to/from a local machine - campus server, desktop computer or laptop.

Setting up x.509 authentication

In order to use the GSISCP and Globus GridFTP transfer services each user needs to do three things

  1. In the NICS portal associate their NetID with their NICS account (see the image below) and
  2. In the NICS portal setup their X.509 user certificate by associating their CILogon InCommon credential with their NICS account
  3. Authenticate to the Globus web-based interface for file transfers using the University of Tennessee X.509 based CILogon InCommon credential and not the user's ACF username/password and two factor credentials
Both of these are shown in the image below. To start off login to the NICS portal at https://portal.nics.utk.edu and click on the "To associate your UTK or UTHSC NetID with your NICS account" follow the prompts, then click on the button to associate your InCommon credential with the NICS infrastructure. Click on the buttons shown in this example portal view as shown below:

To setup this credential you will select "University of Tennessee" as the identity provider and login using your University of Tennessee NetID username and password when prompted by the InCommon CILogon interface. You will set a password for your X.509 credential. Please note and remember this password as you will use it in setting up Globus or GSISCP with X.509 credentials. Once you go through the CILogon process the Distinguished Name (DN) of your X.509 credential will be associated with the NICS ACF infrastructure and will be available for use. Screeshots of the step by step process is shown below.

Step 0: Login to the Newton login node in order to save the credential you are about to create in Step 4

Step 1: select University of Tennessee as the Identity Provider

Step 2: Authenticate with your UT NetID and Password

Step 3: enter a password for your new InCommon credential (and remember this!)

Step 4: you will get a screen that shows you can click to download your certificate. Click to download and save locally. You could also use wget to this URL from Newton to save to your Newton home directory. There is a time limit for access to this certificate so be aware of that. You may have to move quickly to download the certificate.

This X.509 distinguished name (DN) information is put into the /etc/grid-security/grid-mapfile on the ACF DTNs and this process is done every hour so you may have to wait an hour to use this authentication method. Once you have this setup and your credential is in the /etc/grid-security/grid-mapfile on the ACF DTNs you are ready to start using Globus for data transfers. If you want to use GSISCP you will need to follow the instructions in the below paragraph to set that up. The ACF DTNs are configured to use CILogon OAuth credentials. For example, the nics#datamover1 Globus endpoint is setup to use your CILogon credential so just login to Globus, select the nics#datamover1 endpoint and authenticate with your CILogon password. No other authentication method will work for the ACF DTNs with Globus and the GSISCP protocols (one cannot use NetID and password, for example).

To use your new X.509 credential with GSISCP you will need to obtain a credential pem file and put it in your home directory on Newton. The file specifically needs to go into the in ~/.globus/usercred.pem with permissions 600. If you didn't save the credential following the instructions above you can get a new credential pem file by going back to the https://cilogon.org/ page and go through the process again to generate a new certificate. This will then prompt you for a credential password so go ahead and type one in. Again, be sure to remember what this password is for future reference. The CILogon page will give you a link to download the certificated needed as shown below.

Once you have this credential in the ~/.globus/usercred.pem file then login to one of the Newton DTNs (dtn1.newton.utk.edu or dtn2.newton.utk.edu) and run grid-proxy-init. grid-proxy-init will prompt you for your CILogon credential password. This will create a proxy credential which can be used with GSISCP. Once you have done the grid-proxy-init you can then do a gsiscp without having to type a username or password. The default credential lifetime is 12 hours. See the following transcript for an example.

Globus Web-based File Transfer Performance Example

As an example of the power and utility of Globus web-based file transfer capability, one thousand one gigabyte files were created in /gamma/victor (Newton GPFS file system) and transferred to /lustre/medusa/victor on datamover1.nics.utk.edu using the Globus web-based file transfer tool. After logging into https://www.globus.org and setting up the Globus transfer tool with the nics#datamover1 endpoint on the left and UTK OIT Newton DTN1 on the right side all the one gigabyte files were selected on the right and then the transfer button was selected at the top.

Globus managed the file transfer and in many cases does some file transfers in parallel. Using this method the 1000 files of 1 terabytes of data was transferred in 32 minutes and averaged 550 MB/s transfer performance.


Using FileZilla to Transfer Files to/from the ACF

FileZilla will work with file transfers to the ACF. Please only use the DTNs described in the Data Transfer Node Servers section for data transfer and refrain from doing data transfers to ACF login nodes. ACF login nodes are not optimized for data transfer.


To use the FileZilla client with your NetID, password and Duo multi-factor authentication follow these steps: (Note: FileZilla will work with the ACF data transfer nodes that use RSA, but that is just not the subject of this documentation section.)

  • Open your FileZilla client
  • select "File" -> "Site Manager"

  • Click on "New Site" which has the below subset of steps

    • For Host put one of the following: datamover1.nics.utk.edu, datamover2.nics.utk.edu, datamover3.nics.utk.edu, or datamover4.nics.utk.edu (those are the DTNs that support duo). (Note: FileZilla will connect to datamover5, datamover6, datamover7 and datamover8, however, these nodes are setup for ACF authentication with username and RSA two factor authentication. Most ACF users are being directed to using Duo for multi-factor authentication though some may need to use RSA for various reasons.)
    • For Protocol select "SFTP - SSH File Transfer Protocol"
    • For Logon Type select "Interactive" (this is the only one that will work with Duo
    • Put in your NetId for User
    • Rename the entry under "My Sites" from "New Site" to datamover1, or datamover2, datamover3, or datamover4 whichever you used in #3a
    • Note: By default FileZilla will prompt for authentication for each data transfer that you initiate. To prevent this annoying behavior, click on the "Transfer Settings" tab and check the box next to "limit the number of simultaneous connections." The value below this box should be set to 1 and should not be changed. This will keep the SFTP session active and perform transfers using the original first authentication and not prompt for authentication for each file transfer.

  • Click Connect
  • When the FileZilla client prompts, enter your password and click Ok

  • When the FileZilla client prompts again for the Duo challenge, enter "1" in the "Password" field and click Ok

  • You should receive a Duo push request to your smartphone and on your smart phone select "accept" to authorize the authentication
  • That should be it and you should connect successfully with FileZilla


Using WinSCP to transfer files to/from the ACF

WinSCP does work for file transfers to/from the ACF. Please only use the DTNs described in the Data Transfer Node Servers section for data transfer and refrain from doing data transfers to ACF login nodes. ACF login nodes are not optimized for data transfer.


To use WinSCP client with your NetID, password and Duo multi-factor authentication (on datamover1-4) follow these steps: (Note: WinSCP will work with the ACF data transfer nodes that use RSA, but that is just not the subject of this documentation section.)

  • Open the WinSCP client and click on "New Site"

  • Fill in "Host name", "User name", and "Password". Port should be 22 (the default) and hit return.

  • If you have not logged into this server before you will get a Warning dialog to add the server host key. Click "Yes".

  • The authentication banner will be displayed. Click "continue"

  • The login will continue and it will prompt for the Duo push. Enter "1" and click ok or press return

  • Once you authenticate you will get the WinSCP application screen showing left side of the local machine and the right side being the system you logged into.

To connect to datamovers5-8 you can do exactly as described above but you will not receive a second password challenge as is described in the above since these DTNs use RSA authentication.

Last Updated: 11 / 14 / 2019