This is a list of basic terms that might be used in HPC. For more
information about HPC and what it does, see What
Is HPC?
BlueGene
IBM's current line of supercomputers is known as BlueGene, and is
largely based on IBM's Cell processor, which also powers the Playstation 3.
Cabinet
The nodes of a supercomputer are physically mounted into
cabinets, (which contain racks), which also contain the
networking and cooling systems, much like traditional servers.
CLI
A Command Line Interface is a user interface for computers which
uses typed commands rather than buttons or graphical features as would be seen
in a Graphical User
Interface. For example, Microsoft's DOS and GNU's shell (ie, bash) are
CLI's. Command Line Interfaces can also be found in 3rd party programs such as
Matlab.
CPU
CPU stands for Central Processing Unit, and is the part of a
computer which executes software programs. The term is not specific to a
particular method of execution: units based on transistors, relays, or vacuum
tubes might be considered CPU's. However, for clarity, we will use the term to
refer to individual silicon chips, such as Intel's Pentium or AMD's Athlon.
Thus, a CPU contains one or more cores, however, an HPC system may
contain many CPU's. For example, Kraken contains several thousand AMD Opteron
CPU's.
Core
A core is an individual processor: the part of a computer which
actually executes programs. CPUs
used to have a single core, and the terms were interchangeable. In recent
years, several cores, or processors, have been manufactured on a single CPU
chip, which may be referred to as a multiprocessor. It is important to note,
however, that the relationship between the cores may vary radically: AMD's
Opteron, Intel's Itanium, and IBM's Cell have very distinct setups.
Cyberinfrastructure
Cyberinfrastructure consists of computing systems, data storage systems,
data repositories and advanced instruments, visualization environments, and
people, all linked together by software and advanced networks to improve
scholarly productivity and enable breakthroughs not otherwise possible.
DoE
The Department of Energy runs the US National Laboratories, such as Oak
Ridge National Laboratory.
ftp
A protocol or utility which is used to transfer files over a network
connection. For security, use the related sftp.
GPU
Graphics cards contain GPU's, or Graphics Processing Units, for
processing visual data. These processors have a different architecture from
standard CPUs, and are considered a promising technology for small-scale
parallel processing in the future. Many of the projects which once required
supercomputers may be done by single GPU cards in the future.
GUI
A Graphical User Interface is a visual medium for users to input
commands, usually using the mouse and keyboard, as opposed to a Command
Line Interface, which uses typed commands. You are probably using a GUI
right now to read this page, for example.
HPC
High Performance Computing is the term often used for large-scale
computers and the simulations and models which run on them.
HPSS
A single research group may create many Terabytes of data so it is important
to have some place to store this data. HPSS or the High Performance Storage
System is shared between NICS and NCCS, and consists of several Petabytes of
disk and tape storage.
Instruction-level Parallelism
Within an individual core,
there are individual sections of a processor which perform different tasks. You
might think of it like an assembly line: with no instruction
pipelining, or parallelism, the first person in the assembly line would sit
idly until the last person in line finished. Thus, each person would spend most
of their time waiting for their turn to work. Instruction pipelining, in the
ideal case, means that the each person starts on the next part as soon as they
finish with the previous part, so there is no waiting. A more in-depth
description can be found at Wikipedia.org.
Since this type of parallelism is implemented within a CPU, a programmer
doesn't generally have to worry about this, as opposed to thread
parallelism.
Linux
Linux is an operating system, similar to UNIX,
which is becoming quite popular for supercomputers due to abundant support, user
familiarity, and comparable performance with optimized UNIX systems. Kraken,
for example, runs on a modified version of Linux.
NCCS
National Center for Computational Sciences is the Department of Energy's
supercomputing center at Oak Ridge National Laboratory.
NICS
The National Institute for Computational Sciences is the University
of Tennessee's center. The center is located at Oak Ridge National
Laboratory, which allows it to share many resources with the Department
of Energy's supercomputing center, NCCS. Simulations from
researchers all over the country are run on the NICS computer,
Kraken.
Node
In traditional computing, a node is an object on a network. For example, on
a home network, your computer, router, and printer might all be nodes.
Supercomputers like Kraken are essentially networks, with nodes that communicate
with each other to solve a larger problem than any singular computer could in a
reasonable amount of time. Kraken contains several types of nodes; compute
nodes are the work-horses of the system, and are much like a stripped-down
computer. An I/O node is the interface between the compute nodes and
other computers, that is, it deals with input and output for the system.
Open Research
Open research is research that is not protected by proprietary
claims or classified by the government.
ORNL
Oak Ridge National Laboratory: the Department
of Energy's site in east Tennessee.
Scratch Space
Supercomputers generally have what is called scratch space: disk
space available for temporary use. It is analogous to scratch paper. This may
be thought of as a desk: it is where papers are stored while they are waiting to
be worked on or filed away.
ssh
A protocol for securely connecting to a remote computer, or also a program
which uses this protocol. This connection is generally for a command line
interface, but it is possible to use GUI programs through SSH. For more
information about how to use SSH, see Access.
Thread Parallelism
Thread Parallel refers to the method of splitting a program up into
semi-independent threads. For example, if I needed to clean the
kitchen and the bathrooms, I could do the kitchen, then the bathrooms—this might
be considered thread-serial. If I had a helper, however, they could clean the
kitchen while I cleaned the bathrooms. Two related tasks are being carried out
independently (and simultaneously), so it is an analogy for thread parallel
execution. This contrasts with other forms of parallelism, such as Instruction-Parallelism.
top500.org
Top500.org is a list of the fastest
computers in the world, compiled twice a year.
UNIX
UNIX is an operating system first developed in the 1970's. It has gone
through a number of incarnations, and still has many popular versions. UNIX has
dominated supercomputing for many years, however, the high performance computing
community has been increasingly turning to Linux for an operating system.
XSEDE
The National Science Foundation funds a group of open
research oriented supercomputing sites. Researchers may go to
XSEDE and obtain allocations to run their programs at one or more of
the centers. Also, the centers work together to share resources and expertise
and further scientific understanding
XT
Cray's current line of supercomputers. Kraken is an XT5 system.