CrayPat
Category: Tools-Performance
Description
CrayPat is the Cray performance analysis tool. CrayPat is a package for instrumenting and tracing code. It may be used to selectively trace specific functions, or an entire application. The latter is highly discouraged for even moderately large applications. Trace files are inherently very large, and the overhead for running these applications can also be large. Also, it should be noted that CrayPat attempts to account for profiling overhead, however, this is necessarily imperfect, and may result in false hot-spots. This can be mitigated by inlining small functions.
Profiling with the Cray tools requires multiple steps. It does require you to recompile your code. CrayPat requires the perftools module to be loaded. You must recompile your code with ftn or cc (the Cray wrappers) to link in the appropriate Cray performance tools/libraries.
CrayPat consists of three major components:
- pat_build - used to instrument the program to be analyzed
- pat_report - a standalone text report generator that can be used to further explore the data generated by instrumented program execution
- Apprentice2 - a graphical analysis tool that can be used, in addition to pat_report, to further explore and visualize the data generated by instrumented program execution
These components are described in greater detail in the pat_build, pat_report, and app2 man pages, respectively. Note that you must have the perftools module loaded first to get these man pages. In addition, more detail about CrayPat usage and environment variables is provided in the pat man page.
% module load perftools % man pat_build % man pat_report % man app2 % man pat
Additional information may be available with the interactive help, invoked by pat_help.
- The Using Cray Performance Analysis Tools (S-2376-52.pdf) document from Cray is the primary source of information about CrayPat.
- You may also refer to the presentations on using CrayPat available at our Training Archives page.
Use
Follow these 10 STEPS to perform the basic analysis of your program using CrayPat and Apprentice2 tools. Since CrayPat is a performance analysis tool, not a debugging tool, start with a fully debugged and executable program. The program must be capable of running to a planned completion or an intentional termination before CrayPat can be used. Load the programming environment modules first. This ensures that the correct links and libraries are in place with your choice of compiler and target execution environment. For example, if you are working on a Cray XT series system using CNL on the compute nodes, enter the following command:
Step 1: Access performance tools software
First, the perftools modulefile needs to be loaded.
% module load perftools
Note that the former xt-craypat and apprentice2 modules have been merged into one module called perftools. The following error will occur if a user attempts to load xt-craypat and/or apprentice2.
ERROR: xt-craypat and apprentice2 have been merged into one module
called perftools.
Please run the following to load perftools:
module unload xt-craypat apprentice2
module load perftools
Step 2: Build application keeping .o files
Then with the perftools module loaded, rebuild your application using the compiler option to preserve all .o files (and .a files, if any) created during compilation. CrayPat requires access to the object files (and archive files, if any). For example, if you are working with a Fortran program, enter commands similar to the following:
% ftn -c my_program.f % ftn -o my_program my_program.o
Or simply use your makefile
% make clean % make
Step 3: Instrument application for automatic profiling analysis
To use Automatic Program Analysis(APA), follow these steps. Use the pat_build command to insert APA code into your program. The instrumented copy is saved under a new name with the extension +pat. Note that the original program remains unchanged.
Note: When building in your /tmp/work or /tmp/proj area, a copy of the build's .o files will, by default, be placed in $HOME/.craypat directory. This may increase your home directory usage above quota. The PAT_LD_OBJECT_TMPDIR environment variable can be used to control the location of the .craypat directory. For example, setenv PAT_LD_OBJECT_TMPDIR /tmp/work/$USER .
% pat_build -O apa my_program
This produces the instrumented executable my_program+pat.
Execute the program. During execution, the specified performance analysis data is collected and written to one or more data files, depending on the experiment being conducted. On a Cray XT series CNL system, programs are executed using the aprun command.
Step 4: Run application to get top time consuming routines
% aprun -n <numproc> my_program+pat
Or simply submit a PBS script
% qsub test.pbs
This produces the data file my_program+pat+PID-nodesdt.xf, or multiple files in a directory <sdatadir>, which contain basic asynchronously derived program profiling data.
Step 5: Use pat_report to process the instrumentation file
After program execution completes or terminates, use the pat_report command to create an .apa report.
% pat_report -T -o report1.txt my_program+pat+PID-nodesdt.xf
or
% pat_report -T -o report1.txt <sdatadir>
This produces three files:
- a sampling-based text report to report1.txt
- an .ap2 file (my_program+pat+PID-nodesdt.ap2), which contains both the report data and the associated mapping from addresses to functions and source line numbers
- an .apa file (my_program+pat+PID-nodesdt.apa), which contains the pat_build arguments recommended for further performance analysis
Once an .apa file is created, you can open it in your preferred text editor and verify if additional instrumentation is needed. Lines that are preceded with # will be ignored. Any option to pat_build may be added to this file. For most users, the file created by pat_report will be sufficient.
Check the sampling report for possible regions to instrument with the CrayPat API
Step 6: Reinstrument the program
Reinstrument the program, this time using the .apa file.
Most common values for -g tracegroup are:
-g adios Adaptable I/O System API -g armci Aggregate Remote Memory Copy -g blas Basic Linear Algebra subprograms -g caf Co-Array Fortran (Cray CCE compiler only) -g chapel Chapel language compile and runtime library API -g dmapp Distributed Memory Application API for Gemini network -g hdf5 Manages extremely large and complex data collections -g heap Dynamic heap -g io Includes stdio and sysio groups -g lapack Linear Algebra Package -g mpi MPI -g omp OpenMP API and runtime library API (CCE and PGI only) -g shmem SHMEM -g upc Unified Parallel C (Cray CCE compiler only)
For a full list, please see man pat_build
Step 7: Instrument application for further analysis
After you have verified the .apa file, rebuild your executable as follows.
% pat_build -O my_program+pat+PID-nodesdt.apa
It is not necessary to specify the program name, as this is specified in the .apa file. Invoking this command produces the new executable, my_program+apa, this time instrumented for enhanced tracing analysis.
Step 8: Run the new instrumented executable
% aprun -n <numproc> my_program+apa
Or simply submit a PBS script
% qsub test.pbs
This produces the new data file my_program+apa+PID2-nodesdt.xf, which contains expanded information tracing the most significant functions in the program.
You can use this file as input to pat_report, for text reports, or apprentice2, for graphical analysis. By default, your code will gather hardware counters from hwcp group 0. This can be overridden at runtime by setting the PAT_RT_HWPCenvironment variable (see man hwpc). To ignore hwpc data in your text reports, use the -H option to pat_report.
Step 9: Generate text report and visualization file (.ap2)
Once you have completed execution, generate an .ap2 file for pat_report or apprentice2.
% pat_report -T -o report2.txt my_program+apa+PID2-nodesdt.xf
This produces two files:
- a tracing report to report2.txt
- an .ap2 file (my_program+apa+PID2-nodesdt.ap2), which contains both the report data and the associated mapping from addresses to functions and source line numbers
Step 10: View report in text and/or with Apprentice2
% app2 my_program+apa+PID2-nodesdt.ap2
CrayPat API
To obtain very specific profile data, you can explicitly insert CrayPat calls in your code.
For Fine Grain Instrumentation.
Fortran
include “pat_apif.h” ... call PAT_region_begin(id, “label”, ierr) do i = 1,n ... enddo call PAT_region_end(id, ierr)
C/C++
include... ierr = PAT_region_begin(id, “label”); for (i = 0; i < n; i++) { ... } ierr = PAT_region_end(id);
Disable/Enable Recording.
Fortran
include “pat_apif.h” ... call PAT_record(0) ! Disable do i = 1,n ... enddo call PAT_record(1) ! Enable
C/C++
include... ierr = PAT_record(0); /* Disable */ for (i = 0; i < n; i++) { ... } ierr = PAT_record(1); /* Enable */
Support
This package has the following support level : Supported
Available Versions
All versions of this software are provided by the system vendor.