The National Institute for Computational Sciences

Darter

Darter: I get the error message "OOM killer terminated this process". What is OOM?

This error message indicates that the node is running Out Of Memory. This could be the result of a bug in the code, or memory requirements for the given input. Note that due to optimistic memory allocation, you probably will not get a null pointer, even if you are out of memory. The program should be killed at the point the memory is used.

One quick solution might be to run with only four MPI processes per socket so each process gets a larger share of the memory on the node:

Darter: How do I find out what nodes my batch job is using?

There are a couple of easy ways to find out what nodes are assigned to your batch job. The easiest is to use the checkjob command. Part of the output will return a list of nodes like the following:

Allocated Nodes:      

[84:1][85:1][86:1][87:1][88:1][89:1][90:1][91:1]

The method returns the a logical numbering of nodes. A physical numbering of the nodes as well as the pid layout can be obtained by setting the PMI_DEBUG variable to 1.

Darter: Why do I see the message: SEEK_SET is #defined but must not be for the C++ binding of MPI?

The following error message:

#error "SEEK_SET is #defined but must not be for the C++ binding of MPI" 

Is the result of a name conflict between stdio.h and the MPI C++ binding. Users should place the mpi include before the stdio.h and iostream includes.

Users may also see the following error messages as a result of including stdio or iostream before mpi:

Pages

Subscribe to RSS - Darter