You are on page 1of 6

Daphne Lea F.

Ochoco BSCS 4

How do I debug Open MPI processes in parallel?

This is a difficult question. Debugging in serial can be tricky: errors, uninitialized variables, stack smashing, ... etc. Debugging in parallel adds multiple different dimensions to this problem: a greater propensity for race conditions, asynchronous events, and the general difficulty of trying to understand N processes simultaneously executing -- the problem becomes quite formidable. This FAQ section does not provide any definition solutions to debugging in parallel. At best, it shows some general techniques and a few specific examples that may be helpful to your situation. But there are various controls within Open MPI that can help with debugging. These are probably the most valuable entries in this FAQ section.

2. What tools are available for debugging in parallel?


There are two main categories of tools that can aid in parallel debugging:

Debuggers: Both serial and parallel debuggers are useful. Serial debuggers are what most programmers are used to (e.g., gdb), while parallel debuggers can attach to all the individual processes in an MPI job simultaneously, treating the MPI application as a single entity. This can be an extremely powerful abstraction, allowing the user to control every aspect of the MPI job, manually replicate race conditions, etc. Profilers: Tools that analyze your usage of MPI and display statistcs and meta information about your application's run. Some tools present the information "live" (as it occurs), while others collect the information and display it in a post mortem analysis.

Both freeware and commercial solutions are available for each kind of tool.

3. How do I run with parallel debuggers?


See these FAQ entries:

Running under TotalView Running under DDT

4. What controls does Open MPI have that aid in debugging?


Open MPI has a series of MCA parameters for the MPI layer itself that are designed to help with debugging. These parameters can be can be set in the usual ways. MPI-level MCA parameters can be displayed by invoking the following command:
shell$ ompi_info --param mpi all

Here is a summary of the debugging parameters for the MPI layer:

mpi_param_check: If set to true (any positive value), and when Open MPI is compiled with parameter checking enabled (the default), the parameters to each MPI function can be passed through a series of correctness checks. Problems such as passing illegal values (e.g., NULL or MPI_DATATYPE_NULL or other "bad" values) will be discovered at run time and an MPI exception will be invoked (the default of which is to print a short message and abort the entire MPI job). If set to 0, these checks are disabled, slightly increasing performance. mpi_show_handle_leaks: If set to true (any positive value), OMPI will display lists of any MPI handles that were not freed before MPI_FINALIZE (e.g., communicators, datatypes, requests, etc.). mpi_no_free_handles: If set to true (any positive value), do not actually free MPI object when their corresponding MPI "free" function (e.g., do not free communicators when MPI_COMM_FREE is invoked). This can be helpful in tracking down applications that accidentally continue to use MPI handles after they have been freed. mpi_show_mca_params: If set to true (any positive value), show a list of all MCA parameters and their values during MPI_INIT. This can be quite helpful for reproducability of MPI applications. mpi_show_mca_params_file: If set to a non-empty value, and if the value of mpi_show_mca_params is true, then output the list of MCA parameters to the filname value. If this parameter is an empty value, the list is sent to stderr. mpi_keep_peer_hostnames: If set to a true value (any positive value), send the list of all hostnames involved in the MPI job to every process in the job. This can help the specificity of error messages that Open MPI emits if a problem occurs (i.e., Open MPI can display the name of the peer host that it was trying to communicate with), but it can somewhat slow down the startup of large-scale MPI jobs. mpi_abort_delay: If nonzero, print out an identifying message when MPI_ABORT is invoked showing the hostname and PID of the process that invoked MPI_ABORT, and then delay that many seconds before exiting. A

negative value means to delay indefinitely. This allows a user to manually come in and attach a debugger when an error occurs. Remember that the default MPI error handler -- MPI_ERRORS_ABORT -- invokes MPI_ABORT, so this parameter can be useful to discover problems identified by mpi_param_check. mpi_abort_print_stack: If nonzero, print out a stack trace (on supported systems) when MPI_ABORT is invoked. mpi_ddt_<foo>_debug, where <foo> can be one of pack, unpack, position, or copy: These are intenral debugging features that are not intended for end users (but ompi_info will report that they exist).

5. Do I need to build Open MPI with compiler/linker debugging flags (such as -g) to be able to debug MPI applications?
No. If you build Open MPI without compiler/linker debugging flags (such as -g), you will not be able to step inside MPI functions when you debug your MPI applications. However, this is likely what you want -- the internals of Open MPI are quite complex and you probably don't want to start poking around in there. You'll need to compile your own applications with -g (or whatever your compiler's equivalent is), but unless you have a need/desire to be able to step into MPI functions to see the internals of Open MPI, you do not need to build Open MPI with -g.

6. Can I use serial debuggers (such as gdb) to debug MPI applications?


Yes; the Open MPI developers do this all the time. There are two common ways to use serial debuggers: 1. Attach to individual MPI processes after they are running. For example, launch your MPI application as normal with mpirun. Then login to the node(s) where your application is running and use the --pid option to gdb to attach to your application. An inelegant-but-functional technique commonly used with this method is to insert the following code in your application where you want to attach:

{ int i = 0; char hostname[256]; gethostname(hostname, sizeof(hostname)); printf("PID %d on %s ready for attach\n", getpid(), hostname); fflush(stdout); while (0 == i) sleep(5); }

This code will output a line to stdout outputting the name of the host where the process is running and the PID to attach to. It will then spin on the sleep() function forever waiting for you to attach with a debugger. Using sleep() as the inside of the loop means that the processor won't be pegged at 100% while waiting for you to attach. Once you attach with a debugger, go up the function stack until you are in this block of code (you'll likely attach during the sleep()) then set the variable i to a nonzero value. With GDB, the syntax is:
(gdb) set var i = 7

Then set a breakpoint after your block of code and continue execution until the breakpoint is hit. Now you have control of your live MPI application and use the full functionality of the debugger. You can even add conditionals to only allow this "pause" in the application for specific MPI processes (e.g., MPI_COMM_WORLD rank 0, or whatever process is misbehaving). 2. Use mpirun to launch xterms (or equivalent) with serial debuggers. This technique launches a separate window for each MPI process in MPI_COMM_WORLD, each one running a serial debugger (such as gdb) that will launch and run your MPI application. Having a separate window for each MPI process can be quite handy for low process-count MPI jobs, but requires a bit of setup and configuration that is outside of Open MPI to work properly. A naieve approach would be to assume that the following would immediately work:
shell$ mpirun -np 4 xterm -e gdb my_mpi_application

Unfortunately, it likely won't work. Several factors must be considered: 1. What launcher is Open MPI using? In an rsh/ssh environment, Open MPI will default to using ssh when it is available, falling back to rsh when ssh cannot be found in the $PATH. But note that Open MPI

closes the ssh (orrsh) sessions when the MPI job starts for scalability reasons. This means that the built-in SSH X forwarding tunnels will be shut down before the xterms can be launched. Although it is possible to force Open MPI to keep its SSH connections active (to keep the X tunneling available), we recommend using non-SSH-tunneled X connections, if possible (see below). 2. In non-rsh/ssh environments (such as when using resource managers), the environment of the process invokingmpirun may be copied to all nodes. In this case, the DISPLAY environment variable may not be suitable. 3. Some operating systems default to disabling the X11 server from listening for remote/network traffic. For example, see this post on the user's mailing list, describing how to enable network access to the X11 on Fedora Linux. 4. There may be intermediate firewalls or other network blocks that prevent X traffic from flowing between the hosts where the MPI processes (and xterms) are running and the host connected to the output display. The easiest way to get remote X applications (such as xterm) to display on your local screen is to forego the security of SSH-tunneled X forwarding. In a closed environment such as an HPC cluster, this may be an acceptable practice (indeed, you may not even have the option of using SSH X forwarding if you SSH logins to cluster nodes are disabled), but check with your security administrator to be sure. If using non-encrypted X11 forwarding is permissable, we recommend the following: 5. For each non-local host where you will be running an MPI process, add it to your X server's permission list with the xhost command. For example:
shell$ cat my_hostfile inky blinky stinky clyde shell$ for host in `cat my_hostfile` ; do xhost +host ; done

6. Use the -x option to mpirun to export an appropriate DISPLAY variable so that the launched X applications know where to send their output. An appropriate value is usually (but not always) the hostname containing the display where you want the output and the :0 (or :0.0) suffix. For example:
shell$ hostname

arcade.example.come shell$ mpirun -np 4 --hostfile my_hostfile \ -x DISPLAY=arcade.example.com:0 xterm -e gdb my_mpi_application

Note that X traffic is fairly "heavy" -- if you are operating over a slow network connection, it may take some time before the xterm windows appear on your screen. 7. If your xterm supports it, the -hold option may be useful. hold tells xterm to stay open even when the application has completed. This means that if something goes wrong (e.g., gdb fails to execute, or unexpectedly dies, or ...), the xterm window will stay open allowing you to see what happened, instead of closing immediately and losing whatever error message may have been output. 8. When you have finished, you may wish to disable X11 network permissions from the hosts that you were using. Use xhost again to disable these permissions:
shell$ for host in `cat my_hostfile` ; do xhost -host ; done

You might also like