![]() You can write an equivalent loop in fortran, so provided you're happy to get the hostname and pid from another method (see mpi_get_processor_name as mentioned by VladimirF in his answer and if you are happy to use compiler extensions both gnu and intel compilers provide a getpid extension), you could use something like the following (thanks to this answer for the sleep example). You can then use the debugger controls to exit this loop and the program will continue. The posted code is basically just an infinite loop designed to "pause" execution whilst you attach the debugger. Then you get something like MAIN_ () at mpi_gdb.f90:26 In a different terminal window, of course. ![]() PID 2355 on linux.site ready for attach is world rank 0 PID 2358 on linux.site ready for attach is world rank 3 PID 2357 on linux.site ready for attach is world rank 2 PID 2356 on linux.site ready for attach is world rank 1 The above code will print, for example, > mpif90 -ggdb mpi_gdb.f90 Volatile means that the value can change at any time and the compiler must reload its value from memory for the check. You must lower your optimizations or declare i as volatile. Important note: if you compile with optimizations than the compiler can see that i=0 is never true and will remove the check completely. !this serves to block the execution at a specific place until you unblock it in GDB by setting i=0 Write(*,*) "PID ", pid, " on ", trim(hostname), " ready for attach is world rank ", rank It is normally recommended to call MPI_Init as early as possible in your program.Ĭharacter(MPI_MAX_PROCESSOR_NAME) :: hostnameĬall MPI_Get_processor_name(hostname, hostname_len, ie)Ĭall MPI_Comm_rank(MPI_COMM_WORLD, rank, ie) Also for using MPI_Get_processor_name it must be after. Whether before or after MPI_Init? If you want to print the rank, it must be after. What you than do is that you login to that node and attach gdb to the righ process with something like gdb -pid 12345įor sleep you can use the non-standard sleep intrinsic subroutine available in many compilers or write your own. But as I said I would just start with printing alone. Just put that flush after your write where you print hostname and pid. I think Fortran write and print flush as necessary at least in compilers I use.īut you definitely can use the flush statement use iso_fortran_env I don't think the flush is necessary in Fortran. Then you have to attach the debugger early. Less well when you have to stop at some specific place. This will work well when examining deadlocks. The most important is to get a backtrace, so just print bt in the console. Then I just do gdb -pid and I examine the stack and local variables in the processes. Below rank 0 is process 1641 and than they are rank and so on. Then if it hangs I use top to find out the PIDof the processes and usually one can guess easily which rank is which from the PIDs (they tend to be consecutive and the lowest one is rank 0). What I actually often do is I just run the MPI job locally and see what it does.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |