HiPERiSM - High Performance Algorism Consulting
HCTR-2001-3: Compiler Performance 4
1.0 The Stommel Ocean Model: 1-D MPI decomposition
1.1 Serial and MPI code
This is a comparison of parallel MPI performance with SUN Fortran compilers on the SUN E10000 platform for a floating-point intensive application. The application is the Stommel Ocean Model (SOM77) and the Fortran 77 source code was developed by Jay Jayakumar (serial version) and Luke Lonnergan (MPI version) at the NAVO site, Stennis Space Center, MS. It is available at http://www.navo.hpc.mil/pet/Video/Courses/MPI_Finite. The algorithm is identical to the Fortran 90 version discussed in report HCTR-2001-2 but the Fortran 77 version allows for more flexibility in the domain decomposition for MPI. The serial version (STML77S0) and MPI parallel version (STMLPI1D) of SOM77 have the calling tree (produced by Flint from Cleanscape Software) as shown in Table 1.1.
The compute kernel of the SOM77 is a double-nested loop that performs a Jacobi iteration sweep over a two-dimensional finite difference grid and the number of iterations is set to 100. More details of the model are discussed in HiPERiSM courses HC6 and HC8 (see the services page). For this study the problem size sets the number of interior grid point at N=1000 for a Cartesian grid of 1000 x 1000.
2.1 MPI parallel performance
This section shows MPI parallel performance for the Stommel Ocean Model (SOM77) in a 1-D MPI domain decomposition with the SUN Fortran 77 compiler. Figure 2.1 shows the time in seconds (as reported by the MPI W_TIME procedure) and Figure 2.2 shows the corresponding speed up. Table 2.1 summarizes efficiency values and shows a clear cache effect in the superlinear speed up.
Fig. 2.1. Time to solution in seconds for SOM 1-D when N=1000 on the SUN E10000 for 1, 2 and 4 MPI processes.
Fig. 2.2. Speed up for SOM 1-D when N=1000 on the SUN E10000 for 4, 2 and 1 MPI processes.
HiPERiSM Consulting, LLC, (919) 484-9803 (Voice)
(919) 806-2813 (Facsimile)