HiPERiSM's
Technical Reports HiPERiSM - High Performance Algorism Consulting HCTR-2001-6: Compiler Performance 7
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1.0 The Stommel Ocean Model: 2-D MPI decomposition + OpenMP Hybrid 1.1 Serial and MPI+OpenMP hybrid code This is a comparison of parallel MPI+OpenMP hybrid performance with SUN Fortran compilers on the SUN E10000 platform for a floating-point intensive application. The application is the Stommel Ocean Model (SOM77) and the Fortran 77 source code was developed by Jay Jayakumar (serial version) and Luke Lonnergan (MPI version) at the NAVO site, Stennis Space Center, MS. It is available at http://www.navo.hpc.mil/pet/Video/Courses/MPI_Finite. The algorithm is identical to the Fortran 90 version discussed in report HCTR-2001-2 but the Fortran 77 version allows for more flexibility in the domain decomposition for MPI. The OpenMP hybrid version was developed by HiPERiSM Consulting, LLC, as part of case studies for the training courses.
2.1 MPI+OpenMP parallel performance for N=1000 For problem size N=1000 this section shows parallel performance for the Stommel Ocean Model (SOM77) in a 2-D MPI domain decomposition with the SUN Fortran 77 compiler using a hybrid MPI+OpenMP parallel model. <> Table 2.1a shows results for the MPI+OpenMP version executed on the SUN E10000 for 1, 2´ 2 and 4´ 4 MPI processes and 1, 2, and 4 threads. The speed up shown there is relative to the case of one MPI process and one OpenMP thread. This is a 64 processor node and the workload impacted the last example.
Table 2.1b shows how the value of the OpenMP scheduling parameter isched changes and the corresponding OpenMP speed up for each value of the number of MPI processes.
Fig. 2.1. Time to solution in seconds for SOM 2-D when N=1000 in a hybrid MPI+OpenMP model on the SUN E10000 for 1, 2 ´2 and 4´4 MPI processes and 1, 2, and 4 threads.
Fig. 2.2. Speed up for SOM 2-D when N=1000 in a hybrid MPI+OpenMP model on the SUN E10000 for 1, 2´2, and 4´4 MPI processes and 1, 2, and 4 threads.
3.1 MPI+OpenMP parallel performance for N=1000 For problem size N=2000 this section shows parallel performance for the Stommel Ocean Model (SOM77) in a 2-D MPI domain decomposition with the SUN Fortran 77 compiler using a hybrid MPI+OpenMP parallel model. Table 3.1a shows results for the MPI+OpenMP version executed on the SUN E10000 for 1, 2´ 2 and 4´ 4 MPI processes and 1, 2, and 4 threads. The speed up shown there is relative to the case of one MPI process and one OpenMP thread. This is a 64 processor node and the workload impacted the last example.
Table 3.1b shows how the value of the OpenMP scheduling parameter isched changes and the corresponding OpenMP speed up for each value of the number of MPI processes.
Fig. 3.1. Time to solution in seconds for SOM 2-D when N=2000 in a hybrid MPI+OpenMP model on the SUN E10000 for 1, 2´2 and 4´4 MPI processes and 1, 2, and 4 threads.
Fig. 3.2. Speed up for SOM 2-D when N=2000 in a hybrid MPI+OpenMP model on the SUN E10000 for 1, 2 ´2, and 4´4 MPI processes and 1, 2, and 4 threads.Fig. 3.3. Speed up relative to one thread for SOM 2-D when N=2000 in a hybrid MPI+OpenMP model on the SUN E10000 for 4´4, 2´2, and 1 MPI processes, respectively, and 1, 2, and 4 threads..
HiPERiSM Consulting, LLC, (919) 484-9803 (Voice) (919) 806-2813 (Facsimile) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||