Hiperism Consulting Course HC2

HiPERiSM's Course HC2

HiPERiSM - High Performance Algorism Consulting

Course HC2: OpenMP Training for Parallel Computing

image2.gif (13629 bytes)

Prerequisites:

This course is intended for experienced Fortran and ANSI C programmers who have used serial platforms from workstations to mainframes. A knowledge of programming vector Supercomputers would be an advantage but is not required. No prior knowledge of programming parallel computers is assumed, but those experienced in message passing paradigms will have a head-start in applications development under OpenMP.

Objectives:

This training course primarily intends to introduce OpenMP to programmers who have no prior experience in parallel computing. As a secondary objective, the target audience also includes those with a background in vector processing systems, RISC workstations, or simply MPI programmers who wish to understand the OpenMP paradigm and learn how to use it. It is anticipated that this approach complements other approaches (1-4) to introducing the OpenMP Language Standard (5) to applications developers. The course teaches participants how to write parallel fortran codes using the OpenMP programming paradigm. The implementations include Shared Memory Parallel (SMP) platforms from workstations to large SMP high performance computers. Special attention is devoted to issues related to porting legacy code to SMP OpenMP implementations.

Duration:

2 days organized as follows:

Day	Period	Chapter	Topic
1	AM	1 2 3 4	Porting legacy code to parallel computers Parallel programming concepts Models, Paradigms, and Styles The OpenMP Paradigm of Parallel Programming
1	PM	5	OpenMP Language Specification
2	AM	6 7	Examples and Exercises Case studies 1 to 7
2	PM	8 9	Case 8: The Princeton Ocean Model Using OpenMP

Format:

The course is contained in a course work book format that is intended for use in one of three ways:

Class room presentation,
Self-paced study,
As a reference.

For options (a) and (b) the course workbook is accompanied by a syllabus.

Some fundamental design principles in developing the course material and work book are:

Orderly build-up of knowledge of parallel language paradigms and hardware before entering into the details of OpenMP.
Separation of the description of how to use the OpenMP language from explanation of parallel work scheduling, data dependencies, recurrences, and memory models.
Separation of the discussion of OpenMP directives and clauses and the itemization of the directives and clauses in simple comprehensible formats.
Providing examples and case studies that can be immediately compiled and executed on an OpenMP host system, and also compared to MPI equivalents.

The workbook includes all source code, sample input, output, and make files needed to compile and execute all programs discussed in the text.

Review of Sections:

The training workbook is arranged into ten Sections described as follows.

Porting Legacy Code to Parallel Computers. This chapter reviews developer perceptions of parallel programming, considerations for legacy codes and how to look for parallelism in them. Also covered are guidelines for porting to SMP computers, typical parallel performance problems and some lessons learned in SMP parallelization.

Parallel programming Concepts. This chapter starts from the basics by describing SMP architectures and parallel performance (with two case studies). Then issues such as memory management, synchronization, and parallel work scheduling are described, the latter in generic pseudo-code examples (6). The chapter concludes with a detailed discussion of data dependencies and recurrence relations, with emphasis on what code constructs inhibit safe parallel code. In conclusion, some parallelization strategies and common parallel programming errors are itemized.

Models, Paradigms, and Styles. This chapter briefly previews hardware models, parallel program paradigms, styles, and language models. Some discussion is presented of the relative merits of HPF, MPI, and OpenMP. Two language models (MPI and OpenMP) are compared for the classical pi program (7).

The OpenMP Paradigm of Parallel Programming. This chapter introduces OpenMP with an overview and description of the execution model. From this it moves to discuss features and the design categories of the OpenMP standard, and concludes with a summary of benefits and important characteristics, and how OpenMP compilers work.

OpenMP language Specification. This chapter covers the complete OpenMP languages specification, but in a reorganized format. First come the basics of the language structure, then a lengthy and detailed description of OpenMP constructs. The description of directives comes before that of clauses for simplicity, and each construct has sections for Description, Restrictions, and Example of Use. Directives are defined with a summary in a table, followed by an explanation of the parallel construct, work-sharing constructs, parallel and work sharing constructs, synchronization constructs, and data environment. The detailed definition of clauses is delayed to a later section that presents a table to cross-reference OpenMP clauses with directives. The details follow with clauses for parallel execution control, data scope attributes, and special operations. The discussion then turns to rules for data scope, pointer syntax, and directive binding. The chapter concludes with the definitions for run-time libraries, OpenMP lock routines, and environment variables.

Examples and Exercises. This chapter compares simple (but not trivial) examples of MPI code that are modification of examples from Chapters 3 to 6 of Pacheco's book (8) and compares them with the analogous OpenMP version. Exercises revolve around variants of the OpenMP implementations.

Case Studies 1 to 7. This chapter discusses seven case studies. These include the Monte Carlo method for multi-dimensional integrals and a source code modification to the standard Linpack and Lapack codes for the linear matrix equation system Ax=y. Also included are case studies of banded matrix solvers, and finite difference methods for the two dimensional diffusion equation, the Stommel Ocean model. Work scheduling and scalability is studied in depth.

Case 8: The Princeton Ocean Model. This chapter gives a step-by-step procedure for conversion into OpenMP of the Princeton Ocean Model (POM) by following a detailed analysis of serial profile results and loop-level parallelism.
Using OpenMP. This Chapter includes a brief summary of good practices and performance issues in porting code to OpenMP and how to avoid common problems. Also discussed is the role of application development software tools and how they help to improve code reliability and performance. Finally a summary is presented on what is new in OpenMP Version 2.0 and ongoing activity world wide on OpenMP applications and tools.

Bibliography. This includes a list of citations on High Performance Computing and parallel language programming.

References:

1) L. Dagum and R. Menon, OpenMP: An Industry Standard API for Shared-Memory Programming, IEEE Computational Science and Engineering, January-March, 1998, pp 46-55.

2) G. Delic, R. Kuhn, W. Magro, H. Scott, and R. Eigenmann, Minisymposium on OpenMP - A New Portable Paradigm of Parallel Computing: Features, Performance, and Applications, Fifth SIAM conference on Mathematical and Computational Issues in the Geosciences, San Antonio, TX, March 24-27, 1999. (http://www.hiperism.com).

3) C. Koelbel, Short Course on OpenMP in Practice, Fifth SIAM conference on Mathematical and Computational Issues in the Geosciences, San Antonio, TX, March 24-27, 1999.

4) T. Mattson and R. Eigenmann, Tutorial on OpenMP Programmming with OpenMP, SuperComputing SC99, Portland, OR, 15 November, 1999.

5) OpenMP Fortran Application Program Interface, Version 1.1 (November, 1999), http://www.openmp.org.

6) S. Brawer, Introduction to Parallel Programming, Academic Press, Inc., Boston, MA, 1989.

7) W. Gropp, E Lusk, A. Skjellum, Using MPI, Portable Parallel Programming with the Message-Passing Interface, The MIT Press, Cambridge, MA, 1996

8) Peter S. Pacheco, Parallel Programming with MPI, Morgan Kaufman Publishers, Inc., San Francisco, CA, 1997.

HiPERiSM Consulting, LLC, (919) 484-9803 (Voice)

(919) 806-2813 (Facsimile)