Q1: |
What is OpenMP? |
A1: |
OpenMP is a specification for a set
of compiler directives, library routines, and environment
variables that can be used to specify shared memory
parallelism in Fortran and C/C++ programs. |
|
Q2: |
What does the MP in OpenMP stand
for? |
A2: |
The MP in OpenMP stands for Multi
Processing. We provide Open specifications for Multi
Processing via collaborative work with interested parties from
the hardware and software industry, government and
academia. |
|
Q3: |
Why a new standard? |
A3: |
Shared-memory parallel programming
directives have never been standardized in the industry. An
earlier standardization effort, ANSI X3H5 was never formally
adopted. So vendors have each provided a different set of
directives, very similar in syntax and semantics, and each
used a unique comment or pragma notation for "portability".
OpenMP consolidates these directive sets into a single syntax
and semantics, and finally delivers the long- awaited promise
of single source portability for shared-memory parallelism.
OpenMP also addresses the inability of previous
shared-memory directive sets to deal with coarse grain
parallelism. In the past, limited support for coarse grain
work has led to developers thinking that shared-memory
parallel programming was inherently limited to fine-grain
parallelism -- this isn't the case with OpenMP. Orphaned
directives in OpenMP offer the features necessary to represent
coarse-grained algorithms. |
|
Q4: |
How is the OpenMP specification
different from the X3H5 draft standard? |
A4: |
The ANSI/X3 authorized subcommittee
X3H5 was chartered for the purpose of developing an ANSI
standard based on the work done by the Parallel Computing
Forum (PCF). PCF was an informal industry group that attempted
to complete the work on standardized spelling of DO loop
oriented parallelism. They produced a draft standard, but
never completed the work. The OpenMP specification addresses
the same problem. The difference is that the OpenMP
Architecture Review Board completed the task and the
specification is gaining industry-wide support. The OpenMP
specification is an agreement reached between industry vendors
and users - it is not a formal standard. |
|
Q5: |
How does OpenMP compare with ...
? |
A5: |
MPI?
Message-passing has become accepted as a portable style of
parallel programming, but has several significant weaknesses
that limit its effectiveness and scalability. Message-passing
in general is difficult to program and doesn't support
incremental parallelization of an existing sequential program.
Message-passing was initially defined for client/server
applications running across a network, and so includes costly
semantics (including message queuing and selection and the
assumption of wholly separate memories) that are often not
required by tightly-coded scientific applications running on
modern scalable systems with globally addressable and cache
coherent distributed memories.
HPF? HPF has never really gained wide
acceptance among parallel application developers or hardware
vendors. Some applications written in HPF perform well, but
others find that limitations resulting from the HPF language
itself or the compiler implementations lead to disappointing
performance. HPF's focus on data parallelism has also limited
its appeal.
Pthreads? Pthreads have never been
targeted toward the technical/HPC market. This is reflected in
the minimal Fortran support, and its lack of support for data
parallelism. Even for C applications, pthreads requires
programming at a level lower than most technical developers
would prefer.
FORALL loops? FORALL loops are not rich or
general enough to use as a complete parallel programming
model. Their focus on loops and the rule that subroutines
called by those loops can't have side effects effectively
limit their scalability. FORALL loops are useful for providing
information to automatic parallelizing compilers and
preprocessors.
BSP or LINDA or
SISAL or...? There are lots of parallel
programming languages being researched or prototyped in the
industry. These may be targeted towards a specific
architecture, or focused on exploring one key requirement. If
you have a question about how OpenMP compares with a specific
language or model, we can help you figure this out. |
|
Q6: |
What about nested parallelism? |
A6: |
Nested parallelism is permitted by
the OpenMP specification. Supporting nested parallelism
effectively can be difficult, and we expect most vendors will
start out by executing nested parallel constructs on a single
thread. OpenMP encourages vendors to experiment with nested
parallelism to help us and the users of OpenMP understand the
best model and API to include in our specification. We will
include the necessasry functionality when we understand the
issues better. |
|
Q7: |
What about task parallelism? |
A7: |
Support for general task parallelism
is not included in the OpenMP specification. OpenMP encourages
vendors to experiment with task parallelism to help us and the
users of OpenMP understand the best model and API to include
in our specification. We will include the necessasry
functionality when we understand the issues better. |
|
Q8: |
What if I just want loop-level
parallelism? |
A8: |
OpenMP fully supports loop-level
parallelism. Loop-level parallelism is useful for applications
which have lots of coarse loop-level parallelism, especially
those that will never be run on large numbers of processors or
for which restructuring the source code is either impractical
or disallowed. Typically, though, the amount of loop-level
parallelism in an application is limited, and this in turn
limits the scalability of the application.
OpenMP allows you to use loop-level parallelism as a way to
start scaling your application for multiple processors, but
then move into coarser grain parallelism, while maintaining
the value of your earlier investment. This incremental
development strategy avoids the all-or-none risks involved in
moving to message-passing or other parallel programming
models. |
|
Q9: |
What does orphaning mean? |
A9: |
In early shared-memory models,
parallel directives were only permitted within the lexical
extent of parallel regions. To the application programmer,
this meant that the directives had to be defined in such a way
that all information needed to parallelize a loop or
subroutine had to be specified within the source for that loop
or subroutine. If another subroutine was called, parallel
information specific to that subroutine had to be specified at
the call site, or the called subroutine had to be (manually)
inlined. This simplified model was sufficient for the
moderate, loop-level parallelism that dominated the use of
these models, but never allowed good scalability for very
large applications. For such large applications, programmers
had to program outside the directive set to achieve good
performance, resulting in programs that were non-standard and
difficult to maintain.
Orphaning allows parallel directives to be specified
outside the lexical extent of parallel regions. A subroutine
can be written for use from a number of parallel regions, and
parallel directives needed by that subroutine embedded within
its source, instead of having to be replicated everywhere that
calls it. This is a natural place to speficy the parallelism,
and avoids programming errors that result when the earlier
style is used for complex applications. Orphaning is crucial
to implementing coarse grain parallel algorithms, and to the
development of portable, parallel libraries. |
|
Q10: |
What languages does OpenMP work
with? |
A10: |
OpenMP is designed for Fortran, C
and C++ to support the language that the underlying compiler
supports. The OpenMP specification does not introduce any
constructs that require specific Fortran 90 or C++ features.
OpenMP cannot be supported by compilers that do not support
one of Fortran 77, Fortran 90, ANSI 89 C or ANSI C++. |
|
Q11: |
Is support for other languages
planned? |
A11: |
The OpenMP ARB does not plan to
introduce support for additional languages at this
point. |
|
Q12: |
Is OpenMP scalable? |
A12: |
OpenMP can deliver scalability for
applications using shared- memory parallel programming.
Significant effort was spent to ensure that OpenMP can be used
for scalable applications. Ultimately, scalability is a
property of the application and the algorithms used. The
parallel programming language can only support the scalability
by providing constructs that simplify the specification of the
the parallelism and can be implemented with low overhead by
compiler vendors. OpenMP certainly delivers these kinds of
constructs. |
|
Q13: |
What about non-shared memory
machines or networks of workstations? |
A13: |
As much as it would be nice to think
that a single programming model (OpenMP or MPI or HPF or
whatever) might run well on all architectures, this is not the
case today. OpenMP was designed to exploit certain
characteristics of shared-memory architectures. The ability to
directly access memory throughout the system (with minimum
latency and no explicit address mapping) combined with very
fast shared memory locks, makes shared-memory architectures
best suited for supporting OpenMP.
Systems that don't fit the classic shared-memory
architecture may provide hardware or software layers that
present the appearance of a shared-memory system, but often at
the cost of higher latencies or special limitations. For
example, OpenMP could be implemented for a distributed memory
system on top of MPI, so OpenMP's latencies would be greater
than that of MPI (whereas typically the reverse is the case on
a shared-memory system). The extent to which these latencies
or limitations reduce application portability or performance
will help dictate whether vendors choose to develop OpenMP
implementations for distributed memory systems. |
|
Q14: |
Why should OpenMP succeed when PCF
and X3H5 failed? |
A14: |
There are a variety of reasons
OpenMP will succeed at being accepted as a standard, where
earlier efforts failed.
Goals - The OpenMP definition was driven
by a small number of experts working to an aggressive
schedule, building primarily on current practice.
Technical - OpenMP includes better support
for more scalable, coarse grain parallelism and more
directives for managing private/shared data. OpenMP also
includes query functions, environment variables, and
conditional compilation support.
Timing - The need for a standard in this
area is better accepted throughout the industry now, as
vendors have begun to converge on system architectures that
combine aspects of both shared-memory and distributed
architectures of the past. The importance of a standard
encouraging scalable, parallel application development that
can exploit shared-memory hardware is recognized as being more
important than ever. (Interest in PCF and X3H5 was partly
derailed by the appearance of pure distributed memory MPP
systems, whose proponents were arguing that shared-memory
parallel programming was no longer interesting.)
Vendor support - The system vendors behind
OpenMP collectively have delivered a very large share of the
shared-memory parallel systems in use today. |
|
Q15: |
How will OpenMP be managed as a
specification, over the long-term? Who owns it? |
A15: |
The OpenMP specifications are owned
and managed by the OpenMP Architecture Review Board
(ARB). |
|
Q16: |
How do I get other questions
answered? |
A16: |
You can send your questions to the
OpenMP organization in the feedback section of the web site.
We endeavor to answer all queries within one week, but
sometimes it can take longer due to people's varying
schedules, etc. |