In this chapter we discuss the use of FFTW in a parallel environment. The use of FFTW in a shared-memory threads environment is explained, and we also point out the parallel implementations of FFTW that are available.
The FFTW package contains several parallel transform implementations that leverage the uniprocessor FFTW code. Currently, FFTW includes three parallel transform implementations:
One implementation is written in Cilk, a C-like parallel language
developed at MIT and currently available for several SMP platforms. See
also the Cilk home page. The
FFTW Cilk code can be found in the cilk
directory, with
parallelized one- and multi-dimensional transforms of complex data. The
Cilk FFTW routines are documented in cilk/README
.
A second set of routines utilizes shared-memory threads for parallel
one- and multi-dimensional transforms of both real and complex data, and
is callable from an ordinary C program. Currently, this code has been
tested under POSIX, Solaris, and BeOS threads. (POSIX threads are
available on most Unix SMP platforms, including Linux.) (We also
include untested code for Win32 and MacOS MP threads. Users have
reported that the Win32 code works.) These routines are located in the
threads
directory, and are documented in Section Multi-threaded FFTW.
Finally, the mpi
directory contains multi-dimensional transforms
of real and complex data for parallel machines supporting MPI. It also
includes parallel one-dimensional transforms for complex data. The main
feature of this code is that it supports distributed-memory transforms,
so it runs on everything from workstation clusters to massively-parallel
supercomputers. More information on MPI can be found at the
MPI home page. The FFTW MPI routines
are documented in Section MPI FFTW.
Users writing multi-threaded programs must concern themselves with the
thread-safety of the libraries they use--that is, whether it is
safe to call routines in parallel from multiple threads. FFTW can be
used in such an environment, but some care must be taken because certain
parts of FFTW use private global variables to share data between calls.
In particular, the plan-creation functions share trigonometric tables
and accumulated wisdom
. (Users should note that these comments
only apply to programs using shared-memory threads. Parallelism using
MPI or forked processes involves a separate address-space and global
variables for each process, and is not susceptible to problems of this
sort.)
The central restriction of FFTW is that it is not safe to create
multiple plans in parallel. You must either create all of your plans
from a single thread, or instead use a semaphore, mutex, or other
mechanism to ensure that different threads don't attempt to create plans
at the same time. The same restriction also holds for destruction of
plans and importing/forgetting wisdom
. Once created, a plan may
safely be used in any thread.
The actual transform routines in FFTW (fftw_one
, etcetera) are
re-entrant and thread-safe, so it is fine to call them simultaneously
from multiple threads. Another question arises, however--is it safe to
use the same plan for multiple transforms in parallel? (It would
be unsafe if, for example, the plan were modified in some way by the
transform.) We address this question by defining an additional planner
flag, FFTW_THREADSAFE
. When included in the flags for any of the
plan-creation routines, FFTW_THREADSAFE
guarantees that the
resulting plan will be read-only and safe to use in parallel by multiple
threads.
Go to the first, previous, next, last section, table of contents.