Find the word definition


The parallel-TEBD is a version of the TEBD algorithm adapted to run on multiple hosts. The task of parallelizing TEBD could be achieved in various ways.

  • As a first option, one could use the OpenMP API (this would probably be the simplest way to do it), using preprocessor directives to decide which portion of the code should be parallelized. The drawback of this is that one is confined to Symmetric multiprocessing (SMP) architectures and the user has no control on how the code is parallelized. An Intel extension of OpenMP, called Cluster OpenMP 1, is a socket-based implementation of OpenMP which can make use of a whole cluster of SMP machines; this spares the user of explicitly writing messaging code while giving access to multiple hosts via a distributed shared-memory system. The OpenMP paradigm (hence its extension Cluster OpenMP as well) allows the user a straightforward parallelization of serial code by embedding a set of directives in it.
  • The second option is using the Message Passing Interface (MPI) API. MPI can treat each core of the multi-core machines as separate execution host, so a cluster of, let's say, 10 compute nodes with dual-core processors will appear as 20 compute nodes, on which the MPI application can be distributed. MPI offers the user more control over the way the program is parallelized. The drawback of MPI is that is not very easy to implement and the programmer has to have a certain understanding of parallel simulation systems.
  • For the determined programmer the third option would probably be the most appropriate: to write one's own routines, using a combination of threads and TCP/IP sockets to complete the task. The threads are necessary in order to make the socket-based communication between the programs non-blocking (the communication between programs has to take place in threads, so that the main thread doesn't have to wait for the communication to end and can execute other parts of the code). This option offers the programmer complete control over the code and eliminates any overhead which might come from the use of the Cluster OpenMP or MPI libraries.

This article introduces the conceptual basis of the implementation, using MPI-based pseudo-code for exemplification, while not restricting itself to MPI - the same basic schema could be implemented with the use of home-grown messaging routines.