MPI (Message Passing Interface) is a standard describing how to pass messages between programs running on the same or different machines.
MPI is a formal standard and it is actively supported by all major vendors. Some vendors have highly-optimized MPI libraries available on their systems. There are also a couple of open-source implementations of the MPI standard, such as MPICH and OpenMPI. There are also numerous commercial MPI implementations that support a wide range of systems and interconnects, for example, HP-MPI and IntelMPI.
Support for a particular MPI implementation in ADF can be considered at three levels: the source code, the configure script, and pre-compiled binaries. At each level different MPI implementations may be supported.
The ADF source code is not implementation-specific and thus theoretically it supports any MPI library. Many popular MPI implementations are supported at the level of the configure script. For example on 32-bit Linux these are: MPICH1, Intel MPI, OpenMPI, HP-MPI, LAM-MPI, and Score. This means that proper compiler flags will be used and and an appropriate $ADFBIN/start script will be generated at configure time.
However, when choosing an MPI implementation for pre-compiled binaries SCM considers many factors including (but not limited to) the re-distribution policy, performance, and built-in support for modern interconnects. HP-MPI is currently the standard MPI implementation supported by SCM because it has the most favorable combination of these factors at this moment. For platforms where HP-MPI is supported it is distributed with ADF. A different MPI implementation will be standard on a platform where HP-MPI is not available. It may or may not be distributed with ADF. For example, SGI MPT is standard on SGI machines and OpenMPI is standard on MacOS platforms, but only the latter is distributed together with ADF.
When pre-compiled binaries do not work on your computer(s) due to incompatibility of the standard MPI library with your soft- and/or hardware, the SCM support staff will be glad to assist you in compiling ADF with the MPI implementation supported on your machine(s).
If you are going to use an MPI version of the ADF package, and it not HP-MPI or OpenMPI, you will need to determine if the corresponding MPI run-time environment is already installed on your machine. If not, you will need to install it separately from ADF. As it has been already mentioned, HP-MPI and OpenMPI are distributed together with the corresponding version of ADF so you don't need to worry about installing them separately.
Running with MPI on more than one node
When running on more than one machine (for example on a cluster without a batch system) you need to specify a list of hosts on which mpirun needs to spawn processes. In principle, this is implementation-specific and may be not required if the MPI is tightly integrated with your operating and/or batch system. For MPICH1 and HP-MPI, you can do this by preparing a file containing hostnames of the nodes (one per line) you will use in your parallel job. Then you set the SCM_MACHINEFILE environment variable pointing to the file.
When you submit a parallel job to a batch system the job scheduler usually provides a list of nodes allocated to the job. The $ADFBIN/start shell script has some logic to extract this information from the batch system and pass it to the MPI's launcher command (typically mpirun). In some cases, depending on your cluster configuration, this logic may fail. If this happens, you should examine the $ADFBIN/start file and edit the relevant portion of it. For example, you may need to change commands that process the batch system-provided nodelist or change mpirun's command-line options or even replace the mpirun command altogether.