User Tools

Site Tools


aoe:mpi

http://www.linux-mag.com/id/5759/

http://na37.nada.kth.se/mediawiki/index.php/OpenMPI

Cluster software packages such as ROCKS or xCAT can be used to automate clustering.

yum install

needed:

yum install openmpi
yum install openmpi-devel
yum install gcc
yum install gcc-gfortran
yum install gmp
yum install mpfr ??
yum install compat-gcc-34
yum install compat-gcc-34-g77

maybe these:

yum install libmthca-devel
yum install librdmacm-devel
yum install librdmacm
for x86_64 machines on SL53
 mpi-selector --system --verbose --set openmpi-1.2.7-gcc-x86_64

user must log in again after this change

for x86_64 machines on SL6

Yellow Dog 6.0

also needs:

yum install libsysfs-devel

apt-get install

sudo apt-get install 'openmpi-*'
sudo apt-get install build-essential
sudo apt-get install openssh-server

yast

install these

yast -i gcc
yast -i openmpi
yast -i openmpi-devel
yast -i make

Append to ~/.bashrc

export PATH=$PATH:/usr/lib/mpi/gcc/openmpi/bin/
export LD_LIBRARY_PATH=/usr/lib/mpi/gcc/openmpi/lib/

test code

first check the compiler

hello.c

#include "stdio.h"

int main(int argc, char *argv[])
{
  printf("hello MPI user!\n");
  return(0);
}
gcc hello.c -o hello.exe 
./hello.exe

hello-mpi-2.c

#include "stdio.h"
#include <stdlib.h>

#include <mpi.h>
int main(int argc, char *argv[])
{
 int tid,nthreads;
 char *cpu_name;
 double time_initial,time_current,time;

  /* add in MPI startup routines */
  /* 1st: launch the MPI processes on each node */
  MPI_Init(&argc,&argv);
  time_initial  = MPI_Wtime();
  /* 2nd: request a thread id, sometimes called a "rank" from
          the MPI master process, which has rank or tid == 0
   */
  MPI_Comm_rank(MPI_COMM_WORLD, &tid);

  /* 3rd: this is often useful, get the number of threads
          or processes launched by MPI, this should be NCPUs-1
   */
  MPI_Comm_size(MPI_COMM_WORLD, &nthreads);

  /* on EVERY process, allocate space for the machine name */
  cpu_name    = (char *)calloc(80,sizeof(char));

  /* get the machine name of this particular host ... well
     at least the first 80 characters of it ... */
  gethostname(cpu_name,80);
  time_current  = MPI_Wtime();
  time  = time_current - time_initial;
  printf("%.3f tid=%i : hello MPI user: machine=%s [NCPU=%i]\n",
         time, tid, cpu_name, nthreads);
  MPI_Finalize();
  return(0);
}

then use the Makefile

### Basic Makefile for MPI

CC      = mpicc
CFLAGS  = -g -o0
LD      = mpicc
LDFLAGS = -g

PROGRAM = hello-mpi-2

all:    ${PROGRAM}.exe

${PROGRAM}.exe:         ${PROGRAM}.o
        ${LD} ${LDFLAGS} $< -o ${PROGRAM}.exe

${PROGRAM}.o:           ${PROGRAM}.c
        ${CC} ${CFLAGS} -c $< -o ${PROGRAM}.o

clean:
        rm -f ${PROGRAM}.o ${PROGRAM}.exe

and use the command

make

or

make -f Makefile.hello
  • compile for platform

If not using infiniband specify shared memory (sm) bit transport layer (btl)

mpirun -np 1 -hostfile hostfile --mca btl self ./a.out
mpirun -np 1 -hostfile hostfile --mca btl mx,sm,self ./a.out

open firewall

add to /etc/sysconfig/iptables

-A RH-Firewall-1-INPUT -p tcp -s 128.173.105.56 -j ACCEPT

and restart iptables

service iptables restart

don't use iptables reload or nfs gets hung.

environment variables

http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path

set these in ~/.bashrc or /etc/bashrc. –does not seem to work in ~/.bash_profile or /etc/profile

export LD_LIBRARY_PATH=/usr/local/lib64/ (or whatever)
export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/ (or whatever)

the last one is weird on SL5.2 on all others, /usr/bin works

Key Log-in

http://www.debian-administration.org/articles/152

ssh-keygen -t rsa
chmod 400 id_rsa

to copy the public key to ~/.ssh/authorized-keys

ssh-copy-id -i id_rsa.pub steve@steve-laptop

no Infiniband network error suppression

http://mail.cs.unm.edu/pipermail/cs442/2008/000021.html

Hello,

those of you who are using OpenMPI and get messages like the this:

    libibverbs: Fatal: couldn't read uverbs ABI version.
    --------------------------------------------------------------------------
    [0,1,0]: OpenIB on host localhost was unable to find any HCAs.
    Another transport will be used instead, although this may result in
    lower performance.
    --------------------------------------------------------------------------

might be interested in the following. OpenMPI supports many different
networks (simultaneously). The above message says it couldn't find an
Infiniband network. If you don't have one of those, you can tell OpenMPI
up front, so it wont go looking in the first place. Here is how:

    mkdir ~/.openmpi
    echo "btl=tcp,self" > ~/.openmpi/mca-params.conf

if you have root on your system, you can also do this in

    /usr/local/etc/openmpi-mca-params.conf

where it will affect all of the users.

Rolf

AOE user initial setup

begin in home directory

cd
keylogin
ssh-keygen -t rsa

use defaults and don't enter a password.

[testaccount@europa ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/grad5/testaccount/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/grad5/testaccount/.ssh/id_rsa.
Your public key has been saved in /home/grad5/testaccount/.ssh/id_rsa.pub.
The key fingerprint is:
81:4f:86:14:07:70:5f:bf:b7:0b:9b:73:93:52:6c:1e testaccount@europa.aoe.vt.edu
ssh-copy-id -i .ssh/id_rsa.pub titan
[testaccount@europa .ssh]$ ssh-copy-id -i id_rsa.pub titan
10
testaccount@titan's password: 
Now try logging into the machine, with "ssh 'titan'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[testaccount@europa .ssh]$ ssh titan
Last login: Thu Apr  9 20:21:10 2009 from europa.aoe.vt.edu
This computer is operated in accordance with the Acceptable Use Policy of
Virginia Tech. See the following URL for details:
	http://www.policies.vt.edu/acceptableuse.html
chmod 400 id_rsa
set environment vars in .bashrc and .bash_profile
cat >>.bashrc

and enter:

export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/
export LD_LIBRARY_PATH=/usr/lib64/openmpi/1.2.5-gcc/bin/

ctrl-d to close

cat >>.bash_profile

and enter:

export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/
export LD_LIBRARY_PATH=/usr/lib64/openmpi/1.2.5-gcc/bin/

ctrl-d to close

This prevents an error message since we are not using a fast interconnect:

mkdir ~/.openmpi
echo "btl=tcp,self" > ~/.openmpi/mca-params.conf

Log out all shells and back in

test code
cd 
wget --no-check-certificate http://www.aoe.vt.edu/~stedwar1/mpitest.tar
tar -xvf mpitest.tar
cd mpitest
make
mpirun -np 3 -hostfile hostfile hello-mpi-2.exe 
aoe/mpi.txt · Last modified: 1970/01/01 23:36 by 127.0.0.1