[[http://www.linux-mag.com/id/5759/]]
[[http://na37.nada.kth.se/mediawiki/index.php/OpenMPI]]
Cluster software packages such as ROCKS or xCAT can be used to automate clustering.
==== yum install ====
needed:
yum install openmpi
yum install openmpi-devel
yum install gcc
yum install gcc-gfortran
yum install gmp
yum install mpfr ??
yum install compat-gcc-34
yum install compat-gcc-34-g77
maybe these:
yum install libmthca-devel
yum install librdmacm-devel
yum install librdmacm
== for x86_64 machines on SL53 ==
mpi-selector --system --verbose --set openmpi-1.2.7-gcc-x86_64
user must log in again after this change
== for x86_64 machines on SL6 ==
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Technical_Notes/storage.html
module load openmpi-x86_64
==== Yellow Dog 6.0 ====
also needs:
yum install libsysfs-devel
====apt-get install====
sudo apt-get install 'openmpi-*'
sudo apt-get install build-essential
sudo apt-get install openssh-server
====yast====
install these
yast -i gcc
yast -i openmpi
yast -i openmpi-devel
yast -i make
Append to ~/.bashrc
export PATH=$PATH:/usr/lib/mpi/gcc/openmpi/bin/
export LD_LIBRARY_PATH=/usr/lib/mpi/gcc/openmpi/lib/
==== test code ====
first check the compiler
hello.c
#include "stdio.h"
int main(int argc, char *argv[])
{
printf("hello MPI user!\n");
return(0);
}
gcc hello.c -o hello.exe
./hello.exe
hello-mpi-2.c
#include "stdio.h"
#include
#include
int main(int argc, char *argv[])
{
int tid,nthreads;
char *cpu_name;
double time_initial,time_current,time;
/* add in MPI startup routines */
/* 1st: launch the MPI processes on each node */
MPI_Init(&argc,&argv);
time_initial = MPI_Wtime();
/* 2nd: request a thread id, sometimes called a "rank" from
the MPI master process, which has rank or tid == 0
*/
MPI_Comm_rank(MPI_COMM_WORLD, &tid);
/* 3rd: this is often useful, get the number of threads
or processes launched by MPI, this should be NCPUs-1
*/
MPI_Comm_size(MPI_COMM_WORLD, &nthreads);
/* on EVERY process, allocate space for the machine name */
cpu_name = (char *)calloc(80,sizeof(char));
/* get the machine name of this particular host ... well
at least the first 80 characters of it ... */
gethostname(cpu_name,80);
time_current = MPI_Wtime();
time = time_current - time_initial;
printf("%.3f tid=%i : hello MPI user: machine=%s [NCPU=%i]\n",
time, tid, cpu_name, nthreads);
MPI_Finalize();
return(0);
}
then use the Makefile
### Basic Makefile for MPI
CC = mpicc
CFLAGS = -g -o0
LD = mpicc
LDFLAGS = -g
PROGRAM = hello-mpi-2
all: ${PROGRAM}.exe
${PROGRAM}.exe: ${PROGRAM}.o
${LD} ${LDFLAGS} $< -o ${PROGRAM}.exe
${PROGRAM}.o: ${PROGRAM}.c
${CC} ${CFLAGS} -c $< -o ${PROGRAM}.o
clean:
rm -f ${PROGRAM}.o ${PROGRAM}.exe
and use the command
make
or
make -f Makefile.hello
* compile for platform
If not using infiniband specify shared memory (sm) bit transport layer (btl)
mpirun -np 1 -hostfile hostfile --mca btl self ./a.out
mpirun -np 1 -hostfile hostfile --mca btl mx,sm,self ./a.out
==== open firewall ====
add to /etc/sysconfig/iptables
-A RH-Firewall-1-INPUT -p tcp -s 128.173.105.56 -j ACCEPT
and restart iptables
service iptables restart
don't use iptables reload or nfs gets hung.
==== environment variables ====
[[http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path]]
set these in ~/.bashrc or /etc/bashrc. --does not seem to work in ~/.bash_profile or /etc/profile
export LD_LIBRARY_PATH=/usr/local/lib64/ (or whatever)
export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/ (or whatever)
the last one is weird on SL5.2 on all others, /usr/bin works
==== Key Log-in ====
http://www.debian-administration.org/articles/152
ssh-keygen -t rsa
chmod 400 id_rsa
to copy the public key to ~/.ssh/authorized-keys
ssh-copy-id -i id_rsa.pub steve@steve-laptop
==== no Infiniband network error suppression ====
[[http://mail.cs.unm.edu/pipermail/cs442/2008/000021.html]]
Hello,
those of you who are using OpenMPI and get messages like the this:
libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,1,0]: OpenIB on host localhost was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
might be interested in the following. OpenMPI supports many different
networks (simultaneously). The above message says it couldn't find an
Infiniband network. If you don't have one of those, you can tell OpenMPI
up front, so it wont go looking in the first place. Here is how:
mkdir ~/.openmpi
echo "btl=tcp,self" > ~/.openmpi/mca-params.conf
if you have root on your system, you can also do this in
/usr/local/etc/openmpi-mca-params.conf
where it will affect all of the users.
Rolf
==== AOE user initial setup ====
begin in home directory
cd
== keylogin ==
ssh-keygen -t rsa
use defaults and don't enter a password.
[testaccount@europa ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/grad5/testaccount/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/grad5/testaccount/.ssh/id_rsa.
Your public key has been saved in /home/grad5/testaccount/.ssh/id_rsa.pub.
The key fingerprint is:
81:4f:86:14:07:70:5f:bf:b7:0b:9b:73:93:52:6c:1e testaccount@europa.aoe.vt.edu
ssh-copy-id -i .ssh/id_rsa.pub titan
[testaccount@europa .ssh]$ ssh-copy-id -i id_rsa.pub titan
10
testaccount@titan's password:
Now try logging into the machine, with "ssh 'titan'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[testaccount@europa .ssh]$ ssh titan
Last login: Thu Apr 9 20:21:10 2009 from europa.aoe.vt.edu
This computer is operated in accordance with the Acceptable Use Policy of
Virginia Tech. See the following URL for details:
http://www.policies.vt.edu/acceptableuse.html
chmod 400 id_rsa
== set environment vars in .bashrc and .bash_profile ==
cat >>.bashrc
and enter:
export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/
export LD_LIBRARY_PATH=/usr/lib64/openmpi/1.2.5-gcc/bin/
ctrl-d to close
cat >>.bash_profile
and enter:
export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/
export LD_LIBRARY_PATH=/usr/lib64/openmpi/1.2.5-gcc/bin/
ctrl-d to close
This prevents an error message since we are not using a fast interconnect:
mkdir ~/.openmpi
echo "btl=tcp,self" > ~/.openmpi/mca-params.conf
Log out all shells and back in
== test code ==
cd
wget --no-check-certificate http://www.aoe.vt.edu/~stedwar1/mpitest.tar
tar -xvf mpitest.tar
cd mpitest
make
mpirun -np 3 -hostfile hostfile hello-mpi-2.exe