[[http://www.linux-mag.com/id/5759/]] [[http://na37.nada.kth.se/mediawiki/index.php/OpenMPI]] Cluster software packages such as ROCKS or xCAT can be used to automate clustering. ==== yum install ==== needed: yum install openmpi yum install openmpi-devel yum install gcc yum install gcc-gfortran yum install gmp yum install mpfr ?? yum install compat-gcc-34 yum install compat-gcc-34-g77 maybe these: yum install libmthca-devel yum install librdmacm-devel yum install librdmacm == for x86_64 machines on SL53 == mpi-selector --system --verbose --set openmpi-1.2.7-gcc-x86_64 user must log in again after this change == for x86_64 machines on SL6 == http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Technical_Notes/storage.html module load openmpi-x86_64 ==== Yellow Dog 6.0 ==== also needs: yum install libsysfs-devel ====apt-get install==== sudo apt-get install 'openmpi-*' sudo apt-get install build-essential sudo apt-get install openssh-server ====yast==== install these yast -i gcc yast -i openmpi yast -i openmpi-devel yast -i make Append to ~/.bashrc export PATH=$PATH:/usr/lib/mpi/gcc/openmpi/bin/ export LD_LIBRARY_PATH=/usr/lib/mpi/gcc/openmpi/lib/ ==== test code ==== first check the compiler hello.c #include "stdio.h" int main(int argc, char *argv[]) { printf("hello MPI user!\n"); return(0); } gcc hello.c -o hello.exe ./hello.exe hello-mpi-2.c #include "stdio.h" #include #include int main(int argc, char *argv[]) { int tid,nthreads; char *cpu_name; double time_initial,time_current,time; /* add in MPI startup routines */ /* 1st: launch the MPI processes on each node */ MPI_Init(&argc,&argv); time_initial = MPI_Wtime(); /* 2nd: request a thread id, sometimes called a "rank" from the MPI master process, which has rank or tid == 0 */ MPI_Comm_rank(MPI_COMM_WORLD, &tid); /* 3rd: this is often useful, get the number of threads or processes launched by MPI, this should be NCPUs-1 */ MPI_Comm_size(MPI_COMM_WORLD, &nthreads); /* on EVERY process, allocate space for the machine name */ cpu_name = (char *)calloc(80,sizeof(char)); /* get the machine name of this particular host ... well at least the first 80 characters of it ... */ gethostname(cpu_name,80); time_current = MPI_Wtime(); time = time_current - time_initial; printf("%.3f tid=%i : hello MPI user: machine=%s [NCPU=%i]\n", time, tid, cpu_name, nthreads); MPI_Finalize(); return(0); } then use the Makefile ### Basic Makefile for MPI CC = mpicc CFLAGS = -g -o0 LD = mpicc LDFLAGS = -g PROGRAM = hello-mpi-2 all: ${PROGRAM}.exe ${PROGRAM}.exe: ${PROGRAM}.o ${LD} ${LDFLAGS} $< -o ${PROGRAM}.exe ${PROGRAM}.o: ${PROGRAM}.c ${CC} ${CFLAGS} -c $< -o ${PROGRAM}.o clean: rm -f ${PROGRAM}.o ${PROGRAM}.exe and use the command make or make -f Makefile.hello * compile for platform If not using infiniband specify shared memory (sm) bit transport layer (btl) mpirun -np 1 -hostfile hostfile --mca btl self ./a.out mpirun -np 1 -hostfile hostfile --mca btl mx,sm,self ./a.out ==== open firewall ==== add to /etc/sysconfig/iptables -A RH-Firewall-1-INPUT -p tcp -s 128.173.105.56 -j ACCEPT and restart iptables service iptables restart don't use iptables reload or nfs gets hung. ==== environment variables ==== [[http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path]] set these in ~/.bashrc or /etc/bashrc. --does not seem to work in ~/.bash_profile or /etc/profile export LD_LIBRARY_PATH=/usr/local/lib64/ (or whatever) export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/ (or whatever) the last one is weird on SL5.2 on all others, /usr/bin works ==== Key Log-in ==== http://www.debian-administration.org/articles/152 ssh-keygen -t rsa chmod 400 id_rsa to copy the public key to ~/.ssh/authorized-keys ssh-copy-id -i id_rsa.pub steve@steve-laptop ==== no Infiniband network error suppression ==== [[http://mail.cs.unm.edu/pipermail/cs442/2008/000021.html]] Hello, those of you who are using OpenMPI and get messages like the this: libibverbs: Fatal: couldn't read uverbs ABI version. -------------------------------------------------------------------------- [0,1,0]: OpenIB on host localhost was unable to find any HCAs. Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- might be interested in the following. OpenMPI supports many different networks (simultaneously). The above message says it couldn't find an Infiniband network. If you don't have one of those, you can tell OpenMPI up front, so it wont go looking in the first place. Here is how: mkdir ~/.openmpi echo "btl=tcp,self" > ~/.openmpi/mca-params.conf if you have root on your system, you can also do this in /usr/local/etc/openmpi-mca-params.conf where it will affect all of the users. Rolf ==== AOE user initial setup ==== begin in home directory cd == keylogin == ssh-keygen -t rsa use defaults and don't enter a password. [testaccount@europa ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/grad5/testaccount/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/grad5/testaccount/.ssh/id_rsa. Your public key has been saved in /home/grad5/testaccount/.ssh/id_rsa.pub. The key fingerprint is: 81:4f:86:14:07:70:5f:bf:b7:0b:9b:73:93:52:6c:1e testaccount@europa.aoe.vt.edu ssh-copy-id -i .ssh/id_rsa.pub titan [testaccount@europa .ssh]$ ssh-copy-id -i id_rsa.pub titan 10 testaccount@titan's password: Now try logging into the machine, with "ssh 'titan'", and check in: .ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting. [testaccount@europa .ssh]$ ssh titan Last login: Thu Apr 9 20:21:10 2009 from europa.aoe.vt.edu This computer is operated in accordance with the Acceptable Use Policy of Virginia Tech. See the following URL for details: http://www.policies.vt.edu/acceptableuse.html chmod 400 id_rsa == set environment vars in .bashrc and .bash_profile == cat >>.bashrc and enter: export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/ export LD_LIBRARY_PATH=/usr/lib64/openmpi/1.2.5-gcc/bin/ ctrl-d to close cat >>.bash_profile and enter: export PATH=$PATH:/usr/lib64/openmpi/1.2.5-gcc/bin/ export LD_LIBRARY_PATH=/usr/lib64/openmpi/1.2.5-gcc/bin/ ctrl-d to close This prevents an error message since we are not using a fast interconnect: mkdir ~/.openmpi echo "btl=tcp,self" > ~/.openmpi/mca-params.conf Log out all shells and back in == test code == cd wget --no-check-certificate http://www.aoe.vt.edu/~stedwar1/mpitest.tar tar -xvf mpitest.tar cd mpitest make mpirun -np 3 -hostfile hostfile hello-mpi-2.exe