===== linpack =====
Benchmark tool
===== nis =====
todo:
* set hostname and domainname
* firewall for nis/nfs server
Some good setup info:
[[http://penguin.triumf.ca/recipes/nis-auto/index.html]]
==== nis client ====
# hostname (does it need the fqdn?)
# domainname (make sure it is set)
# domainname setup (to set it to 'setup')
# ypdomainname (same, but from yp tools)
# portmap (should be running)
edit /etc/yp.conf and add to the end:
domain setup server G4-cluster01.setup.lan
Create /var/yp if it does not exist
mkdir /var/yp
edit /etc/nsswitch.conf
...
passwd: files nis
shadow: files nis
group: files nis
...
automount: files nis
...
To start the NIS client, otherwise known as ypbind:
# service ypbind start
To test:
# rpcinfo -p localhost
Output similar to:
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100007 2 udp 758 ypbind
100007 1 udp 758 ypbind
100007 2 tcp 761 ypbind
100007 1 tcp 761 ypbind
# rpcinfo -u localhost ypbind
Output similar to:
program 100007 version 1 ready and waiting
program 100007 version 2 ready and waiting
# ypcat passwd
Should output the yp users in password file.
Start ypbind
chkconfig ypbind on
service ypbind start
==== nis server ====
install:
# yum install ypserv
already installed (add if not already installed):
yp-tool
ypbind
edit /etc/yp.conf as above
add normal unix users, then
# /usr/lib/yp/ypinit -m
option: on slaves yp servers (not all clients)
# ypinit -s G4-cluster01.setup.lan
Check the nis server:
# rpcinfo -u localhost ypserv
program 100004 version 1 ready and waiting
program 100004 version 2 ready and waiting
If not running,
# service ypserv start
To update the map (don't use ypinit -m):
(update users)
# cd /var/yp
# make
To allow password changes:
# rpc.yppasswdd
# rpc.yppasswdd -e chfn -e chsh (to allow changing of full name and login shell)
/etc/ypsecurenets
# allow connections from local host -- necessary
host 127.0.0.1
# same as 255.255.255.255 127.0.0.1
#
# allow connections from any host
# on the 192.168.2.0 network
255.255.255.0 192.168.2.0
start services on boot
chkconfig ypserv on
chkconfig ypbind on
chkconfig ypxfrd on
chkconfig yppasswdd on
===== nfs =====
* edit /etc/exports
/home \
G4-cluster01.setup.lan(rw,sync,no_root_squash) \
G4-cluster02.setup.lan(rw,sync,no_root_squash) \
G4-cluster03.setup.lan(rw,sync,no_root_squash) \
G4-cluster04.setup.lan(rw,sync,no_root_squash)
/usr \
G4-cluster01.setup.lan(rw,sync,no_root_squash) \
G4-cluster02.setup.lan(rw,sync,no_root_squash) \
G4-cluster03.setup.lan(rw,sync,no_root_squash) \
G4-cluster04.setup.lan(rw,sync,no_root_squash)
* Add to startup services
chkconfig portmap on
chkconfig nfs on
* Start the nfs server
portmap
service nfs start
* Reload any nfs changes to /etc/exports
exportfs -ra
===== auto mount =====
====if not using automount, add to /etc/fstab====
G4-cluster01:/home /home nfs defaults 0 0
====If using automount, on master:====
/etc/auto.master
/home auto.home
/etc/auto.home
steve G4-cluster01.setup.lan:/home/&
or
* G4-cluster01.setup.lan:/home/&
/etc/nsswitch
automount: files nis
Then update the database if yp is already running:
cd /var/yp
make
===== mpi =====
[[mpi]]
===== openmp =====
[[http://www.linux-mag.com/id/4609|openmp in 30 minutes]]
===== system monitoring =====
[[http://www.redhat.com/docs/manuals/linux/RHL-9-Manual/admin-primer/s1-resource-rhlspec.html|Red Hat Linux-Specific Information]]
[[http://www.redhat.com/cluster_suite/|Red Hat Cluster Suite]]
===== batch system =====
Also refer to the Parallel Matlab section
https://www.aoe.vt.edu/~stedwar1/Steve/doku/dokuwiki-2009-02-14/doku.php?id=aoe:matlab
====pbs====
[[http://euler.phys.cmu.edu/cluster/pbs.html]]
qsub
showq
checkjob
notes from Shinpaugh:
openpbs -> torque
A free version of Moab is maui
To submit your job to the queuing system use the command qsub:
qsub ./JobScript.sh
This will return your job name of the form xxxx.queueserver. Example: 4567.admin01
To remove a job from the queue, or stop a running job, use the command `qdel `. Example: qdel 4567
To see status information about your job, you can use:
`qstat -f ` which is a Torque command that will provide detailed information about the job.
`showstart ` which is a Moab command that will tell you expected start and finish times.
and `checkjob -v ` which is a Moab command that will provide detailed information about the job.
NOTE: The Moab commands may report an error of the form "ERROR: cannot locate job '' if the scheduler has not yet picked up the newly submitted job. If so, just a wait a minute and try again.
When your job has finished running, any outputs to stdout or stderr will be placed in the files .o and .e. These 2 files will be in the directory that you submitted the job from.
To find information about all your queued or running jobs you can use the commands `qstat` and `showq`. The `qstat` command without a argument will show all ithaca jobs from the Torque resource manager's perspective. The `showq` command without arguments will show all of the running jobs on all ARC systems from the Moab scheduler's perspective . If you wish to only view ithaca jobs with showq, use `showq -p ITHACA`. NOTE: Users generally find showq to be more useful that qstat.
==== High Availability Cluster ====
http://olex.openlogic.com/wazi/2011/ensure-high-availability-with-centos-6-clustering/