===== linpack ===== Benchmark tool ===== nis ===== todo: * set hostname and domainname * firewall for nis/nfs server Some good setup info: [[http://penguin.triumf.ca/recipes/nis-auto/index.html]] ==== nis client ==== # hostname (does it need the fqdn?) # domainname (make sure it is set) # domainname setup (to set it to 'setup') # ypdomainname (same, but from yp tools) # portmap (should be running) edit /etc/yp.conf and add to the end: domain setup server G4-cluster01.setup.lan Create /var/yp if it does not exist mkdir /var/yp edit /etc/nsswitch.conf ... passwd: files nis shadow: files nis group: files nis ... automount: files nis ... To start the NIS client, otherwise known as ypbind: # service ypbind start To test: # rpcinfo -p localhost Output similar to: program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100007 2 udp 758 ypbind 100007 1 udp 758 ypbind 100007 2 tcp 761 ypbind 100007 1 tcp 761 ypbind # rpcinfo -u localhost ypbind Output similar to: program 100007 version 1 ready and waiting program 100007 version 2 ready and waiting # ypcat passwd Should output the yp users in password file. Start ypbind chkconfig ypbind on service ypbind start ==== nis server ==== install: # yum install ypserv already installed (add if not already installed): yp-tool ypbind edit /etc/yp.conf as above add normal unix users, then # /usr/lib/yp/ypinit -m option: on slaves yp servers (not all clients) # ypinit -s G4-cluster01.setup.lan Check the nis server: # rpcinfo -u localhost ypserv program 100004 version 1 ready and waiting program 100004 version 2 ready and waiting If not running, # service ypserv start To update the map (don't use ypinit -m): (update users) # cd /var/yp # make To allow password changes: # rpc.yppasswdd # rpc.yppasswdd -e chfn -e chsh (to allow changing of full name and login shell) /etc/ypsecurenets # allow connections from local host -- necessary host 127.0.0.1 # same as 255.255.255.255 127.0.0.1 # # allow connections from any host # on the 192.168.2.0 network 255.255.255.0 192.168.2.0 start services on boot chkconfig ypserv on chkconfig ypbind on chkconfig ypxfrd on chkconfig yppasswdd on ===== nfs ===== * edit /etc/exports /home \ G4-cluster01.setup.lan(rw,sync,no_root_squash) \ G4-cluster02.setup.lan(rw,sync,no_root_squash) \ G4-cluster03.setup.lan(rw,sync,no_root_squash) \ G4-cluster04.setup.lan(rw,sync,no_root_squash) /usr \ G4-cluster01.setup.lan(rw,sync,no_root_squash) \ G4-cluster02.setup.lan(rw,sync,no_root_squash) \ G4-cluster03.setup.lan(rw,sync,no_root_squash) \ G4-cluster04.setup.lan(rw,sync,no_root_squash) * Add to startup services chkconfig portmap on chkconfig nfs on * Start the nfs server portmap service nfs start * Reload any nfs changes to /etc/exports exportfs -ra ===== auto mount ===== ====if not using automount, add to /etc/fstab==== G4-cluster01:/home /home nfs defaults 0 0 ====If using automount, on master:==== /etc/auto.master /home auto.home /etc/auto.home steve G4-cluster01.setup.lan:/home/& or * G4-cluster01.setup.lan:/home/& /etc/nsswitch automount: files nis Then update the database if yp is already running: cd /var/yp make ===== mpi ===== [[mpi]] ===== openmp ===== [[http://www.linux-mag.com/id/4609|openmp in 30 minutes]] ===== system monitoring ===== [[http://www.redhat.com/docs/manuals/linux/RHL-9-Manual/admin-primer/s1-resource-rhlspec.html|Red Hat Linux-Specific Information]] [[http://www.redhat.com/cluster_suite/|Red Hat Cluster Suite]] ===== batch system ===== Also refer to the Parallel Matlab section https://www.aoe.vt.edu/~stedwar1/Steve/doku/dokuwiki-2009-02-14/doku.php?id=aoe:matlab ====pbs==== [[http://euler.phys.cmu.edu/cluster/pbs.html]] qsub showq checkjob notes from Shinpaugh: openpbs -> torque A free version of Moab is maui To submit your job to the queuing system use the command qsub: qsub ./JobScript.sh This will return your job name of the form xxxx.queueserver. Example: 4567.admin01 To remove a job from the queue, or stop a running job, use the command `qdel `. Example: qdel 4567 To see status information about your job, you can use: `qstat -f ` which is a Torque command that will provide detailed information about the job. `showstart ` which is a Moab command that will tell you expected start and finish times. and `checkjob -v ` which is a Moab command that will provide detailed information about the job. NOTE: The Moab commands may report an error of the form "ERROR: cannot locate job '' if the scheduler has not yet picked up the newly submitted job. If so, just a wait a minute and try again. When your job has finished running, any outputs to stdout or stderr will be placed in the files .o and .e. These 2 files will be in the directory that you submitted the job from. To find information about all your queued or running jobs you can use the commands `qstat` and `showq`. The `qstat` command without a argument will show all ithaca jobs from the Torque resource manager's perspective. The `showq` command without arguments will show all of the running jobs on all ARC systems from the Moab scheduler's perspective . If you wish to only view ithaca jobs with showq, use `showq -p ITHACA`. NOTE: Users generally find showq to be more useful that qstat. ==== High Availability Cluster ==== http://olex.openlogic.com/wazi/2011/ensure-high-availability-with-centos-6-clustering/