User Tools

Site Tools


aoe:enterprise

Account setup

  1. edit /etc/passwd
    • add the users name (copy the previous line)
    • increment the userid
    • change name
    • change home directory
    • Set the desired shell
  2. edit /etc/shadow
    • add the users name (copy the previous line)
  3. Change the password with passwd <username>
  4. edit /etc/group
    • add the users name
    • change the userid to match the one in /etc/passwd
  5. Create /usr/people/<username>
  6. Set permissions
    • chown <username>:<username> /usr/people/<username>
    • chmod 700 <username>
  7. set quotas
    • eduquota <username>
    • fs /usr/people kbytes (soft = 0, hard = 2097152) inodes (soft = 0, hard = 0)

—-

Enterprise

swmgr (IRIX Package manager.)
edquota <username>
repquota -a
l2cmd --scdev /hw/module/001c10/L1/controller power
l1cmd --scdev /hw/module/001c10/L1/controller serial all
[/] /usr/cpu/firmware/sysco/flashsc --l2 10.17.175.238 /usr/cpu/firmware/sysco/l1.bin 121.20
$ /usr/diad/bin/pandora

(04:56:52 PM) luke.scharf@im.clusterbee.net: It came via the federal HPC Modernization Program.

(04:57:08 PM) luke.scharf@im.clusterbee.net: I think it was initially owned by the Army Research Lab in Louisiana, but the Air Force now holds the deed.

(04:57:22 PM) luke.scharf@im.clusterbee.net: I might be able to look up some of the e-mails from the HPCMP folks.

(05:01:19 PM) luke.scharf@im.clusterbee.net: It looks like the POC for the project at the time was: Maj Edward M. Williams, PhD HPC Program Manager Air Force Office of Scientific Research 703-696-6566 (DSN 426)

(05:04:38 PM) luke.scharf@im.clusterbee.net: Another contact is Bill Reidy (of BAE Systems and the HPCMP) at 703-812-8205

(05:04:51 PM) luke.scharf@im.clusterbee.net: reidy@hpcmo.hpc.mil

(05:05:18 PM) luke.scharf@im.clusterbee.net: I think Bill Reidy was the main POC – it looks like he answered most of the questions.

(05:10:11 PM) luke.scharf@im.clusterbee.net: It looks like Bill Reidy is really the guy who put it together.

Disk Maintenance

see the section “Creating a New System Disk by Cloning” for using xfsdump to copy the disk

http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=0650&db=bks&srch=&fname=/SGI_Admin/IA_DiskFiles/sgi_html/ch02.html

http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=0650&db=bks&srch=&fname=/SGI_Admin/IA_InstLicns/sgi_html/ch03.html

I cloned the root drive from controller 0 to 7 on 11/20/2009. First, I did

xfsdump -l 0 -f /tmp/xfsdump-oldroot.sfsdump /mnt/tmp

of the mounted spare drive which contained a dump from when Luke helped get the machine back from /usr/people array trouble.

Luke made this copy back in 2007 after the crash maybe?

xfsdump - / | gzip | ssh root@alexandria.aoe.vt.edu 'cat > /home/sysadmin/enterprise.aoe.vt.edu/backup/root_2007-01-29.xfsdump.gz'

XVM Manager GUI

xvmgr
vol/people
  subvol/people/data
    stripe/stripe0
      slice/disk3s0
      slice/disk4s0
vol/tmp
  subvol/tmp/data
    stripe/stripe1
      slice/disk1s0
      slice/disk2s0
vol/workspace
  subvol/workspace/data
    concat/workspace
      stripe/workspace1
        slice/xvmdisk_0s0
        slice/xvmdisk_1s0
      stripe/workspace2
        slice/xvmdisk_2s0
        slice/xvmdisk_3s0
      stripe/Workspace3
        slice/xvmdisk_4s0
        slice/xvmdisk_5s0

wipe commands

workspace

dd if=/dev/zero of=/hw/rdisk/dks13d66vol bs=1M
dd if=/dev/zero of=/hw/rdisk/dks13d69vol bs=1M
dd if=/dev/zero of=/hw/rdisk/dks13d70vol bs=1M
dd if=/dev/zero of=/hw/rdisk/dks13d73vol bs=1M
dd if=/dev/zero of=/hw/rdisk/dks13d74vol bs=1M
dd if=/dev/zero of=/hw/rdisk/dks13d77vol bs=1M

tmp

dd if=/dev/zero of=/dev/rdsk/dks4d114vol bs=1M
dd if=/dev/zero of=/dev/rdsk/dks4d117vol bs=1M

people

dd if=/dev/zero of=/dev/rdsk/dks4d115vol bs=1M
dd if=/dev/zero of=/dev/rdsk/dks4d116vol bs=1M

Main Disks

/dev/rdsk/dks0d1vol
/dev/rdsk/dks0d2vol
dd if=/dev/zero of=/hw/disk/dks0d2vol bs=1M
/dev/rdsk/dks7d1vol
/dev/rdsk/dks7d2vol
dd if=/dev/zero of=/hw/disk/dks7d2vol bs=1M

CD Rom

/dev/rdsk/dks5d1vol
/dev/rdsk/dks8d1vol

SGI O2

xvm command line interface

enterprise 136# xvm
xvm:local> dump *
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
Don't know how to dump object type 9 (unlabeled)
label -name disk1 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c694584-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d114vol
label -name disk2 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c694585-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d117vol
label -name disk3 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c694586-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d116vol
label -name disk4 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c694587-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d115vol
slice -start 0 -length 71683264 -uuid 9c694588-0d9e-102f-86cc-08006911d5db phys/disk1
slice -start 0 -length 71683264 -uuid 9c694589-0d9e-102f-86cc-08006911d5db phys/disk2
slice -start 0 -length 71683264 -uuid 9c69458a-0d9e-102f-86cc-08006911d5db phys/disk3
slice -start 0 -length 71683264 -uuid 9c69458b-0d9e-102f-86cc-08006911d5db phys/disk4
stripe -pieces 2 -unit 128 -tempname -uuid 9c69458c-0d9e-102f-86cc-08006911d5db
stripe -pieces 2 -unit 128 -tempname -uuid 9c69458d-0d9e-102f-86cc-08006911d5db
subvol -uid 0 -gid 0 -mode 0600 -type 2 -tempname -uuid 9c69458e-0d9e-102f-86cc-08006911d5db
subvol -uid 0 -gid 0 -mode 0600 -type 2 -tempname -uuid 9c69458f-0d9e-102f-86cc-08006911d5db
volume -volname people -uuid 9c694590-0d9e-102f-86cc-08006911d5db
volume -volname tmp -uuid 9c694591-0d9e-102f-86cc-08006911d5db
xvm:local>
xvm:local> dump -topology tmp
volume -volname tmp -uuid 9c6945a0-0d9e-102f-86cc-08006911d5db
subvol -uid 0 -gid 0 -mode 0600 -type 2 -tempname -uuid 9c6945a1-0d9e-102f-86cc-08006911d5db
stripe -pieces 2 -unit 128 -tempname -uuid 9c6945a2-0d9e-102f-86cc-08006911d5db
slice -start 0 -length 71683264 -uuid 9c6945a3-0d9e-102f-86cc-08006911d5db phys/disk1
attach -pos 0 9c6945a3-0d9e-102f-86cc-08006911d5db 9c6945a2-0d9e-102f-86cc-08006911d5db
slice -start 0 -length 71683264 -uuid 9c6945a4-0d9e-102f-86cc-08006911d5db phys/disk2
attach -pos 1 9c6945a4-0d9e-102f-86cc-08006911d5db 9c6945a2-0d9e-102f-86cc-08006911d5db
attach -pos 0 9c6945a2-0d9e-102f-86cc-08006911d5db 9c6945a1-0d9e-102f-86cc-08006911d5db
attach -pos 0 9c6945a1-0d9e-102f-86cc-08006911d5db 9c6945a0-0d9e-102f-86cc-08006911d5db
xvm:local> dump -topology people
volume -volname people -uuid 9c6945a5-0d9e-102f-86cc-08006911d5db
subvol -uid 0 -gid 0 -mode 0600 -type 2 -tempname -uuid 9c6945a6-0d9e-102f-86cc-08006911d5db
stripe -pieces 2 -unit 128 -tempname -uuid 9c6945a7-0d9e-102f-86cc-08006911d5db
slice -start 0 -length 71683264 -uuid 9c6945a8-0d9e-102f-86cc-08006911d5db phys/disk3
attach -pos 0 9c6945a8-0d9e-102f-86cc-08006911d5db 9c6945a7-0d9e-102f-86cc-08006911d5db
slice -start 0 -length 71683264 -uuid 9c6945a9-0d9e-102f-86cc-08006911d5db phys/disk4
attach -pos 1 9c6945a9-0d9e-102f-86cc-08006911d5db 9c6945a7-0d9e-102f-86cc-08006911d5db
attach -pos 0 9c6945a7-0d9e-102f-86cc-08006911d5db 9c6945a6-0d9e-102f-86cc-08006911d5db
attach -pos 0 9c6945a6-0d9e-102f-86cc-08006911d5db 9c6945a5-0d9e-102f-86cc-08006911d5db
xvm:local>
xvm:local> dump disk*
label -name disk1 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c6945aa-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d114vol
label -name disk2 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c6945ab-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d117vol
label -name disk3 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c6945ac-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d116vol
label -name disk4 -volhdrblks 3072 -xvmlabelblks 1024 -uuid 9c6945ad-0d9e-102f-86cc-08006911d5db /dev/rdsk/dks4d115vol
slice -start 0 -length 71683264 -uuid 9c6945ae-0d9e-102f-86cc-08006911d5db phys/disk1
slice -start 0 -length 71683264 -uuid 9c6945af-0d9e-102f-86cc-08006911d5db phys/disk2
slice -start 0 -length 71683264 -uuid 9c6945b0-0d9e-102f-86cc-08006911d5db phys/disk3
slice -start 0 -length 71683264 -uuid 9c6945b1-0d9e-102f-86cc-08006911d5db phys/disk4

prtvtoc

enterprise 2# prtvtoc /dev/rdsk/*vh
* /dev/rdsk/dks0d1vh (bootfile "/unix")
*     512 bytes/sector
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 0          xfs  yes     2101248     33445628   /
 1          raw             4096      2097152
 8       volhdr                0         4096
10       volume                0     35546876
* /dev/rdsk/dks0d2vh (bootfile "/unix")
*     512 bytes/sector
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 0          xfs  yes     8392704     27173776
 1          raw             4096      8388608
 8       volhdr                0         4096
10       volume                0     35566480
* /dev/rdsk/dks13d66vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        4096     71683273
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         4096
10       volume                0     71687369
* /dev/rdsk/dks13d69vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        4096     71683273
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         4096
10       volume                0     71687369
* /dev/rdsk/dks13d70vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        4096    286745392
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         4096
10       volume                0    286749488
* /dev/rdsk/dks13d73vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        4096    286745392
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         4096
10       volume                0    286749488
* /dev/rdsk/dks13d74vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        4096    143370642
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         4096
10       volume                0    143374738
* /dev/rdsk/dks13d77vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        4096    143370642
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         4096
10       volume                0    143374738
* /dev/rdsk/dks4d114vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        3072     71684297
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         3072
10       volume                0     71687369
* /dev/rdsk/dks4d115vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        3072     71684297
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         3072
10       volume                0     71687369
* /dev/rdsk/dks4d116vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        3072     71684297
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         3072
10       volume                0     71687369
* /dev/rdsk/dks4d117vh (bootfile "")
*     512 bytes/sector
* Unallocated space:
*       Start         Size
*        3072     71684297
*
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 8       volhdr                0         3072
10       volume                0     71687369
prtvtoc: /dev/rdsk/dks5d1vh: I/O error
* /dev/rdsk/dks7d1vh (bootfile "")
*     512 bytes/sector
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 0          xfs  yes     2101248     33445628
 1          raw             4096      2097152
 8       volhdr                0         4096
10       volume                0     35546876
* /dev/rdsk/dks7d2vh (bootfile "/unix")
*     512 bytes/sector
Partition  Type  Fs   Start: sec    Size: sec   Mount Directory
 0          xfs  yes     8392704     27154172
 1          raw             4096      8388608
 8       volhdr                0         4096
10       volume                0     35546876
prtvtoc: /dev/rdsk/dks8d1vh: I/O error
enterprise 3#

firewire (1394) Maxtor drive

fx -d /dev/rdsk/1394/10b9021153eeca/lun0vol/c5p0
enterprise 141# fx -d /dev/rdsk/1394/10b9021153eeca/lun0vol/c5p0
fx version 6.5, Jul  1, 2005
...opening /dev/rdsk/1394/10b9021153eeca/lun0vol/c5p0
...drive selftest...OK
fx: Warning:  device appears to be a floppy

Scsi drive type == Maxtor

----- please choose one (? for help, .. to quit this menu)-----
[exi]t               [d]ebug/             [l]abel/
[b]adblock/          [exe]rcise/          [r]epartition/
fx> r

----- partitions-----
part  type        blocks            Megabytes   (base+size)
  8: volhdr        0 + 4096           0 + 2    
 10: volume        0 + 1953525168       0 + 953870

capacity is 1953525168 blocks

----- please choose one (? for help, .. to quit this menu)-----
[ro]otdrive        [u]srrootdrive     [o]ptiondrive      [re]size
fx/repartition> o

fx/repartition/optiondrive: type of data partition = (xfs) ?
----- type of data partition-----
[x]fs        [e]fs
fx/repartition/optiondrive: type of data partition = (xfs) x
Warning: you will need to re-install all software and restore user data
from backups after changing the partition layout.  Changing partitions
will cause all data on the drive to be lost.  Be sure you have the drive
backed up if it contains any user data.  Continue? y
fx/repartition/optiondrive: Warning:  device appears to be a floppy


----- partitions-----
part  type        blocks            Megabytes   (base+size)
  7: xfs        4096 + 1953521072       2 + 953868
  8: volhdr        0 + 4096           0 + 2    
 10: volume        0 + 1953525168       0 + 953870

capacity is 1953525168 blocks

----- please choose one (? for help, .. to quit this menu)-----
[ro]otdrive        [u]srrootdrive     [o]ptiondrive      [re]size

fx/repartition> ..

----- please choose one (? for help, .. to quit this menu)-----
[exi]t               [d]ebug/             [l]abel/
[b]adblock/          [exe]rcise/          [r]epartition/
fx> exi
 mkfs_xfs /dev/dsk/1394/10b9021153eeca

/etc/fstab

# firewire drive
/dev/dsk/1394/10b9021153eeca/lun0vol/c5p0 /tmp2 xfs rw 0 0

Bricks


down:

001C10
001c16
008C24
013c10 PIMM 0 VCPU Low V @ 1.678
004c35 off
006c21 No 48V
012c32 off
014c29 off

004c35 off (PIMM issues, Luke 7-7-06)
006c21 off
008c24 pwered down, display goofy.
012c32 off SGI Defective Part
014c24 off

console msg:

006r19 error: error readinf the router control 110 expander
002c10 1.5v low @ 1.368V
001c24 powered off and back on
131p16 power bay
101i20 Lost Redundancy

L2find L2term

002c10 L1 (Voltage regulator Seems like it needs replacing.)

—- 12/22?/2006

13c10 PIMM0 VCPU low/stabilized@1.1748
11c32 Error Reading noitor node 1 interupt status 1 : no Ack
101:20 DPS Power Bay 4 predicive fail
NASID 58-59 and 74-75 incomplete memory region
Autoboot failed
dksc(0,1,0)unix:no such file or Directory

14c24 was off

#### from phone converstation in early January with Al Duran(?) from SGI:

Load installation tools The #1 CD contains instaation tools.

esc to quit boot when prompted.

Maintenance Menu
#2 Install

Jump into shell

v: /root/etc
/old

Might be possible to break into the os using the copy of the drive made during the installation. (this was not neccessary.) —-to break in—-

passwd
delete second field
shadow
save

restart single user
admin 6.520 
esc to quit boot when prompted.
Maintenance Menu
#2 Install
  1. > shell

2002-2004 was end of life of disk box (d-brick??)

Built 94 or 95

Clarion Mass storage ==== Initial Installation process:

Load OS, Patch and configure

-xfs Dump restore

raid disk box – drop in – information from Joe or Eric. Does not recommend Apple.

####

To update hardware inventory database(re-inventory):

esc to abort boot

5 command monitor

>> update

Al's number: 804 839 8764

####

l2term to enter the l2 terminal control from the laptop connected to ??

008c24 PIMM 1 B
008c24 PIMM 1 A
014c24 PIMM 0 A
002c10 1.5v low

tty erase (backspace)

114 117
1
015c10
015c16

update klconfig_cpuinfo (error??)

load dksc(0,1,0)/unix : no such file or directory

c for continue goes to sash

sash:
sash: boot (to boot)

from l1

env

env reset (to clear messages)

log

cfs

from pod → reset

enable all

Off:

011c32
012c32
006c21
004c35

####

1-24-2007 still trying to get enterprise going.

0013c13 12 Bias low fault below 8.0V auto powerdowm
002-L2 can't read packet from connection to 002c35 Irouter: read failed-read error

—-

kill all l2term
startx

don't use the term that comes up.

start xterm

—- from pod, missing memory regions:

30-31
36-37
50-51
58-59
74-75
30 004c32 off
36 013c10 off
59 012c35 off
51 011c35 off
74 006c16 off

Kept turning off complaining c bricks and she came up


ssh enterprise
swmgr
freeware.sgi.com
rsync 2.5.6
pkginfo
screen

############################################################################## 1-25-07 still trying to get enterprise running. Went down during the night.

008c24 ALERT: Error reading monitor PIMM1 A and B interupt status1: "no acknowledge" and "bus busy"
l2erm | tee /tmp/l2.log

during boot:

160 objects
116 hubs
44 routers
001c32: IRouter: read failed- read error
004c29 Alert: Error Reading monitor PIMM1 A interupt status 2: arbitration lost.

l2gui (??)

004c24 Membanks 01 disabled
0016c35: CPU disabled, Reason: One or both cpu's took exception.
004c29 missing
008c24 missing

Incomplete memory regions:(NASID's)

92-93
28-29
122-123
016c36

as of 1-25-2007, these c bricks are turned off:

001c24
004c29
004c32
004c35 Luke 7/7/06 Pimm Issues
006c16
006c21
008c24
011c32
011c35
012c32 SGI Defective
012c35
013c10
013c13
016c35
004c24

1-29-2007

Rebooted and ran update.

113 hubs
44 routers
452 400MHz IP35 processors

Hardware information:

# hinv -v
# topology (needs original path, not freeware)

incomplete memory regions:

122-123
92-93
4-5

6-18-2007 Turned off

002C10
002C13 ?

Moved power connectors for router over left one position.

Jun 14 03:53:17 enterprise unix: |$(0x160)WARNING: 002c10 ATTN: 1.5V level stabilized @  1.354V.

March 17, 2008

Commented ftp out of /etc/inetd.conf and kill -HUP the process because of the logs being filled with connection attempts.

http://www.pimpworks.org/sgi/issues.html


April 11, 2008

Turned off 03c35(complaining) and 03c32(ok?) Turned off 13c29(complaining) and 13c24(ok?) Turned off 14c32(complaining) and 14c35(ok?)


May 19, 2008

Turned off 002c29(complaining) and 002c24(ok) Turned off 012c16(complaining) and 012c21(ok)


July 2, 2008

Turned off 016c29(complaining) and 016c24 Turned off 007c21(complaining) and 007c16


Nov 3, 2008

Turned off 004r38(complaining) Turned off 001c32(complaining) and 001c35


Dec 15, 2008

Rebooted


Dec 18, 2008

Turned off 003c10(complaining) and 003c13


Dec 19, 2008

011r19 complaining of high voltage

switched power module with 6r19 and switched 6r19 with 4r38 since 4r38 was complaining worse

That did not help, so switched 4r38 main board with 11r19. now personalities are reversed.

tried turning off 11r19 (which now reports as 4r38) and turning on 4r38.

Dec 23, 2008

Switched 4r38(bad one) with 12r19(good one)

4r38 ALERT: Error reading monitor POWER interrupt status 1: no acknowledge

This fixed 4r38, a repeating router and allowed 11r19 to be turned off.


Jan 6, 2009

$(0x167)WARNING: /hw/module/016c13/node/cpubus/1/a: INTR: Scache SBE, failing bit 16 CPU F9C3.K3  SRAM  H4A7.P7

Oct 9, 2009

12c24 Error reading monitor Node 2 interrupt status 1: no acknowledge
14c16 PIMM0 High

Oct 16, 2009

12c32 Error reading monitor PIMM0 A interrupt status 1: no acknowledge
12c32 Error reading monitor PIMM0 A interrupt status 2: no acknowledge

Oct 19, 2009

1c24 Fatal PI errir interrupt 0x80000000 (from CPU B)

graphviz

aoe/enterprise.txt · Last modified: 1970/01/01 00:00 (external edit)