Set up a VIO server to be more useable

The special padmin user that IBM tried to make us all use was a big mistake. A vio server is really just an AIX server with a different set of commands. I find it easier to circumvent all of this:

SEED=somegoodhost
scp $SEED:/etc/security/login.cfg /etc/security/login.cfg
chuser su=true login=true rlogin=true shell=/usr/bin/bash root
ln -s /usr/ios/cli/ioscli /usr/sbin/ioscli
chuser shell=/usr/bin/bash groups=staff,buxs rigljoha

All the places hostnames have to get changed in AIX

The hostname gets hardcoded into a couple of nasty places that the ‘hostname’ command doesn’t see or change.  The first is ‘uname -n’ which you set with ‘uname -S newname’.  Oracle tends to use old perl scripts that use this to find the hostname.  To be safe, read the rest of this post also.

The hostname command itself seems to simply read from inet0, but sometimes this doesn’t work:

lsattr -El inet0 | grep hostname

chdev -l inet0 -a hostname=newname

Apparently if you change the hostname of a system, it doesn’t get changed in the IBM.Host stanza of rsct, which CRM relies on.  You must use a refresh command to change this.  Also, if your hostname seems to magically change back to on old name on reboot, check /etc/rc.net to see it is hardcoded there.

> lsrsrc IBM.Host
Resource Persistent Attributes for IBM.Host
resource 1:
Name = “newnodename”
NumProcessors = 16
RealMemSize = 5033164800
OSName = “AIX”
KernelVersion = “5.3”
DistributionName = “IBM”
DistributionVersion = “5300-06-07-0818”
Architecture = “ppc”
NumOnlineProcessors = 8
EntProcCapacity = 150
NumOnVProcessors = 4
NumActPProcessors = 4
ActivePeerDomain = “”
NodeNameList = {“oldnodename”}
> /usr/sbin/rsct/install/bin/recfgct
(doesn’t return anything but takes a few seconds to regenerate from the HMC)

> lsrsrc IBM.Host
Resource Persistent Attributes for IBM.Host
resource 1:
Name = “newnodename”
NumProcessors = 16
RealMemSize = 5033164800
OSName = “AIX”
KernelVersion = “5.3”
DistributionName = “IBM”
DistributionVersion = “5300-06-07-0818”
Architecture = “ppc”
NumOnlineProcessors = 8
EntProcCapacity = 150
NumOnVProcessors = 4
NumActPProcessors = 4
ActivePeerDomain = “”
NodeNameList = {“correctnodename”}

 

So to recap:

hostname newname
uname -S newname
chdev -l inet0 -a hostname=newname
/usr/sbin/rsct/install/bin/recfgct
check /etc/rc.net to see if the hostname has been hardcoded into it by HACMP or something

Restart the rmcd daemons on an LPAR

Sometimes when you try to DLPAR something from the HMC, it fails to communicate, it is because some daemons aren’t running on the client.

I have run this on a vio server. I don’t know if it will affect a client that is already using rsct for something else (HACMP):

to reset the deamons
usr/sbin/rsct/install/bin/recfgct
/usr/sbin/rsct/bin/rmcctrl -p
/usr/sbin/rsct/bin/rmcctrl -z
/usr/sbin/rsct/bin/rmcctrl -A

Use the ioscli command to circumvent padmin

As root on an AIX vio server, you can run /usr/ios/cli/ioscli with whatever padmin command after it that you want to execute.  In this way, you can do things like create scripts:

# /usr/ios/cli/ioscli lsmap -all

Better yet, just get it into your path:

ln -s /usr/ios/cli/ioscli /usr/sbin/ioscli

A wierd but useful flag for the lsdev command is:

lsdev -Cp vhost0

Other lsdev commands are:

lsdev -Cs vtdev
lsdev -Cc virtual target

I got these from the odm:

-bash-3.00$ odmget CuDv | grep disk | tail
        name = "hdiskpower24_"
        PdDvLn = "virtual_target/vtdev/scdisk"
        name = "hdiskpower25_"
        PdDvLn = "virtual_target/vtdev/scdisk"
        name = "hdiskpower26_"
        PdDvLn = "virtual_target/vtdev/scdisk"
        name = "hdiskpower32_"
        PdDvLn = "virtual_target/vtdev/scdisk"
        name = "hdiskpower35_"
        PdDvLn = "virtual_target/vtdev/scdisk"

lquerypv (Undocumented command for determining disk info)

Lquerypv will simply read the data from the disk and display it in a format similar to octal dump (od). In the example below, we see the PVID written to the disk at location 80.  You seem to be able to read anything that you point lquerypv at (I tried /etc/motd and read it just fine).  This is great for reading the PVID of a logical volume on a vio server that is pretending to be a virtual disk on a client since you can’t see that information with lspv.  Lquerypv is also a great command for figuring out where disk access issues are.  If lquerypv returns any data, then you can read the disk and it isn’t a reserve issue.  If it can’t read any data, and just hangs or returns nothing, then ABSOLUTELY NO OTHER AIX COMMAND WILL WORK.  At this point you should stop looking at your filesystems or volume groups and logical volumes.  The issue is that you simply can’t read the disks, and you need to either go to the vio server and see if there is a problem there or use lsattr -El hdisk0 to check the scsi reserve (on another system that might be sharing the disk). If you the issue is on your VIO server, or you have direct-attached SAN disks, then ask your SAN administrators to check their stuff. If, however queries against all of our disk hang, especially during an initial install, then maybe your client SAN software is messed up, you could try to remove it and use the MPIO version or just re-install it. The clearest sign of one disk with a reserve lock at the san level is when lquerypv returns nothing and lquerypv against other disks works fine.

# lspv
hdisk0          00031691bced4a4e                    oraclevg        active
hdisk2          00cdeaeadfcd0ebc                    oraclevg        active
hdisk1          00031691bcd549a6                    rootvg          active
# lquerypv -h /dev/hdisk0
00000000   C9C2D4C1 00000000 00000000 00000000  |................|
00000010   00000000 00000000 00000000 00000000  |................|
00000020   00000000 00000000 00000000 00000000  |................|
00000030   00000000 00000000 00000000 00000000  |................|
00000040   00000000 00000000 00000000 00000000  |................|
00000050   00000000 00000000 00000000 00000000  |................|
00000060   00000000 00000000 00000000 00000000  |................|
00000070   00000000 00000000 00000000 00000000  |................|
00000080   00031691 BCED4A4E 00000000 00000000  |......JN........|
00000090   00000000 00000000 00000000 00000000  |................|
000000A0   00000000 00000000 00000000 00000000  |................|
000000B0   00000000 00000000 00000000 00000000  |................|
000000C0   00000000 00000000 00000000 00000000  |................|
000000D0   00000000 00000000 00000000 00000000  |................|
000000E0   00000000 00000000 00000000 00000000  |................|
000000F0   00000000 00000000 00000000 00000000  |................|
#