The ramifications of non-routable IP addresses

The term non-routable IP means a little bit more than one might think.  There are of course IP ranges not used on the internet that are expected to be used by companies intranets, but this isn’t the end of the story.  Oracle RAC for instance wants what it calls non-routable IP addresses.  But these can’t be just any two addresses that can see each other, like 10.0.0.1 and 10.0.0.2.  It could be, but Oracle seems to have a soft requirement that interfaces for interconnects have different addresses.

In our environment, we have many different VLANs which are connected to each other through a switch.  Lets call them VLAN 0, VLAN 4, and VLAN 8.  As a happy coincidence, the vast majority of the IPs on these VLANs follow this convention:

VLAN 0 ( 10.32.0 ) –> If machines all have 255.255.252.0 as their netmask, they could talk to each other within the VLAN if their IPs are between 10.32.0.1 and 10.32.3.254.  The gateway out is always specified as 10.32.0.1 but it doesn’t have to be.  It could be 10.32.2.17, but that would not be a good practice.

VLAN 4 ( 10.32.4 ) –> VLAN 4 starts at 10.32.4.1 and goes to 10.32.7.254, provided that everyone use netmask 255.255.255.252.  If hosts within this VLAN decide to use different netmasks, such as 255.255.255.253, they will see a smaller subset of hosts without going through the gateway.  They might be able to see other hosts by going through the gateway, but things could get pretty ugly pretty fast, especially when talking to other hosts which they can see on the local VLAN, but which can’t see them.

VLAN 8 ( 10.32.8 ) By now, you should get the point.  Lets say however, that we decide that all of the hosts on VLAN 8 should use netmask 255.255.255.0.  In this case the IP range would be smaller: 10.32.8.1 to 10.32.8.254.  But we also now have the potential to create something which we arbitrarily call VLAN 9 and could begin populating it with IPs 10.32.9.1 – 10.32.9.254 and a netmask of 255.255.255.0.

For a long time, I thought that on the switch, where VLAN 4 was specified, some logic was also included to only allow IPs 10.32.4.1 – 10.32.7.254.  This is not the case.  If I put the IPs 10.32.12.23 and 10.32.12.24 on VLAN 4, they will see each other.  It would make sense for me to rachet down the netmask to be as small as possible in this case, but the physical hardware of the switch will allow these two interfaces to talk to each other,  They would not be seen or see other IPs on the same VLAN which did not fit into their IP/netmask restriction.  They also wouldn’t see the gateway unless it also fit into their scheme, 10.32.12.24 for example.

One could imagine a really interesting configuration of a VLAN with 2 or 3 evenly divided IPs that couldn’t see each other.  Each would have their own unique gateways out.  Since I recently read “Godel, Escher, and Bach”, I am reminded of this image (Double Planetoid), which represents two coexisting worlds that never see each other.  I suggest that you click to enlarge the image below to fully understand it.

And so, in the real world, you probably wouldn’t have a VLAN of two equal parts that couldn’t see each other, the dinosaur IPs and the civilization IPs, but that doesn’t mean that its not possible, or that some dinosaur IPs may not exist in the VLAN.  This is the case with our non-routable RAC heartbeats and interconnects. They can’t exist in their own VLAN because they share interfaces and it would be a little silly to create special VLANs for only a few interfaces, yet they can’t really every see the other addresses on their own VLAN and so the model is valid.

The easiest way to set up ssh without a password

Most people know how to create ssh public and private keys. If you hit enter when it asks for a password, you then have keys that don’t need a password to authenticate. As I show below, you really only need to do this once for each user if you own a whole farm of servers. Only worry about unique keys if you are giving them away to someone else or putting them on a server outside of your control. This may seem lax, but if you tighten up security too much in some places, you end up with unwieldy policies that people find ways to work around.

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/username/.ssh/id_rsa):
Created directory '/home/username/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/username/.ssh/id_rsa.
Your public key has been saved in /home/username/.ssh/id_rsa.pub.
The key fingerprint is:
63:ef:3d:e0:83:86:57:5c:61:57:2c:5c:9f:a4:f2:c6 username@thishost

I like to start by making a line between /etc/ssh/ssh_known_hosts and /home/root/known_hosts. This way, when I accept a host as root, it works for everybody. My philosophy is that if root trusts a host to be what it says it is, everyone else can trust it too:

cd /etc/ssh
ln -s /etc/ssh/ssh_known_hosts /home/root/.ssh/known_hosts

Next I use the same idea for users. Instead of making special keys for each server, I simply use the same one. This allows me to copy my user’s id_rsa.pub to authorized_keys:

scp root@trustedhost:/home/root/.ssh/id_rsa id_rsa
scp root@trustedhost:/home/root/.ssh/id_rsa.pub id_rsa.pub
cp id_rsa.pub authorized_keys

After that, just run a test to see if it works:

>ssh trustedhost pwd
/home/root
>ssh trustedhost ssh thishost pwd
/home/root

Linux LVM

IBM has given parts of the AIX LVM to Linux, so it would only stand to reason that there is a set of AIX-like commands for linux to use. This shows how to basically create a filesystem. A more comprehensive introduction to LVM is at:

HowtoForge

I have created 4 new partitions and got them available to linux for a test:

cd /dev
ls
brw-r----- 1 root disk   8,  32 Nov  3 10:52 sdc
brw-r----- 1 root disk   8,  48 Nov  3 10:52 sdd
brw-r----- 1 root disk   8,  64 Nov  3 10:52 sde
brw-r----- 1 root disk   8,  80 Nov  3 10:52 sdf

And the LVM command that I find are:

lvm
lvmchange
lvmdiskscan
lvmsadc
lvmsar

pvchange
pvcreate
pvdisplay
pvmove
pvremove
pvresize
pvs
pvscan

vgcfgbackup
vgcfgrestore
vgchange
vgck
vgconvert
vgcreate
vgdisplay
vgexport
vgextend
vgimport
vgmerge
vgmknodes
vgreduce
vgremove
vgrename
vgs
vgscan
vgsplit

I start with pvcreate:

 pvcreate /dev/sdc /dev/sdd /dev/sde /dev/sdf
  Physical volume "/dev/sdc" successfully created
  Physical volume "/dev/sdd" successfully created
  Physical volume "/dev/sde" successfully created
  Physical volume "/dev/sdf" successfully created

Next I display what I have done:

 pvdisplay
  --- Physical volume ---
  PV Name               /dev/sdb1
  VG Name               system
  PV Size               15.00 GB / not usable 0
  Allocatable           yes
  PE Size (KByte)       4096
  Total PE              3839
  Free PE               24
  Allocated PE          3815
  PV UUID               Vydyz4-Njvf-6Wyk-Xl2s-UQM1-VcG4-Gbdk9a

  --- NEW Physical volume ---
  PV Name               /dev/sdc
  VG Name
  PV Size               3.75 GB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               dYMnY2-V4By-N21v-s4Bh-nfMz-TKOp-xrmT2s

I already had one volume created (sdb1) and it clearly shows up a little different. The new
ones all say ‘NEW’ for one thing, and don’t yet show Allocatable, probably because they aren’t associated with a logical volume yet. First, they need a volume group, though:

vgcreate testvg /dev/sdc /dev/sdd /dev/sde /dev/sdf
  Volume group "testvg" successfully created

And the display command shows what we did:

vgdisplay
  --- Volume group ---
  VG Name               system
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               15.00 GB
  PE Size               4.00 MB
  Total PE              3839
  Alloc PE / Size       3815 / 14.90 GB
  Free  PE / Size       24 / 96.00 MB
  VG UUID               F8PeFV-TIJj-YHqB-SpYv-Qxgn-UuwH-yE8eQH

Just like in AIX, the volume group can now be split into logical volumes:

lvcreate --name testlv1 --size 2G testvg
  Logical volume "testlv1" created
lvcreate --name testlv2 --size 2G testvg
  Logical volume "testlv2" created
lvdisplay
  --- Logical volume ---
  LV Name                /dev/testvg/testlv1
  VG Name                testvg
  LV UUID                ixjAYJ-A8Uz-CUga-OCmo-HBvv-OaC0-wkR0pN
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                2.00 GB
  Current LE             512
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:1

The maps flag is probably something close to AIX’s lsvg -l:

lvdisplay --maps
  --- Logical volume ---
  LV Name                /dev/testvg/testlv1
  VG Name                testvg
  LV UUID                ixjAYJ-A8Uz-CUga-OCmo-HBvv-OaC0-wkR0pN
  LV Write Access        read/write
  LV Status              available
  # open                 0
  LV Size                2.00 GB
  Current LE             512
  Segments               1
  Allocation             inherit
  Read ahead sectors     0
  Block device           253:1

  --- Segments ---
  Logical extent 0 to 511:
    Type                linear
    Physical volume     /dev/sdc
    Physical extents    0 to 511

Even though we can make reference to volume group names (testvg) directly, logical
volumes seem to be named specifically as their device location. To extend and reduce
the size of the logical volume, we have to say:

lvextend -L 2.5G /dev/testvg/testlv2
  Extending logical volume testlv2 to 2.50 GB
  Logical volume testlv2 successfully resized

lvreduce -L 2G /dev/testvg/testlv2
  WARNING: Reducing active logical volume to 2.00 GB
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce testlv2? [y/n]: y
  Reducing logical volume testlv2 to 2.00 GB
  Logical volume testlv2 successfully resized

Next, lets build a filesystem:

 mkfs.ext3 /dev/testvg/testlv1
mke2fs 1.38 (30-Jun-2005)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
262144 inodes, 524288 blocks
26214 blocks (5.00%) reserved for the super user
First data block=0
16 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912

Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 28 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

Finally, all we have to do is mount the filesystem and add a line to /etc/fstab for it:

mkdir /test
mount /dev/testvg/testlv1 /test
df -u

(add to /etc/fstab)
/dev/testvg/testlv1  /test                ext3       rw,noatime            0 0

Another LVM Howto

Emacs Lisp

; Emacs Lisp

I have been playing around with the lisp integrated into emacs, rather than one that you bring up with slime and run alongside emacs.  Of course the first big difference is that so much is tied to how you call the function, pass it variables, and what it does.  Many if not all of the games are written in emacs lisp and you can go to help, read about them, and then pull up their lisp source.

sumacheck – downloads patches from IBM and send email

Suma (smitty suma) is a pretty cool automated patch downloader for AIX. I tweaked it a little bit to send me emails only when it finds something new. Just put this in cron and don’t worry about the cron setup through suma itself.

#################################################################
# Title      :  sumacheck - Goes to IBM and gets new patches
# Author     :  John Rigler
# Date       :  01-13-2009
# Requires   :  ksh
#################################################################

/usr/suma/bin/suma -x -a RqType=Latest -a Action=Download \
 -a Repeats=n  > /tmp/$$.sumafile

grep SUCCEED /tmp/$$.sumafile > /tmp/$$.sumasuccess

if [[ $? -eq 0 ]]
 then
   mail -s "New Packages from IBM" userid@company.com \
   < /tmp/$$.sumasuccess
 fi

rm /tmp/$$.sumafile /tmp/$$.sumasuccess

Make sure system will boot OK

1. bosboot -ad /dev/hdisk5



root@nad0019aixd09/dev> bootlist -m normal -o
hdisk5 blv=hd5


bootlist -m normal -o

ln rhdisk5 ipldevice   ( use ln to create a copy of the raw devices if necessary)

These need to all exist and point to rootvg, rhd5, and rhdiskX ( being whereever the boot image is):

root@nad0019aixd09/dev> ls -l | grep -i ipl
crw-rw----   1 root     system       10,  0 Jan 11 2006  IPL_rootvg
crw-rw----   2 root     system       10,  1 Apr 23 13:02 ipl_blv
crw-------   2 root     system       20,  6 Apr 23 12:19 ipldevice

root@nad0019aixd09/dev> ipl_varyon -i


PVNAME          BOOT DEVICE     PVID                    VOLUME GROUP ID
hdisk2          NO              00033f6a7c51d6bd0000000000000000        00cdeaea00004c00
hdisk4          NO              00cdeaea38c938d20000000000000000        00cdeaea00004c00
hdisk5          YES             00033f6a7c4c17d40000000000000000        00cdeaea00004c00  <--- this one is important


Don’t panic if your boot hangs on led 538

538 The configuration manager is going to invoke a configuration method.

Tonight, we had to reboot one of our servers after an old version of powerpath freaked out while discovering LUNs. The LUNs were discovered again on boot and we set on 538 for about 15 minutes. When you are used to the whole LPAR coming up in less then 10 minutes, this can be scary, but just as we were about to make other plans, the led moved on and cfgmgr finished.

When trying to bring the same LUNs online in normal mode, the server would hang on 538 forever for some reason, I suspect it is because we are at 5200-08 and powerpath 3.0.4.0, really old stuff.

Of course something similar happens when installing upgrades, it seems to hang forever.

If the network config is wrong, it will get past config manager and hang on NSF or something like that. In this case, I usually boot up with an alternate profile that doesn’t have any network adapters, then from the console I just rmdev everything and then reboot back with my old profile. Works every time.

We saw this again later and it took more like 40 minutes but then came up.

How to drop into maintenance shell during an mksysb install

I was cloning a system from a NIM mksysb at AIX 5.3 ML 7 and chose the option to ‘reduce the filesystem’ sizes.  This seemed to work fine, but it reduced the size of /tmp to a point that it could not create a boot image when it got to that part of the install.  No problem, I just went to the maintenance shell, removed some useless files, and reran the command.  The bosboot worked fine.  When I exited from the maintenance shell, the install continued and finished successfully.  Unless you get failures during the actual restore command, you can often figure out what went wrong and just continue.  Sysback operates the same way.  It appears to have retried the command, because when I continued on, it looks like it tried the mksysb again:

 

                                                                               
   

                                                                            
0301-152 bosboot: not enough file space to create:                             
         bootimage                                                             
         /tmp has 31164 free KB.                                               
         bootimage needs 36242 KB.                                             

BOS Install: Could not create boot image.                                      
   ID#        OPTION                                                           
     1        Continue                                                         
     2        Perform System Maintenance and Then Continue                     
   Enter ID number: 2                                                          

# bosboot -ad /dev/hdisk0                                                      
                                                                               
0301-152 bosboot: not enough file space to create:                             
         bootimage                                                             
         /tmp has 31160 free KB.                                               
         bootimage needs 36242 KB.                                             
# cd /tmp                                                                      

(Remove a bunch of extra stuff you don't need anyway, in this case old Oracle dumps or patch installs)

# bosboot -ad /dev/hdisk0                                                      
                                                                               
bosboot: Boot image is 35867 512 byte blocks.                                  
# exit
       

         Please wait...                                                                                                                                 
                                                                                
        Approximate     Elapsed time                                           
     % tasks complete   (in minutes)                                            
                                                                                
          87               14      Creating boot image.                         
                                                                                                                                                                   
   Copyright BULL 1993, 2007.                                                  
   Copyright Digi International Inc. 1988-1993.                                
   Copyright Interactive Systems Corporation 1985, 1991.                        
...

 US Government Users Restricted Rights - Use, duplication or disclosure        
 restricted by GSA ADP Schedule Contract with IBM Corp.                        
                                                                               
forced unmount of /var                                                         
Rebooting . . .                                                                 
                                                                                
                                                                                
                                                                               
                                                                               
        Approximate     Elapsed time                                           
     % tasks complete   (in minutes)                                           
                                                                               
                                                                               
          87               14      Creating boot image.                        
                                                                               
                                                                               
  
   Copyright BULL 1993, 2007.                                                  
   Copyright Digi International Inc. 1988-1993.                                
   Copyright Interactive Systems Corporation 1985, 1991.                       
   Copyright ISQUARE, Inc. 1990.                                               
   Copyright Mentat Inc. 1990, 1991.                                           
   Copyright Open Software Foundation, Inc. 1989, 1994.                        
   Copyright Sun Microsystems, Inc. 1984, 1985, 1986, 1987, 1988, 1991.        
                                                                               
 All rights reserved.                                                          
 US Government Users Restricted Rights - Use, duplication or disclosure        
 restricted by GSA ADP Schedule Contract with IBM Corp.                        
                                                                               
forced unmount of /var                                                         
Rebooting . . .

 

After that everything works fine.