Apr
14
2010
0

Open Consulting Article

This is a great article on open-consulting. I find it similar to articles on open source.

Written by admin in: Coffee |
Apr
08
2010
0

The ramifications of non-routable IP addresses

The term non-routable IP means a little bit more than one might think.  There are of course IP ranges not used on the internet that are expected to be used by companies intranets, but this isn’t the end of the story.  Oracle RAC for instance wants what it calls non-routable IP addresses.  But these can’t be just any two addresses that can see each other, like 10.0.0.1 and 10.0.0.2.  It could be, but Oracle seems to have a soft requirement that interfaces for interconnects have different addresses.

In our environment, we have many different VLANs which are connected to each other through a switch.  Lets call them VLAN 0, VLAN 4, and VLAN 8.  As a happy coincidence, the vast majority of the IPs on these VLANs follow this convention:

VLAN 0 ( 10.32.0 ) –> If machines all have 255.255.252.0 as their netmask, they could talk to each other within the VLAN if their IPs are between 10.32.0.1 and 10.32.3.254.  The gateway out is always specified as 10.32.0.1 but it doesn’t have to be.  It could be 10.32.2.17, but that would not be a good practice.

VLAN 4 ( 10.32.4 ) –> VLAN 4 starts at 10.32.4.1 and goes to 10.32.7.254, provided that everyone use netmask 255.255.255.252.  If hosts within this VLAN decide to use different netmasks, such as 255.255.255.253, they will see a smaller subset of hosts without going through the gateway.  They might be able to see other hosts by going through the gateway, but things could get pretty ugly pretty fast, especially when talking to other hosts which they can see on the local VLAN, but which can’t see them.

VLAN 8 ( 10.32.8 ) By now, you should get the point.  Lets say however, that we decide that all of the hosts on VLAN 8 should use netmask 255.255.255.0.  In this case the IP range would be smaller: 10.32.8.1 to 10.32.8.254.  But we also now have the potential to create something which we arbitrarily call VLAN 9 and could begin populating it with IPs 10.32.9.1 – 10.32.9.254 and a netmask of 255.255.255.0.

For a long time, I thought that on the switch, where VLAN 4 was specified, some logic was also included to only allow IPs 10.32.4.1 – 10.32.7.254.  This is not the case.  If I put the IPs 10.32.12.23 and 10.32.12.24 on VLAN 4, they will see each other.  It would make sense for me to rachet down the netmask to be as small as possible in this case, but the physical hardware of the switch will allow these two interfaces to talk to each other,  They would not be seen or see other IPs on the same VLAN which did not fit into their IP/netmask restriction.  They also wouldn’t see the gateway unless it also fit into their scheme, 10.32.12.24 for example.

One could imagine a really interesting configuration of a VLAN with 2 or 3 evenly divided IPs that couldn’t see each other.  Each would have their own unique gateways out.  Since I recently read “Godel, Escher, and Bach”, I am reminded of this image (Double Planetoid), which represents two coexisting worlds that never see each other.  I suggest that you click to enlarge the image below to fully understand it.

And so, in the real world, you probably wouldn’t have a VLAN of two equal parts that couldn’t see each other, the dinosaur IPs and the civilization IPs, but that doesn’t mean that its not possible, or that some dinosaur IPs may not exist in the VLAN.  This is the case with our non-routable RAC heartbeats and interconnects. They can’t exist in their own VLAN because they share interfaces and it would be a little silly to create special VLANs for only a few interfaces, yet they can’t really every see the other addresses on their own VLAN and so the model is valid.

Written by admin in: Install,Network |
Apr
01
2010
0
Mar
04
2010
0

bash versus powershell: shell escapes

bash:

/home/coffee> echo “This is the `date`”

This is the Thu Mar  4 10:16:33 CST 2010

powershell:

“This is the $(Get-Date)”

This is the 03/04/2010 10:34:41

Written by admin in: Powershell |
Dec
08
2009
3

Formula for converting memory pages to gigabytes

vmstat in AIX returns memory statistics in 4096 byte blocks. To convert this to gigabytes:

number of blocks / 256 / 1024

Written by admin in: Memory |
Dec
07
2009
0

Interesting article about the future of salespeople

This article is about consumer electronics, but could also apply to many of the products that we use in IT.  As far as I am concerned, I would rather just go to a website to purchase IBM equipment with whatever config I want.  Then i want to go back a year later, pull up my machine, and perhaps add an adapter or two to it.

Link

Written by admin in: Coffee |
Nov
19
2009
0

Install Oracle 10g on AIX

Start with a gz file, uncompress it and then use cpio to unpack it:

gunzip 10gr2_aix5l64_database.cpio.gz
cpio -idcmv < 10gr2_aix5l64_database.cpio

I had to use the 'c' flag because I got a cpio error when I tried without.

In a wierd coincidence, I found this on a site almost like my own:
Life After Coffee

This will make a 'Disk1' subdirectory, a painful reminder that this is just code from a CD that got shuffled over here:

#cd Disk1/rootpre
#./rootpre.sh
./rootpre.sh output will be logged in /tmp/rootpre.out_09-11-19.10:54:06
Saving the original files in /etc/ora_save_09-11-19.10:54:06....
Copying new kernel extension to /etc....
Loading the kernel extension from /etc

 Oracle Kernel Extension Loader for AIX
       Copyright (c) 1998,1999 Oracle Corporation

 Successfully loaded /etc/pw-syscall.64bit_kernel with kmid: 0x4525300
 Successfully configured /etc/pw-syscall.64bit_kernel with kmid: 0x4525300
The kernel extension was successfuly loaded.

Configuring Asynchronous I/O....

Configuring POSIX Asynchronous I/O....

Checking if group services should be configured....
Nothing to configure.

Now create an oracle userid and a dba and oinstall group:

mkuser -a groups=dba,oinstall oracle
mkdir /apps/oracle
chown ora.dba /apps/oracle

The rest is a little tricky, you now have to set an x-environment. Since AIX stopped making machines where you could just beebop into the data center and log into CDE, you will probably be using something on your client PC to do this. I use cygwin. Setting up cygwin is almost a different post, but I set up the default for X11 and got an error that twm wasn't found. From my cygwin bash shell I went to /etc/X11/xinit and eventually just replaced my xinitrc by doing this:

echo "xterm" > xinitrc

It is ugly, but I get an x environment with a shell that works.

From that shell, I run 'xhost +' and then 'ipconfig' to get my ip.

From the AIX session, I create a new /apps filesystem, become the 'oracle' user, and then export my display:

# crfs -v jfs2 -g rootvg -m /apps -a size=5G
File system created successfully.
5242516 kilobytes total disk space.
New File System size is 10485760
# mount /apps
# chown oracle.dba /apps
# su - oracle
$ export DISPLAY=10.32.32.95:0.0
$ xclock

I see a clock in my xterm, so I control-C in AIX to kill it. I now have my environment set up to run the installer:

$ cd /tmp/Disk1
$ ./runInstaller
**************************************************************************
******

Your platform requires the root user to perform certain pre-installation
OS preparation.  The root user should run the shell script 'rootpre.sh' be
fore
you proceed with Oracle installation.  rootpre.sh can be found at the top
level
of the CD or the stage area.

Answer 'y' if root has run 'rootpre.sh' so you can proceed with Oracle
installation.
Answer 'n' to abort installation and then ask root to run 'rootpre.sh'.

**************************************************************************
******

Has 'rootpre.sh' been run by root? [y/n] (n)
y

Starting Oracle Universal Installer...

No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.
Preparing to launch Oracle Universal Installer from /tmp/OraInstall2009-11-19_11-11-29AM. Please wait ...$ Nov 19, 2009 11:11:36 AM java.util.prefs.FileSystemPreferences$2 run
INFO: Created user preferences directory.
Nov 19, 2009 11:11:38 AM java.util.prefs.FileSystemPreferences$3 run
INFO: Created system preferences directory in java.home.

Now I am in the gui, but I don't feel like posting GUI screen shots, so I will just talk you through it, I change the install path to be under /apps/oracle. Ditto with the next question about the inventory directory:

/apps/oracle
/apps/oracle/oraInventory
I set the operating system group to 'dba'

I can't cut and paste from the gui window, but there checks give me one warning:

bos.adt.prof 5.3.0.1
bos.cifs_fs.rte 5.3.0.1 <---- this didn't install correctly with NIM, I had to move to
my local system and run installp.

I pulled out my media and installed them, no problem. The retry button doesn't work, so I had to hit back and forward on the gui to get it to really retry the tests and be successful. Looks like to install everything requires 3.46G.

After it all runs, I am asked to run as root:

# /apps/oracle/oraInventory/orainstRoot.sh
# ./orainstRoot.sh
Changing permissions of /apps/oracle/oraInventory to 775.
Changing groupname of /apps/oracle/oraInventory to dba.
The execution of the script is complete

and also:

# /apps/oracle/product/10.2.0/db_1/root.sh
Running Oracle10 root.sh script...

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /apps/oracle/product/10.2.0/db_1

Enter the full pathname of the local bin directory: [/usr/local/bin]:
Creating /usr/local/bin directory...
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.

I am just going to start with the default instance:
cd /apps/oracle/product/10.2.0/db_1/dbs/
cp init.ora initoracle.ora

Default shared memory isn’t enough:

$ sqlplus “/ as sysdba”

SQL*Plus: Release 10.2.0.1.0 – Production on Thu Nov 19 12:26:24 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-00371: not enough shared pool memory, should be atleast 123232153 bytes
SQL>

Remove these:

shared_pool_size = 3500000                                            # SMALL
# shared_pool_size = 5000000                                          # MEDIUM
# shared_pool_size = 9000000                                          # LARGE

Replace with:

shared_pool_size = 123232153                                    # Required

$ sqlplus “/ as sysdba”

SQL*Plus: Release 10.2.0.1.0 – Production on Thu Nov 19 12:32:33 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 163577856 bytes
Fixed Size 2019328 bytes
Variable Size 150994944 bytes
Database Buffers 8388608 bytes
Redo Buffers 2174976 bytes
ORA-00205: error in identifying control file, check alert log for more info

I then become oracle and try to get into sqlplus:

. /usr/local/bin/oraenv
ORACLE_SID = [oracle] ?
ksh: dbhome: not found.
ORACLE_HOME = [] ? /apps/oracle/product/10.2.0/db_1
$ sqlplus “/ as SYSDBA”

SQL*Plus: Release 10.2.0.1.0 – Production on Thu Nov 19 12:05:34 2009

Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to an idle instance.

SQL>

Written by admin in: Oracle |
Nov
16
2009
0

oslevel -s gives bogus message; 70324,004

On AIX 6.1, we see the following error:

# oslevel -s
rpm_share: 0645-024 Unable to access directory /tmp/.workdir.487456.430278_1
rpm_share: 0645-007 ATTENTION: init_baselib() returned an unexpected result.
6100-02-05-0939
#

This may be something we can further debug, if oslevel is a shell script.

First of all, figure out where oslevel is:

# whereis oslevel
oslevel: /usr/bin/oslevel

Next, figure out what it is:

# file /usr/bin/oslevel
/usr/bin/oslevel: shell script  - ksh (Korn shell)

Now, that we know its a shell, script, we put in a set -x. To do this to the beginning, you
can just put a -x after the ksh line:

Change:

#!/bin/ksh
# IBM_PROLOG_BEGIN_TAG
# This is an automatically generated prolog.
#
# bos53L src/bos/usr/bin/oslevel/oslevel.sh 1.5.6.2
#

To this:

#!/bin/ksh -x
# IBM_PROLOG_BEGIN_TAG
# This is an automatically generated prolog.
#
# bos53L src/bos/usr/bin/oslevel/oslevel.sh 1.5.6.2
#

Then, run it again (I only show the end of the command):

+ [[ 1 = 1 ]]
+ [[ 0 = 1 ]]
+ [[ 0 = 1 ]]
+ [[ 0 = 1 ]]
+ print_current_spack
rpm_share: 0645-024 Unable to access directory /tmp/.workdir.667776.684286_1
rpm_share: 0645-007 ATTENTION: init_baselib() returned an unexpected result.
6100-02-05-0939
+ exit 0
+ interrupted

Looks like we are looking for a function called ‘print_current_spack’, the set -x doesn’t descend
into functions, so take the original one out and add one into this function:

Before:

print_current_spack()
{
        trap interrupted INT QUIT TERM
        typeset BUF
        typeset DVRMF
        typeset DSPNO
        typeset DTLNO

After:

print_current_spack()
{
        set -x
        trap interrupted INT QUIT TERM
        typeset BUF
        typeset DVRMF
        typeset DSPNO
        typeset DTLNO

Now we see the next level of debugging show up at the top of the output:

# oslevel -s | more
+ trap interrupted INT QUIT TERM
+ typeset BUF
+ typeset DVRMF
+ typeset DSPNO
+ typeset DTLNO
+ get_known_spacks
rpm_share: 0645-024 Unable to access directory /tmp/.workdir.536746.573476_1
rpm_share: 0645-007 ATTENTION: init_baselib() returned an unexpected result.
+ + /bin/grep _SP: /tmp/oslevel.0.577656/.oslevel.mlinfo
+ /bin/grep :-:
+ /usr/bin/awk -F: { print $1 }

So we just move on to ‘get_known_spacks’. In this case, I can’t find the output easily on the screen,
so I write to a file:

oslevel -s > /tmp/outfile 2>&1

print_current_rml (this may now be official spaghetti code):

+ typeset MVRMF
+ typeset SPNO
+ typeset TLNO
+ + print_current_rml
rpm_share: 0645-024 Unable to access directory /tmp/.workdir.585812.594022_1
rpm_share: 0645-007 ATTENTION: init_baselib() returned an unexpected result.
BUF=6100-02-00_SP
+ [[ ! -s /tmp/oslevel.0.684192/.oslevel.mlinfo ]]
+ + /usr/bin/awk -F: $1 ~ /_SP$/ { print $1 } /tmp/oslevel.0.684192/.oslevel.mlinfo
+ /usr/bin/sort -t- -r
+ /usr/bin/uniq

Next function = get_known_rmls:

+ trap interrupted INT QUIT TERM
+ get_known_rmls
rpm_share: 0645-024 Unable to access directory /tmp/.workdir.557148.569454_1
rpm_share: 0645-007 ATTENTION: init_baselib() returned an unexpected result.
+ + /bin/grep _AIX_ML /tmp/oslevel.0.340030/.oslevel.mlinfo
+ grep :-:
+ /usr/bin/awk -F: { print $1 }
+ /usr/bin/uniq

print_min_rml:

+ trap interrupted INT QUIT TERM
+ typeset BUF
+ + print_min_rml
rpm_share: 0645-024 Unable to access directory /tmp/.workdir.512036.557160_1
rpm_share: 0645-007 ATTENTION: init_baselib() returned an unexpected result.
BUF=6120-00
+ [[ ! -s /tmp/oslevel.0.553194/.oslevel.mlinfo ]]
+ + /usr/bin/awk -F: $1 ~ /_AIX_ML$/ { print $1 } /tmp/oslevel.0.553194/.oslevel.mlinfo
+ /usr/bin/uniq
+ /usr/bin/sort -r
KWN_RML_LEVS=6100-02_AIX_ML
6100-01_AIX_ML
6100-00_AIX_ML
5300-08_AIX_ML

Now, it might be getting trickier:

+ trap interrupted INT QUIT TERM
+ typeset BUF
+ + print_min_rml
+ typeset ERROR=eval /usr/sbin/inuumsg 216 110 >&2; exit 1
+ typeset BUF
+ [[ 1 -ne 1 ]]
+ + /usr/bin/lslpp -qLc bos.rte
+ LC_ALL=C
rpm_share: 0645-024 Unable to access directory /tmp/.workdir.377012.446524_1
rpm_share: 0645-007 ATTENTION: init_baselib() returned an unexpected result.
BUF=bos:bos.rte:6.1.2.1: : :C:F:Base Operating System Runtime: : : : : : :0:0:/:0920
+ + echo bos:bos.rte:6.1.2.1: : :C:F:Base Operating System Runtime: : : : : : :0:0:/:0920
+ awk -F: {split($3,lev,".");
         print lev[1]lev[2]lev[3]"0"; exit(0)}
+ LC_ALL=C

Now, it gets more difficult. First we see that rpm_share is not something called directly, although some research shows that it is a different script. Set -x in it gets stripped out, so that doesn’t help.

Here is the source:

print_min_rml()
{
   typeset ERROR='eval /usr/sbin/inuumsg 216 110 >&2; exit 1'
   typeset BUF

   [[ $recml -ne 1 ]] && return 0

   # There is no RML information, lets print out
   # VRMF-00.
   BUF=$(LC_ALL=C /usr/bin/lslpp -qLc bos.rte) || $ERROR
   BUF=$(echo "$BUF" | LC_ALL=C awk -F: '{split($3,lev,".");
         print lev[1]lev[2]lev[3]"0"; exit(0)}')
   [[ $? -ne 0 || -z $BUF ]] && $ERROR
   echo "${BUF}-00"
   return 0

}  # end of print_min_rml()

Lets walks though it from the shell, one line at a time:

# BUF=$(LC_ALL=C /usr/bin/lslpp -qLc bos.rte)
# echo $BUF
bos:bos.rte:6.1.2.1: : :C:F:Base Operating System Runtime: : : : : : :0:0:/:

I include the echo to make sure we got something, so we move onto the next line:

 echo "$BUF" |  LC_ALL=C awk -F: '{split($3,lev,"."); print lev[1]lev[2]lev[3]"0"; exit(0)}'
6121

I would have done that mass off ugliness differently, but now we have BUF=6121

We are also at the end of this function, it is supposed to return 6.1.2.0-00, and does so, but also throws an error. Lets try to hard-code something to isolate it. This means stepping back a function:

get_known_rmls()
{
   trap interrupted INT QUIT TERM
   typeset BUF

#   BUF=$(print_min_rml)
        BUF="6.1.2.0-00"

   if [[ ! -s $mlinfo ]]; then
      KWN_RML_LEVS=$BUF
      return 0

This works fine:

# oslevel -s
6100-02-05-0939

So know we know the problem must be in ‘print_min_rml’. When we removed the hard-coding, it still seemed to work, maybe it cleared out an unknown wierdness.

Written by admin in: Uncategorized |
Nov
11
2009
0

The easiest way to set up ssh without a password

Most people know how to create ssh public and private keys. If you hit enter when it asks for a password, you then have keys that don’t need a password to authenticate. As I show below, you really only need to do this once for each user if you own a whole farm of servers. Only worry about unique keys if you are giving them away to someone else or putting them on a server outside of your control. This may seem lax, but if you tighten up security too much in some places, you end up with unwieldy policies that people find ways to work around.

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/username/.ssh/id_rsa):
Created directory '/home/username/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/username/.ssh/id_rsa.
Your public key has been saved in /home/username/.ssh/id_rsa.pub.
The key fingerprint is:
63:ef:3d:e0:83:86:57:5c:61:57:2c:5c:9f:a4:f2:c6 username@thishost

I like to start by making a line between /etc/ssh/ssh_known_hosts and /home/root/known_hosts. This way, when I accept a host as root, it works for everybody. My philosophy is that if root trusts a host to be what it says it is, everyone else can trust it too:

cd /etc/ssh
ln -s /etc/ssh/ssh_known_hosts /home/root/.ssh/known_hosts

Next I use the same idea for users. Instead of making special keys for each server, I simply use the same one. This allows me to copy my user’s id_rsa.pub to authorized_keys:

scp root@trustedhost:/home/root/.ssh/id_rsa id_rsa
scp root@trustedhost:/home/root/.ssh/id_rsa.pub id_rsa.pub
cp id_rsa.pub authorized_keys

After that, just run a test to see if it works:

>ssh trustedhost pwd
/home/root
>ssh trustedhost ssh thishost pwd
/home/root
Nov
09
2009
0

Email 101 in Unix

This sends an email from a file:

mail -s “here is the subject” email@rigler.org,otheremail@rigler.org < /etc/motd

This sends an email from a pipe:

ls -l | mail -s “here is the subject” email@rigler.org,otheremail@rigler.org

When you first install the system, if you have /etc/resolv.conf configured, email will figure out
where to go.

If you want to forward email sent to your unix account to a different account, just add the email
address that you want to use to a file called ~/.forward

Written by admin in: AIX |

Powered by WordPress | Aeros Theme | TheBuckmaker.com WordPress Themes