Error reporting test message

Since we have tricked out errpt to email messages to us, it is also good to make a test to periodically tell us that we haven’t forgotten to do this from a system. The errlogger command is great for this and can be used in conjunction with dsh to fan across your entire environment:

dsh sudo errlogger “test message”

Then you have to check your emails for test messages:

------------------------------------------------------------------
LABEL:		 		 OPMSG
IDENTIFIER:		 AA8AB241

Date/Time:       Thu Oct 30 11:40:51 CDT 
Sequence Number: 85501
Machine Id:      00CDEAEA4C00
Node Id:         nad0019aixp02s1
Class:           O
Type:            TEMP
Resource Name:   OPERATOR        

Description
OPERATOR NOTIFICATION

User Causes
ERRLOGGER COMMAND

		 Recommended Actions
		 REVIEW DETAILED DATA

Detail Data
MESSAGE FROM ERRLOGGER COMMAND
test message

Two scripts that allow you to mail errpts to yourself

/buxs/bin> more errpt_odmadd
#!/bin/ksh

grep “^##” $0 | sed ‘s/^##//g’ > /tmp/$$.odmadd

odmdelete -o errnotify -q “en_name = syslog”
odmadd /tmp/$$.odmadd
rm /tmp/$$.odmadd

##errnotify:
##  en_pid = 0
##  en_name = “syslog”
##  en_persistenceflg = 1
##  en_label = “”
##  en_crcid = 0
##  en_class = “”
##  en_type = “”
##  en_alertflg = “”
##  en_resource = “”
##  en_rtype = “”
##  en_rclass = “”
##  en_method = “/admin/bin/errnotify $1”

/admin/bin> more errnotify
#!/bin/ksh

O=/admin/bin/errnotify.txt

errpt -a -l $1 > $O

egrep “LABEL|Class|Type” $O | cut -c 7- | xargs -n3 |  read A B C

chmod 755 $O

dt=`date +”%m %e %Y %T”`

echo $A $B $C $dt >> /admin/bin/errcount.txt

chmod 755 /admin/bin/errcount.txt

chown root.buxs /admin/bin/errcount.txt

mail -s “`hostname`: errpt $A $C $B”  YOUR_EMAIL_HERE< $O

Errpt message hack / Broadcom Error

Many months ago I began getting Duplicate Arp Errors in my error report.  The network guys tried to track it down for months (or for months they periodically tried to track it down).  We finally figured out that it was a bug with Broadcom Windows Teaming.  It is explained in this website. ( SurfControl Approved CopyIn the meantime, I wrote this action that immediately deletes the message from the error report in AIX when it shows up, because I was getting one each minute:

errnotify:      
               en_name = "BogusARP"      
               en_persistenceflg = 1      
               en_label = "AIXIF_ARP_DUP_ADDR"      
               en_class = "S"      
               en_type = "PERM"      
               en_method = "/usr/bin/errclear -l $1 0"

Instead of doing the above hack, you can also simply turn error reporting off for this message:

errupdate =FE2DEE00: Log=False Report=False

To check what you have turned off:

errpt -t -F Report=0 errpt -t -F Log=0