Configuring kdump on RHEL 6

Some quick notes for configuring kdump on RHEL 6.  Kdump produces a vmcore on a kernel panic, oops, or other condition that our friends a Red Hat support can use to debug kernel level issues.

  1. Make sure /var/crash has space for vmcores.  You need to have enough space for an entire dump of RAM just to be safe.
  2. Add crashkernel=128M to your kernel command line in /boot/grub/grub.conf
  3. Setup /etc/kdump.conf to save vmcores to the right place.  I normally have /var in a separate logical volume so I need to change the default location.  We also setup what memory pages to leave out and to use compression.
    # cat /etc/kdump.conf
    ext4 /dev/mapper/Volume00-var
    path /crash
    core_collector makedumpfile -c --message-level 1 -d 31
  4. Make sure the kdump service is set to start on boot and restart the system.
  5. Check that there is an initrd in /boot created for kdump.  It will have “kdump” in the file name.
  6. Test your configuration.
    # echo 1 > /proc/sys/kernel/sysrq
    # echo c > /proc/sysrq-trigger
  7. Configure the system to kernel panic on oops or NMI depending on the problem you are attempting to capture. Add these lines to /etc/sysctl.conf and then run sysctl -p as root.
    kernel.panic_on_oops = 1
    kernel.unknown_nmi_panic=1
    kernel.panic_on_unrecovered_nmi=1

Dumb Tricks with gPXE

For my first bit of magic with gPXE I decided to replace the boot ISOs I have for folks that are unable to install machines via PXE.  With gPXE I don’t need my RHEL initrd and kernel image to boot strap myself into an install from a CD or USB stick.  I’ve encoded the TFTP server and the file name to grab and execute into the gPXE image so as long as the machine can get any type of DHCP lease it will load up my PXELINUX environment.  This makes the boot CD images work identically to doing a real PXE boot…because you are.

Step 1:  I grabbed the gPXE distribution and unpacked it.  I patched its autoboot functionality as described here.  This lets me DHCP automatically even if the first ethernet device is not the one connected to the network.  For gPXE 1.0.1 you can use my patch instead.

Step 2: Make an embedded script file.  This just supplies the information to gPXE that a normal PXE boot would get from the next-server and filename options in the DHCP response.

#!gpxe
autoboot
chain tftp://FQDN/pxelinux.0

Yup, we have DNS support so just add the FQDN of your TFTP server. In my setup I have pxelinux.0 in the root of my TFTP server.

Step 3: Build gPXE with your embedded script.

make EMBEDDED_IMAGE=path/to/your/script

Step 4: Burn the resulting ISO onto a CD and PXE boot a PXE-less machine.

Thoughts on gPXE

I maintain a RHEL deployment with automatic installs.  You PXE boot, type in a version string that equates to your RHEL version and arch and during the install process you get the Red Hat Kickstart that matches your fully qualified host name.

So no images here.  All installs are based on descriptions of what the end result should look like rather than a large image of it.  Especially with a large install base these descriptions scale much better than having a place to store what would be thousands of images.

However, VM Farms and other cloud based technologies really like working with images. They’re easy I guess.  Slap the image down and boot it.  Somehow I need to merge both worlds.  Step in gPXE or formally Etherboot.  Its an Open Source PXE boot loader that is highly scriptable.  I could hand out an image to those that want them that when first booted would query our Kickstart system and being the install.  The images are small, the kernel/initrd files are stored on the TFTP server so they can be updated with out modifying the gPXE images, and they can trigger fully automated installs.

Did I mention scriptable?  With a bit more magic, I can use a token to reference my Kickstart rather than host name and create custom gPXE images that install by referencing this token.  This solves the issues with cloud environments where we don’t know the target machine’s FQDN until the machine is dynamically provisioned.  But that begs the small question of how to build these custom gPXE images for each customer.  The gPXE source actually has GNU make targets and variables that even automate this process for you.  A small web app to build your gPXE image based of your magic token become very easy.

I’ve got some coding to do.

Moving AFS Volumes by Name

I help maintain a fairly large OpenAFS installation.  One of the things I find myself doing often is moving AFS Volumes from server to server.  The command to do so requires you know exactly where the volume currently is before it can be moved to a new server.  There’s a reason for that even if AFS already knows where that volume is.  In any case, while moving one type of volumes to server A and a different type to server B I decided I needed a tool to make this easier.

Cue the perl-AFS modules and a bit of perl code.  I wrote the attached script that will locate the volumes for you and move them to your stated destination.  You can do things like:

$ perl move_volumes_by_name.pl -t server00.example.com \
-P /vicepa -c example.com volume1 volume2 volume3 ...

That will move the three volumes given on the command line to server00. Or, you can use some command line redirection and do something like:

$ echo "volume1 volume2 volume3" | perl move_volumes_by_name.pl \
-t server00.example.com -P /vicepa -c example.com -

Use the -d switch (for debug) to see a list of actions that the script would perform without actually performing them.

 http://linuxczar.net/code/afs/move_volumes_by_name.pl

Awesome

As the user interfaces for Gnome/KDE advance I sometimes miss the old days when you had to chose your own window manager, compile, and carefully configure it to have a generally useful desktop.  I remember putting hours into setting up Window Maker just the way I wanted it to be.

As I became a professional System Administrator I starting using the desktop environment that I “support” or otherwise encourage my own users to use.  If I can’t stand my own dog food, I know I’ve got work to do.

While reading LWN I came across the Awesome Window Manager.  I was very intrigued by LWN’s review.  Its a modern window manager ment to get out of your way and windows are organized and delt with in much the same way as GNU Screen operates.  I’m a big fan of GNU Screen.  (Yes, I know about tmux.  Tmux and Awesome, I think, may be even more similar.)  Great keyboard support, less mousing around.  Best yet, completely programmable.  If I put the time and effort into setting it up well I may have a very functional and efficient desktop.  Better yet, no one will know how to use it but me.  ;-)

On the flip side, using and configuring other terminal applications like urxvt can be tricky.  I like having my terminals 80 characters wide, but I still have 1280×1024 resolution monitors.  Either terminal fonts are way too small, or I can get 79 columns across with two terminals side by side with the Terminus fonts.  I haven’t found a winning combination there.  But it may be worth the research.  Suggestions?

Recovering RAID 5 Arrarys With Multiple Failed Devices

Failed Linux MD RAID devices. That’s what I got to deal with yesterday. The ext3 file system produced scary errors and remounted the file system as read-only. A quick look at /proc/mdstat showed

# cat /proc/mdstat
Personalities : [raid1] [raid5]
md1 : active raid5 sdi1[8] sdh1[7] sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sda1[0]
   645136128 blocks level 5, 256k chunk, algorithm 2 [10/8] [U_UUUUUUU_]

That’s bad.  The second hard drive had failed in a RAID 5 array with no spares.  Our mission, get the data back as best we can.

This was a fileserver that was in use.  I pulled emergency maintenance and rebooted the server into single user mode for safety.  After the reboot /dev/md1 was listed as inactive.  There were not enough working devices to bring up the array.  Obviously, it wasn’t mounted either.  This was exactly where I wanted to be.  (You could also unmount the broken file system and use mdadm --stop /dev/md1 to stop the array.)

Next, use mdadm to force assemble the array.

# mdadm --assemble --force /dev/md1
mdadm: forcing event count in /dev/sdb1 from X to Y

Now that should bring the array online.  The event counter for /dev/sdb1 was the least out of date and mdadm just fudged it.  This means we have introduced corruption.  Once the second device fails, its not long before the array fails.  So, provided that you bring back the second failed disk (not the first which failed 3 years ago, right?) you should introduce minimal corruption.  However, you now have a working array.

Next, we back up the array.  Mount /dev/md1 as a read-only file system.  Use rsync or another tool to copy off the data.  Don’t try to add more disks and rebuild the array before you back it up.

Our mission is accomplished.  We have data.

Encryption Types Order in Kerberos

Things I don’t wish to forget the next time I need them.  In a Kerberos Realm with multiple encryption types available the KDC will use the first type in the list that’s compatible with the client.  So let’s say you are adding encryption types to your KDC, of course you need to add new keys to the krbtgt/<REALM> principle.  Once you do so check out the list of keys.  What will the KDC chose to use if things are in this order?

Key: vno 2, DES cbc mode with CRC-32, no salt
Key: vno 2, AES-256 CTS mode with 96-bit SHA-1 HMAC, no salt
Key: vno 2, ArcFour with HMAC/md5, no salt
Key: vno 1, DES cbc mode with CRC-32, no salt

You guested it, des-cbc-crc!  The weakest type.  Even when the AES encryption is compatible with the client.

This is where I began to bang my head on my desk.  The order is, of course, set when you use the change_password command in kadmin or via the kpasswd tool.  The order is read from the supported_enctypes variable in your kdc.conf.  So make sure the encryption types listed for that variable are listed in the order of their strength.

Unfortunately, there’s no way to fix the order.  You need to correct the order in your kdc.conf file, restart the kadmin server, then use the change_password command again (with -keepold) to rekey the krbtgt/<REALM> principle.  Or use the kpasswd tool to update your principle.  Yes, the restart in there is required.

Build Systems: Making RHEL5 Packages on RHEL6

In my cobbled together build system I’ve never been able to build RHEL 5 (much less RHEL 4) packages on my RHEL 6 host.  I knew this was because RHEL 6 uses newer file digest algorithms and compression algorithms in its version of RPM.  When Mock put together the RHEL 5 build chroot, the version of RPM there could not understand the source package built by the host.

The time came to stop working around this annoyance and fix it.  All one needs to do is define a couple extra defines when one calls RPM to assemble the source package.

--define "_source_filedigest_algorithm md5"
--define "_binary_filedigest_algorithm md5"

My make file simply checks if the dist I’m building for needs the use of these extra defines with the following code.

ifneq (, $(findstring $(DIST), "EL3 EL4 EL5"))
RPMDEFINES := $(RPMDEFINES) \
    --define "_source_filedigest_algorithm md5" \
    --define "_binary_filedigest_algorithm md5"
endif