Extending an LVM Volume to Its Max

Unless a System Administrator states a partitioning or LVM layout (which the junior level folks may not be able to do) I use a default layout in their Kickstarts.  It does a basic layout using about 20G and leaving the rest unallocated in a single Volume Group in LVM.  So what’s a common question I get?  You guessed it, how do you resize a volume.  How do you resize it to take up the rest of the available space?

Usually its the volume mounted at / that folks want to enlarge.

lvresize -l+100%FREE /dev/mapper/Volume00-root

The easy part to remember is how to do an on-line resize of that file system.

resize2fs /dev/mapper/Volume00-root

spacewalk-clone-by-date

The Red Hat Network (RHN) and RHN Satellites are part of my daily grind.  They have not always been the easiest to work with when deploying RHEL to a large university with many business, research, and education use cases.

Finally, Red Hat is shipping a tool to help clone RHN channels at a specific point in time and get them right. Pinpointing server to a specific patch set is not my preferred method of maintaining systems, but a lot of people depend on working in that manner.  This tool goes a long way to making that possible.

Altering KVM Virtual Disk Images

I wanted to alter a file that was a disk image for a KVM virtual machine.  With a physical machine I use dd fairly often to save and alter the partition table and boot loader.  I wanted to do that to a KVM image. The problem being that when you use dd to write to a file, when dd is done it truncates the file. So I would lay down a new partition table and boot loader on my KVM image and find the image was now only a few kilobytes long. It used to be 20 gigabytes in size!

To do this we need to use the loop device:

# dd if=/dev/zero of=/var/lib/libvirt/images/foo.img bs=1024 count=20971520
# losetup /dev/loop0 /var/lib/libvirt/images/foo.img
# dd if=/tmp/bar.img of=/dev/loop0

Now you can use fdisk or other tools to examine the hard disk image on /dev/loop0. When you are done, tear down the loopback.

# losetup -d /dev/loop0

Now boot your KVM.

Why do I like to do this? Think about automated ways to install DBan. Or to laydown a gPXE bootloader to reinstall or reprovision the machine.  Fixing a corrupt partition table or MBR.  Doing things this way allows for a high level of automation.

Bad Experiences With Fedora

I normally run Fedora on my personal systems at home.  I usually enjoy it and it helps keep me up to date with all the new toys that will eventually be a part of the RHEL machines I sysadmin as a professional.  I’ve been running Fedora 15 and had switched to XFCE as the new Gnome 3 user interface and I didn’t get along.  It was past time I updated to Fedora 16.

I’m a professional systems administrator.  If there is one thing that has taught me and I’d like to teach everyone else is that all hard drives fail.  Not if, but when.  So most machines I use (save for laptops) have 2 hard drives installed using Linux’s software RAID 1 to mirror them.  (Not two identical hard drives either.  Find ones that are different.)  Needless to say, my workstation at home is configured this way.  The /boot partition is md0 and everything else runs in LVM on top of md1.

I installed Fedora 16 on my workstation after backing up my data.  I normally do a clean install and reformat everything accept for the /home logical volume.  I get a brand new system and don’t lose my data.  Everything appeared to go well during the install.  When I rebooted I was greeted with an unfriendly Grub2 Rescue Mode.  The new boot loader couldn’t boot my system.

I’m quite familar with the older Grub and using its shell mode to recover my system.  Boy was I in for a suprise.  Grub 2′s command shell is completely different.  Unequipped with a “help” or “?” command to boot!  At this point the Grub2 rescue shell has 4 commands: ls, set, unset, and insmod.  Helpful isn’t it?

I was already using Google (from my smart phone).  There’s not a lot of quality documentation about recovering a system with Grub2.  There are quite a lot of Ubuntu articles.  This is a big problem.  A big problem in that Grub2 should have more visible documentation and a big problem that Fedora should have more visible documentation.

Learn More About Grub2 Rescue Mode

Running ls showed me the problem after I had figured out how Grub2 was working.  It only listed “(hd0) (hd1)“.  Grub didn’t see the /boot partitions because they were RAID 1 partitions.

Turns out Fedora has never “supported” /boot on a RAID 1 device.  Its worked for years and allowed me to recover broken systems many times.  The Fedora 16 installer does not have the Grub modules loaded to support Linux Software RAID devices.  (Yes, RAID 1 which you can mount one of the mirrors as normal ext4 filesystem in a pinch.)  There is a warning buried in my install logs that Grubby didn’t complete but the install itself had no errors.  Needless to say I am disappointed in many ways in the Fedora project.

Fortunately, the above Fedora 16 Common Bugs page had most of the solution.  After the install was complete and Grub2 ends up in the rescue shell this is how to recover your system.

  1. Boot the Fedora 16 install media and use its Rescue Mode.
  2. Tell the rescue mode to find and mount up your existing Linux system.  In my case the rescue mode couldn’t find my system.  I dropped down to the shell and ran
    # mdadm --assemble --scan
    # vgchange -ay

    This loaded up my Software RAID devices and the LVMs on top of them. Mount them properly under /mnt/sysimage.

  3. You need to setup a chroot to run the Grub2 install program correctly.  Because the automated rescue didn’t locate my system and I mounted it my hand, I needed to create enough nodes in /mnt/sysimage/dev so I could continue.
    # cp -a /dev/* /mnt/sysimage/dev/
  4. Do the chroot.
    # chroot /mnt/sysimage
  5. Add the following lines to /boot/grub2/grub.cfg:
    insmod raid
    insmod mdraid09
    insmod mdraid1x

    At the top of the file is fine.

  6. Now run the following commands:
    # grub2-install /dev/sda
    # grub2-install /dev/sdb

    At this point you should see a successful install of Grub 2 on both mirrors of the RAID 1. You should be able to reboot and have the system come up from the hard disks.

Getting Started with Python and Genshi

I’ve used the Genshi templating language toolkit for a few years now.  Its a great templating engine for producing XML, HTML, and plain text output.  In fact, its the engine that powers my Web-Kickstart system at work.

I recently stumbled across some of the first “Hello World” python code I wrote to figure out how Genshi works.  (I was also experimenting with out Genshi would handle different behaviors with custom classes.) I thought this might be useful to others, especially for folks looking to automate plain text files (like Red Hat Kickstarts).

IPTables: The MARK Target

Load balancing for High Availability and Disaster Recovery with LVS and Keepalived is fun, and quite powerful.  One of the most useful aspects is that you can use IPTables with the MARK target to select what traffic is routed to a set of real servers.  Its a lot more powerful than simple IP or IP/port combinations.

For example, a specific service may have a web site as well as another protocol.  Printing uses the IPP protocol and we have a web site documenting our printing system.  With the above trick you can create one virtual IP and have web traffic directed to a pool of web servers doing virtual hosting of many sites.  IPP traffic on a different port gets routed to a pool of Cups servers that do not maintain any web infrastructure.  End users only have to remember one DNS name.

However, remember that the MARK target is one of IPTables’ non-terminating-targets.  It doesn’t stop the packet from being processed by later iptables rules and possibly other MARK targets.  So your iptables rules need to be in order from least specific to most specific.  With the above example, let’s say all traffic goes to the Cups pool and only web traffic gets redirected to the Apache pool.  Your snippet that lives in /etc/sysconfig/iptables will look something like this:

*mangle
-A PREROUTING -d 10.0.0.5 -p tcp -m tcp -j MARK --set-mark 0xc
-A PREROUTING -d 10.0.0.5 -p tcp -m tcp --dport 80 -j MARK --set-mark 0xA
-A PREROUTING -d 10.0.0.5 -p tcp -m tcp --dport 443 -j MARK --set-mark 0xA

Where 10.0.0.5 is the print service’s Virtual IP and our LVS configuration is set to direct firewall mark 0xC to the Cups pool and 0xA to the web pool.

Yum API: Reloading Repos

While working with the Yum driver for Bcfg2 (a configuration management tool) I came across an interesting problem.  Bcfg2 would load the driver and Yum would read all its configuration and .repo files.  Next, Bcfg2 would attempt to install files and packages according to its specification.  Bcfg2 would gladly install additional .repo files for Yum, but the Yum driver would have no knowledge of them and wouldn’t be able to install packages from the new repo.  You would have to run the Bcfg2 client again for the Yum driver to proper install the new packages.  A bug, and it was time to fix it.

Turns out, using the Yum API, there isn’t a way to reset or reload the .repo files without a lot of work.  All my attempts just removed all repos from Yum’s little mind rendering it unable to install any package.  A suggestion from the Yum-devel mailing list was to simply reinstantiate the Yum API. So,

import yum
b = yum.YumBase()
...do interesting things...
b = yum.YumBase()
...do interesting things with updated .repo files...

As I was cleanly between transactions at that point in the code, this worked very well.

My Bcfg2 patches and work can be found at GitHub: https://github.com/jjneely/bcfg2

Repairing Users’ Accounts

How does one go about repairing a user’s UNIX/Linux account?  We very often run into this issue were things have gone wrong in a user’s account.  A corrupted Firefox profile, a bad .xsession script.  Or any number of problems.  How do other people go about resetting a user’s account back to its default configuration at scale?

We have several tools available on the web or through the GDM session manager that all end up calling one very small script.  This script has a collection of files and directories it moves safely out of the way and attempts to repair Firefox and Chrome profiles.

repair_dotfiles.sh

What does your IT shop do?