Archive for the ‘Uncategorized’ Category

Linux Woes

Tuesday, November 6th, 2007

I’m bored. There are two things that I’m looking at working on. The first is a long standing quest to turn NCSU’s Linux website into a dynamic site that easier for me and others to maintain. I’ve been poking at making the site completely in the existing MoinMoin wiki or redoing it in MediaWiki. Although there don’t seem to be many tools to convert a Moin wiki into MediaWiki. I’ve also been looking at using WordPress and Drupal. WordPress is inviting as its easy to maintain and seems to put a lot of effort into looking really good. The problem is theming the CMS properly. Unfortunately, CSS and similar magic isn’t really my cup of tea. All solutions will take quite a bit of work.

In a similar vain what value exactly does a separate wiki get me? Is there value in having an “official” website with the tacked on wiki? Should the website be one or the other completely?

The second project is configuration management. I’ve been researching the existing tools such as Bcfg2 and Puppet and have been fairly impressed, but neither seem to fit it what what I need. I have some brief requirements written up here:

I keep thinking what I want is something based off a distributed SCM such as Git. Where local admins can rebase the configuration tree to achieve what they might need. Mostly, marking a specific file as “I know I purposely changed this, don’t change it back.”

I’ve also been keeping tabs on Func which is very interesting. It may definitely play a role in my monitoring. But it doesn’t seem to be heading in the CM direction either. Its close…perhaps if I thought enough about how to implement some CM ideas on top. Perhaps after Func can pull as well as push.

RHEL 5 Vesa Bug

Wednesday, August 29th, 2007

For a couple days I’ve been looking at a bug in RHEL 5 where the VESA driver is unable to drive a Dell LCD panel connected to a Dell Optiplex 745 using a rather new ATI X1300 card. I was getting bad resolutions, X starting once and then failing to restart until the machine is rebooted, and lots of errors with “no screens found.” On my 745 in my office running in x86_64 mode I could not duplicate the problem even when using the identical monitor and configuration.

Turns out there is a bug that effects how the X server gets the DDC inforamtion about the monitor. This is overerly well documented as #10238 in Freedesktop.org’s Bugzill and as #236416 in Red Hat’s Bugzilla. All non-i386 arches and virtual machines (Xen) use emulation to get the DDC information. Turns out using the same emulation works in this case for i386 RHEL 5 machines.

To work around the bug edit your xorg.conf file and add the following to the Server Layout section:

Option "Int10Backend" "x86emu"

Collecting Usage Statistics

Thursday, July 19th, 2007

One of many goals I have is to be able to collect usage statistics for Linux machines at the university. I want to see the utilization of a computer lab over time and calculate the amount of usage of a group of computers. Which means I need to find out how long each login session on each computer lasts and eventually get that into a graph of the computers I’m interested in. Sounds easy.

However, its rather hard to find other bits of code on the internet of people collecting usage statistics from a group of Linux machines. I had to write a PAM module not long ago and it was easy to use the session functionality to make a report for each login of how long that session lasted and send that to the XMLRPC interface that collects all this mess. It works, but why was adding this functionality into a PAM module the easiest thing to do? Isn’t there an easy way to lift this information from the system with a bit of Python?

I’ve been looking at wtmp. It stores the data I want but its horrible to work with. I either need yet more C code to work with a bad API or a C Python module to still work with a bad API and grok wtmp’s strangeness. I could screen scrape the “last” command, but that’s really prone to error with the way it represents dates.

How do other folks mine this kind of data?

More T61 Goodness

Wednesday, July 4th, 2007

I have built patched modules to drive the Intel 4965 wireless card in the ThinkPad T61. A lesser version of the same code is available in later Fedora 7 kernels but the PCI ID is different for the card in the T61. I’ve build Fedora Kmod packages that build the iwl4965 module from the iwlwifi-0.0.32 package. Its not the neatest package I’ve build, but it works.

Also, there is a small bug in the new Inte 945GM video driver where the driver is attempting to scale the image even though the LCD panel is running at its native resolution. This produces an image that doesn’t appear “crisp” or appears out of focus. So here are some rebuild xorg-x11-drv-i810 packages with the proper patch. I found this out from the following post: http://www.spinics.net/lists/xorg/msg25117.html

The packages are here: http://linuxczar.net/code/t61

ThinkPad T61 and Fedora

Saturday, June 30th, 2007

I’m the proud owner of a new Lenovo ThinkPad T61. It has the new 965GM graphics chipset as well as the Intel 4965 a/b/g/n wireless. The T61 is currently only available in widescreen and I have the 14.1″ 1440×900 model. So far I’ve been fairly impressed, but it being a new laptop there are always a few tricks to get it working.

Fedora 7 has the new Intel drivers that drive the 965GM and X seems to work fine. However, it appears as if Gnome doesn’t understand the widescreen resolution. GDM, the graphical boot loader, and Gnome itself seems to only want to work in a 1024×768 window in the upper left corner of the screen. In Gnome, I can re-adjust the panels (except the top panel) and move windows and use the space outside the box but the below image is what things look like per default. You can see that the resolution selection app seems to believe that 1024×768 is as high as we go. Does anyone know a solution to this?

The Intel 4965 wireless does not work out of the box. However, Intel has released drivers as part of the iwlwifi kernel modules. The most recent Fedora 7 kernel (2.6.21-1.3228.fc7) has an older set. It looks like using a newer snapshot of this project and getting the microcode for the wireless card should enable this to work fairly well. This article has some details for getting the iwlwifi code to work on Fedora 7.

Configuration Management

Sunday, June 17th, 2007

Configuration anagement (CM) is a critical point of doing large scale, or even small, systems administration. Its more than overly important that your various machines pick up new and updated configuration files easily and in a timely matter. At NCSU, I’ve been doing what’s counted at CM using a python project I call Realmconfig. Okay, Realmconfig has been around at NCSU managing linux machines longer than I’ve been its maintainer. As Realmconfig developed, it gained more and more CM-like features such as a arbitrary collection of modules than run at boot to handle initial configuration. One of these modules “manages” a selection of files, if the file isn’t identical to the gold copy its replaced with the gold version. Generally, its worked well for initial configuration put pushing out changes hurt. They hurt bad.

Said module that “manages” files can either run once, run every boot, or run only when I bump its version which requires a new Realmconfig package. Run once handles inital configuration but ignores any updates. Run every boot sees updates, provided I’ve included them in a new package, but is draconian in applying those updates. Certain systems have modified configuration files in place and need to keep them. Only running when the version is bumped is a compromise, but I end up with the worst of all the problems.

Obviously, I need to move away from a haphazard collection of simplistic scripts to something that can scale to an environment of thousands of machines, not require a new RPM package for any update (unless one solely decides to do CM by RPM), propagate updates easily without weird scripts, ability to handle restarting of services and small scriptlets, and allow for administrators to override/replace aspects of the configuration I’ve provided.

The last point sounds a bit odd. Allow my configuration to be overridden? Most CM systems scale well but are designed around centralized administration. That’s a little different than what I will call centralized management. Centralized administration is one or a group of system administrators that act as one, unified entity to manage machines. In this case they are all trusted and working together to build and maintain their infrastructure. Any configuration changes made from outside the centralized administration are, by definition, not approved and should be quickly reconciled with the known good configuration. Or, more simply, you have a compromised machine.

However, I work at a university with many fiefdoms. Seldom is anything done that may be perceived as giving away direct control of something to another fiefdom. Fortunately, systems administrators are normally smart folks and understand that working together across fiefdoms they can achieve bigger and better things. Some, however, don’t. So what we have at the university are modified versions of Solaris, Windows, and Linux that we (the central IT folks) make available to the university. The colleges, and departments can deploy these “kits” as they need and leverage centralized management of the machines and deploy their own labs, workstations, and services. Most importantly, they can still be “in control” of the machines themselves.

So, where does this leave us in the realm of Configuration Management? I require a system where I can push out changes to all the managed linux machines on campus. Also, local systems administrators that may not be trusted with all the configuration of all machines may wish to add configuration and have it enforced on their machines. Its possible that there might be a third layer as well. Also, if a local administrator decides to manage a file I also manage we need to do something a lot smarter than replace it with the global copy. We need to merge, or make sure that their files remain intact and ignore the global changes.

I’m not aware of any CM tool that this flexible. I’ve been looking at Bcfg2 and will spend some more time with it as well. For a CM it seems designed well, stays away from inventing new languages, scalable, and is written in Python. We’ll see how it plays out in my testing. An important part for a useful CM tool is something that there is a community around rather than some random code I wrote. Bcfg2 has a very active community and maintainer.

Now we get into my crazy ideas. Toss in the hopper that configuration should be managed by some sort of SCM so that we have backups, machines can have their configuration rolled back, and a log is kept of why the configuration was updated. Suddenly, to me at least, we have the use case of a distributed SCM. Each machine has its own repository where configuration changes can be made locally and the machine can pull its configuration from any other machine, by default a master repository. An easy way to make a configuration hierarchy. We just need to be smart about automation and conflicts.

Using Git and pretending we have a useful data schema and tools to make use of it, how do we manage the magic local repository based on another machine’s?

  1. git clone SOURCE_REPO
  2. git branch upstream origin
  3. git pull origin :upstream
  4. git merge -s ours upstream
  5. System distrubutes configuration files. Local admins can commit their own configuration to HEAD.
  6. Goto step 3.

Alas, the local changes override the upstream default in an automated way and pull their configuration information from any arbitrary point.

Probably complete crack.

Red Hat Summit

Wednesday, May 16th, 2007

Saturday I returned from another extremely successful Red Hat Summit in San Diego. Lots of fun, lots of interesting sessions. The real highlight was AMD’s announcement about making the GPU part of the CPU and opening the specifications. Real Open Source graphic drivers. I spoke a couple times with Ted Donnelly from AMD. (I’ve probably horribly misspelled that.) He spoke several times about what a huge risk this move is for AMD. If the Open Source community doesn’t “catch” AMD then its going to be a hard landing. But with the support that Ted was giving the Open Source community truly free graphics on Linux will be very, very welcome. Okay, about 1,400 people wanted to hug some AMD folks. However, there wasn’t any specifics on when and where, but the future looks bright. Check out LWN for the full press details.

One of the many very interesting things talked about was the ability for the average user to create Fedora live CDs and spin their own Fedora based distributions. Very neat stuff. The presenter of one session had burned a handful of mini-CDs of a live CD of just enough rawhide to get Yum working. These were very limited, so I wanted to post the 175MB ISO.

Installation Number Followup

Sunday, March 18th, 2007

Its possible my earlier post was…a bit much. Probably being frustrated all day means I shouldn’t post at 1:00 AM.

I still believe that these INs are a terrible idea. They serve to make my life more complicated and its a very “corporate” way to control your products compared to the Open Source methodology of choice.

I’m drawing comments. I’ll take that as a good sign. To correct myself and respond to a few comments, is this IN thing a form of DRM? Wikipedia defines DRM as

an umbrella term that refers to any of several technologies used by publishers or copyright owners to control access to and usage of digital data or hardware, and to restrictions associated with a specific instance of a digital work or device.

Assuming that Wikipedia is fairly accurate here is there a case we could compare this to DRM? INs definitely don’t qualify as copy protection or technical protection measures. There is really nothing to prevent you copying the work in question. However, I think there is a point that INs control access to the software. Can you install it later via Yum (provided you have the proper contracts) or use RPM to install the packages from the CDs? Yes. Would every single person installing RHEL 5 understand that you have to go back and install more stuff? I have folks that I can’t get to understand that you must register your system with RHN to even get any updates. What about the software on the CDs that are not covered by my contracts? I can just make my own Yum repository of those too.

Does this potentially violate the GPLv3? I am not a lawyer so insert grain of salt here. In a more lucid state I would definitely say no. The distribution can be completely reproduced via the source without loss of functionality.

Another commenter had the audacity to say “installation numbers are a convenience.” I laughed. I laughed a lot. Thanks, ’cause I needed that. I do more than install a few RHEL 5 clients and servers. I maintain well over 1,000 RHEL based machines. So my environment is highly automated. The admins I work with understand well the existing system of choosing package sets. Now I need to somehow figure out from their package set choices what IN I should use. Or do I distribute INs to the admins and let them combine yet another code on top of the RHN activation key and our internal configuration management keys and make the same package choices twice? There are also the folks that want to install their own version of RHEL without our local modifications. I see the next 6 months are going to be putting out a lot of fires of “I can’t install RHEL 5″ rather than doing useful things. No matter which way ends up the best to solve using INs there has been more complexity introduced.

The same comment’s corollary seems to indicate that based on your contracts you are not permitted to install some “binary bits” found on the installation media. That seems like “technologies used by publishers or copyright owners to control access to and usage of digital data” to me.

Perhaps what really set me off last week is that we use the academic contracts to purchase our RHEL subscriptions. So on Red Hat’s website I see my line items that look like “RHEL WS/AS Site Subscription” or “RHN Academic Site Subscription.” None of my line items are specific to Server or Client much less the virtualization, workstation, or cluster options. The web site gives me several INs. All of which are the same Server, 2 sockets, and 4 VMs. I’ve seen lots of noise on mailing lists about academic customers not having the INs that cover what they purchased. I’ve spoke with various folks in and outside of Red Hat and I know they are honestly working to correct the situation.

The solution here is choice. If I’m doing onesy-twosy installs let me chose what options I want to install. This is one reason we have installclasses. During firstboot and RHN registration I could be presented with a well designed dialog explaining that I have installed software options not covered by my entitlement if that were the case. This would warn about security issues and give me options for what I should do. In my automated environment I have control over what can be installed and can configure my local configuration management tools to not install options that we do not have access to right beside the code that knows how to register the machine with RHN. I do that already, that adds minimal complexity.

The solution is choice.

Red Hat’s Installation Numbers

Saturday, March 17th, 2007

I’m normally a big fan of Red Hat. Both for the ideals and progress in the Fedora community and for the advantages of using RHEL in the enterprise world. RHEL as a distribution tends to Not Suck(tm).

With the introduction of RHEL 5 Red Hat now wants you to use an activation code to install. Called Installation Numbers. (How long did it take someone to figure up that name?) You can install without a code but you only get the core Client or Server installed. In most cases this is somewhat less than useful. But, being clear, to have the installer install much of the Open Source software on the CD sets you must enter a 16 diget hexadecimal code that configures the installer to install the options you purchased.

I’m very insulted. The ideals that Red Hat holds so highly are flushed down the toilet at the sight of something green. How is this not DRM? How would this be legal with the GPL v3? Its really a horrid, evil idea. The least of which makes me as a sysadmin have to do much more work to deploy RHEL 5. I remember when (back in the day) new distributions were easier to maintain and deploy than older.

This is Open Source software. Its all about choice. Why did Red Hat chose to not give the user a choice what flavor of Server/Client they want to install? At RHN registration time the admin could be alerted that he or she has installed features not covered under the contract and give them options for what to do. Possibly, buy the missing support? No…that would be too hard. Instead, we must give them the complete set of software and then restrict how it can be used. Bad Red Hat.

This is Open Source software. The installer needs to know how to parse these installation numbers. The RHN tools on the system need that knowledge as well to communicate with RHN. This is Open Source software Red Hat. You cannot hide the details of these codes. In fact, I have already learned all I need to generate Installation Numbers myself with any feature set I so desire.

You can find some quick code to generate these numbers in genkey.py. I have also written a more in depth article about creating installation numbers and a few examples.

PBR Followup

Saturday, February 17th, 2007

Well, my Dell 1950 isn’t printing out “Bad PBR signature.” anymore. I figured out how to fix the below error and how to recreate it, fix it again. There seems to be some issue with Grub involving the device mappings. While the device mappings that Anaconda wrote to disk look fine, something isn’t right.

I’ve been using my Grub CD to boot the machine and attempt to correct the Grub installation issues. Also the Grub CD is handy to boot the system by using the ‘configfile’ option. Once booted into the system proper I started up a Grub shell and ran the following:

device (hd0) /dev/sdaroot (hd0,0)setup (hd0)device (hd0) /dev/sdbroot (hd0,0setup (hd0)

The last 3 lines are for my RAID 1 configuration for /boot. I rebooted and the box booted up like a champ. When I boot of my Grub CD and reinstall Grub to the MBR I get the old non-bootable behavior. Its strange. I wonder what’s going on…

Some references: