Over the years, I’ve developed some rules of thumb for systems administration and operations. They serve me well and might serve others too. Here they are in no particular order:

1. Invest In Openness

Most of my solutions involve Open Source Software or Open Standards. They build the most scalable and rigorous infrastructure. Participating in the Open Source community elevates the tools all of us use and is far more productive in solving problems.

2. Backups Are Sacred

Data is our most valuable asset. Make sure its recoverable. See rule #4 and rule #5. Backing up to a single location isn’t always recoverable.

3. Use a Versioning Tool (Git)

Use a versioning tool with all documentation, code, scripts, configuration, packages, everything. It makes things easier to backup (although versioning tools alone are not backups). It gives the power to locate exactly what changed to cause a bug, who made the change, and why. It gives the ability to travel in time, which is always handy.

4. Failure Will Happen – Plan For It

From hard drives to cloud providers failure will happen. Not if, but when is the crucial question. Know the technologies and tools well enough to plan for failure and be able to handle it gracefully.

5. Automate Everything

If you might do a task again its worth automating. Repeating the task becomes significantly more efficient, and the process will not be lost or forgotten for rarely performed work.

6. Testing Is a Ritual

Test, test, and automate testing. Don’t touch production with an untested process. Don’t write untested documentation either.

7. Never Be Without a Pen/Pencil

Some things are best kept on paper and few bits of software can adequately hold and represent what is in the brain. Keep a notebook and writing instrument handy.

8. The Scotty Factor

Multiply time estimates by 4. The task will usually take longer than first thought. Sometimes the reputation of a miracle worker follows.

9. Network With Peers

There is nothing more valuable than your own network of IT folks. Maintain friendships and find peers to bounce ideas off of.

10. Read Only Friday

Use the last day of the week for documentation, coding or anything other than making changes to anything that’s remotely production. Time is needed to catch up on these tasks and no one likes weekend pages due to a Friday change. Or evening pages due to a late day change.

The documentation written is invaluable for those that handle pages, including yourself.

Other Quotes to Live By

  1. “NO system should EVER rely on user behavior to remain stable.”
  2. “You either do your job well, or you do your job continuously.” – King’s Law
  3. “Every time I fix a problem by rebooting (rather than knowing the real cause and fixing it) I feel a little bit of me dies inside. It hurts our industry and our profession when we develop bad habits like guessing instead of knowing.” – Tom Limoncelli