Raam Dev’s Weblog

Avatar

Look for what you do well, and excel.

Yahoo DNS Issues Cause Problems in the United States

Yahoo! appears to be inaccessible to people in the US. Visiting yahoo.com redirects to www.yahoo.com and fails to load. I confirmed it was at least somewhat limited to the US by trying the connection from a shell account on a server in Europe.

Using dig (a Unix DNS lookup utility), we can see from within the United States that there is a problem with DNS. There is no A record with an IP address listed in the ANSWERS section:

;; QUESTION SECTION:
;www.yahoo.com. IN A

;; ANSWER SECTION:
www.yahoo.com. 129 IN CNAME www.wa1.b.yahoo.com.

And from the server in Europe:

;; QUESTION SECTION:
;www.yahoo.com. IN A

;; ANSWER SECTION:
www.yahoo.com. 272 IN CNAME www.wa1.b.yahoo.com.
www.wa1.b.yahoo.com. 33 IN CNAME www-real.wa1.b.yahoo.com.
www-real.wa1.b.yahoo.com. 33 IN A 209.191.93.52

;; AUTHORITY SECTION:
wa1.b.yahoo.com. 273 IN NS yf2.yahoo.com.
wa1.b.yahoo.com. 273 IN NS yf1.yahoo.com.

If you try connecting directly to the missing IP address, you should at least be able to get the main Yahoo page: http://209.191.93.52. You might also try temporarily adding an entry to your /etc/hosts or C:\Windows\system32\drivers\etc\hosts if you want to continue being able to use the FQDN.

UPDATE: As of 15:50 EST, Yahoo appears to be working again. The outage appeared to start around 15:11 EST, so that’s a good 40 minutes of downtime.

HOWTO: Count Files Recursively with Exclusion on Linux

Find all files in this directory, including the files in sub-directories, and exclude all files that start with a period (dot files) and any directories named .thumbs. Then pass the list of results to the wc command to get a total count:

find . ! -name ".*" ! -path "*.thumbs*" -type f | wc -l

Mounting HFS+ with Write Access in Debian

When I decided to reformat and install my Mac Mini with the latest testing version of Debian (lenny, at the time of this writing) I discovered that I couldn’t mount my HFS+ OS X backup drive with write access:

erin:/# mount -t hfsplus /dev/sda /osx-backup
[ 630.769804] hfs: write access to a journaled filesystem is not supported, use the force option at your own risk, mounting read-only.

This warning puzzled me because I was able to mount fine before the reinstall and, since the external drive is to be used as the bootable backup for my MBP, anything with “at your own risk” was unacceptable.

I had already erased my previous Linux installation so I had no way of checking what might have previously given me write access to the HFS+ drive. A quick apt-cache search hfs revealed a bunch of packages related to the HFS filesystem. I installed the two that looked relevant to what I was trying to do:

hfsplus - Tools to access HFS+ formatted volumes
hfsutils - Tools for reading and writing Macintosh volumes

No dice. I still couldn’t get write access without that warning. I tried loading the hfsplus module and then adding it to /etc/modules to see if that would make a difference. As I expected, it didn’t. I was almost ready to give up but there was another HFS package in the list that, even though it seemed unrelated to what was trying to do, seemed worth a shot:

hfsprogs - mkfs and fsck for HFS and HFS+ file systems

It worked! I have no idea how or why (and I’m not interested enough to figure it out), but after installing the hfsprogs package I was able to mount my HFS+ partition with write access.

Understanding the Linux Load Averages

I have been using Linux for several years now and although I have looked at the load averages from time to time (either using top or uptime), I never really understood what they meant. All I knew was that the three different numbers stood for averages over three different time spans (1, 5, and 15 minutes) and that under normal operation the numbers should stay under 1.00 (which I now know is only true for single-core CPUs).

Earlier this week at work I needed to figure out why a box was running slow. I was put in charge of determining the cause, whether it be excessive heat, low system resources, or something else. Here’s what I saw for load averages when I ran the top command on the box:

load average: 2.86, 3.00, 2.89

I knew that looked high, but I had no idea how to explain what “normal” was and why. I quickly realized that I needed a better understanding of what I was looking at before I could confidently explain what was going on. A quick Google search turned up this very detailed article about Linux load averages, including a look at some of the C functions that actually do the calculations (this was particularly interesting to me because I’m currently learning C).

To keep this post shorter than the aforementioned article, I’ll simply quote the two sentences that gave me a clear-as-day explanation of how to read Linux load averages:

The point of perfect utilization, meaning that the CPUs are always busy and, yet, no process ever waits for one, is the average matching the number of CPUs. If there are four CPUs on a machine and the reported one-minute load average is 4.00, the machine has been utilizing its processors perfectly for the last 60 seconds.

The machine I was checking at work was a single-core Celeron machine. This meant with a continuous load of almost 3.00 the CPU was being stressed much higher than it should be. Theoretically, a dual-core machine would drop this load to around 1.50 and a quad-core would drop it to 0.75.

There is a lot more behind truly understanding the Linux load averages, but the most important thing to understand is that they do not represent CPU usage. Rather they represent the load on the CPU by processes waiting for their chance to use the CPU. If you still can’t get your brain away from thinking in terms of percentages, consider 1.00 to be 100% load for single-core CPU’s, 2.00 to be 100% load for dual-core CPUs, and so on.

Creating a Bootable OS X Backup on Linux: Impossible?

I’ve had plans for a while now to set up a backup system using a Debian Linux server and rsync to back up my MacBook Pro laptop. At first glance, it seemed like it would be pretty straight forward. I’ve been able to make a bootable copy of my entire MBP using nothing but rsync (thanks to some very helpful directions by Mike Bombich, the creator of the popular, and free, Carbon Copy Cloner software). And by bootable copy I mean I could literally plug in the USB drive and boot my MBP from the drive (hold down the Alt/Option key while booting). Restoring a backup is as simple as running the rsync command again, but in the reverse direction. I know this solution works because I used it when I upgraded to a 320GB hard drive.

To start, I needed to create a big enough partition on the external USB drive using Disk Utility (formatted with Mac OS Extended (Journaled)). I then made a bootable copy of my MBP with one rsync command:

sudo rsync -aNHAXx --protect-args --fileflags --force-change \
--rsync-path="/usr/local/bin/rsync" / /Volumes/OSXBackup

But my dream backup system was more unattended. I wanted something that would periodically (a couple times a day) run that rsync command over SSH (in the background) and magically keep an up-to-date bootable copy of my MBP on a remote server.

I love Linux and I jump at any opportunity to use it for something new, especially in a heterogeneous network environment. So when I decided to set up a backup server, I naturally wanted to make use my existing Debian Linux machine (which just so happens to be running on an older G4 Mac Mini).

So, after making a bootable copy of my MBP using the local method mentioned above, I plugged the drive into my Linux machine, created a mount point (/osx-backup), and added an entry to /etc/fstab to make sure it was mounted on boot (note the filesystem type is hfsplus):

/dev/sda /osx-backup hfsplus rw,user,auto 0 0

All that’s left to do now is to run the same rsync command as earlier but this time specifying the remote path in the destination (root@myserver.example.com:/osx-backup/). This causes rsync to tunnel through SSH and run the sync. Unfortunately, this is where things started to fall apart.

OS X uses certain file metadata which must be copied for the backup to be complete (again, we’re talking about a true bootable copy that looks no different than the original). Several of the flags used in the rsync command above are required to maintain this metadata and unfortunately Linux doesn’t support all the necessary system calls to set this data. In particular, here are the necessary flags that don’t work when rsyncing an OS X partition to Linux:

-X (rsync: rsync_xal_set: lsetxattr() failed: Operation not supported (95))
-A (recv_acl_access: value out of range: 8000)
–fileflags (on remote machine: –fileflags: unknown option)
–force-change (on remote machine: –force-change: unknown option)
-N (on remote machine: -svlHogDtNpXrxe.iL: unknown option)

According to the man page for rsync on my MBP, the -N flag is used to preserve create times (crtimes) and the --fileflags option requires chflags system call. When I compiled the newer rsync 3.0.3 on my MBP, I had to apply two patches to the source that were relevant to preserving Mac OS X metadata:

patch -p1 <patches/fileflags.diff
patch -p1 <patches/crtimes.diff

I thought that maybe if I downloaded the source to my Linux server, applied those same patches, and then recompiled rsync, that it would be able to use those options. Unfortunately, those patches require system-level function calls (such as chflags) that simply don’t exist in Linux (the patched source wouldn’t even compile).

So I tried removing all unsupported flags even though I knew lots of OS X metadata would be lost. After the sync finished, I tried booting from the backup drive to see if everything worked. It booted into OS X, but when I logged into my account lots of configuration was gone and several things didn’t work. My Dock and Desktop were both reset and accessing my Documents directory gave me a “permission denied” error. Obviously that metadata is necessary for a viable bootable backup.

So, where to from here? Well, I obviously cannot use Linux to create a bootable backup of my OS X machine using rsync. I read of other possibilities (like mounting my Linux drive as an NFS share on the Mac and then using rsync on the Mac to sync to the NFS share) but they seemed like a lot more work than I was looking for. I liked the rsync solution because it could easily be tunneled over SSH (secure) and it was simple (one command). I can still use the rsync solution, but the backup server will need to be OS X. I’ll be setting that up soon, so look for another post with those details.

WHM Whitelist to Exclude from Exim Sender Verify Callbacks

Sender verification is an important feature used by email servers to help prevent spam. When sender verification is enabled, the receiving email server checks to make sure the sender exists. Various email servers have different ways of handling this feature. Exim, for example, uses a mechanism called ’sender callouts’ or ‘callbacks’. (When the sending server does not accept a verification request, it does not comply with RFC 2821.)

However, in the event that the network route from the receiving email server to the originating email server is broken (or a firewall blocks the connection), the result can be a bit confusing. The receiving email server treats a failed verification as a failed verification, regardless of whether or not it could even connect to the originating server. This means the email never comes through to the recipient. After all, as far as the email server knows, it’s spam.

One of my hosting clients was experiencing this “lost email” problem and a quick grep at /var/log/exim_mainlog confirmed the problem (hosts and IPs changed for obvious reasons):


2008-11-17 15:02:27 [30121] H=relay1.example.com (qsv-spam1.example.com) [67.26.151.59]:36752 I=[69.161.211.25]:25 sender verify defer for : could not connect to customer.example.com [163.112.75.15]: Connection timed out
2008-11-17 15:02:27 [30121] H=relay1.example.com (qsv-spam1.example.com) [67.26.151.59]:36752 I=[69.161.211.25]:25 F=<administrator@customer.example.com> temporarily rejected RCPT <raam@mydomain.com>: Could not complete sender verify callout
2008-11-17 15:02:27 [30120] H=relay1.example.com (qsv-spam1.example.com) [67.26.151.59]:36751 I=[69.161.211.25]:25 incomplete transaction (RSET) from <administrator@customer.example.com>

As you can see, the email server was unable to connect to customer.example.com to verify the existence of the sender (administrator@customer.example.com). This doesn’t mean the sending server doesn’t verify callbacks, but rather that the network connection from my server to the sending server could not be established.

Most of the stuff I found online related to solving this problem on a server running WHM (here and here) explain how to modify exim.conf to add special whitelist rules. Luckily, my server is running WHM 11.23.2 and has a whitelist option that makes it really easy to exclude a particular IP address from sender verification without any manual changes to exim.conf:

1. Click Service Configuration -> Exim Configuration Editor
2. Under Access Lists, find “Whitelist: Bypass all SMTP time recipient/sender/spam/relay checks” and click [EDIT]
3. Add the IP address for the sending server for which you wish to skip sender verification (as the note at the bottom explains, hosts cannot be used in this list)
4. Click Save
5. Click Save again near the bottom of the Exim Configuration Editor page

That’s it! Now any emails from that IP that were failing to come through because of a sender verification failure will come through without a problem (again, you can watch /var/log/exim_mainlog to confirm).

HOWTO: Install md5sum & sha1sum on Mac OS X

I was a bit surprised to learn that my Mac didn’t have the md5sum and sha1sum tools installed by default. A quick search and I found a site that provides the source. The sources compiled successfully on my Mac (OS X 10.5.5, xCode tools installed).

The only quirk appears in the last step:

$ ./configure
$ make
$ sudo make install
cp md5sum sha1sum ripemd160sum /usr/local/bin
chown bin:bin /usr/local/bin/md5sum /usr/local/bin/sha1sum \
/usr/local/bin/ripemd160sum
chown: bin: Invalid argument
make: *** [install] Error 1

The make install command tries to change the ownership of the files to the bin user. Since that user doesn’t exist on my system, the command fails. This isn’t a problem though, as both binaries work perfectly. By default, they are installed to /usr/local/bin/.

  • I just finished installing DD-WRT on a Linksys WRT54GL router for the office and all I can say is wow. I remember when replacing the firmware on a Linksys router was like doing surgery in the dark with a butcher knife and a wrench. Now I just download the DD-WRT firmware, use the Upgrade Firmware section of the Linksys configuration page on my router, and BAM! I have DD-WRT installed. The extra features provided by DD-WRT are invaluable and make the router’s usefulness increase exponentially. I’ve got to install this on a router at home. (0)

Googlebot Relentlessly Using Bandwidth

When one of my hosting clients complained about continuously running out of bandwidth on his low-traffic site, I took a peek at the access logs and discovered that Googlebot was indexing every single possible day on a simple calendar addon for the phpBB2 forum software installed on the site. (Googlebot is the program that crawls the web indexing everything so you can search for it using Google.)

A quick peek at the access logs showed thousands of Googlebot requests for a forum calendar:

66.249.71.39 - - [01/Sep/2008:17:09:12 -0400] "GET /forums/calendar.php?m=7&d=21&y=1621&sid=79b643b30eer7140adcd2ba76732688a HTTP/1.1" 200 44000 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.71.40 - - [01/Sep/2008:17:09:33 -0400] "GET /forums/calendar.php?m=4&d=2&y=2188&sid=e4da1ee0a488096e3897a8f15c31cea2 HTTP/1.1" 200 43997 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.71.40 - - [01/Sep/2008:17:09:44 -0400] "GET /forums/calendar.php?m=12&d=4&y=1624&sid=cc5d5084d158457ce3c7a9d38263f553 HTTP/1.1" 200 44076 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.71.41 - - [01/Sep/2008:17:10:05 -0400] "GET /forums/calendar.php?m=10&d=15&y=1621&sid=a4e8af0d20715g965b3e616ae6f95004 HTTP/1.1" 200 43751 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.71.41 - - [01/Sep/2008:17:10:15 -0400] "GET /forums/calendar.php?m=9&d=13&y=2187&sid=80c79b2491ddf3d8d46076d48a6282d1 HTTP/1.1" 200 43896 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.71.40 - - [01/Sep/2008:17:10:26 -0400] "GET /forums/calendar.php?m=5&d=30&y=1618&sid=f0619ba6517an57bcd6a7e9ca6289a32 HTTP/1.1" 200 43820 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.71.39 - - [01/Sep/2008:17:10:38 -0400] "GET /forums/calendar.php?m=11&y=2189&d=30&sid=97c0a58bbd2b3914dbf255ea0a2b1a4c HTTP/1.1" 200 44107 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

A quick Google search turned up many others who’ve had the same problem:

Just found exactly the same on one of my client’s sites. They were complaining that despite being a small site, they’d apparently used all of their bandwidth within 4 days.

They had one of these PHP calendars on their site, where you click the day and it tells you what’s on. Googlebot had tried to index EVERY SINGLE POSSIBLE DAY. And, in the first four days of September, had used up all this site’s bandwidth, clocking up an impressive 19,000 hits and 800MB of bandwidth.

You can use robots.txt to tell all decent robots to push off. I’ve just done that. Let’s see if it works!

So I added a file to the root web directory for the site and named it robots.txt. Inside, I put the following:

User-agent: *
Disallow: /forums/calendar.php

Sure enough, the next time the Googlebot came through it ignored /forums/calendar.php and didn’t use up ridiculous amounts of bandwidth indexing something that need not be indexed.

I can’t blame the Googlebot though. It was just doing its job. The fault goes to the creators of the calendar addon. What they should have done was add a rel="nofollow" to all the links in the calendar. You can add a nofollow tag to individual links to prevent Googlebot from crawling them. Google started using the nofollow tag as a method of preventing comment spam back in 2005.