So there I was, happily working away, when Time Machine pops up and tells me, "Time Machine has not successfully completed a backup in 18 days." "That's strange," I thought, and proceeded to look into what could possibly be wrong.
I won't bore you with my deep satisfaction with Macs and Time Machine. That's not what this article is about. However, what I discovered was that Time Machine was failing to mount the sparse bundle in which the backup is stored. After poking at this for a couple of minutes I decided to simply reformat the Time Machine partition and be done with it.
After doing so I told Time Machine to run a backup. 250 gigabytes into the backup it fails again. Now is when I realized the truth... I was dealing with a dying disk.
Normally this wouldn't bother me too much. I've been in the IT and Security fields for more than 20 years so I'm fully invested in backing up data. Unfortunately, on this particular drive there resides about 200 gigabytes of data that isn't conveniently backed up anywhere else. Of course I should have known better but these things happen. The real question is what can be done to recover the data. Fortunately for me I work for a data forensics and recovery company!
After disassembling the drive array (it's a Western Digital MyBook with dual 500 gig SATA drives inside) I hooked each of the drives up to a Tableau write blocker to see what was what. In terms of data recovery, I'm actually doing some diagnosing at this point. I'm looking to determine first whether or not both drives are dying and second whether or not there are any signs of physical problems.
By physical problems I'm talking about all of those lovely sounds that hard drives can make when they're dying. Anyone who has experienced this knows these sounds. Everything from the magnetic sucking sound of the voice coil pulling the heads in and shooting them back across the drive in a frantic effort to read your data to the grinding sound of heads crashing into the platter. A good sign in this case is that the drives both spin up and make no unusual sounds. By the way, take a look here for an excellent sampling of the different sounds dying drives make!
When it comes to a first attempt at data recovery I'm an enormous fan of ddrescue. This handy utility takes all of the effort out of using dd to manually carve data off of a dying drive. I hooked the first drive up to the write blocker with a spare 500 gig drive as a target and unleashed ddrescue. After watching for a few minutes it was clear that there was no problem with this drive. Data was flying from one drive to the other at more than 12,138 kB/s. This is at least some consolation since I only have to recover one drive now. *whew*!
Now I hook up the second drive. The drive is recognized easily enough, but when I start the data copy more and more read errors are detected. While I start out with a fairly low error rate (perhaps 1 bad block in every 1000) it quickly escalates to be every other block. Not so good. Here's where experience pays off, however...
The data recovery process, at this rate, was looking to be pretty spotty and would likely take more than 10 days to complete its run. Looking at the facts so far, particularly the fact that when the drive was cool I had few read errors and now that it was hot I was having tons of read errors, I decided to stop and restart the process. Restarting the process, previously successful reads are failing. In fact, every block is being marked as bad! This might sound like bad news but it's actually excellent news coupled with the fact that there are no physical sounds of a dying drive. The most likely problem is that the controller board is dying.
The best news of the morning is that, if you remember, this MyBook has two identical drives in it. Taking my handy precision ESD screw drivers I take the controller off of the bad drive and replace it with the controller from the good drive. The result is that ddrescue now finds only 155 kB worth of bad data on a 500 gigabyte drive!
So what's the lesson for us? I learned a long time ago, much to my wife's chagrin, to never throw away hard drives. I've actually got a cabinet full of them representing more than fifteen years of hard drive history. Now don't misunderstand, I don't store bad drives. What I store are working drives. You never know when a drive will come across your desk with serious problems, all of which are related to the controller board. If you're a forensic investigator working in the information security field I'd strongly recommend that you also start such a collection.
For my business I need to maintain a stock of a lot of drives. I have very little control over what will come in. If you're working as an internal security, incident response or forensics person, however, you need far fewer drives. What you really need to do is make sure that whenever a computer system is purchased, especially a widely deployed system, at least one (more is better, of course) spare drive is purchased for that model of system. For instance, if you decide to push out a stack of Dell Poweredge servers with terabyte drives then at least one spare drive that matches the model of drive in the server should be purchased and stored as part of a response kit. Even if you never need them for an actual forensic incident the drives easily pay for themselves when you need to recover data off of a drive that has suddenly become unreadable and was not backed up as it should have been!
Just as a final closing thought, if the drive you are looking at is making evil sounds and you need to get something off of it, you should power the drive off and stop poking at it. Call a professional. When there is a physical defect, continuing to try to work with the drive will almost always make matters worse, perhaps to the point of being unrecoverable.
This is just one of the many topics discussed and taught hands on in David Hoelzer's class, "Advanced System & Network Auditing", available through The SANS Institute. David is a Senior Fellow with The SANS Institute and the principal examiner for Enclave Forensics. You can find a variety of topics on his blog.