Recently, I had the opportunity to do forensic analysis on a HDD extracted from a Canon ImageRunner Advanced C5240 Multifunction Copier. After a story was broken by CBS News, back in 2010, it seemed likely that less would be available than is described in the copier forensic write-ups here and here. Nonetheless, I was hopeful. As you will see, my results were somewhat mixed, but the security enhancements put in place after that article could certainly have been much more complete.
The Good:
The first new security feature I noticed, when beginning my examination of the drive, was that the drive itself was protected using an ATA firmware password. This isn't actually the showstopper it might appear to be, however. Such passwords can be fairly simple to remove, if you have access to the right tools. One such commercial tool is Atola Insight. While it's relatively expensive, it's a tool frequently owned and used by various data recovery firms, who will use it to remove firmware passwords as a standard service offering. For the do-it-yourselfer, I also found this link (translated), which described in detail the process of recovering an ATA password from a Western Digital drive (not the maker of my test device). It also mentions that, in at least the case it specifically covers, the firmware password can be found written in the clear on the drive's internal system partition. For those interested in more detail on HDD internals, I refer you to Scott Moulton's Hard Drive Diagnostics presentation.
One thing further that I should mention on this subject is that many forensic investigators are apparently unaware of even how to verify that an ATA password has been set on a drive. While the use of such a password is supported on most newer laptops, it's actually rather unusual, in my experience, to see it enabled. I've also been somewhat surprised to find that few forensic utilities seem to even check for it, just repeatedly failing to read blocks on the device when attempting to image it. Maybe such checking requires specific hardware support in the controller and driver used to connect to the target drive. In any case, one hardware/software combination that does work for this is a Tableau Write-Blocker paired with Tableau's free TIM imaging software. Once the target device is connected via the Tableau write-blocker, and TIM is started up, you can open up the disk details window, and examine the HPA/DCO tab. If the device being examined has an ATA password set, the 'Security in Use' field will be set to 'Yes'.
In any case, once I received the drive back from a data recovery firm, with the password removed, I was able to begin my examination in earnest. The OS itself appeared to be implemented using some variety of bootable Java. There were 14 partitions present. Ten of these hosted ext3 filesystems, one was a Linux swap partition, and the other three were of unknown type. The unknown partitions appeared to contain nothing but a header of some kind (in two of the three cases), followed by long strings of 00 or FF, suggesting that wiping activity of some sort had taken place. The unknown partition with header "NadaFs_FastVctlTable_V0403" is the one that the first report listed above suggested would have carveable copy page data in it, so it appears that the (very) specific issue referenced in the CBS article has been addressed.
The Bad:
However this is not to say that I was unable to carve any interesting image data from this drive. In fact, I was able to get a number of JPG and PDF files from unallocated space on one of the ext3 partitions. These, or at the least some of them, appeared to be data from scan jobs that were run on the device. In particular, I was able to recover a JPG and a PDF of the exact same document in a couple of cases. My conclusion is that the device uses a different mechanism for spooling scan data than it does for copy data, and that it first creates a JPG from a scanned document, and then converts that file into a PDF for subsequent transmission to the scanning user via email. I also recovered two PDF documents from unallocated space on a different partition. I'm guessing these were processed via some other path through the device. If this is the case, then one of the other functions of the device (Email? Fax? Printing?) uses yet another spooling system.
The Ugly:
In addition to the previously processed pages I was able to carve from unallocated space, I also looked over the file data carefully, searching for any log files that might illuminate usage patterns for the device in its various modes. I found several such files, but all were written in various different binary formats. After several days of staring blearily at meaningless jumbles of hex, I was finally able to unravel the date formats used in these files for the edification of all. I'm not sure why this data is all in binary, but my best guess is that at least some of it represents part of the backend database used by the device's SNMP agent. In any case, there are two directories on one of the disk's partitions that contain log data of interest; /nvmem, and /VAR/ADM/PIPITLOG.
I began my analysis with the PIPITLOG folder, because it contained the most potentially useful data; specifically, a file named LMCOM.LOG, which had records within it listing email addresses, the string "Attached Image" (which appears as the subject on scan emails sent from the device), the string "MAIL", and four separate suspected timestamp values. All of the interesting values except for the timestamps are fairly obvious string values which occur at fixed offsets, and thus would be fairly simple to extract. Unfortunately, without the decoded timestamps, the remaining values are of strictly limited utility. After trying various standard timestamp decoding methods, and getting nowhere, I copied all the file data out as hex, arranged it in lines of the same length as the apparent record size, and then examined the various columns of my suspected timestamp fields. After a while, I started trying to encode various date parameters in various ways and look for them in the data. Eventually, I noticed a (mostly) invariant "Dx 7D" (here x is a random hex value) which, for a section of the file, morphed to "Ex 7D". In little-endian, this decodes to 7DDx and 7DEx. 7DD is 2013 in decimal, and 7DE is 2014. Hypothesizing that this was intended to be the year, I quickly identified two other columns (by their restricted ranges) as the hour and minute respectively. The month took a little longer. This was the x value that formed the lower nibble of the least significant year byte (Dx and Ex) The restricted range suggested that it might be the month, but there was not anything that immediately stood out as a day-of-month, and the pattern of the byte immediately to its left seemed from the pattern to be some sort of counter which continually went up to or close to FF, then dropped back to 1, at the same time this value incremented, suggesting that this nibble was really just the most significant 4 bits of that value. Ultimately, it turned out that this was, in fact the month, but the byte immediately to the left of it was broken up into a 5 bit quantity for the day-of-month, and a 3 bit quantity for the day-of week.
So for example, in this file, the value "49 D9 7D 48 71 1F 0F" would decode as follows:
49 = 0100 1001 in binary. The first 5 bits (day of month) is 01001, which is 9 in decimal. The last 3 bits (day of week) is 001, or 1 in decimal. 0 would be Sunday, so this is Monday, the 9th of the month.
D9 7D is 7DD9 in little endian. The first three nibbles of that are 7DD, which is the year, 2013, and the last nibble is 9, which is September.
To the right of this value, we ignore two bytes (48 71); BTW, the 2nd nibble of this quantity is very oddly restricted, it's always either 0 or 8. Anyone have a suggestion what it might be? Neither of these values is restricted to 0-60, so they can't be seconds.
Then 1F is minutes, 31 in decimal
And the last byte, 0F is hours, 15 decimal.
Thus, the final date is Monday, Sept 9th, 2013, at 15:31.
The /nvmem directory contains the files that I suspect (because of the presence of one file named MIB2.dat in that directory) to be associated with the device's SNMP agent. The files of interest have names such as JobScn.xxx (where xxx is a number from 000 through 029), JobPdl.xxx (where xxx is a number from 000 through 049), and JobCpy.xxx (where xxx is a number from 000 through 049). There are other files present as well, some of them formatted similarly, but on this particular device, most were empty of useful data. Of all these files, the JobPdl.xxx files appeared to contain the most, and most useful information; specifically, account names, print job names, and document filenames, in conjunction with multiple timestamps. The only useful information I identified in the other files were timestamps, presumably those of individual copy or scan jobs being scheduled, started, and/or completed.
The timestamps in these files were actually a bit easier to decode, since they don't rely on bit-level fields within the data values. The date format for these begins with a two-byte little-endian quantity for the year, then a byte for the Month, one for Day of Week, one for Day of Month, one for Hour, and one for Minute.
Thus in this file, the date sequence "DD 07 03 04 0E 10 25" would be decoded thusly:
DD 07 = little-endian 07DD = 2013
03 = March
04 = Thursday
0E = 14th
10 = 16
25 = 37
So the full decoded date is Thursday, March 14th, 2013, at 16:37
Happy Forensicating!
John
As always, please feel free to leave commentary if you liked this article or want to call me on the carpet for some inaccuracy.
John McCash, GCFA Silver #2816, is currently a Forensic Investigator - previously employed by a fortune 500 telecommunications equipment provider, and now making a foray into the world of consulting.