Earlier this year, a life time ago in internet years, I published a series of posts on the FAT file system. Over the next few months, I'll be publishing a similar series on NTFS. Much of the information contained in these posts will come from Brian Carrier's excellent book, File System Forensic Analysis, articles from Microsoft and other sources. Where applicable, specific sources will be cited within each blog post.
On day one of SANS Sec 508: Computer Forensics, Investigation and Response we cover the most common file systems in detail. Almost without fail, someone asks if the material is really important or how many times I've had to break out a hex editor and manually parse file system data structures. The truth is that I have yet to encounter a drive image that required me to process data structures with a hex editor. But, I know more experienced investigators who have and on many occasions. Regardless of your experience, I believe understanding how the file systems work and how common tools parse those file systems will make you a better forensic investigator. Naturally, this series will contain hex dumps and lots of screenshots.
Compared with FAT, NTFS, is a more advanced file system. At the start of a FAT partition is the boot sector, the first 512 bytes of the partition. Following the boot sector, there may be a number of reserved sectors, the primary and secondary File Allocation Tables and depending on the particular flavor of FAT, the root directory may lie outside of the portion of the disk called the cluster area where the file content actually resides.
In NTFS, there are no reserved sectors. Even the boot sector is referenced by NTFS's metadata structure, the Master File Table (MFT).
One of the first tools I reach for when processing a disk image is fsstat from Carrier's Sleuthkit. Below is a screenshot of fsstat's output against a newly formatted NTFS partition:.
At the top of the output is the file system information, it's self-explanatory. Below that is the metadata for the partition. Think of the metadata information like a card catalog entry in a library. Metadata tells us where a file's contents can be found on disk, who owns the file, when it was created, last accessed, modified and a slew of other information that we'll eventually talk about.
In the metadata section, we see that the first MFT starts in cluster 4515 and the backup copy of the MFT begins in cluster 6773. Remember clusters are made up of one or more sectors, are addressable and can be allocated or unallocated. MFT entries, Microsoft calls them File Record Segments (FRS), are like card catalog entries for files in the file system. In this series, I'll adopt Carrier's terminology of referring to the FRS as MFT entries, but be aware that you will see FRS in other sources. Every file on an NTFS partition, will have at least one MFT entry. We'll pick apart MFT entries in future posts.
Next, fsstat tells us index records are 4096 bytes. What are index records? According to Carrier's book (see pages 290 - 296), NTFS indexes are B-trees that store certain attributes. B-trees are a form of linked list containing sorted data and can be searched quickly and managed efficiently. File names are an attribute type that is indexed via B-trees.
Following that we see the current range of MFT entries, 0 - 64. The previous values in the metadata section were given in cluster numbers or bytes, the values here (0 - 64) are MFT entry numbers. More MFT entries will be created as more files are created in the file system. And finally in the metadata section, we see that MFT entry five containts metadata for the root directory for this partition.
That's enough for today, we'll delve deeper in the next post in the series and at some point in the series we'll give away some books courtesy of Syngress Publishers. If you're interested in this type of information and want to learn more, I'll be teaching SANS Security 508: Computer Forensics, Investigation and Response in South Lake Tahoe, CA from January 25 through January 30.
Dave Hull, GCFA, GCIH, GREM, CISSP, is founder of Trusted Signal and describes his working life as "on the Venns" of incident response, digital investigations and web application security.