One of the basics of doing forensics involves gathering the ASCII and Unicode strings in the file system and searching for keywords. Using Linux we can gather the strings for both ASCII and Unicode using the strings command.
To Gather the ASCII Strings
# strings -td /dev/sdb > sdb.ascii
Note: The "-td" in the above line tells strings to print the offset in decimal for the line.
To Gather the Unicode Strings
# strings -td -el /dev/sdb > sdb.unicode
Note: The "-el" option will have the strings command handle 16-bit little endian encoding. Strings can handle other types of encoding such as 32-bit big/little endian. See the man page on strings and the -e option.
Below is a sample output from the command:
192301896 <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlNone"> 192301972 <summary>This field is deprecated. Deprecated components of Microsoft DirectX 9.0 for Managed Code are considered obsolete. While these components are still supported in this release of DirectX 9.0 for Managed Code, they may be removed in the future. When writing new applications, you should avoid using these deprecated components. When modifying existing applications, you are strongly encouraged to remove any dependency on these components.Deprecated.</summary> 192302446 </member> 192302461 <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlRtsDtr"> 192302539 <summary>This field is deprecated. Deprecated components of Microsoft DirectX 9.0 for Managed Code are considered obsolete. While these components are still supported in this release of DirectX 9.0 for Managed Code, they may be removed in the future. When writing new applications, you should avoid using these deprecated components. When modifying existing applications, you are strongly encouraged to remove any dependency on these components.Deprecated.</summary> 192303013 </member> 192303028 <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlXonXoff"SZDD
Now that we have the output we can use a variety of tools to search for keywords in the output files. Some examples are:
grep -i keyword sdb.ascii > sdb.ascii.keyword
"-i" tells grep to ignore case. This is a pretty useful option as we do not always know how the keyword will be laid out in reference to case.
grep -i -f keywords.txt sdb.ascii > sdb.ascii.keywords
The "-f" option in the above command allows you to create a keyword file with all of keywords you are looking for.
egrep -color -i -f keywords.txt sdb.ascii
Egrep is equivalent to doing a "grep -E". It allows for extend regular expressions, which in itself is another topic. The key thing right now to pick up on the above command is the -color option. This will print any matching keyword in a different color. On my Fedora systems, the keyword is in red. One thing to note about this is, if you pipe egrep output to another command or redirect the output to a file, you will lose the color on matching text. It is a nice command to get a keyword to pop out for doing a quick search.
- Perl programs like https://blogs.sans.org/computer-forensics/2008/12/03/perl-and-forensics/ and http://www.citadelsystems.net/index.php/forensics-tools/36-word-search/53-wordsearchpl
Offset Math
Sometimes you want to take a closer look at the clusters/blocks for where your keyword was found. Using the offsets listed in the strings output you can quickly figure out where the keyword is in the drive or file. For example:
192303028 <member name="F:Microsoft.DirectX.DirectPlay.Address.FlowControlXonXoff"SZDD
The offset here is 192303028 for our DirectX keyword. For this NTFS file system, the cluster size is 4096 bytes. To figure out which cluster DirectX is in do:
Offset / cluster size or
192303028 / 4096 = 46948.981445312 or cluster 46948
If you wanted the sector where the keyword is located:
192303028 / 512 = 375591.8515625 or sector 375591
Figuring Out Cluster Size
You can use the "ntfsinfo" command to figure out the cluster size for NTFS file system. To do this use:
# ntfsinfo --mft /dev/sda1 Volume Information Name of device: /dev/sda1 Device state: 11 Volume Name: Volume State: 1 Volume Version: 3.1 <strong>Sector Size: 512 Cluster Size: 4096</strong> Volume Size in Clusters: 13181323
In the above output in bold, the command has listed the sector size and the cluster size.
For Linux the block size can be found with the "tune2fs" command. I have piped it out to grep as the output can be lengthy.
# tune2fs -l /dev/sda2 | grep Block Block count: 12799788 <strong>Block size: 4096</strong> Blocks per group: 32768
Again the block size is in bold.
There you have it, the basics of using the strings command and how to calculate the cluster/block/sector for where the keyword can be found.
Keven Murphy, GCFA Gold #24, is the Senior Forensics Specialist for a Fortune 100 defense contractor.