Summary
In digital forensic analysis it is sometimes required to be able to determine if an e-mail has or has not been falsified. In this paper a review of certain Outlook Message Application Programming Interface (MAPI) is provided which can help in determining falsified e-mails or altered appointments in an Microsoft Outlook/Exchange environment.
About the libpff Project
In 2008 Joachim Metz a forensic investigator at Hoffmann Investigations started the libpff project. At that time the best source about the Personal Folder File (PFF) format in the public domain was
the libpst project. The libpst project dated back to 2002 and had been contributed and maintained by David Smith, Joe Nahmias, Brad Hards and Carl Byington.
However the libpst, at that time, wasn't a library and had no support for recovering deleted items in PST and OST files. The initial goal of the libpff project to create a shared library for PST and OST that had support for recovering deleted items. Recovering deleted items requires detailed knowledge of the inner structures of the PFF format. This was the beginning of an interesting journey. In which even recently additional information about the inner structures has been discovered, like the 6c and 8c table and the use of indirection in large tables.
In March 2009 PFF forensics was first discussed as part of Microsoft Office forensics in the Hoffmann Advanced Forensic Sessions (HAFS). A paper titled 'Personal Folder File (PFF) forensics' was published as part of the HAFS. This paper explains the basics of the PFF format, which can be quite a challenge to understand. One of the main conclusions of the both the paper and the seminar was that different forensic tools provide different results when recovering deleted items in PST and OST files.
In the mean time the libpff project has evolved. Due to continued analysis of the PFF format and several contributions new aspects of the file format have been discovered. Some of which are the
PFF items that contain information about the recipients, sub folders, sub messages and sub associated items.
Also a lot of information available about the MAPI has made available. The OpenChange project provides libmapi which contains an Open Source implementation of the MAPI. And the MFCMAPI project has provided a lot of MAPI information now available on MSDN.
Within Hoffmann Investigations libpff has been to put to work for two purposes. First as a tool to cross reference findings in other forensic tools and secondarily as a tool that can provide more information about PST and OST files than those forensic tools. In the upcoming Hoffmann Advanced Forensic Sessions in November 2009 PFF forensics will be therefore once more the subject of discussion. In the mean time several of the interesting findings are provided in this paper.
1. Introduction
Wouldn't it be nice to have your forensic analysis software to filter out falsified e-mails and appointments for you? However, most of the current forensic tools provide little information about the authenticity of e-mail messages and appointments. Therefore, certain analysis have to be done manually. This paper will give you an understanding of parts the Outlook Message Application Programming Interface (MAPI) to help identify falsified e-mails in Microsoft Outlook/Exchange environments.
1.1. Background
If you are a forensic investigator in the field of corporate environments you are probably dealing with Microsoft Outlook and Exchange most of the time. What you might not know is that both make heavy use of the MAPI. The MAPI is not only a programming interface but also a useful resource of information regarding properties of e-mail attributes. For those of you not familiar with analyzing the Personal Folder File format used by Microsoft Outlook for PST and OST files, I advice reading [METZ09] before reading this paper.
2. Falsified e-mail message
In a recent investigation we had to investigate if a user had sent an e-mail at a certain date and time. We started by determining the existence of the e-mail in the mailbox of both the sender and the recipients. But there were other characteristics that were highly interesting from a forensic point of view.
A certain e-mail dated March 10, 2009 was forwarded on March 17, 2009. The original e-mail could not be found in any of the mailboxes. The first indication of falsification was a discoloring of the day of the month in a print-out of the forwarded e-mail. The 0 in March 10, was gray while the surrounding text was clearly black.
2.1. The e-mail body
In Outlook/Exchange an e-mail message can contain RTF and/or HTML body text. Both RTF and HTML formats use formatting codes. Using these formatting codes we did a low-level analysis of the body text. Most of the available forensic tools do not provide access to these formatting codes, but lucky for us there is libpff and its tools.
After having compiled libpff with verbose and debug output and having pffexport export the PST file with the verbose option (-v), we had created a detailed debug log file. In this log file we looked up the e-mail and its RTF body. In the RTF body the following information was found:
{\*\htmltag84 <b>}\htmlrtf {\b \htmlrtf0 Sent: {\*\htmltag92 </b>}\htmlrtf }\htmlrtf0 Tuesday March 1 {\*\htmltag84 <span style='color:#1F497D'>}\htmlrtf {\htmlrtf0 0 {\*\htmltag92 </span>}\htmlrtf }\htmlrtf0 , 2009 13:48 {\*\htmltag116 }\htmlrtf \line \htmlrtf0 {\*\htmltag4 \par }
Using other forwarded e-mails as a reference, we established that the bold formatting code should not be there.
2.2. Conversation index
Looking at existing e-mail messages we hypothesized that the original e-mail was not created on March 10, 2009 but was in fact an e-mail created on March 17 2009 that had been altered. We wanted proof besides the lack of the original e-mail message in the mailboxes of the sender and the recipients.
A MSDN article titled 'Tracking conversations' provided us with a fairly reliable answer.
[MSDN] states that:
PR_CONVERSATION_INDEX (PidTagConversationIndex) indicates the position of the message within a particular conversation. It is a client's reponsibility to set PR_CONVERSATION_INDEX for each outgoing message, whether it is a new message, a forwarded message, or a reply. Clients can set this property manually or call ScCreateConversationIndex, a utility function provided by MAPI. ScCreateConversationIndex generates the value of a conversation index for any outgoing message. ScCreateConversationIndex implements the index as a header block that is 22 bytes in length, followed by zero or more child blocks each 5 bytes in length. The header block is composed of 22 bytes, divided into three parts: * One reserved byte. Its value is 1. * Five bytes for the current system time converted to the FILETIME structure format. * Sixteen bytes holding a GUID, or globally unique identifier. Each child block is composed of 5 bytes, divided as follows: * One bit containing a code representing the difference between the current time and the time stored in the header block. This bit will be 0 if the difference is less than .02 second and greater than two years and 1 if the difference is less than one second and greater than 56 years. * Thirty one bits containing the difference between the current time and the time in the header block expressed in FILETIME units.This part of the child block is produced using one of two strategies, depending on the value of the first bit. If this bit is zero, ScCreateConversationIndex discards the high 15 bits and the low 18 bits. If this bit is one, the function discards the high 10 bits and the low 23 bits. * Four bits containing a random number generated by calling the Win32 function GetTickCount. * Four bits containing a sequence count that is taken from part of the random number.
Reverse-engineering this description for the PFF format I found that the part of the header block containing the 'One reserved byte' with a value of 1 is actually the first byte of the filetime. So there are not 5 bytes of the filetime but 6. The date and time in the header block of the conversation index matches the creation date and time of e-mail messages. The child block contains a difference between the current and the previous time and not the time stored in the header block, as according to the MSDN specification. This was validated using the creation date and time of multiple e-mails.
The conversation index for the specific e-mail translates to:
0x0071 (PidTagConversationIndex : Conversation index) 0x0102 (PT_BINARY : Binary data) Header block: Filetime : Mar 17, 2009 10:13:04 UTC GUID : 11111111-2222-3333-4444-555555555555 Child block: 1 Filetime : Mar 17, 2009 10:18:03 UTC Random number : 2 Sequence count : 0 Child block: 2 Filetime : Mar 17, 2009 10:24:01 UTC Random number : 9 Sequence count : 0 Child block: 3 Filetime : Mar 17, 2009 10:42:39 UTC Random number : 9 Sequence count : 0 Child block: 4 Filetime : Mar 17, 2009 10:45:36 UTC Random number : 14 Sequence count : 0 Child block: 5 Filetime : Apr 17, 2009 07:19:08 UTC Random number : 8 Sequence count : 0
Note that the precision of the date and time difference in the child block varies and does not match the creation date and time. The actual reason for this variation is yet unknown.
0x3007 (PidTagCreationTime : Creation time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Apr 17, 2009 08:41:20 UTC
However there is no date March 10, 2009 in the conversation index. Looking at the conversation indexes of other forwarded and replied e-mail messages this is the behavior we would expect. Note that the GUID '11111111-2222-3333-4444-555555555555' in this example was altered. Using the GUID we found corresponding e-mails, with the same GUID in the conversation index. Most of these e-mails had a different content. This finding supported our hypothesis. All of the corresponding e-mails also had a creation date of March 17, 2009. Therefore, it was plausible that the e-mail with the discolored zero in 'March 10' was falsified using another e-mail created on March 17, 2009. Upon being faced with the findings in an interview, the sender of the e-mail admitted that he had altered the e-mail.
3. The modified appointment
In another investigation we found an appointment that contained a conversation topic that contained one of the keywords we were looking for. However the appointment had an entirely different subject and the last modification date and time already indicated that the appointment was modified at a later date.
We needed to be certain that this behavior was caused by modifying an appointment. Using Outlook we created a PST file with an appointment. Libpff provided us with the following information about the subject and the conversation topic:
0x0037 (PidTagSubject : Subject) 0x001f (PT_UNICODE : UTF-16 Unicode string) Unicode string : ^A^ATest1 0x0070 (PidTagConversationTopic : Conversation topic) 0x001f (PT_UNICODE : UTF-16 Unicode string) Unicode string : Test1
And about the date and time values:
0x0039 (PidTagClientSubmitTime : Client submit time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:07:47 UTC 0x0071 (PidTagConversationIndex : Conversation index) 0x0102 (PT_BINARY : Binary data) Header block: Filetime : Jul 23, 2009 14:07:47 UTC GUID : 11111111-2222-3333-4444-555555555555 0x0e06 (PidTagOriginalDeliveryTime : Message delivery time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:07:47 UTC 0x3007 (PidTagCreationTime : Creation time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:04:28 UTC 0x3008 (PidTagLastModificationTime : Last modification time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:07:50 UTC
The ^A characters in the subject are control characters and can be ignored. Note that the creation and last modification date and time are not equal.
Next we modified the appointment and had libpff provide us with information about the subject and the conversation topic:
0x0037 (PidTagSubject : Subject) 0x001f (PT_UNICODE : UTF-16 Unicode string) Unicode string : ^A^AModified1 0x0070 (PidTagConversationTopic : Conversation topic) 0x001f (PT_UNICODE : UTF-16 Unicode string) Unicode string : Test1
And about the date and time values:
0x0039 (PidTagClientSubmitTime : Client submit time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:07:47 UTC 0x0071 (PidTagConversationIndex : Conversation index) 0x0102 (PT_BINARY : Binary data) Header block: Filetime : Jul 23, 2009 14:07:47 UTC GUID : 11111111-2222-3333-4444-555555555555 0x0e06 (PidTagOriginalDeliveryTime : Message delivery time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:07:47 UTC 0x3007 (PidTagCreationTime : Creation time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:04:28 UTC 0x3008 (PidTagLastModificationTime : Last modification time) 0x0040 (PT_SYSTEM : Windows Filetime (64-bit)) Filetime : Jul 23, 2009 14:08:37 UTC
As you can see the conversation topic and index do not change when an appointment is modified.
The last modification date and time in the example is not much of an indication that the appointment was modified, mainly because we did the modification right after the creation of the appointment.
4. Conclusion
E-mails and appointments in Outlook/Exchange provide us with certain properties that can be useful for digital forensic analysis of e-mails, like the conversation index and multiple formatted body texts. Others may be the conversation topic and original creation and/or modification dates and times.
Appendix A. References
Title: Personal Folder File (PFF) forensics
Subtitile: Analyzing the horrible reference file format
Author(s): Joachim Metz
URL: http://kent.dl.sourceforge.net/sourceforge/libpff/PFF_forensics.pdf
[MSDN]
Title: Tracking conversations
URL: http://msdn.microsoft.com/en-us/library/cc765583.aspx