OK, like everyone I joined facebook just to get updates on my high school reunion. (Who knew you could also use it as a possible alibi.)
But then, after writing pdgmail and pdymail and seeing all the neat personal information in facebook...tada pdfbook! Memory parsing to grab facebook info.
Like it's predecessors pdgmail and pdymail, I'm following the simple construct that memory strings are easy to get to and yield a treasure of information given today's web 2.0 world of javascript, dhtml, json, etc. Facebook, it turns out doesn't seem to cough up xml like yahoo, or json like gmail but rather unique class ID strings in it's html.
What does this mean to forensics? Well with a memory dump from any of the popular memory dumping tools, strings -el and pdfbook you can get:
- status updates
- facebook emails
- lists of friends
- likely owners of the memory image
Friends come with their unique facebook ID's like:
Story from friend: id:6815841748: Name:Barack Obama
Facebook emails are raw html with authors, dates, etc like so :
FacebookEmailDetail author: Storm Large url: http://www.facebook.com/stormlarge FacebookEmailDetail Date: October 29 at 9:41am FacebookEmailDetail Body: Nov 19.2009 - 8:30PM Molly Malones - Los Angeles, California More info:
Facebook recent activity is like so:
RecentActivity:Jeff became a fan of Fishbone.
Status updates show up like so:
StoryMessage:Jeff Bryner 2 gamble @the airport or not, that is the question.
If you're really lucky the memory image will contain enough html to produce what pdfbook recognizes as a 'delete' button which is only passed out to the owner of the html content. In other words, you are allowed to delete your posts on facebook, pdfbook recognizes this and your facebook userid, correlates it and deduces that the likely owner of the memory image is:
Likely Owner of fbook memory artifacts: FacebookUserID:1421688057 Name:Jeff Bryner
A sample usage:
on a windows or linux box, use pd from www.trapkit.de ala:
pd -p 2345> 2345.dump
where 2345 is the process ID of running instance of IE/firefox/browser of your choice.
You can also use any memory imaging software like mdd, win32dd, etc. to grab the whole memory on the box rather than just one process. You can also use common memory repositories like pagefile.sys, hiberfile.sys, etc.
I'll refer the reader to the memory imaging tool reference at the forensic wiki.
Transfer the dumped memory to linux and do:
strings -el 2345.dump> memorystrings.txt pdfbook -f memorystrings.txt
It'll find what it can out of the memory image and spit out it's findings to standard out. Grep your way to facebook happiness or redirect the output to a file for later viewing.
As this is mosly html parsing, it's very brittle; meaning that a change in the classID of one of the facebook UI components breaks this program. Matter of fact it's already broken once since the UI rework of 10/2009. So it will work for awhile until they redesign and I'm out of sync. Maybe I'll post it to sourceforge or github so you all can update as you see fit.
Along those lines, look for the diary of pdfbook creation with explanation of it's regex goodness at the newly created digitalforensicsmagazine.com freshly created this month! Dissect and contribute your own regex hacks for finding stuff you recognize in your own facebook memory images.
Related Blog Posts:
Jeff Bryner , GCFA Gold #137, also holds the CISSP and GCIH certifications, occasionally teaches for SANS, performs forensics, intrusion analysis, and security architecture work on a daily basis and runs p0wnlabs.com just for fun.