Introduction
The field of incident response, forensics, and malware analysis is full of thrilling hunts and exciting investigations where you have an opportunity to aggressively pursue the activities of adversaries. While technical acumen certainly supports these efforts, a truly successful execution requires both a well-crafted process and detailed documentation of the journey through that process. Meticulous documentation allows you to easily retrace your analysis flow (particularly important if the work supports any litigation), and it facilitates information sharing so others can benefit from your analysis approach and results. More importantly, if a malware analysis effort continues for any substantial period of time, tracking what you've done and what is yet to be done is difficult without comprehensive notes. Generating documentation is clearly one of the less glamorous parts of malware analysis, but it's absolutely necessary to be an effective analyst.
Approaches
Documentation approaches I've heard of include using a word processing program, a wiki, a mind map, a dirty napkin, and hope (that you'll just remember). Each of these styles has pros and cons - some are too structured, some are not structured enough. Just like the analysis process itself, the documentation cannot be too rigid and must allow for some creative freedom; otherwise, it will simply go unused. The goal, of course, is not to find a perfect format, but to use one consistently that allows you to track the investigation process and results. My personal rule is if you don't document it, it didn't happen.
An Example
Here is a Word document template I created to record analysis details when performing manual malware analysis of Windows executable files. I've found that a structured Word document provides me the organization I need to quickly note by observations and screenshots without restricting my analysis approach. Not only does the template include the key components of malware analysis, but it also refers to specific tools that can be helpful in gathering certain data. Please note that the document is intended to be a guide, and it should not be used as a comprehensive process document.
The template is divided into several sections:
- Background: This is where you can record basic contextual information such as the date the file was discovered, why it was brought to the analyst's attention (e.g., IDS alert) its location on the network, and timestamp data.
- Static Analysis: As you learn everything you can about a sample without executing it, log your observations here. In addition to file characteristics, you can also document any open source research you perform using information such as the file hash or string references.
- Behavioral Analysis: Record any file system, network, or memory artifacts in this section. Be sure to consider and document dependencies that impact behavior; for example, malware may require that a particular browser be launched to perform its activity. Samples may also depend upon escalated privileges or only trigger upon reboots. It's important to perform iterative testing to discover such dependencies.
- Code Analysis: When it's necessary to disassemble code and debug its execution, detail your flow and observations here. The level of effort required to perform useful code analysis can vary greatly between samples. Incorporating clues from static and behavioral analysis and focusing on key functionality can help accelerate this process.
- Analysis Summary: As you perform deep technical analysis to prove or disprove your theories about a sample, it's helpful to document key analysis findings in a separate section to highlight findings and summarize your work. This section is particularly helpful when management asks for an impromptu update.
You're welcome to use the template and tailor it to your own analysis process and other file types. I'd be grateful for any feedback or revised templates.
Final Thoughts
Developing a documentation medium you're comfortable with is an iterative process, so if it's unclear what fits your style, try one approach and customize it as necessary. If it's not working, move on and try something new. Developing a habit of reflection (i.e. What worked? What didn't work?) will not only help you create better documentation, but it will also improve your analysis methodology.
If you would like to learn more about dissecting malware, I encourage you to join me at an upcoming FOR610 course.
-Anuj Soni
Anuj Soni teaches FOR610 Reverse-Engineering Malware for the SANS Institute. He is also a Senior Incident Responder at Booz Allen Hamilton, where he focuses on hunting threats and double-clicking malware all day long. You can find him on twitter at @asoni.