The Event Query Language (EQL) is a standardized query language (similar to SQL) to evaluate Windows events. Written by Russ Wolf, EQL is an amazing tool to normalize Windows log events for consistent access and query.
In practice, EQL is most effective when working with Windows Event Log and Sysmon logging data as part of your threat hunting tactics. In this article I'll demonstrate some ways to get started with EQL to assess the tactics of an attacker from a compromised system.
Installing EQL
EQL works equally well on Windows, Linux, and macOS, and requires Python. You can install EQL with pip3 install eql
, or build it from the GitHub repository.
Alternatively, you can download and run Slingshot Linux, where EQL is already installed and ready to go!
Getting Started
EQL works best with Sysmon logs, converted to JSON format. From a system where you're using a Sysmon configuration to capture detailed system events, download the EQL (scrape-events.ps1](https://eqllib.readthedocs.io/en/latest/guides/sysmon.html#getting-sysmon-logs-with-powershell) PowerShell script. Import it and write the Sysmon data as a JSON file, as shown here:
<i># Import the functions provided within scrape-events</i>
Import-Module .\scrape-events.ps1
<i># Get all the Sysmon logs from Windows Event Logs</i>
Get-WinEvent -filterhashtable @{logname="Microsoft-Windows-Sysmon/Operational"} `
-Oldest | Get-EventProps | ConvertTo-Json | Out-File -Encoding ASCII `
-FilePath my-sysmon-data.json
Note that in this example I've broken up this long command into multiple lines with a backtick at the end of each line per the PowerShell convention. If you type this on one long line, omit the backticks.
EQL includes two important utilities: eql
and eqllib
:
eql
is a command line tool to interrogate your dataeqllib
is a command line tool to format your data in a consistent manner
Working from the PowerShell my-sysmon-data.json
file, convert the Sysmon-structured data to the EQL schema using eqllib
:
<pre><code>$ eqllib convert-data my-sysmon-data.json -s "Microsoft Sysmon" querydata.json</code></pre>
The querydata.json
file will be your data source for interrogation with EQL.
Get the Demo Files
To follow the examples in this article, download the sample data files, unzip, and change to the eql-data-samples
directory.
slingshot@slingshot:~$ wget https://www.willhackforsushi.com/articles/eql-data-samples.zip
slingshot@slingshot:~$ unzip -q eql-data-samples.zip
slingshot@slingshot:~$ cd eql-data-samples
slingshot $
Threat Hunting: regsvr32.exe
To use EQL to search through the eqllib
-normalized JSON files, you will craft SQL-like queries using this syntax:
This is best shown in examples. Let's start with the file querydata.json
. We'll start by looking for any instances where the DLL registration utility regsvr32
is run:
slingshot $ eql query -f querydata.json "process where process_name = 'regsvr32.exe'"
{"command_line": "\"C:\\Windows\\syswow64\\regsvr32.exe\" /s .\\meterpreter.dll", "event_type": "process", "logon_id": 180388, "parent_process_name": "powershell.exe", "parent_process_path": "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe", "pid": 5208, "ppid": 1500, "process_name": "regsvr32.exe", "process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe", "subtype": "create", "timestamp": 132039702277510000, "unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}", "unique_ppid": "{AC6A4E42-064B-5CF4-0000-00106FB21900}", "user": "SEC504STUDENT\\Sec504", "user_domain": "SEC504STUDENT", "user_name": "Sec504"}
{"event_type": "process", "pid": 5208, "process_name": "regsvr32.exe", "process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe", "subtype": "terminate", "timestamp": 132039702279730000, "unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}"}
This is ... less than beautiful. If you haven't already, install jq
to pretty-print the output data:
slingshot $ sudo apt-get install -y jq
slingshot $ eql query -f querydata.json "process where process_name = 'regsvr32.exe'" | jq
{
"command_line": "\"C:\\Windows\\syswow64\\regsvr32.exe\" /s .\\meterpreter.dll",
"event_type": "process",
"logon_id": 180388,
"parent_process_name": "powershell.exe",
"parent_process_path": "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
"pid": 5208,
"ppid": 1500,
"process_name": "regsvr32.exe",
"process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe",
"subtype": "create",
"timestamp": 132039702277510000,
"unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}",
"unique_ppid": "{AC6A4E42-064B-5CF4-0000-00106FB21900}",
"user": "SEC504STUDENT\\Sec504",
"user_domain": "SEC504STUDENT",
"user_name": "Sec504"
}
{
"event_type": "process",
"pid": 5208,
"process_name": "regsvr32.exe",
"process_path": "C:\\Windows\\SysWOW64\\regsvr32.exe",
"subtype": "terminate",
"timestamp": 132039702279730000,
"unique_pid": "{AC6A4E42-07B3-5CF4-0000-0010719C1D00}"
}
This is much more useful output! Here we see that someone launched regsrv32.exe
from a PowerShell session, passing the command line parameters /s .\\meterpreter.dll
. Probably not good news if this is a box you rely on.
We can break down the arguments in this query as shown:
Query Component | Description |
---|---|
eql query | Run the eql command, execute a query |
-f querydata.json | Read from the specified file |
"process where process_name = 'regsvr32.exe'" | The EQL query for the process event |
| jq | Send the JSON results to the jq utility to print output nicely |
The query syntax process where process_name = ...
is used often with EQL. The initial keyword process indicates that we are querying the process data in the normalized JSON. Other keywords for interrogation include file
, network
, registry
, and image_load
.
Threat Hunting: ntdsutil
An attacker with privileged access to a Windows Domain Controller can use ntdsutil
to create an accessible backup of the domain password hashes. Not a good time for the security of the Windows Domain. For this example, we can reference the T1003-CredentialDumping-ntdsutil_eql.json
file:
slingshot $ eql query -f T1003-CredentialDumping-ntdsutil_eql.json \
'process where process_name == "ntdsutil.exe" \
and command_line == "*create*" \
and command_line == "*ifm*"' | jq
{
"command_line": "\"C:\\Windows\\system32\\ntdsutil.exe\" \"ac i ntds\" ifm \"create full c:\\hive\" q q",
"event_type": "process",
"logon_id": 301152,
"parent_process_name": "powershell.exe",
"parent_process_path": "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
"pid": 5680,
"ppid": 628,
"process_name": "ntdsutil.exe",
"process_path": "C:\\Windows\\System32\\ntdsutil.exe",
"subtype": "create",
"timestamp": 132046718142390000,
"unique_pid": "{8a215c30-bc46-5cfe-0000-0010ae451200}",
"unique_ppid": "{8a215c30-b80d-5cfe-0000-0010e96a0d00}",
"user": "Wardrobe99\\Administrator",
"user_domain": "Wardrobe99",
"user_name": "Administrator"
}
Note that in this example I've broken up this long command into multiple lines with a backslash at the end of each line. If you type this on one long line, omit the backslashes.
Here we see another example of using the process
keyword to search for instances of the ntdsutil
process. This by itself is probably enough to warrant further investigation, but we can confirm it further by also checking for the create
and ifm
command lines as well using wildcard (*
) matchine.
Ntdsutil can also be invoked without arguments and run interactively, eliminating the command line detail. Don't rely on the presence of the additional command line arguments to indicate suspicious
ntdautil
use.
Anomalous Command Lines
Instead of looking for specific processes, we can also use EQL to look for anomalous behavior. For example, we can search for very long command lines, often used to pass encoded PowerShell scripts to bypass Set-ExecutionPolicy
restrictions:
slingshot $ eql query -f normalized-rta.json 'process where \
length(command_line) > 200 and not process_name in ("chrome.exe", "ngen.exe")' \
| jq "{process_name,parent_process_name}"
{
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe"
}
{
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe"
}
{
"process_name": "powershell.exe",
"parent_process_name": "python.exe"
}
Notice here how we can call EQL's length
function to calculate the length of the command line. I've also added a second and not process_name in
clause to eliminate common return values where long command lines are normal and not necessarily indicative of an attack. Finally, I added some jq
syntax to display only the process_name
and parent_process_name
values for each response event.
The return values here aren't that exciting, though we see three events in the log that have a command line longer than 200 characers. Let's modify the jq
syntax to get the detail from the command_line
member:
slingshot $ eql query -f normalized-rta.json 'process where length(command_line) > 200 and not process_name in ("chrome.exe", "ngen.exe")' | jq "{process_name,parent_process_name,command_line}"
{
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe",
"command_line": "C:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\cvtres.exe /NOLOGO /READONLY /MACHINE:IX86 \"/OUT:C:\\Users\\alice\\AppData\\Local\\Temp\\RES5673.tmp\" \"c:\\Users\\alice\\AppData\\Local\\Temp\\eexr0kqp\\CSCE6E4328451414E5C89B772D1F2FFE5F8.TMP\""
}
{
"process_name": "cvtres.exe",
"parent_process_name": "csc.exe",
"command_line": "C:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\cvtres.exe /NOLOGO /READONLY /MACHINE:IX86 \"/OUT:C:\\Users\\alice\\AppData\\Local\\Temp\\RES575D.tmp\" \"c:\\Users\\alice\\AppData\\Local\\Temp\\5gcbnfh4\\CSC48F3F5A831E04AC989C6D1D0A2C1DE4D.TMP\""
}
{
"process_name": "powershell.exe",
"parent_process_name": "python.exe",
"command_line": "powershell.exe -ec RwBlAHQALQBQAHIAbwBjAGUAcwBzACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAAIAAgACAAIAAgACAAIAAgACAAa...snip"I
}
EQL reveals a suspicious PowerShell command that we would want to investigate further!
Data Exploration
Apart from process information, the EQL-normalized Sysmon logs can reveal additional attributes about the system for interrogation. You can get a summary of the available data by asking:
slingshot $ eql query -f normalized-rta.json "any where true | count event_type"
{"count": 87, "key": "network", "percent": 0.0027321546336714505}
{"count": 240, "key": "file", "percent": 0.007536978299783312}
{"count": 574, "key": "process", "percent": 0.018025939766981754}
{"count": 9811, "key": "image_load", "percent": 0.30810539207989196}
{"count": 21131, "key": "registry", "percent": 0.6635995352196715}
Here I used EQL to retrieve objects of the any
event type, the where true
being necessary to return the records (EQL requires an evaluation criteria that returns true
). The event_type
member tells me the type of event, which I obtain using the count
function.
The object members in each record will be different for process
vs. network
(you wouldn't expect destination_port
to be a member in the process
event, for example). You can read about the data structures in the EQL documentation, or you can ask EQL to tell you what it knows:
slingshot $ eql query -f normalized-rta.json "network where true | head 1" | jq
{
"destination_address": "192.168.162.135",
"destination_port": "445",
"event_type": "network",
"pid": 4,
"process_name": "System",
"process_path": "System",
"protocol": "tcp",
"source_address": "192.168.162.135",
"source_port": "50456",
"subtype": "outgoing",
"timestamp": 131883575711730000,
"unique_pid": "{9C977984-B294-5C05-0000-0010EB030000}",
"user": "NT AUTHORITY\\SYSTEM",
"user_domain": "NT AUTHORITY",
"user_name": "SYSTEM"
}
EQL supports the head
function (note that this is part of the EQL query, not a command-line argument) to limit the number of events returned by a query. In the output we see that the network
event type reveals information about the system including source and destination addresses and ports, protocol information, the process ID, domain, and user information.
We can put this information to good use, summarizing the destination port information that the system is using:
slingshot $ eql query -f normalized-rta.json "network where subtype = 'outgoing' | \
count destination_port | sort count"
{"count": 1, "key": "137", "percent": 0.023255813953488372}
{"count": 1, "key": "138", "percent": 0.023255813953488372}
{"count": 1, "key": "53155", "percent": 0.023255813953488372}
{"count": 1, "key": "53159", "percent": 0.023255813953488372}
{"count": 1, "key": "53355", "percent": 0.023255813953488372}
{"count": 1, "key": "80", "percent": 0.023255813953488372}
{"count": 2, "key": "139", "percent": 0.046511627906976744}
{"count": 4, "key": "445", "percent": 0.09302325581395349}
{"count": 4, "key": "49667", "percent": 0.09302325581395349}
{"count": 5, "key": "49669", "percent": 0.11627906976744186}
{"count": 9, "key": "135", "percent": 0.20930232558139536}
{"count": 13, "key": "8000", "percent": 0.3023255813953488}
This output reveals that nearly a third of the activity is destined to TCP/8000. We can investigate this further, identifying any port 8000 activity where the process name is an executable file:
slingshot $ eql query -f normalized-rta.json "network where process_name = '*.exe' \
and destination_port = '8000'" | jq "{process_path,user,timestamp,destination_port}"
{
"process_path": "C:\\Windows\\System32\\mshta.exe",
"user": "RTA-DESKTOP\\alice",
"timestamp": 131883576881820000,
"destination_port": "8000"
}
{
"process_path": "C:\\Windows\\System32\\msiexec.exe",
"user": "NT AUTHORITY\\SYSTEM",
"timestamp": 131883577020100000,
"destination_port": "8000"
}
{
"process_path": "C:\\Windows\\System32\\msiexec.exe",
"user": "NT AUTHORITY\\SYSTEM",
"timestamp": 131883577024280000,
"destination_port": "8000"
}
{
"process_path": "C:\\Windows\\System32\\rundll32.exe",
"user": "RTA-DESKTOP\\alice",
"timestamp": 131883577304160000,
"destination_port": "8000"
}
Some suspicious activity going on here. First, the Microsoft HTA execution utility is launched by Alice, then msiexec
is used twice as system nearly instantaneously, followed a few seconds later by Alice running rundll32.exe
. Definitely worth investigation.
Conclusion
EQL is a powerful tool, with a lot of significant benefits for defenders to leverage when threat hunting. While the search syntax can be a little confusing at first, with a little practice it becomes second-nature, making it possible to explore logging data in a consistent, simple format with minimal fuss.
Got a fantastic query you use for threat hunting with EQL? Please let me know! until then, use the sample data files, and explore the secrets hidden in Sysmon logs with EQL.