When I started working with PowerShell, I immediately wanted to use PowerShell to do things the way I do them in Bash and Zsh shells. I wanted to find the PowerShell versions of cut/awk/sed/tr/grep. While PowerShell can do many of the things those tools do, it's not really The PowerShell Way.
For example, let's look at working with an IIS web server log file:
PS C:\Users\Sec504> gci .\u_ex220608.log Directory: C:\Users\Sec504 Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 6/9/2022 12:00 AM 416083 u_ex220608.log PS C:\Users\Sec504> Get-Content .\u_ex220608.log | Select-Object -First 5 #Software: Microsoft Internet Information Services 10.0 #Version: 1.0 #Date: 2022-06-08 16:25:40 #Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken 2022-06-08 16:25:40 ::1 GET / - 80 - ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/102.0.5005.63+Safari/537.36+Edg/102.0.1245.33 - 200 0 0 1008
After the first 4 header fields, the IIS log entry record looks similar to other ASCII log files, capturing the date, source, HTTP verb, URI, and more. I want to identify the IP address with the most requests from this log file.
I started to collect information by cutting columns and treating the log file as text:
PS C:\Users\Sec504> Get-Content .\u_ex220608.log | Select-Object -Skip 4 | ForEach-Object { ($_ -Split(" "))[8] } ... 172.30.48.149 172.30.48.149 172.30.48.149 172.30.48.149
Let's break down this command, piece-by-piece:
- Get-Content .\u_ex220608.log |: Retrieve the content of the IIS server log file, start a pipeline
- Select-Object -Skip 4 |: Skip the first 4 lines of the log file (these are header lines in IIS log files)
- ForEach-Object {: Start a loop where the commands within the {} block will execute for each line in the log file
- (: Start parenthesis bracket to execute the code here first
- \(_ -Split(" "): Using \)_ to refer to the current line of log data, split the line into multiple elements delimited by a space
- ): Close the parenthesis bracket, finishing the code
- [8]: From the returned data in the () parenthesis, access the 9th element (8th offset; that is, starting at 0) which is the IP address of the client in the log file
- }: Close the code block executing in the ForEach-Object loop
This approach to accessing the log file data isn't ideal though, since it only produces a list of IP addresses. As an alternative, let's look at converting the log file into a PowerShell object that allows us to interrogate it using standard PowerShell commands and the pipeline.
First, let's look at the first few header lines in the log file:
PS C:\Users\Sec504> Get-Content .\u_ex220608.log | Select-Object -First 4 #Software: Microsoft Internet Information Services 10.0 #Version: 1.0 #Date: 2022-06-08 16:25:40 #Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
The last line shows us the field names for each of the rows of data that follow. We can use this data to create an array of field names:
PS C:\Users\Sec504> $fields = "date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken" -Split " " PS C:\Users\Sec504>
Here I declared a variable, $fields, and I cut-and-paste the list of field names within quotation marks to treat it as a string. Using the string -Split operator, PowerShell will convert this to an array object, where each field name it a different element in the array.
Once we have the column names in an array, we can convert the log file to an array of custom PowerShell objects using ConvertFrom-String. We also need to skip the lines that begin with # as part of our pipeline with Select-String -NotMatch:
PS C:\Users\Sec504\Desktop> $weblog = get-content .\u_ex220622.log | Select-String -NotMatch "^#"| ConvertFrom-String -PropertyNames $fields PS C:\Users\Sec504\Desktop>
- $weblog = Declare a variable $weblog which will hold the array of custom PowerShell objects
- get-content .\u_ex220622.log | Read the log file, start the pipeline
- Select-String -NotMatch "^#"| Skip the lines that begin with # using the regular expression marker ^ (which means the beginning of the line), followed by #
- ConvertFrom-String -PropertyNames $fields Convert the data to a custom PowerShell object using the property names defined in the $fields array
Now, $weblog is an array of custom PowerShell objects that we can access using the pipeline and Select-Object:
PS C:\Users\Sec504> $weblog | Select-object -property c-ip c-ip ---- ::1 ::1 172.30.48.149 172.30.48.149 ...
We can count the unique IP addresses in the c-ip (client IP) property using Group-Object:
PS C:\Users\Sec504> $weblog | Group-Object c-ip Count Name Group ----- ---- ----- 635 172.30.48.149 {@{date=6/8/2022 12:00:00 AM; time=17:54:55; s-ip=172.30.48.149; cs-method=GET; cs-u... 412 172.30.48.1 {@{date=6/8/2022 12:00:00 AM; time=18:53:29; s-ip=172.30.48.149; cs-method=GET; cs-u... 26 172.30.48.148 {@{date=6/8/2022 12:00:00 AM; time=22:15:42; s-ip=172.30.48.149; cs-method=GET; cs-u...
We can build complex queries, such as identifying the top 5 endpoint URIs that took the longest to process:
PS C:\Users\Sec504> $weblog | Sort-Object -Property time-taken -Descending | Select-Object -First 5 -Property cs-uri-stem cs-uri-stem ----------- / /dashboard/overview /DBL1 /DBL2 /status PS C:\Users\Sec504>
The ability to convert text into custom PowerShell objects is really powerful, since it allows us to leverage all of the other PowerShell commands to interrogate and process the data. For the first time, I'm seeing something in PowerShell that makes me think I'd could make the switch away from Unix text processing tools.
-Joshua Wright
Return to Getting Started With PowerShell
Joshua Wright is the author of SANS SEC504: Hacker Tools, Techniques, and Incident Handling, a faculty fellow for the SANS Institute, and a senior technical director at Counter Hack.