Tags:
Every organization, team, and individual is trying to find the most effective way to harness the value of Generative AI (GenAI). There are many obvious options such as chat bots, document creation, and writing code. While these applications can be valuable to your Security Operations Center (SOC) as well, we will explore other options.
Using a strategy that is often applied to SOAR, your SOC can find tremendous value in GenAI: Automate repeatable tasks to get rid of the boring work and focus on the interesting work. Admittedly, it is not a glamorous way to say it, but it gets the point across.
While you may want to consider spicing up the language when presenting to leadership, regardless of how you sell it, the concept is the same. Look at the work that analysts are doing and identify where GenAI can be inserted to handle the mundane tasks. In my previous blog post “How SOAR Transforms Security Operations,” I write about this point and go into detail. I also mention that GenAI can be used to handle Data Loss Prevention (DLP) alerts. I will get to that use case in a moment.
First, before any AI work can be done, there are steps that need to be taken to ensure it is being done safely. SANS Fellow Frank Kim and SANS Head of Innovation Dan deBeaubien do an excellent job in the course, AIS247: AI Security Essentials for Business Leaders, outlining how to prepare your organization for using AI. For the purposes of this blog post we are going to skip over the risk assessment, policy creation, and protections, and move on to the practical application content. One item from that class that does need to be pointed out is the use cases in this post should leverage a closed large language model (LLM). To that point, it should only be open to the SOC (or teams authorized to have access to SOC documentation/work product).
In my discussions with software providers, many out-of-the-box options are being developed. The most common development is leveraging GenAI to create queries applicable to a particular investigation. For example, an endpoint alert fires for malware in your environment. GenAI reviews the alert and recommends certain queries that can be run to look for other infected systems. The SOC might be tempted to just run the GenAI suggested queries and populate the output in tickets for manual review. I would recommend caution with this approach until your LLM and prompts have been fully trained and vetted. Instead, have the analyst review the suggested queries prior to running them. This will add a layer of verification to the work output of GenAI.
Another application that my team has been working on goes after DLP alerts. DLP alerts can be some of the noisiest alerts for your SOC. There are limited solutions for how to handle them without tuning out true positives. When the overall workflow is reviewed, there are a few standout candidates for the application of GenAI. The manual workflow looks like this:
- An action happens that triggers a DLP alert. This could be the act of sharing a large group of files with a public email address.
- A ticket is created, populated with relevant data, and assigned to an analyst.
- The SOC analyst reviews the file names or the content of the files for sensitive information.
- The SOC checks the person’s role within the company.
- Previous alerts and tickets are reviewed to identify trends and prior actions.
- The ticket is updated with a summary of analysis and it is determined that this is expected behavior.
In my experience, this is usually the point where the analyst turns to the analyst next to them and asks when they can tune out alerts for “Bob from Enterprise Accounts.” It seems like every single month there is an alert for Bob sending a massive invoice, SOW, or other relevant document to an enterprise customer. Eventually Bob becomes the subject of a rather funny meme that is being routinely sent around the SOC. Of course, if the alerts are a regular occurrence, User Behavior Analytics (UBA) should eventually identify this as expected behavior using AI/ML, but there is a risk in automatically dismissing this activity as expected.
My team’s research has been into leveraging GenAI during steps 2-5. The relevant data from the alert, file names, content, user’s role, and previous tickets are fed into the LLM to create a summary for the analyst to review and decide how to proceed. Here is an example of what the prompt might look like.
Generate a detailed summary of Data Loss Prevention (DLP) alert with the ID DLP-2024-0917-001. Include the following information:
- Previous Tickets: Consider whether the user has had any prior DLP alerts or security-related tickets. Provide details on these tickets, such as their resolution and frequency.
- File Content: Summarize the content of the files that triggered the alert (e.g., sensitive information like PII, financial data, intellectual property). If multiple files were involved, include a breakdown of the content for each.
- User Role: Provide context on the user's role within the company, including their department, access level, and whether the flagged data is relevant to their job function.
- Past DLP Alerts: Summarize any patterns in the user’s previous DLP alerts (e.g., similar file types or destinations, repeated policy violations, etc.). Indicate whether the user is a repeat offender or if this is an isolated incident.
Ensure that the summary provides sufficient context for the security team to assess the risk and take appropriate action.
Here's a simulated output based on the prompt provided:
DLP Alert Summary
- Alert Details
- Alert ID: DLP-2024-0917-001
- Date: September 17, 2024
- Triggered by: John Doe (jdoe@company.com)
- Department: Sales
- Role: Senior Account Manager
- Access Level: Moderate - Access to customer financial data and sales reports
Previous Tickets
1. Ticket #12345 (August 5, 2024):
- Incident: Attempt to upload an internal sales report containing client financial data to an unauthorized cloud storage platform.
- Resolution: Warning issued; user educated on data handling policies.
2. Ticket #11234 (July 15, 2024):
- Incident: Sharing of a document containing sensitive pricing information with an external consultant via personal email.
- Resolution: File removed from external location, and user received mandatory data security training.
File Content
- File Name: Q3-Sales_Report.xlsx
- File Type: Excel Spreadsheet
- Content Summary: Contains sensitive client data, including contact information, financial performance, and internal notes on client strategies.
- DLP Policy Trigger: The file contains PII (personally identifiable information) and financial data flagged by the DLP system when an attempt was made to transfer it to a personal email address (johndoe.personal@gmail.com).
User Role Context
- John Doe's role as a Senior Account Manager requires access to sales reports and customer financials to manage client relationships and prepare proposals. However, the attempt to transfer this document to a personal email address suggests a potential violation of internal data handling protocols. His role does not justify sending sensitive data externally via unapproved methods.
Previous DLP Alerts for John Doe
- Frequency: This is the third DLP alert in the past 3 months.
- Pattern: Previous incidents have involved attempts to share sensitive data through unauthorized channels (cloud storage and personal email). Each alert has been associated with client-related financial information.
- Escalation: The pattern of behavior suggests either a lack of adherence to company policy or deliberate attempts to bypass security measures. Further investigation and potentially stricter disciplinary actions are recommended.
Recommendation
- Given the recurrence of similar incidents and the sensitive nature of the data involved, it is advised to:
- Conduct a detailed investigation into whether this is intentional data exfiltration.
The output could be populated into the ticket, But if the output is inaccurate (due to bias or hallucination) it might be better to send the response to another medium. This assumes that tickets are considered a source of truth and immutable. Consider sending the output via email, a shared document or Slack to the assigned analyst for review. This is an important step. A human should review the output from GenAI before any action is taken or the output is documented. Even with the time taken to review the output, significant efficiencies can be realized.
As mentioned earlier, this work should be done using an LLM that is isolated to conduct SOC (or SOC adjacent) work specifically. These alerts and data will train the LLM and improve capabilities for the SOC but if not restricted to the SOC, this could lead to an exposure of sensitive details to people not authorized to see it.
There are many other applications for GenAI in a SOC, CFC or infosec team. Stay tuned as we continue to explore how to leverage these tools to improve the work we do.
To learn more about concepts like practical applications of GenAI in a SOC, check out our SANS course, LDR512: Security Leadership Essentials for Managers. Learn more about the blog author and LDR512 instructor, Shawn Chakravarty, here.