SIEM, SOAR and Security Analytics

Video Transcript

Less than 4.2. Sim Sore and Security Analytics. The objectives for this lesson include understanding the best practices when using a security information and event management or SIM tool. Understand the best practices when tracking events and incidents within your environment. Discussing the best practices and importance of tuning alerts and the consequences for alert fatigue, understanding security analytics and how they impact the ability to detect and respond to a cyber incident. And, finally, how to use security orchestration, automation in response or sore. For a certain team, a SIM tool is one of the most powerful tools you can have an incident response. In fact, I would argue that it's one of the first tools you should consider implementing if you don't have one already. When I came into a cybersecurity program once and was asked to rebuild the program, there was no simple in place whatsoever. Essentially, they were alerted on things through Lennox scripts and some other custom things, but they had very little visibility into what was going on.

‍

After investing in a SIM tool and tuning it and getting agents out to report information into the SIM, we increased our visibility into endpoint and network activity and events by 1800%. The amount of information we were getting, the ability to respond and detect two things was completely different than before. A simple allows you to correlate and aggregate logs from a variety of sources. So just imagine having a database, if you will, of all the information coming into the SIM tool from in points firewalls, I DS sensors and devices and servers and applications. You ingest all of that, and then you can correlate an aggregate and search across all the events in order to find out what's going on and to build a timeline of activity. One thing to be very careful of, though, is the time settings in the SIM tool. If you have ah, national or a global footprint and you've got systems and users coming in from different time zones, you can really cause problems or at least limit your ability to easily find events going on with the SIM tool.

‍

So I always recommend to normalize all of the time information into UTC or GMT. Same thing Zulu time. They all mean the same thing, but essentially universal coordinated time where everything is set to that. In that way, when you search an activity, it doesn't matter what local time zone those devices air in. It will search like time all across the data that you have That's one of the really important keys to a sim tool. Also use checklists to ensure that data feeds dashboards and alerts are working daily. I remember one time before I'm one of the reasons I learned to do checklists, and before we had them was we had firewall logs coming into our SIM tool, which is a great idea. Gives you a lot of visibility, can really create some neat searches and dashboards based on that. But after a couple of days, we realized that we weren't getting our normal reports or alerts, and you never want to be in a position of thinking whom I haven't seen such and such alert in a couple of days on That's what the position we were and come to find out.

‍

It had upgraded the firmware on the firewall devices and that firm where update changed the structure of the logs just enough that it broke all of our searches. So actually the data was still coming in to our Sim tool, but our searches and fields weren't working anymore, so as a result, we had to go in and customize and customize, really. But change our tags and fields in orderto have those working again. This is a great example of Thekla collaboration that I, T and cybersecurity need toe. Have I mentioned earlier in this course I was going to give some stories about that? And this is just one example of why security and I t really need to be joined well together and understand what's going on within each other's environment.

‍

So if that would have been done under a change management process and cybersecurity was part of that process, it would have been easy to know ahead of time. But what this taught me was really the need to have a dashboard, if you will, or a checklist that says OK, I expect to get data from endpoints, Windows systems, Mac systems, Lennox systems, Windows servers, Lennox servers, firewalls, I ds all these different sources, and it's very easy and simple rules to look to see the last date and time that an event came in via each source or source type that you get information from. So I quickly created a checklist.

‍

I made it part of our daily processes within the I R team to go through in the SIM and make sure that we are getting the data that we expected and that the fields and tags are still working like we hope them. Teoh. So that's just a anecdotal, real life story there, but something you might consider. And then along those lines, you really need to have somebody own the SIM tool. They need to be the system owner, the person that's responsible for making sure it's patched and updated, that all the alerts, air working and that any maintenance or operations that needs to be done on it is in fact, done along the lines of SIM is another thing to consider. That's also very important, and that's how do you track events and incidents going on in the environment? You you need some way to track what's happening. Who are the repeat offenders? What are the I P addresses that keep coming up? Do you have historical information on users and systems that you may need to reference during an another incident response.

‍

Eso you should be tracking incidents and events. Remember the definitions air critically important? So make sure you have those define what are the sources of alerts. One thing that I used to track was what's the cost per alert and false positive rate? So, for instance, if you have a device that you get two alerts a month from, but you're paying $125,000 a year for this device, you can quickly do the math and figure out what you're paying per alert. Now, if you layer on top of that the false positive rate and if you find out one out of every two is false positive, that's a pretty expensive tool to be paying for. Then you're not getting a lot of return on your investment, but you need to look at is attuned right? Or using it right? Is it implemented correctly? So it's not just this, but it is something to look at, and it is a way to refine your security portfolio and make sure you have good, actionable information from tools that are solid. If you have a tool that's giving a lot of great intel and you realize you could use it.

‍

Mawr if you just had the time and money, maybe you scrap one of the tools that isn't as effective and invest more in the one that IHS again. You know, making sure you tracking I p address information. Historically also being able to add commentary and workflow for incident responders. Assigning tickets, making sure that people are working on the right things. You've got him triage and you know what the comments are, the dispositions of investigations, etcetera and then notifications of assignments. If I, as the certain manager, assigned you as an I R member to go do forensics on something or to be the lead investigator, I should be able to know Tate that and then automatically sends you information that you have a case assigned to you. One thing I will say back on this slide here is there's a link at the bottom of the graphic I have for Scott Sandia Cyber Omni Tracker. It's an open source project by Sandia National Labs, part of the Department of Energy and National Nuclear Security Administration's lab system.

‍

The link is there to get this. It's a free tool to be ableto manage your incidents and do all the things here that I just discussed. Also, be careful about the amount of noise that your analysts are exposed to him by noise. I just mean alerts and frequent alerts, especially if you know you have high false positives. We see this all the time. It's no different than the fire department or the Police department going to a building that continuously has false alarms for their fire alarm or false burglar alarms. After a while, people either don't respond at all or they get lazy. And instead of sending five engines, let's just send one to go check it out. Well, one of these times you show up in the whole building's on fire, and now you have to wait for everyone to show up. Same is true for these.

‍

You have to be careful. Make sure you have that owner of the SIM Tool making sure that alerts are valid, that if they need to be tuned to cut out false positives, you're constantly doing that because you don't want an analyst to just say I've seen that alert 100 times. I'm not gonna worry about it if we look at like the target breach a couple of years ago. This is what happened where there was a fire. I alert it was ignored because it had been a problem in the past. And actually there was a breach, obviously. So make sure that you're taking that into consideration. Look through things like poor prioritization and loss of focus. If you're seeing that from analysts or also just inefficient systems or engineering problems that you can resolve to make the information for your analysts more actionable with security analytics, there are a few things to look out for when you have somebody. Maybe he's trying to sell you on artificial intelligence for cybersecurity or machine learning.

‍

There's certainly some vendors out there that claim that, and there are some tools that do a pretty good job with some of this. But I want to just walk you through the four major items from a Security Analytics standpoint to be aware of. First, the first type of data is considered descriptive analytics. So this is really what happened in the environment. Some information that you get for descriptive analytics, maybe from an intrusion detection system, a sim tool or threat intelligence the next type of data is diagnostic. Why did this happen? So this could be network traffic analysis or user entity behavior and analytics. Things like looking at full packet captures, or looking at traffic analysis to determine why a certain event was either allowed to happen or actually did occur. Predictive analytics are things about what will happen, and this is really a fraud. Detection is a great example of this. So you probably have been the recipient of some sort of a message text message, an email from your bank saying, Is this a legitimate charge?

‍

You know, click this for yes or click this for now, or something along those lines. Did you actually log in to this account? It's looking for certain signs that don't look right, and it's able to them detect in. Predict that there might be fraudulent activity about to occur and then, finally, prescriptive. And this is what should I do about it? And this is where a device takes action on its own or software tool does. And one example of this is sore security orchestration, automation in response, where it's already been programmed with playbooks, and if it sees certain activities, it takes action automatically. There's several approaches for anomaly detection and assert team I'll go through. These one is rule based, and we all know this is Does it match a. I. D s rule? Does it match an anti virus signature? Do have threat intelligence on this. Are there statistics to help us determine whether or not it's possible for an account toe log on 30 times within two minutes from five different workstations? For example, Historical analysis is another way to get data and identify anomalies.

‍

Things like activity scoring. Josh normally logs in, Let's say, from California. But today we see him log in from Texas and then within an hour of logging in from Texas, he logged in from North Dakota, and there's no way to get from Texas North Dakota in an hour, so this can't be legitimate. His account must be compromised. Things like that. Machine learning is the third type, and it's really looking at learning the system, learning based on the information you give it and making decisions based on that is this good? Is this bad eso There's detection algorithms and then supervised and unsupervised learning within machine learning. There's a range of analytics toe Look at for incident responders. First is known knowns. These are, for example, lists of, um, websites, DNS entries I P addresses that are good. They're OK for the organization to go. Teoh Correlation rules I p addresses or you are all matches. So these are things that we just we know because we have the part of the information we know bad DNS names. We know bad I p addresses. We have lists of devices that are okay on the network, that sort of thing. Then we have known unknowns.

‍

This is where machine learning can help us out or on the in those we have really two categories of machine learning supervised, which is usually what we see with vendors. Now, as these air known bads, we've taught the system. If you see these kinds of activities, that's never a good thing. And then unsupervised is really looking through a bunch of data to try and find anomalies within that data. Because the system learns I normally see Josh do this. This and this. Ah, no. I've watched him do that for the last month. But now this one time he did something different. Let's flag that for review, so you're not teaching it that it's for sure. Bad, but you are teaching it, or it's learning on its own. Something's different here, and you need to take a look. Then you just have unknown unknowns. This is where threat hunting can help you, but also some deep learning software labelling data as you learn it as to what it might be.

‍

And there's models that can help you extract information. But this is all types of analytics, and especially on the unknown unknown side. It's not very mature, and it would typically fall into things like artificial intelligence to help you determine what might be bad versus what's okay. Activity on the network. Now I've talked about sore a couple of times. Usually this is something we just see in really mature organisations. But sore allows you to integrate with other security solutions, so there's typically a P I calls between different devices on your network. It can push or pull, and there's also the ability to abstract tools, and it doesn't require analyst to be experts in everything. So if you have multiple tools available to you, firewalls and gateways and proxies and I ds I PS and SIM. Having sore can help make things a little bit easier when you're trying to take action. It can help standardize your processes within incident response, and it also gives you the ability to create playbooks.

‍

So some quiz questions. How would a SIM assist a cert in detecting and responding to a cyber incident, a correlation and aggregation of logs from a variety of sources? Be full automation of the i R process or see scanning hosts for vulnerabilities. The answer to this is a correlation and aggregation of logs from a variety of sources is how it would help incident response teams. Next question. Why is alert fatigue A concern for IR teams? A. Alert fatigue is not a concern. Be it may cause too many analysts toe Look at the same event. See, analysts may begin to ignore alerts because of too many false positives and miss an actual incident. The answer to this is See, I mentioned the example of the Fire Department not going toe fire alarms anymore or only sending a single engine and finding out. In fact, this time the building really is on fire. Same goes for incident responders.

‍

We need to make sure that people take alert seriously and if there's a problem with multiple false positives that somebody's tuning that rule to get rid of that problem. So in summary in this session, we talked about the best practices when using a SIM tool best practices when tracking events and incidents, the importance of alert tuning and the consequences of alert fatigue. I talked to you about security analytics and how they may impact the ability to detect an incident and a little bit about sore the security orchestration, automation in response used by certain and how you can set it up and talk to multiple different tools and actually take action. So, for instance, if you had a network device that all of a sudden was reaching out to a known bad domain, Soar could automatically take that device, move it to a different Virtual Land and LTD's communications, all without any interaction from an analyst whatsoever.

Intermediate

Course link:

Incident Response Lifecycle

Want to build on your foundational knowledge of cybersecurity incident response? By taking this Incident Response Lifecycle course, you will learn how to prepare an incident response plan, triage and categorize events, recover from cyber incidents, and effectively report on incidents to senior executives.

Instructed by

Instructor

Josh Moulin

Since 2003, Josh Moulin has been helping people and organizations ranging from small businesses to Fortune 500's and the U.S. military understand complex technology and cybersecurity challenges.