Finding the Right Balance in Cyber Security Operations

February 15, 2017 | Views: 3792

Begin Learning Cyber Security for FREE Now!

FREE REGISTRATIONAlready a Member Login Here

Estimated reading time: 4 minutes

Security monitoring is a complex business and if the organization has a global reach then the size and network architecture add to its complexity. Apart from hunting for threats which utilize known Indicators of Compromise (IoC), the logs generated by computer systems and networks can be examined for anything suspicious. However, the number of logs that systems of a global organization generate on a daily basis are staggering. These logs have to be analyzed by a rule engine that converts them into incidents based on some logic that the rule specifies. These incidents are then analyzed by security specialists to determine if they are indeed incidents or are false alarms. This is where the real challenge lies i.e. how many incidents are generated and more importantly how many of these are actually True Positive (TP).

Every incident that is created based on a rule has to be analyzed by security professionals and consumes time and resources. It is, therefore, essential that each incident has a high detection/success rate. If there are too many False Positives (FP) then the analysts can be overwhelmed by their volume which can cause actual incidents to slip through as they are not dealt with straight away. However, if the threshold is raised so that fewer false positives are generated then there is a possibility of overlooking something potentially malicious by the rule engine and classifying them as harmless. This case i.e. classifying an actual incident (malicious) as a harmless incident (False Negative) is known as a Type-II error which could have very serious consequences and should be avoided at all costs. Similarly, a harmless or benign incident (Not Malicious) could be flagged as an actual incident (False Positive) and this is known as a Type-I error and is not serious as the most that can happen is that it is investigated by an Analyst and rejected. This is shown in the Figure below:

These parameters are calculated as:

  • True Positive = No. of Harmful Incidents Detected Correctly / Total No. of Harmful Incidents
  • False Positive = 1- Specificity
  •  where Specificity = No. of Benign Incidents Identified Correctly / Total No. of Benign Incidents

A balance has to be reached between the acceptable False Positive rate (FPR) and the True Positive rate (TPR). But what is the right balance? This can be best understood with the help of a ROC curve, ROC – Receiver Operating Characteristics, was first used by Electrical Engineers for measuring Radars performance during World War II. The ROC is a plot of the True Positive Rate (Success Rate) vs the False Positive Rate. In the figure below the blue diagonal dotted line represents the characteristics of a random guess (or a coin toss) with a 50/50 chance for each outcome i.e. equal probability for classifying an event as a malicious or benign incident. The performance of any rule or method can be plotted on this graph once it’s True Positive and False Positive Rates have been calculated. Therefore, as we move away from this diagonal line towards the top left corner it means that the performance is improving and as we move away, below the line, towards the right bottom corner, the performance worsens. The top left corner represents the ideal or perfect detection score with zero false positives and 100% detection of targets. The bottom right corner represents NO detection and ALL False Positives. The ROC curve is shown below.
img-2The curve for each rule/method can be plotted over a period of time and under various conditions which will result in a curve. It is clear that any method/rule must have its characteristic curve above the dotted line as otherwise, it is useless as a rule and results in wasted effort. It can be seen that the points A and B are better at detecting actual incidents than points C and D. Point A has a very high TPR and a very low FPR whereas point B’s TPR is higher than its FPR and therefore is above the dotted line. Point C has a high false positive and a very low true positive rate whereas the point D has a very high false positive rate lower true positive rate i.e. lower TPR than FPR.

Finding the right balance in incident creation is crucial for running successful and effective Cyber Security Operations and it comes down to how effective the rules are i.e. how successfully they detect and differentiate between malicious events and benign events. However, the TPR should always be higher than FPR for any method/rule.

Share with Friends
Use Cybytes and
Tip the Author!
Share with Friends
Ready to share your knowledge and expertise?
  1. nice article..

  2. Nice! Enjoyed this article, especially the ROC Curve graph. More and more I need to see and “visualize” data. Thanks for this!

  3. Great information you have here.

  4. Very nice statistical analysis of threats over a medium to long period of time. I never thought about this before… We learn so much here eh?

Comment on This

You must be logged in to post a comment.

Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

Cybrary On The Go

Get the Cybrary app for Android for online and offline viewing of our lessons.

Get it on Google Play

Support Cybrary

Donate Here to Get This Month's Donor Badge

Skip to toolbar

We recommend always using caution when following any link

Are you sure you want to continue?