March 6, 2017
Security Plus - Compliance and Operational Security
March 6, 2017
Compliance and Operational Security 2.1
Based on CompTIA’s list of Security + exam objectives (their PDF list of domains is found here: ), this article covers the second domain, Compliance and Operational Security 2.0, with its first sub-heading (2.1).
This document is only an add-on to your normal avenue of Security + training. Here I try to use plain language and examples to help you understand the concepts, rather than diving into the technical details (which are a must for a security practitioner).
Best wishes on your exam!
2.1 Explain the importance of risk related concepts.
• Control types
Security thrives on control. These are the 3 types to know about. You need multiple layers of control, called Defense in Depth, so no one of these controls is sufficient.
- TechnicalThese controls can be hardware or software and are logical and technological. A physical lock is not considered technical, but encryption is. Firewalls and IDSs are technical, also. Technical may sound really technical, but it’s really just the stuff you use all the time to keep your network, data, and users safe and secure.
The focus is on decision-making and managing risk, typically by the organization’s managers and executives. There’s a lot of risk assessment, planning, written policy creation, and project management that goes on. This needs to happen because they’re the business leaders, and IT needs to support the business.
These controls are used by people (typically not the managers), and it’s not automated like the Technical controls. These controls include incident response, personnel security, and training. While they’re not automated, these controls can certainly have alerts set up so it’s not completely manual.
These controls are needed because the Technical controls don’t know about people and can’t turn on a dime, and Management is busy dealing with the overall business needs.
Operational control includes your own training by studying for Security Plus!
• False positives
Ever felt something crawling on your arm, then you realize it’s just a stray thread from your shirt? That’s a false positive – an indicator that something’s wrong when, in reality, all is well.
An example in IT would be apparent intrusive traffic on your network. I saw that one time and wondered why and what. With further investigation, it was actually one of our devs running testing.
The thing with false positives is that they show up on your radar – there’s an indicator. Not so with the next point.
• False negatives
Now that you’re used to stray shirt threads (see the previous point), you may not notice that you actually do have something crawling on your arm.
In IT, you don’t always get an indicator. Because it doesn’t show on your radar, it doesn’t get caught. This is where the importance of log monitoring, a form of Operational Control, comes in. In the end, a human needs to view the data to see if it’s really a threat.
Here is where data exfiltration can take place. If the level of exfiltration is so low that it appears to the Technical Control as normal traffic, it won’t be picked up.
• Importance of policies in reducing risk
If you don’t write it, it doesn’t exist! Your organizational leadership will determine what all policies you have, but there are some that you need to be aware of. The legal need for the policies depends upon your company’s industry and how competitive they want to be, but from a computer security standpoint, you need as many as you can put in place.
This is where a company defines how to handle information generated by employees, clients, customers, etc. You’ll see it as something like this: “Your privacy is important to us. We won’t sell your information to any third-party, nor will we disclose your information unless you’ve consented…” if you have a website that takes any kind of information from people, then you need this.
Be familiar with: PII (personally identifiable information)
- Acceptable Use
Aka AUP, this is found in pretty much every employee handbook. it stipulates how an employee may use the hardware and software to which they have access. Some examples: If you get a smartphone, you may and may not do such-and-such. If you use social media, you may not speak negatively about our company or competitors. If you use the internet, don’t do anything illegal. If you use e-mail, you may not mail bomb anyone.
- Security policyHigh-level and strategic, this type of document outlines how the organization protects its data, network, equipment, and people. An example is a privileged password policy that outlines who is allowed to have privileged access and what the password requirements are; these requirements are different than the typical company-wide user password policy, which is also a security policy.
- Mandatory vacationsThis isn’t simply or primarily for their rest, but for the company’s protection! Not everyone can or does this, but making people take time off is another layer of securing the network. If someone is doing something wrong, then the time off can assist in figuring out who is doing it. With this policy, the mandatory time off is either 1-week or 2-weeks.
- Job RotationThis ensures that someone can’t 100% hide their activities. It’s not haphazardly done but is systematic. And it’s done on purpose, not just to pick on a particular person. This mandatory vacation and separation of duties are good ways to prevent issues and track down troubles.
- Separation of DutiesHere you separate the critical business functions (e.g., print checks and sign checks). Lots of authority carries the temptation, or possible coercion, to do wrong things. In SMEs, it’s common to have people doing multiple duties. And in sole proprietorships and very small businesses, there’s no way around having a few people do many things or everything. But the larger the corporation is, the more it is able to reduce the number of people who have multiple roles.
- Least Privilege
This goes across the board (and it can upset people!). The CTO doesn’t get access to your HR files. And each manager doesn’t get access to Accounting files. And a manager doesn’t get access to another manager’s documents. With LP you get only the permissions that you need to do your job. It might be annoying, but in reality, it shields you from any kind of defamation where someone may say, “Alice is a receptionist, and I bet she looked at everyone’s yearly reviews.” With least privilege, it protects everyone.
A VERY important point (not for the test, but for real life): Once you implement any or all of these, it’s VITAL that you keep them up-to-date. It’s one thing to tell the auditors or attorneys “we don’t have that policy.” It’s completely other to tell them, “We have that policy, but we don’t A. abide by it, B. monitor it, or C. update it.” While they’re important, it’s better not to have one than to have one that’s not applied. So 1. be careful how you craft your policies, and 2. remain diligent in keeping up with them.
• Risk calculation
Your money is at risk. Your ideas are at risk. Your designs are at risk. Your data – all the stuff that’s written by, about, and for you – is at risk. So it goes beyond just money. Here are ways to calculate the value of your company’s data. Why is this important? In security, you can help the business in a couple ways with these calculations: you can help show your directors/executives/board: A) what such-and-such data/asset is worth (they may be shocked!) and B) show them rough numbers that “if we spend X amount of money, we have a much better chance at protecting Y and Z, and X is a lot cheaper than losing Y and Z.”
This is the imprecise probability that it will occur. E.g., it’s more likely that your data center will have at least one power interruption, however minimal or noticeable, in one or two years than it is that all 100 of your computers will die at once. Not precise, but important to think about.
This is the amount of damage done when a risk actually takes place (called “when a risk materializes”). As an example - If 10 of your computers failed right now, how much would it cost to replace them (also think of things like the cost to ship overnight)?
- SLESingle Loss Expectancy equals your Asset Value x your Exposure Factor (AV x EF).AV is the dollar value that the asset is worth.EF is how much of that asset will actually be lost if a risk materializes.E.g., You make an average of $10,000/hour with your websites (AV). If one of your websites goes down for an hour, you’ll lose 50% of your hourly gain (EF).So your SLE is 10000 x 50% , or $5,000.- AROThe Annual Rate of Occurrence is where you calculate how often something might happen in a year. Will at least 2 of your web servers go down one time each every year?
- ALEYour Annual Loss Expectancy (ALE) equals Single Loss Expectancy (SLE) x Annualized Rate of Occurrence (ARO)Keeping with the web servers idea: Your SLE is 5000, and your ARO is 2, so your ALE is $10,000It might be worth it to have 1 or more web servers handy to greatly reduce the downtime.
- MTTRMean Time To Repair (or Restore). This has 2 different ideas – both the time an asset will be out of service for repair each time, or the time it takes to get back to normal. It can be fairly easy to estimate. E.g., if you have a spare laptop in your IT closet, and you always have it prepped, then when someone in your office has a laptop fail, your MTTR might be about 15 minutes – as long as it takes to get that prepped laptop renamed and hooked in.
Mean Time To Failure. This is the average time that something takes to fail (maybe not completely, but something that degrades performance). It is, of course, an estimate, but estimates are necessary for planning ahead. A personal example is a laptop. In my experience, laptops get about 2 years of use before something (e.g., hard drive, optical drive) goes bad. It could be 1 year, or it could be 3, so I estimate 2 years. Another example is how long a network device (e.g., router) will go before it needs to be replaced. You might not need to have 1 hot device and 1 cold device right away, but maybe, based on experience and research, order a replacement in a year.
- MTBFMean Time Between Failures (MTBF) is used to calculate the expected lifetime amongst all of your devices. Maybe you expect one item to go out next year, but you have a device that’s been there almost a year and could need replacing in 6 months. And add the other devices. If you have a lot of devices, you might end up with an MTBF of 3 weeks, 3 days, 3 hours, or 3 minutes.
• Quantitative Analysis vs. Qualitative Analysis
Qualitative – Using Qualitative Analysis, there’s a lot of subjectivity in assessing risk likelihood and impact (low, medium, high) using impact and probability. When you assess something as a high risk, that’s when you definitely need to move to Quantitative Analysis for those items/areas.
QuantitativeQuantitative Analysis is objective in its analysis, usually using the dollar amount. However it goes, its goal is to provide an objective numerical value. You can see with a little calculation that it’s very hard, if not impossible, to give numbers to many risks. E.g., what are all of those spreadsheets worth? What’s will it cost to replace the building next year? Whose data is more valuable – HR or the Executives? What about replacing the servers, since that model is no longer available, and models and prices will certainly change even by the end of the year?
These are weaknesses in security control, and they are great in number. Lack of up-to-date patches, an open door, employees clicking on phishing links, default or no passwords on devices, an easily compromised motion-detection operated door, non-segregated guest wi-fi, USB drives planted by bad guys – the list goes on and on. And it’s your job to fix it all! You can use things like port scanners and network mappers to identify weaknesses.Be familiar with these concepts: port scanners, network mappers, protocol analyzers, password crackers, vulnerability scanners
• Threat vectors
These are the channels that a bad guy uses to get to you. They don’t have to be weak (those are vulnerabilities); they’re just the possible avenues that an attacker can get to you. 2 examples are your mobile device via SMiShing or email with phishing. It could also be tailgating.
• Probability/threat likelihood
Once you’ve identified the threats, how likely is it that those threats will materialize?
• Risk avoidance, transference, acceptance, mitigation, deterrence
Avoidance – if you want to avoid trouble with personal devices, then specify that no personal devices are allowed. Have you ever heard of those restaurants that were sued, in years past, because of really hot drinks that got spilled on people and burned them? Some of those restaurants later avoided that risk by no longer serving hot drinks.
Transference – an example is insurance (e.g., cyber liability insurance) – if there’s trouble, then you let someone else pay SOME of the damages. I emphasize SOME because no one can cover it all. Things like deductibles help make people take care of their things a little more responsibly.
Acceptance – This is risk that you have (thoughtfully) decided to retain. As a simplified example: If paying for insurance for your computers over 3 years is $500,000, but replacing them would only cost $300,000, then you might accept the risk (i.e., not buy any insurance) and be willing to save up and pay $300,000. This is where your ALE calculation can come in handy. If you expect that you might have 1 big loss every 5 years, then it’s a calculated risk (it’s not a gamble if you’ve thought it through carefully) that retaining or accepting the risk will be the better deal.
Mitigation – This is reducing the likelihood or impact. To mitigate flooding, find a data center on high ground; to mitigate someone getting data off of stolen hard drives, use drive encryption. To mitigate network attacks, your firewall’s inherent Deny All works well, forcing you to specify what ports are opened and what traffic is allowed.
Deterrence – This is very much like Mitigation above, but its difference is that it’s an obstacle to kind of scare people off – something like a guard dog or a barbed wire fence.
• Risks associated with cloud computing and virtualization
There’s a popular saying: “The cloud is just someone else’s computer.” Think through what it means to have your data, your hardware, and your software in someone else’s hands. You’re placing a lot of trust in that third-party, so you need to make sure that you have assurances that they’ll keep those things as safe as possible.
How reliable is that cloud host regarding recovering from downtime (e.g., when a controller goes bad)? What about outages (power, internet connection from your office to theirs)? How do they protect from hackers/crackers (someone gaining access to the hypervisor)? How often does that third-party perform updates to hardware/software, firewall, IDS/IPS, and other things? Do you have any control over the firewall or IPS? It’s possible that the third-party has access to your encryption keys. You’ll want to look for things like ISO 27001 compliance when looking at cloud security.
• Recovery Time Objective and Recovery Point Objective
The RPO – Recovery Point Objective: How much data can your org lose before it cannot recover from disaster? It’s part of your business continuity plan, and it measures the amount of data lost. E.g., If you backed up your data at 2 AM, and the disaster happened at 2 PM, then you lost everything generated between those hours. Is that allowable? It depends on the size of your org, but someone needs to decide how many lost hours of data are acceptable.
Think of it this way, sort of a time-warp idea: At what point in the past is your org able to restore its data to in the present without drastically affecting its business?
The RTO – Recovery Time Objective – How long should it really take for your org to fully recover from a failure? This is different than the MTTR (mean time to restore/repair). MTTR is just an average for, typically, a single device (e.g., your web server). MTTR is part of disaster recovery, while RTO and RPO are business continuity*. In business continuity, your org needs to have a set time for the max amount of hours or days when everything will be back up and running. It might occur in phases (remember, there are all kinds of scenarios): CS available in 15 minutes, email up in 30, management at home and online in 45, vital systems and data recovered in 4 hours, etc.
*Here’s a good article that describes the differences between Disaster Recovery (DR) and Business Continuity Planning (BCP):https://www.isaca.org/Groups/Professional-English/business-continuity-disaster-recovery-planning/Pages/ViewDiscussion.aspx?PostID=72