Problem Management

FacebookTwitterGoogle+LinkedInEmail
Description
This unit covers problem management. Problem management is about how a response is provided in a timely manner relative to defined procedures such as escalation. Problem management is needed to manage the following:
  • Procedures vs actual work
  • Inefficient and ineffective controls
  • Acceptable use policy violation (AUP)
  • Job Accounting
  • Training
This unit also covers the incident handling process; focusing on the phases and who is involved and the IS auditors role in the handling of incidents. Participants will also learn about digital forensics, which examines computer data in a laboratory. [toggle_content title="Transcript"] Now let’s move on to problem management. This is basically talking about the idea that when issues arise, which they always do, that the organization can deal with the resolution in a timely manner. The more mature the organization is, the more experience it has with resolving various problems, the more efficiently it should be able to deal with problems when they crop up. So when do we need problem management? It should be needed at all times, obviously, but some times more than others. We want to think about analyzing our procedures, for one thing. If you’ve gone to the trouble to document the job responsibilities and roles and what those people do and how they do it, we want to make sure that that’s a good alignment with what was documented and what actually gets done. If you’ve got great procedures but people aren’t actually doing the job as it’s described, then there’s a disconnect. There’s some failure there that needs to be analyzed to understand why this happened and what to do about it. Sometimes we implement controls that are inefficient or ineffective that only might be discovered during some kind of risk analysis or an audit. When those controls are discovered that are not meeting the objectives that it was designed for, then something needs to be done. So having a problem management methodology is where we would want to point ourselves. What about acceptable use policies? This could be something as simple as someone spending too much time surfing the Internet for their own personal interests, or reading personal email taking up too much of their time, wasting time on Facebook and so on. When those policy violations happen, what is the result? What are the consequences for the violator? We need to think about job accounting. This is a job as in a processing job, not as in someone’s role. So if we’re running a job like payroll, for instance, and it doesn't complete or there are problems with it, how do we deal with those exceptions? This ties-in with the exception report, of course. We also want to think about how the organization responds when there’s an issue during one of its critical jobs or critical processes. Of course training can’t be underestimated or ignored. We need to know that all of our staff are trained properly for the roles that they find themselves in. If there’s job rotation then training needs to address that as well. You can’t expect somebody to rotate into a different position that they haven’t been trained for. So that’s a known quantity as far as training budgets and timelines. One of the most important types of training that everybody gets, regardless of their job, is security awareness. This is a vital part of the organization’s staff. Everybody needs to be aware of the different pitfalls and problems that come and go as a normal course of doing business. What about incident handling? We’ve got an incident response team. Most organizations do. We have to think about whether or not that team has the right form and function. It could be a centralized team, if the organization is of a certain size. That might make the most sense. You might have various offices scattered around the country or even the globe, but your incident response team might be based at your headquarters. Or you could have a de-centralized team. So each office might have their own incident response team and then they all report to the central management area, or the headquarters, if you will. There needs to be coordination and good communication. Overall when there’s an incident, we have four main phases of activity. We start with preparing for the incident response, getting as much information as possible about what needs to be done, what the options are what the contingency plans are, and so on. Then we move to phase two where there is detection of an incident, and some initial analysis is done. It could be that the detection is done by a user and not by some automated monitoring program. In either case, once the incident is known, then the analysis begins. Some cases the incident needs to be contained: in the event of a worm or virus outbreak, for instance. You want to isolate that system, make sure that the problem doesn't spread to other areas of the organization. Then, once it’s isolated, we work to eradicate the problem or mitigate the problem and then try to recover from the issue in as quickly a manner as possible. Lastly we can’t forget about our lessons learned or post-incident activities. If it’s a larger incident especially, the organization might have several meetings discussing what happened, did we perform as expected? Where are the areas where we could use some improvement? Organizations that do this step properly will continue to grow and mature to provide more efficient use of their resources in the future. So why would an auditor be interested in incident handling? One of the most important things to think about right off the bat would be how long did it take to respond? When the problem was discovered, how long did it take before the help desk was notified? How long from that point was the first level of support notified? Your first level engineers or second level support or third level developer support? These are all different metrics that might be generated in the course of dealing with an incident. seeing the trends over time helps the auditor understand whether the business has the right policies and procedures and training in-place. What about the members of the incident response team? Are they formally appointed or is it an ad-hoc group that just comes together when needed? Do they have professional training for their different roles, or is everyone just sort of a subject matter expert on the needed areas and they just do the best they can? These are questions that need to be answered in order for the auditor to fully understand the incident response team’s abilities and capabilities. Moving on to digital forensics, this is an important aspect to an organization, especially when there are problems that relate to hacking or incidents that happen due to disgruntled internal employees, privileged insiders and the like. When this happens, one of the first things we think about is the acquisition of information. We want to get it from the best possible sources, meaning that we want first-hand direct evidence, not something that was second-hand or indirect evidence. We have to pay attention to those data sources that are volatile. So if you’ve got a system that was involved in a hacking attack, or some kind of illegal activity, we don't want to turn it off, right? Maybe we unplug it from the network, or disable the network connection, but we want to leave the system running so we can get an image of memory. Maybe then the system can be imaged so we can get a bit-by-bit copy of the hard-drive to capture all of the relevant information that might be needed for an investigation. Non-volatile data, of course, is on the long-term storage. Like the internal hard disk. Then the information gets examined after it’s acquired. One of the important things about forensics is gathering the information correctly, dealing with the chain of custody correctly so that when we get to the point where it’s being examined, we know with certainty that none of the information has been tampered with or changed in any way. It has integrity. This is critical. That’s why we always work with read-only copies of the information. If you’re analyzing the information in any case, you should be using a copy if possible. We wouldn't want to do analysis on an original hard-drive that was involved in some fraudulent activity. You make a copy, you do the analysis with the duplicate. All of the contents of the memory, any network connections any files and folders and directories that are involved need to be considered. All the log files, transaction logs, application related logging. There could be temporary files that get generated during the course of using an application or a system itself. Then we think about the utilization of the results of the analysis. How does this piece together to reconstruct what happened? That might be needed for a legal prosecution, or it could just be performed to understand what went wrong and how to prevent it from happening again. Finally, there’s a review process. Looking again for lessons learned opportunities to see what went right, what went wrong, and where there’s room for improvement. [/toggle_content]
Recommended Study Material
Learn on the go.
The app designed for the modern cyber security professional.
Get it on Google Play Get it on the App Store

Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

Cybrary On The Go

Get the Cybrary app for Android for online and offline viewing of our lessons.

Get it on Google Play
 

Support Cybrary

Donate Here to Get This Month's Donor Badge

 
Skip to toolbar

We recommend always using caution when following any link

Are you sure you want to continue?

Continue
Cancel