3.1 Introduction to Data Processing

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Time
4 hours 42 minutes
Difficulty
Advanced
CEU/CPE
5
Video Transcription
00:00
Hello and welcome again to the advance it Cyber Threat Intelligence Course. This video is an introduction to a second module data processing.
00:09
In most of the cases, the data that we collected from multiple sources comes in various formats and this is you two different nature off sources. In other words, we are combining two or more data sources, including internal and external, or finish it reports and threat feats. This combination is a necessity
00:29
to keep an eye all the full picture or the full threat landscape.
00:33
But you will want to make sure that you don't generate the public. It alerts. This is why going through the processing phase is essential.
00:43
In this short video, we will introduce the data process, increase the different phases involved in the processing of data And why is it important for threat indulgence?
00:55
Let me start with a quick definition.
00:58
Data processing is the transformation off the collected data into a Fermat usable by the organization.
01:04
Almost all road data collected needs to be processed in some manner, whether by humans or machines.
01:15
Keep in mind that if you are collecting your data from multiple sources with different formats, then you only different approaches off processing
01:23
as the time consumed in obtaining the desired result depends our operations, which need to be performing all the collected data and all the nature off the output requirement to be update
01:36
at the high level. The most common approaches use it for automated processing today include basic patterns such as regular expressions toe. Identify data that is or is not of interest,
01:49
statistical or probability. Algorithms toe identify things which are or are not similar.
01:57
Mention learning algorithms to provide statistical classifications around where is or where is not normal or expected or natural language processing off a human produced text to extract sentiment, intent, purpose, target or topic
02:14
when it comes to limitation. Evan With MACHIN Learning and Expert systems, there is still today no replacement for the human analyst on. Thus there is no fully automated way to produce high quality Tyler Threat intelligence. Now let's talk about a human based approach. In this method,
02:34
data is persisted manually without the use off the machine.
02:38
This reliance on humans as part of the process arises from the unique trait that you have over computers, our ability for adaptive reasoning
02:47
or in other words, our ability for problem solving and our ability to think late early
02:55
in cases off finishes reports is difficult to make software to automate extraction off indicators because some of them are non common items. Some reports may describe incidents without explicitly mentioning uses. So an analyst creates
03:13
http indicator based on this report, while a tool will probably will be unable to classify or no normalized
03:21
properly the threats.
03:23
As a result, threat intelligence analysts are able to go beyond what any fully automated system can do nowadays in terms of finding related events observable sze tactics, techniques, procedures and actors, while also providing valuable context and meaning to the business.
03:42
Data processing is a composed face, and it is considered combination off sourcing and filtering, normalization and storage and integration. Sorting and filtering is often refer to as pre processing, and it is the stage at which road data is clean it up
04:01
on organized for the following stage off data processing. Basically, if you are collecting data from several several sources,
04:10
you will need to make sure to eliminate bit bad data, including duplicates, incomplete or incorrect data. The second stage is normalizing, and here we are going to choose the standard or format that is the most suitable for our requirements. In other words, if the output
04:29
is an indicator that will be added to a watch list,
04:31
then the format should be compatible with the same solution used in our organization. For this threat, intelligence defined multiple standards to the square of threats and manipulate threat data.
04:44
By the end of this stage, road Data takes the form off usable information. The final stage off data processing is storing and integration. We are going to see this stage in more details in future video.
04:59
All of these stages can be done by a single software or a combination off software's, whichever feasible or record by your company. Nowadays, more and more data is collected from multiple sources free and paid ones, including network traffic files, malware samples
05:16
and said son boxing. The results finishes reports about incidents,
05:20
lists off email addresses used for fishing campaigns, malicious domains, malicious I p ease, et cetera.
05:28
Dealing with non process of data is time consuming, and sometimes it's difficult or even impossible for analysts to correlate events and make assessment on Lee based older Oh data.
05:42
This is why processing off collected data is really, really important. This is all for this introduction. In this video, we saw definition off data processing, some approaches off, processing the different stages off data processing. And why is it important?
06:00
This video was a quick introduction to the second module data processing. In the next lesson, we're going to discover together some examples off common standards used in cyber threat intelligence.
Up Next