Data Discovery

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Time
12 hours 57 minutes
Difficulty
Intermediate
CEU/CPE
13
Video Transcription
00:00
>> We talked about why you might want to classify data,
00:00
how data should be classified,
00:00
but how do we discover data within our organization?
00:00
We're going to talk about
00:00
the common data labeling methods,
00:00
data discovery technology, and some of
00:00
the benefits that come with
00:00
>> these data labeling methods.
00:00
>> In terms of data discovery,
00:00
data discovery can reference when an organization wants
00:00
to ensure that its data is properly classified,
00:00
and it may employ a number of
00:00
these strategies for identifying data within
00:00
its organization where it may reside in order
00:00
to make sure that there are no undue risks
00:00
to data exposure that may result as
00:00
a result of data not being properly
00:00
classified to begin with or labeled.
00:00
There's also a reference to eDiscovery.
00:00
If there's litigation and it involves
00:00
a Cloud-based service, legally,
00:00
you may be forced to produce certain types of
00:00
data and label it
00:00
accordingly so that can be used
00:00
>> in the legal proceedings.
00:00
>> There are really three main methods of data labeling
00:00
that are discussed in the CCSP.
00:00
The first is label-based.
00:00
We've kind of talked about this before,
00:00
that we talked about the public versus
00:00
the private scheme for classifying data.
00:00
Well, when that data is created,
00:00
it should be labeled appropriately to
00:00
its level of sensitivity.
00:00
Then there's metadata-based data labeling.
00:00
Metadata refers to data about data.
00:00
In many computer processes,
00:00
when data is created or used,
00:00
a lot of data about the information
00:00
such as its location and
00:00
use in the program that was used to create
00:00
it are often created that are associated with the data.
00:00
This can be used to easily label the data as
00:00
appropriate and identify it if you need to
00:00
discover it later using
00:00
the various data discovery techniques.
00:00
Then there is content-based discovery.
00:00
Content-based discovery can be used,
00:00
done through a number of automated tools that scan
00:00
the contents of files and use algorithms to
00:00
match content to well-known phrases
00:00
or terminology that often
00:00
appears in sensitive documents,
00:00
and analyzing the content
00:00
can produce a suggested data label.
00:00
There are a lot of
00:00
very interesting data labeling software
00:00
that use this content-based strategy
00:00
>> and match all files
00:00
>> and a scan by the software to
00:00
well-known data content-based labels
00:00
that require protection and sensitivity.
00:00
Okay, let's reflect a moment.
00:00
How does your organization identify and labeled data?
00:00
This varies from organization to organization.
00:00
Those that are in highly regulated or
00:00
highly sensitive industries,
00:00
there should be a greater sense of awareness with
00:00
each piece of information
00:00
or data that's shared or handled.
00:00
Everybody should really be aware of what is this data,
00:00
and data labeling also creates awareness
00:00
amongst people that they
00:00
are handling something important.
00:00
Without data labeling, organizations face
00:00
a lot of potential risks related to
00:00
data being mishandled or sent
00:00
where it shouldn't be because it's not clear to
00:00
the individual handling the data what
00:00
their responsibility is given
00:00
the sensitivity of the information.
00:00
How can we leverage increasing importance
00:00
of data analytics to improve data security?
00:00
As I said before,
00:00
the metadata, the data around data,
00:00
it's creating even more importance
00:00
and organization are catching on.
00:00
That if they have lots of data,
00:00
it can be mined for
00:00
insights that helped improve the business,
00:00
reveal different trends regarding
00:00
their customers activities and desires,
00:00
and that interests in
00:00
leveraging data to provide insights,
00:00
which is also referred to as data analytics,
00:00
creates an opportunity to improve security.
00:00
Organizations want to know where their data is,
00:00
what data they have,
00:00
and that creates an opportunity to utilize
00:00
data discovery to ensure
00:00
that data is labeled appropriately to
00:00
its sensitivity and protected accordingly.
00:00
That's a great opportunity for anyone
00:00
out there in the analytic space.
00:00
Partner with security and
00:00
make sure that data is not only used,
00:00
but properly labeled and secured as well.
00:00
In summary, we talked about the importance of
00:00
data labeling and data discovery techniques,
00:00
and we talked about the main data labeling approaches.
00:00
I'll see you in the next lesson.
Up Next