3.2 Parts of Splunk

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Difficulty
Beginner
Video Transcription
00:00
>> In this video, we'll discuss the parts of Splunk.
00:00
First, we'll talk about the data pipeline,
00:00
then go into Splunk components
00:00
and have a little bit of a discussion
00:00
on distributed versus
00:00
non-distributed versus clustered environments.
00:00
The data pipeline as Splunk sees it,
00:00
is made up of input,
00:00
parsing, indexing, and searching.
00:00
The input part of that is just what you might think,
00:00
Splunk is getting data.
00:00
At this stage, there's also metadata
00:00
added like source, host,
00:00
and source type, but the main focus is getting input,
00:00
those data coming in.
00:00
Each of these stages
00:00
corresponds to different actual Splunk components.
00:00
We'll talk about each of these,
00:00
but input goes along with forwarders,
00:00
universal or heavy forwarders
00:00
and it can be done at the indexer level.
00:00
At the next stage, we have parsing.
00:00
Data is getting turned into events at this stage.
00:00
This could be line breaks happening or
00:00
data being transformed based on certain rules.
00:00
This can occur on an indexer or heavy forwarder.
00:00
Indexing is taking the parsed events
00:00
and putting them into an index for later use.
00:00
When you get to the searching stage,
00:00
there's some interaction between
00:00
the search head and indexers.
00:00
The search head is responsible for search management.
00:00
This is where you would go to run a search
00:00
and that search request gets sent to an indexer.
00:00
Then the results get sent back to
00:00
the search head for you to view and work with.
00:00
At the searching level,
00:00
you can have scheduled searches,
00:00
alerts and, dashboards.
00:00
Along the side here is a common setup you might see.
00:00
I have UF for a universal forwarder.
00:00
You can think of a universal forwarder
00:00
as something like an agent,
00:00
maybe installed on the server and setup to
00:00
collect Windows event logs as getting input,
00:00
taking those Windows event logs and
00:00
sending them onto the indexer.
00:00
On the indexer, you have
00:00
parsing indexing is taking the data,
00:00
breaking it into events,
00:00
and organizing it in a place
00:00
the search head can easily send requests for.
00:00
Search heads are what users typically interact with.
00:00
They perform search management.
00:00
You can, for example,
00:00
go into this box and run a basic search,
00:00
which the search head then distributes as a request to
00:00
the different indexers and
00:00
then displays the results form.
00:00
You can do things like have custom dashboards,
00:00
alerts, and reports.
00:00
Indexers receive indexed and stored data.
00:00
They can provide the search head with needed information.
00:00
There's a bit of complexity around the term index,
00:00
so I wanted to break down the different definitions.
00:00
Index as a noun is a data repository.
00:00
By breaking up data into different indexes,
00:00
you can improve performance,
00:00
apply different data retention policies,
00:00
and limit access to different sets of data.
00:00
For example, if you're collecting firewall logs,
00:00
you may have an index titled firewall logs,
00:00
where logs from multiple firewalls gets stored.
00:00
This can make it easier to limit
00:00
your searches to the type of information you're
00:00
looking for and if you have another team, say,
00:00
a help desk, that needs to look at authentication logs,
00:00
but maybe not web traffic,
00:00
you can easily limit them from viewing this data.
00:00
If you need acute firewall logs
00:00
for a set amount of time for audits,
00:00
you can specify this on the data retention policies.
00:00
Index as a verb is the processing of raw data,
00:00
as in taking the data and handling and organizing it.
00:00
An indexer is
00:00
a particular Splunk instance that indexes data.
00:00
This sentence might help you
00:00
remember the different meanings.
00:00
An indexer indexes data and puts it in an index.
00:00
Forwarders, like I mentioned,
00:00
you can think of them like agents you install on a host.
00:00
They send data onward.
00:00
There are several different types of forwarders.
00:00
A light forwarder is deprecated,
00:00
meaning there aren't newer versions of it,
00:00
but it does exist.
00:00
Universal forwarders are typically
00:00
what you want to install when possible.
00:00
They have a pretty light footprint and
00:00
mostly just work to send data onward.
00:00
You can do some filtering with universal forwarders
00:00
such as by blacklisting certain event types,
00:00
but if you want to do any more complex filtering,
00:00
you're probably going to need to
00:00
set up a heavy forwarder.
00:00
There are also different server roles.
00:00
We aren't going to talk about
00:00
them too much in this course,
00:00
but I want you to know they exist.
00:00
For example, things like a deployment server can help you
00:00
manage forwarders and send apps by groups.
00:00
Distributed environments are basically
00:00
ones where different components of Splunk are broken out.
00:00
The setup in this course will be
00:00
a simple non-distributed environment where
00:00
a search head indexer and
00:00
licensed master are all combined.
00:00
For larger companies or if you're handling a lot of data,
00:00
you'll probably need to separate these pieces out.
00:00
This is sometimes thought of as horizontal scaling.
00:00
As you grow, you can add
00:00
different parts to scale the environment.
00:00
With the idea of different deployment skills,
00:00
if you have a very small office working with
00:00
less than 20 gigs a day with fear than a 100 forwarders,
00:00
you could probably get away with
00:00
a non-distributed environment like
00:00
we're doing for this course.
00:00
For a larger company,
00:00
you're probably going to need a distributed environment.
00:00
Clustering is a more advanced topic,
00:00
but you should know what it is.
00:00
At a basic level,
00:00
it replicates data between different components to create
00:00
redundancy so that there is
00:00
duplicate data across multiple instances.
00:00
This is good to look at if you can't
00:00
have any downtime in your environment,
00:00
or if you're worried about disaster
00:00
recovery or the potential of
00:00
losing data. Question time.
00:00
A universal forwarder deals with
00:00
the blank part of the data pipeline.
00:00
The answer is input.
00:00
A universal forwarder helps to
00:00
bring data into the Splunk environment.
00:00
As a review, forwarders send data,
00:00
indexers turn data into
00:00
events and place them in indexes,
00:00
search heads send,
00:00
search requests, and display data.
00:00
A large company will
00:00
likely need a distributed environment.
00:00
For this course, we will set up
00:00
a simple non-distributed environment.
00:00
Clustering also won't be covered in
00:00
this course as it is a more advanced topic,
00:00
but you should know that it provides redundancy and is
00:00
a good option for
00:00
high availability and disaster recovery options.
00:00
In the next up videos,
00:00
we will install Splunk.
Up Next
3.3 Installing Splunk on Linux
3.4 Installing Splunk on Windows
3.5 Installing a Universal Forwarder
4.1 Data
4.2 Web Interface Tour