Index Structure

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *

Already have an account? Sign In »

6 hours 3 minutes
Video Transcription
Hello and welcome back to the Splunk Enterprise Certified Administrator course on Cyber A. This is the beginning of module five where we'll be discussing indexing. So this is gonna do a deep dive into essentially everything you need to know about indexes dot com What an index is how we use it. Teoh basically restrict
access and also to apply data retention
requirements on the data that's coming into Splunk.
So this is a follow up to our configuration files module. So now we're getting into specific configuration files specific area of Splunk that we need to configure and indexing makes sense to set up as one of the first things because
it really prepares your environment to receive data. If you don't have your indexes
well thought out and created in advance, then you don't have anywhere to put data. And if you don't have data, you can't do any searching or
anything else, so it makes sense to me. Toe put it
in this position in the course.
So let's jump into the first lesson for this module 5.1 where we'll discuss the structure of an index.
So the learning objectives here are gonna be to cover what the purpose of an index is. Understand what components comprise and index and then defining List three types of buckets, which you'll figure out what buckets are as we go.
So why are we learning this? So like I said, this is really important because we need to understand what index is before we define an index or define our indexing strategy. And
as I said previously, basically, it just makes sense to do it now because it's a core step in preparing our environment to ingest and receive data. Also, your indexes need to be in place because that's one of your metadata that you're gonna associate with
incoming data. So you can't even set up your foreigners until your index is already exist, since you need to specify them.
So what is an index? Essentially, it's just a logical storage location for data. You're just making a name and assigning it some attributes to define how long it holds data. You know what, Dad? It's gonna end up holding who can access it,
but really, all that translates to is a
folder, a directory on disk for the indexers, and then you will be specified as like a metadata field for incoming data to just tell it. Hey, where should where should I write this data? This index and an index is defined by indexes,
uh dot com.
So that's the configuration file that will be talking about later on to explore more about indexes. And then I just put into know which already quickly went over. Basically, your index will be specified in inputs dot com or via props and transforms to tell Splunk where to store
any data that you're ingesting.
So what is an index do it stores data. It also groups like data. So,
basically, you've got multiple static types in a given indexes. What I mean by grouping data
specifies our attention periods. So, on an index, by index basis, you specify
basically how much data you want Splunk to store in a given index. And how long you want to store that data for those air to separate configurations?
Um, and basically it decides
the term that data stays and Splunk, for it also allows you to restrict visibility. So you condone Breakout Index is based on.
You know how sensitive
data is group the most sensitive data into a single index or multiple indexes and only allow, like security personnel or other people who absolutely need access to that data to see it.
So what makes up in index? So, like I said, an index is just a directory on disk. And then within that directory there's going to be a number of buckets, which is basically just a subdirectory that stores data based on time.
So certain windows of time,
um, events will be in one bucket than you know. Subsequently, the next bucket, the next bucket, the next bucket. Uh, so then, within buckets, you have a file that will, or another director that will contain all your compressed roll data.
You'll have T s I. D X files, which is a time Siri's index file, which is basically
the way that Splunk
takes keywords out of data and then points back Teoh slices of that compressed roll data so it allows for spoiling toe only have toe un compress, decompress certain slivers of roll data to speed up searching.
And then there's some other additional metadata fields in there as well, like bloom filters. And
you know each of your actual indexed fields will be written to disk in their own individual files, but the additional metadata fields are kind of out of scope for this course. You don't really need to know all that menu sha for
the administrator exam. That would be more for, like, architect or consultant.
So what is a bucket? So, like I said, it's a subdirectory oven index. I give you a sample naming convention here. So those two numbers in the middle or iPAQ time stamps. So it's just a number of seconds that tells you
what the earliest and latest event in that bucket are. Which Splunk will use that to basically eliminate buckets entirely from searches. When you have a time specifications in in your search. And then there's three types of buckets. We're not gonna get too deep into this because, well,
we really do a deep dive on the data life cycle in a following. Listen,
but basically there's hot buckets where Dad has actively written. There's warm buckets, which is recently written data, but the book it is no longer rideable. And then there's cold, which is essentially for older data and tens, and you can move it to slower storage just to save on,
of course.
So in summary, we talked about what an index is. It's just a directory on disk that groups data. It's the purpose. Yeah, groups data. It stores. It specifies her attention and specifies access requirements.
And then an index is composed of a bucket, which is also composed of a number of things such as the actual compressed data decide the X file, a number off metadata fields.
And then we talked about what a bucket is. Subdirectory. I explained that already and then basically that bucket sub components and the three types of buckets hot, warm or cold. So that's everything you need to get your first initial precursor gland
glance into what indexes and in the subsequent lessons will doom or deep dives onto certain topics.
And we'll also have a lab where we really get into indexes and even setting one up. So that's gonna be the the end of this lesson, and we'll see you in the next video
Up Next