Data Input Options

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *

Already have an account? Sign In »

6 hours 3 minutes
Video Transcription
Hello and welcome back. The Splunk Enterprise certified Administrator course on Cyber. This is module nine, where we'll be discussing the different ways to get data into Splunk.
So we're getting to the end of the course. This is the ninth out of 10 modules we've already in the previous sections set up our Splunk in enterprise environments, so it's totally ready to receive data and allow our users to log in and search that data. So now we need to actually get data populated in Splunk.
So let's first talk about what in put options we have available to us, and that will be a lesson 9.1.
So the learning objectives here will be to review what are the common inputs dot com configurations? And also, what are the common input types you'll see with Splunk
on the reason that we're gonna learn This is because before we start bringing that in, it's important to understand what options we have available to us so that we can properly identify the data that we can bring into Splunk in our environment. And then it's also important to go through how to configure it so that we know we're bringing the data in properly.
So before we get into actual, uh, the actual examples of inputs, let's first start with just a definition of what an input is. So essentially, it's any data that's going to be sent to Splunk
and in terms of input as faras configuration in Splunk, it's basically the configurations that explain to *** how to see the data that we wanted to monitor, and then also for each subset of data that we want to monitor what metadata to associate
with that data.
So some of the common inputs that come configurations are gonna be first. What defines the data that we're looking at, which will be this stands the name in ***. And then after that, we'll also need a destination where we save in this status to which, with in each stanza, you'll define that with an index attributes
to just point to one of your existing in indexes
and then also you'll want to define your metadata, so your source type source and host so source type is very important because generally the way that configurations are applied to the data, such as like field parsing tags, event types on and things of that. Nature is generally by the source type name.
So normally, when you bring data in, you find a ta on Splunk base
on and it will have a source type that it's expecting. And you need to make sure that when you ingest your data and your inputs, you specify that proper source type.
Attribute value in
the stands, the name.
Then you'll have your source, which generally defines, like the file path where the data originating from. And then you have the host, which defines the device that, actually, uh, that actually produced the data to begin with.
So those are basic. Those are the basic configurations. You'll want to be making two inputs dot com on a regular basis.
And now let's get into what the actual common input methods are. So here's a list of them. There's five, and we'll be doing a deep dive in the subsequent slides, just working through each of these options.
So the first option, which is just a file directory input. This will allow you to monitor a specific file or all files within directory or specific named files within the directory on a point,
and it will just read the file. Some notes for this is that your path can include wild cards. You can reference inputs dot com for more detail on that. There's a couple options for how you can do this, but it just makes your monitoring stands as a little bit more
flexible. So if you're monitoring a large number of different systems where the exact
file path might very you can use wild cards to basically work around that. And the way that this would look in an inputs dot com file would be this kind of stands out where it starts with monitor colon, forward for ford slash ford slash
and then the path of what you've ever file or directory you want
to monitor. And then, obviously, underneath that, it would have any attributes like index source type source hosts that we discussed earlier.
A network input. This is gonna be used to monitor either UDP or TCP traffic that's coming directly to Splunk over. Report eso It would look like either TCP colon ford slash ford slash on the device you would expect it to come from and then you d or UDP. So just depending on which
which one of those protocols you're using
one? No, on this is generally we don't receive direct data directly from a network input. So one use case for this might be if you're monitoring sis log. But like, best practice for that is to actually have the data sent to assist log server written to disk. And then you just use file and
file and directory input monitors. But
still good to know that this is an option in case, uh,
in case you need to leverage it. And it could be any pork, too, so it doesn't just have to be like six log data. And then there's also Windows inputs, which are used to monitor window generated Windows generated events.
So these are very complicated. Split has their own ta that they put out for this
on. And basically we'll have to go into more detail on this one in the labs because it is just one of the more complex input types you can have, which probably isn't surprising to most people who use windows. The logging is is very verbose, and it's also a bit convoluted,
we'll work through a lab later on how to do it so well, just
for now. Basically, note that, yes, that's an option. We can definitely monitor all of our Windows data. It's just matter how.
Then another option is scripted input. So this will be
basically, if you have a custom script you want to run to generate log messages, then you can configure it
this way and basically, by writing the script up and then referencing it in a either a script or power shell stanza, depending on what the
script is written in.
So some special settings would be interval or schedule if its power shell. And that just tells Splunk how often to run the script. There's a couple different options for that. You do like negative one, which just means it'll run on start up and then never again. You could do
any number of seconds, then to to have actually run on an interval.
And for schedule, you could set a Cron schedules specifically as well, and then script. If you're using power Shell to specify the name of the script that you're using,
So some notes for this is the script. Stanza works for a bash, batch or python, and then the power shell stands that is used for power shell
and then those with stands of names.
And then finally, there's the http event collector. So what this does is it sends stir at data directly to splitting from a nap, and it's agent lists. So basically, you just say, Hey on on whatever device you're trying to ford the data from you, just configure it to send over http or https
with a generated token,
Bram Splunk tell authenticate with. And that way you can cut out the agent altogether. So some special settings, as I mentioned you do have to generate and specify your token. And then I linked a document here toe that goes Maurin depth on how to do this configuration.
But we'll also be doing this in a lab so you don't necessarily need to check that out now. Just kind of for reference.
Ondas far stands of this one's a little weird because you don't set this up through inputs dot com. You just will do it through *** Web. It's just a lot easier to do it that way. So
So that basically wraps up everything you need to know as faras what inputs are So we did define what an input is. And Splunk. Basically, it's just any data that's coming in a so ah, high level and then also a little bit more granular. We talked about how a nen put also
defines how to find the data and associates meta data with it as well.
When we talk about which settings and input should determine so mostly the metadata fields the destination, which is the index, and then it should also define what Dad it's looking at. And then we talked about the five primary data input methods. Those being TCP UDP poor file monitor or directory monitor,
scripted input, HTC or Windows input.
So that covers all of the basic information. You mean don't know about inputs and will see you in the next lessons where we'll start actually going through and configuring these
Up Next