Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
Already have an account? Sign In »
6 hours 3 minutes
alone will come back to the Splunk Enterprise Certified Administrator course on Cyber. In this video, we're going to talk about the fish bucket and basically show you how you can manipulate the fish bucket to re ingest data and also go over How, uh,
certain things like, um,
if the files match despite that, like if the contents of the file match despite their names being different,
Splunk won't be able to tell the difference and it won't ingest. And then we'll also look at how it uses in its CRC length. And if the 1st 256 bytes of two files match, only one will be ingested. So let's just get started and we'll look at some
data I've already brought in.
Um, I'm just going to use inputs. I could figure on this device just for speed. Eso we could do it a little bit faster. Technically, this isn't
the best way to be doing it,
but as you can see, we have these three files
and they've been ingested one time,
but we want to re ingest them.
And so the way that we're going to do this
is we're gonna clear the fish bucket and re ingest, and then we'll come back and compare
this data to see the number increase.
So let's go over to our search head figure. This
cop Splunk been Splunk. There's multiple ways that we could do this. So I guess I'll walk through each one,
Um so we can clear clean event data
and index the fish bucket.
So this this clean event at a command will empty any index you want or all of them if you wanted to do that but I don't see why it would. But if you specify, underscore the fish bucket, it'll just clear the fish bucket. And that's what keeps the trackers that tell Splunk which files it's red and how far it's read into them.
So if you delete this,
we gotta stop spoiling first.
So if we stop slowing
and then we clean the event data from the fish buggit when we start Splunk back up and look in there, we'll see double whatever the numbers are. We saw with, um, the data to start with.
Are you sure you'd like toe erase all events from the fish bucket? Yes, I am.
So now we'll start this back up.
We'll just run that same search on DSI.
How many events we have now? We had
five initially. So if we were on this surge,
see all time.
A long time is a little too much. But
let's see by time
usually have nine new events by time.
So let's just look at the ones that were ingested here
Wednesday, July 29th 2020
That wasn't the best way to do this. Just because I didn't search all time to begin with. So let me redo it
and just compare this number now. So we have
This should not work because blinks about to restart.
Okay, so we'll clean event dad again.
Yes. Sorry. Stopped.
and we'll check. We should go up from 44 events says it's loads.
Okay, so let's run this again
and we're up to 49. You can see it did in just new data. Um, we could
do like some sort of tricky work to show you what was just index like. I did like the next time he calls underscore index time sort descending
in next time for some time. Next time something like this. We'll see what events were just brought in.
let's do a table
brought on and in next time.
So you see, this data was just brought in when we did that. It's 7 47 now of 7 46 So if I add source here, this will be a better way. Toe. See that this data did get re ingested.
Cool. So that proved it. The only one we don't see is that one with this tilled at the end. That just means it was a swap file. So that file doesn't exist right now, which is why it's not getting re ingested. But this clearly demonstrates that that worked. So let me show you another way
to do this. There's two more two ways. In addition to the clean event at a fish book it this time we're just going to delete the data out of fish bucket
and show you how it works so well just too far. M
r f pop Splunk
the fish bucket? No.
Yeah, I was fish bucket.
We just do that.
Starks Blanc up again and we should see
those files ingest yet another time now that we've done that. So
we'll go back here.
Yeah, I'm writing this search out the way I did. Definitely makes this a lot easier to see visually, so we'll run. This will give it a couple seconds. So it actually has time to read that file. Another cool trick. If you didn't know, you could hit control pipe. And it'll auto format here. SPL. I'm kind of like,
obsessive about that. So you'll see me do that a lot, probably.
But you can see this data was ingested back at 46. And then after we stopped and ran that command, the data ingested yet again. One more way to do this is
to use the BT probe command. And with that one, we can specify a specific
ah file or a monitor that we want to reset.
So this is like the most
I mean, I kind of like it the word probe. It's the most
precise. It's the most.
Yeah, it's the most precise. It's the best way to do this if you need to do it like granule early on a single input.
So let's do help. BT probe.
Get an idea of what this command does. Never mind that they don't offer us the help for that. So let's look it up because I honestly do not use this often. And I don't know how to do it off the top, my head. So just Google bt probe Command Splunk
That might be a good one too.
OK, Bt Probe. So looks like
we can Oh, it looks like it has its own help command.
So it looks like we'll specify
the directory of
or a file name and feel like file name would be easier.
Let's see some examples.
Ah ha! Lookie, lookie Gives us the exact example of how to reset fish Book it.
What is this Dash K
X key or all we're not going to use that will use
Okay, cool. So I'm gonna copy this
as a baseline
we'll keep the directory except will change the *** forwarder because we're not on Florida
***. Private dp file.
What is our file that we wanna Let's go back and look at a source?
Opt logs. Apache sample Apache access log. Let's do that. One
Apache access log
dot Text fairly set.
Let's see that did not work
logs head. So that's probably the error there. That director didn't exist. Okay, so it looks like a reset it. So now we can start Splunk back up, and that should
show more logs for
just that single
Give it a second,
and yes, so you can see just this one file got re ingested. So that's all the different ways that you can manipulate the fish bucket so that you're searches
or your files will get re ingested if you ingested them. And maybe you have the wrong settings initially and you want to change it, you'll mostly probably do this in, like, deaf or in your test index when you're trying to get your props, configurations and line breaking and stuff correct. The only other thing
that I could think that would be important to know
is how to use the, um, salt in its here sea salt and also in Searcy length.
let me I guess demo how to do that. So first we're gonna need logs
where this would
cause an issue in the first place. So let me see, Let me copy,
sample Apache access log dot text and we'll call it
sample Apache access log to dot text
Now. Splunk should automatically pick this up because if you remember, we're just monitoring. And I think I could just show you two were monitoring this whole directory. So technically, anything in here should automatically get captured
forget what we call this one. So we'll just correct for,
uh, Apache. Maybe I'll do dash all I so it doesn't matter.
so now I can use that
to specify a specific.
Okay, so, yeah, so you can see Yeah, we're capturing everything in here,
and the source type should be sample to, but so this file should automatically get registered.
research this source, you will see we don't get a new file. And the reason that is is because
this is the exact same file as Thea Other one that we copied. So this file and split basically checks the 1st 256 characters of a file and hashes that, and it's called like, well, the in its here. See, length is how much, um
is how many characters that should check, which by default is 256. But then it creates that hash. And if it's the same hash doesn't matter. If the file names air different, it will not ingest the new file. But a way around that is, if you use an in its CRC salt and I'll just pull up the documentation on how to do this.
So this coming inputs dot com
but we're gonna have toe
Bob Splunk at sea abs. Let's see what we have.
Apache inputs, default inputs dot com
let's find in it CRC salt
assault. Because I get CR. It's just CRC salt, and we're gonna use this value. And what this is going to do is it's going to take the
path of the file its monitoring and add that to the in its source CRC length and then create a hash so that hash will be unique per file.
So go here.
and paste. That CRC soul
equals source. And now if I restart Splunk,
we'll see that, um,
now it will win just both files because, uh, the hash won't match because now it's comparing the 1st 10 to 56 bites and also the file name.
Now, the time where this would not be. A good thing is, if you have a rolling log file and every time it rolls, the name changes its of you. Seriously, salt there with source. Uh, you would be re ingesting all your data and it would be terrible. So don't do that
if you ran into
ah ha. See? So now we have it ingesting both. So that worked. Um,
the other way to do this all demonstrate as well, because this would be a good way to do it. If,
um if you were concerned about being rolling long file and re ingesting your dad over and over,
you can adjust
in its CRC length.
So whatever you needed to, I'm not gonna demonstrate this completely. But so say I took those two files and I added a little bit something different on each of them at the end. And I set this value to be bigger than the park. That's the same between the two files. Then he would check
the header information. That's all the same. Say that's 300
bites or something. And then it would keep going into the next characters. And since we got a length that's bigger than the header. It will get into the unique information, and that will cause the hash from those characters to be different. So that's just another way to do it. Just another way to get around
some of Splunk six
built in tools to help it track. It's monitor inputs, so it's important to know how to manipulate that. It's also important to know what could be causing those problems so that you can troubleshoot so you can check the file and see OK does have a header. And are these the same? If so, either increase in its here, see length or at a Searcy salt.
And then if you're ingesting data and you're trying to,
uh, tweak your config is and you need to re ingest to test again, use any of the three ways that I showed you to clear the fish bucket. But perhaps everything that you need to know what the fish bucket and how you can use these tools to help you troubleshoot monitor inputs. So that's gonna be the end of this video, and I'll see you in the next