Fishbucket Lab

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Time
6 hours 3 minutes
Difficulty
Intermediate
CEU/CPE
6
Video Transcription
00:00
alone will come back to the Splunk Enterprise Certified Administrator course on Cyber. In this video, we're going to talk about the fish bucket and basically show you how you can manipulate the fish bucket to re ingest data and also go over How, uh,
00:17
certain things like, um,
00:21
if the files match despite that, like if the contents of the file match despite their names being different,
00:28
Splunk won't be able to tell the difference and it won't ingest. And then we'll also look at how it uses in its CRC length. And if the 1st 256 bytes of two files match, only one will be ingested. So let's just get started and we'll look at some
00:47
data I've already brought in.
00:50
Um, I'm just going to use inputs. I could figure on this device just for speed. Eso we could do it a little bit faster. Technically, this isn't
00:59
the best way to be doing it,
01:00
but as you can see, we have these three files
01:03
and they've been ingested one time,
01:07
but we want to re ingest them.
01:11
And so the way that we're going to do this
01:14
is we're gonna clear the fish bucket and re ingest, and then we'll come back and compare
01:19
this data to see the number increase.
01:23
So let's go over to our search head figure. This
01:30
cop Splunk been Splunk. There's multiple ways that we could do this. So I guess I'll walk through each one,
01:38
Um so we can clear clean event data
01:44
and index the fish bucket.
01:47
So this this clean event at a command will empty any index you want or all of them if you wanted to do that but I don't see why it would. But if you specify, underscore the fish bucket, it'll just clear the fish bucket. And that's what keeps the trackers that tell Splunk which files it's red and how far it's read into them.
02:06
So if you delete this,
02:09
we gotta stop spoiling first.
02:16
So if we stop slowing
02:19
and then we clean the event data from the fish buggit when we start Splunk back up and look in there, we'll see double whatever the numbers are. We saw with, um, the data to start with.
02:31
Are you sure you'd like toe erase all events from the fish bucket? Yes, I am.
02:37
So now we'll start this back up.
02:42
We'll just run that same search on DSI.
02:46
How many events we have now? We had
02:50
five initially. So if we were on this surge,
02:57
still five
03:04
see all time.
03:07
A long time is a little too much. But
03:13
let's see by time
03:15
usually have nine new events by time.
03:19
So let's just look at the ones that were ingested here
03:38
Wednesday, July 29th 2020
03:40
20.
03:45
That wasn't the best way to do this. Just because I didn't search all time to begin with. So let me redo it
03:52
and just compare this number now. So we have
03:55
44 events.
04:11
This should not work because blinks about to restart.
04:14
Okay, so we'll clean event dad again.
04:17
Yes. Sorry. Stopped.
04:21
Start
04:24
and we'll check. We should go up from 44 events says it's loads.
04:35
Don't say
04:38
Okay, so let's run this again
04:43
and we're up to 49. You can see it did in just new data. Um, we could
04:49
do like some sort of tricky work to show you what was just index like. I did like the next time he calls underscore index time sort descending
05:03
in next time for some time. Next time something like this. We'll see what events were just brought in.
05:15
Um,
05:15
let's do a table
05:17
brought on and in next time.
05:24
So you see, this data was just brought in when we did that. It's 7 47 now of 7 46 So if I add source here, this will be a better way. Toe. See that this data did get re ingested.
05:38
Cool. So that proved it. The only one we don't see is that one with this tilled at the end. That just means it was a swap file. So that file doesn't exist right now, which is why it's not getting re ingested. But this clearly demonstrates that that worked. So let me show you another way
05:55
to do this. There's two more two ways. In addition to the clean event at a fish book it this time we're just going to delete the data out of fish bucket
06:08
and show you how it works so well just too far. M
06:12
r f pop Splunk
06:15
far lived.
06:16
Splunk
06:18
the fish bucket? No.
06:23
Yeah, I was fish bucket.
06:27
We just do that.
06:28
And now
06:31
Starks Blanc up again and we should see
06:34
those files ingest yet another time now that we've done that. So
06:41
we'll go back here.
06:42
Yeah, I'm writing this search out the way I did. Definitely makes this a lot easier to see visually, so we'll run. This will give it a couple seconds. So it actually has time to read that file. Another cool trick. If you didn't know, you could hit control pipe. And it'll auto format here. SPL. I'm kind of like,
07:00
obsessive about that. So you'll see me do that a lot, probably.
07:02
But you can see this data was ingested back at 46. And then after we stopped and ran that command, the data ingested yet again. One more way to do this is
07:18
to use the BT probe command. And with that one, we can specify a specific
07:24
ah file or a monitor that we want to reset.
07:29
So this is like the most
07:30
I mean, I kind of like it the word probe. It's the most
07:33
precise. It's the most.
07:38
Yeah, it's the most precise. It's the best way to do this if you need to do it like granule early on a single input.
07:46
So let's do help. BT probe.
07:48
Get an idea of what this command does. Never mind that they don't offer us the help for that. So let's look it up because I honestly do not use this often. And I don't know how to do it off the top, my head. So just Google bt probe Command Splunk
08:09
to this.
08:09
That might be a good one too.
08:15
OK, Bt Probe. So looks like
08:18
we can Oh, it looks like it has its own help command.
08:22
So it looks like we'll specify
08:24
the directory of
08:28
or a file name and feel like file name would be easier.
08:31
Let's see some examples.
08:37
Ah ha! Lookie, lookie Gives us the exact example of how to reset fish Book it.
08:43
What is this Dash K
08:48
X key or all we're not going to use that will use
08:54
fire.
08:56
Okay, cool. So I'm gonna copy this
09:00
as a baseline
09:01
on and
09:09
we'll keep the directory except will change the *** forwarder because we're not on Florida
09:22
***. Private dp file.
09:26
What is our file that we wanna Let's go back and look at a source?
09:33
Opt logs. Apache sample Apache access log. Let's do that. One
09:39
upped
09:43
log
09:46
Apache
09:50
sample
09:52
Apache access log
09:56
dot Text fairly set.
10:11
Let's see that did not work
10:26
logs head. So that's probably the error there. That director didn't exist. Okay, so it looks like a reset it. So now we can start Splunk back up, and that should
10:43
show more logs for
10:46
just that single
10:50
source.
10:54
Give it a second,
11:00
and yes, so you can see just this one file got re ingested. So that's all the different ways that you can manipulate the fish bucket so that you're searches
11:09
or your files will get re ingested if you ingested them. And maybe you have the wrong settings initially and you want to change it, you'll mostly probably do this in, like, deaf or in your test index when you're trying to get your props, configurations and line breaking and stuff correct. The only other thing
11:28
that I could think that would be important to know
11:31
is how to use the, um, salt in its here sea salt and also in Searcy length.
11:41
So
11:43
let me I guess demo how to do that. So first we're gonna need logs
11:48
where this would
11:50
cause an issue in the first place. So let me see, Let me copy,
11:56
sample Apache access log dot text and we'll call it
12:05
sample Apache access log to dot text
12:09
Now. Splunk should automatically pick this up because if you remember, we're just monitoring. And I think I could just show you two were monitoring this whole directory. So technically, anything in here should automatically get captured
12:24
and
12:26
forget what we call this one. So we'll just correct for,
12:31
uh, Apache. Maybe I'll do dash all I so it doesn't matter.
12:39
Okay,
12:41
so now I can use that
12:43
to specify a specific.
12:50
Okay, so, yeah, so you can see Yeah, we're capturing everything in here,
12:56
and the source type should be sample to, but so this file should automatically get registered.
13:03
But
13:05
research this source, you will see we don't get a new file. And the reason that is is because
13:13
this is the exact same file as Thea Other one that we copied. So this file and split basically checks the 1st 256 characters of a file and hashes that, and it's called like, well, the in its here. See, length is how much, um
13:31
is how many characters that should check, which by default is 256. But then it creates that hash. And if it's the same hash doesn't matter. If the file names air different, it will not ingest the new file. But a way around that is, if you use an in its CRC salt and I'll just pull up the documentation on how to do this.
13:50
So this coming inputs dot com
13:52
setting,
13:56
but we're gonna have toe
14:00
Bob Splunk at sea abs. Let's see what we have.
14:03
Apache inputs, default inputs dot com
14:09
So
14:11
let's find in it CRC salt
14:16
assault. Because I get CR. It's just CRC salt, and we're gonna use this value. And what this is going to do is it's going to take the
14:28
path of the file its monitoring and add that to the in its source CRC length and then create a hash so that hash will be unique per file.
14:41
So go here.
14:45
Oops
14:46
and paste. That CRC soul
14:50
equals source. And now if I restart Splunk,
15:00
we'll see that, um,
15:03
now it will win just both files because, uh, the hash won't match because now it's comparing the 1st 10 to 56 bites and also the file name.
15:16
Now, the time where this would not be. A good thing is, if you have a rolling log file and every time it rolls, the name changes its of you. Seriously, salt there with source. Uh, you would be re ingesting all your data and it would be terrible. So don't do that
15:33
if you ran into
15:39
ah ha. See? So now we have it ingesting both. So that worked. Um,
15:45
the other way to do this all demonstrate as well, because this would be a good way to do it. If,
15:52
um if you were concerned about being rolling long file and re ingesting your dad over and over,
15:58
you can adjust
16:03
in its CRC length.
16:08
So whatever you needed to, I'm not gonna demonstrate this completely. But so say I took those two files and I added a little bit something different on each of them at the end. And I set this value to be bigger than the park. That's the same between the two files. Then he would check
16:26
the header information. That's all the same. Say that's 300
16:30
bites or something. And then it would keep going into the next characters. And since we got a length that's bigger than the header. It will get into the unique information, and that will cause the hash from those characters to be different. So that's just another way to do it. Just another way to get around
16:49
some of Splunk six
16:52
built in tools to help it track. It's monitor inputs, so it's important to know how to manipulate that. It's also important to know what could be causing those problems so that you can troubleshoot so you can check the file and see OK does have a header. And are these the same? If so, either increase in its here, see length or at a Searcy salt.
17:11
And then if you're ingesting data and you're trying to,
17:15
uh, tweak your config is and you need to re ingest to test again, use any of the three ways that I showed you to clear the fish bucket. But perhaps everything that you need to know what the fish bucket and how you can use these tools to help you troubleshoot monitor inputs. So that's gonna be the end of this video, and I'll see you in the next
Up Next