Time
6 hours 3 minutes
Difficulty
Intermediate
CEU/CPE
6

Video Transcription

00:00
Hello and welcome back to the Splunk Enterprise Certified Administrator course one cyber, This is Lesson 9.3. Were will be talking about the fish bucket.
00:09
So they're learning objectives in this lesson are gonna be to discuss what the fish bucket is, how to reset the fish bucket and why you would reset the fish bucket. So let's quick before we jump into the contents, discuss why we're learning this. So in the previous video, we went through a lab on how to set up file monitors and directory mongers.
00:29
And this issue or this concept is directly related to file monitors or direct reminders.
00:35
Essentially, the fish bucket tracks certain key information about file inputs. And so, if you were setting up file inputs, it's a very important topic to understand, as it's gonna be essential for troubleshooting or re indexing your file monitor inputs.
00:53
So what is the fish bucket? It is an internal Splunk DB, which just means essentially, that is Splunk Index.
01:02
It tracks info on your monitor inputs such as like the last read location. What's called that in its CRC and and some other information that will discuss in the following slides and essentially, what it does is it tracks
01:19
where the *** where in the file it
01:22
Splunk is already read so that you don't re index events, and then it also keeps like thumbprints of the files so that it doesn't accidentally re index a file.
01:34
So the problems associated with the fish bucket if you're seeing monitor inputs where the file is simply not being indexed, most likely is originating from well, either an error in your input, where you're not specifying file directly properly. But if it's not that, most likely
01:52
it's going to be a problem with the fish bucket, where for some reason,
01:56
it is miss identifying that file as another input and so it won't read it, or you may be processing your data. You might have a file input and you're working on your props configurations or something, and you need to re in. Just the data were unable Teoh, and that would be another instance where
02:15
that problem is most likely originating from the fish bucket.
02:21
So how do we fix those things? So if the file is not being indexed, most likely that's because so Splunk does what it calls it takes like, innit? Innit? CRC, which Searcy stands for, like cyclical redundancy. Check.
02:37
But basically it takes the 1st 256 bytes of the file by default and hashes them, and it tracks that.
02:44
And so if you have two files where they have, like, ah, very long header,
02:47
then then the file on Lee the first file would be ingested because they will take that header, create the, um, in its here. See from that. And then when another file comes in, it will create the same in its year, see hash. And so then it won't index.
03:04
So the solution there is to just increase the header. The units here see length toe something longer than the header so that that hash will be unique across files. Another way to address that would be to add a Searcy salt. And so what that does is it adds assault
03:22
value to that inner CRC length
03:24
before it computes the hash. So a lot of times, you can just use, um,
03:30
the host or the the source value. So it'll just
03:37
take whatever the file path of the file is pra penned that to the CRC in the normal innit? CRC, but and then perform the hash. So then, basically, for each unique file, it will create a unique hash and still ingest each file
03:55
going. Problem with that solution
03:58
is if you have rolling log files than each time the file rolls, it will generate a new hash, and the fire will be totally re indexed. So in those instances, definitely manipulate that in its your see length versus the sea Air Sea salt.
04:15
But in other cases, manipulating CRC salt can be fine.
04:17
But those air both ways that you can fix if your file
04:23
is not being monitored
04:26
or not being indexed by Splunk those air two possible solutions for that
04:30
now. Another issue that you may have is that you can't re index the data. And so that's because, as we mentioned, the fish bucket is tracking all that information, and so it sees those files as things that's already ingested. So to fix that you have to clear the fish bucket or reset the fish buggit
04:49
so it doesn't have that tracking information anymore.
04:53
And there's three ways that you can do that. You can use a bill insulin command that specifically meant to do that called BT probe. So I posted the syntax for that here. So you just do Splunk CMD BT probe
05:10
and then tack tack file. Specify which
05:14
monitor you want to remove the fish bucket entry. Four. Tack tack. A reset to tell it that what you want to do is reset it and then dash D to specify the actual fish bucket directory so that it
05:31
can remove the fish bucket entry.
05:34
That is one way to do it. Another way is to actually just manually delete the fish bucket, which it's much less specific and much less elegant than the Bt per probe example. But you could just doing our MRF
05:53
on the fish bucket and then it would re ingest
05:57
all of your monitored inputs.
06:00
Uh and then also you could do Splunk Seelye. You could clean event data from the fish bucket, and I've included this in tax for that as well.
06:08
So that's similar to the previous command will reset all of your monitors versus the BT probe on which allows you to do it a little bit more granular early, but those your three options. So if you run into an issue where
06:24
you know, you index your data initially and it came out all broken. May the line breaking wasn't
06:29
right. And then you change your config and you need to retest it. Use one of these methods, Teoh, clear that fish bucket so that Splunk will re in just the data, recognizing it as a new input.
06:42
So in summary, we talked about what the fish bucket is. It's a Splunk internal index for tracking information on your file monitors. We talked about what potential problems could be caused by the fish bucket, which would be either you,
06:58
you have a name profile is incorrectly just not being indexed at all.
07:01
Or, uh, you can't re indexed data.
07:06
We talked about how to fix it, fix basically failed file mongers. So if it's because of the intense CRC not being correct, we talked about how you can increase the length of that. Or you can add a CR you can add assault to it.
07:24
So those are your two options for fixing that. Then we also talk about the three methods for resetting the fish bucket. You can either use the clean event data command, the BT probe command or you could just use the normal or m r r fto totally remove that directory contents altogether.
07:42
But those are the things you need to know for
07:45
the fish bucket in orderto hopefully help you with troubleshooting any of your final monitor inputs. So that wraps up this lesson and we're gonna jump back into the labs now, after this, so see in those videos.

Up Next

Splunk Enterprise Certified Administrator

The course is designed around the guidelines provided in Splunk’s Test Blueprint for the Certified Administrator certification, Splunk Docs, the Splunk Data and System Admin courses, and the experience of a Splunk Professional Services Consultant.

Instructed By

Instructor Profile Image
Anthony Fecondo
Splunk Professional Service Consultant
Instructor