Storage Troubleshooting

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Time
21 hours 25 minutes
Difficulty
Intermediate
CEU/CPE
21
Video Transcription
00:00
>> Hey there Cybrarians and welcome back to the Linux plus
00:00
course here at Cybrary I'm your instructor Rob Gills.
00:00
In today's lesson we're going to
00:00
be discussing storage troubleshooting.
00:00
Upon completion of this lesson,
00:00
you're going to be able to understand the types
00:00
of storage issues that you may need to troubleshoot,
00:00
as well as locate files and use
00:00
utilities to troubleshoot storage performance.
00:00
Storages issues are just
00:00
really incredibly varied in Linux.
00:00
We're going to touch on several types of
00:00
issues we may see in this lesson.
00:00
For example, we're going to look
00:00
at missing devices, volumes,
00:00
and mount points, as well as
00:00
performance issues and resource exhaustion.
00:00
Then in later lessons we'll go into more depth
00:00
on some other issues that can impact storage,
00:00
such as network adapter issues and storage integrity.
00:00
Off the bat, the first thing we should look
00:00
at is the missing device issue.
00:00
This issue can have a lot of causes.
00:00
Maybe we just have a device that's not showing up,
00:00
we run a command like lsblk, and it's not there.
00:00
Well one thing we could do is we can run lspci-M,
00:00
and this will perform a scan of
00:00
all devices that are on that bus,
00:00
and then hopefully we should be able to see that device
00:00
again when we run lsblk one more time.
00:00
Now we might also have network
00:00
attached storage go missing,
00:00
and this could be due to network issues.
00:00
We have network attached storage,
00:00
we have network issues.
00:00
Let's go ahead and troubleshoot
00:00
the network connectivity and see what's going on there.
00:00
Maybe we just have modules
00:00
that aren't loaded into memory,
00:00
and we need to have these kernel modules in
00:00
some cases to run the underlying storage.
00:00
Well, we can verify kernel modules are
00:00
loaded by using the ls mod command,
00:00
and then there are a handful of other causes.
00:00
For example, we maybe have
00:00
a removal device that's not attached or attached
00:00
incorrectly or not powered on if it's
00:00
a removable storage device
00:00
that has to have separate power.
00:00
Maybe we also have a missing
00:00
or damaged device connection,
00:00
so the device can be physically damaged or maybe it's
00:00
not correctly connected to the system,
00:00
>> and that finally,
00:00
>> we can have an incorrect device filename,
00:00
maybe something's just not configured
00:00
right in the dev partition.
00:00
Now the next issue we'll look at is
00:00
the concept of missing volumes.
00:00
A missing volume issue occurs when
00:00
a disk and logical volume group fails,
00:00
or maybe that this got accidentally removed.
00:00
This issue can be detected by
00:00
running the command PV scan and
00:00
the PV scan command will return couldn't find
00:00
device if you have a missing volume,
00:00
and it'll also tell you the UUID for the missing disk.
00:00
To resolve this, you have to replace
00:00
the failed disk and then replace the failed volume.
00:00
You do PV creates replace the failed volume,
00:00
you can restore that volume group metadata by bringing
00:00
the command vgcfgrestore,
00:00
and then he can recover the group with vgscan and
00:00
activate the group once again with vgchange.
00:00
Now a missing mount point occurs for couple of reasons.
00:00
But generally what you'll see is that
00:00
the mount command will return
00:00
a message saying they can't find them out
00:00
point on a file system.
00:00
In this situation, you don't get a message from
00:00
the mount command saying mount point does not exist.
00:00
It could be something like the mount point
00:00
not being created and that's easy to resolve.
00:00
We can just recreate the mount point
00:00
with the make directory command.
00:00
But the other side of it is,
00:00
maybe you don't have the device that we
00:00
need to connect to that mount point,
00:00
and that's really the issue is that
00:00
the mount points there but the device isn't.
00:00
In that case, we'd have to see the previous section for
00:00
information of how we troubleshoot
00:00
missing devices before moving forward.
00:00
Now another type of issue unrelated to missing
00:00
storage issues is the concept of performance issues.
00:00
These also had a big impact
00:00
on a system and its applications.
00:00
Troubleshooting stores performance often requires
00:00
a baseline and without one,
00:00
it's hard to determine if the performance has changed.
00:00
But let's go through a couple of
00:00
performance troubleshooting tools and
00:00
utilities that we've seen and
00:00
we've covered some of these already.
00:00
For example, we talked about ioping,
00:00
we could do ioping for a directory,
00:00
and what that'll do is it'll show us
00:00
the I/O latency for the directory.
00:00
We could also do iostat-dh,
00:00
and that will display I/O device
00:00
>> statistics respectively.
00:00
>> We could also do sar-b or sar-d.
00:00
Sar-b will give us overall I/O activity.
00:00
Sar-d in will give us
00:00
individual block I/O device activity.
00:00
Then we can also use a few other commands
00:00
like the dd command,
00:00
the disk destroyer command is we like to refer to it.
00:00
What we could do with the dd command is generate
00:00
a large file and just see how long
00:00
it takes for that to complete.
00:00
For example, we could generate a one gig file
00:00
using the dev0 directory as our source.
00:00
We can say, in file ddif,
00:00
and then the outfile is going to var test,
00:00
so we're creating a one gig file,
00:00
by size one gig in var test directory just once,
00:00
and we want to do that directly and just stream that
00:00
data right over from dev0 into that test,
00:00
that var test file.
00:00
Let's see how long it takes to create that file.
00:00
That'll give us an idea of what our performance
00:00
looks like in writing to a disk.
00:00
Then finally, we can also use
00:00
the hdparm command to determine a drives write speeds.
00:00
To do this, we just do hdparm and then we
00:00
specify -t and the disk device.
00:00
Finally, storage resource exhaustion
00:00
can be caused by a handful of things.
00:00
We might have no disk space
00:00
remaining and we can use df to find this,
00:00
or maybe the devices out of inodes,
00:00
which we know every file
00:00
needs inodes in order to store metadata.
00:00
If there are inodes left,
00:00
no files can be created,
00:00
and so he just use df here as
00:00
well as we've talked about previously.
00:00
We can just use df-I to display the inode information.
00:00
There may also be a quota limit on
00:00
a device or the user group as they hit their quota,
00:00
and so we can check at cfstab for user quota or
00:00
group quota on a device see if that's
00:00
on the line for that device and that cfstab,
00:00
or we could use rep quota or rep
00:00
quota-a to examine quotas.
00:00
With that, in this lesson, we talked about
00:00
the types of storage issues
00:00
you may need to troubleshoot,
00:00
and then we talked about locating files and
00:00
using utilities to troubleshoot storage performance.
00:00
Thanks so much for being here and I look
00:00
forward to seeing you in the next lesson.
Up Next