10 hours 41 minutes
one of the keys to having a good troubleshooting process is to actually have a process
and not just randomly
attack an issue.
Sometimes we don't think enough about that, and we should.
So we'll talk through this. What I like in the two is if if you take your car into a mechanic, do you want him to just randomly replace parts until he actually gets it to run again? Or would you rather he had a process?
And sometimes we lose sight of that. We need to remember that the people were supporting
would like us to also have a process so that they have some insurance, that things were gonna get fixed him back to them
in a reasonable time without a lot of extra effort.
So the steps to follow
these are worth remembering.
First, identify the problem
on dhe. That's not as easy as it seems
when somebody says to you or you get a trouble ticket that says the Internet's down.
you know, Okay, the entire Internet worldwide just went away.
Well, that's probably not true, right?
It may not even be that their access to the Internet went down.
I've had multiple situation where somebody told me that
their machine couldn't go to the Internet and the reason they were sure it couldn't go to the Internet is ended up being that the default Web page they had didn't work and they never tried another one.
The reason they're default Web page didn't work is that that Web server was down
and everything else on the Internet was still working. But because they launched their Web browser and it came up with with Unreachable, they assumed that the connection was bad.
When is really the other end that had the problem, not ours.
sometimes you need to help people with identifying the problem.
You described the activity
Timmy tell. Tell me what you seen
because too often they will try to diagnose the problem and tell you what's broken rather than tell you what
symptoms they're seeing.
first step identifying the problem
means listen to what the person saying and then asks in the challenging questions, because what you're really trying to do is at some point something's working, and then something isn't working.
So what's the difference? What parts of their system are still functional? You know the Internet's down. Okay, is your computer on
probably the wrong place to start,
But sometimes it does get. Get that funny.
Tell your personal story.
I spent two weeks trying to help somebody. This is gotta be over 10 years ago. Now
it's been two weeks trying to help somebody
they couldn't get the computer on
Okay, it won't turn on for them.
And this was in a classroom that had two instructors that was that shared access to the system.
on Lee, one of the two people was complaining the computer wouldn't turn on, actually called the other instructor, and the other instructor said, No, everything's fine,
but I get back to the first person and first person said no, it still won't turn on for me.
And one day after me, going over a couple of times and finding everything worked just fine.
I found they went over with the instructors, was having trouble and said, You know, I can't find this. Let me watch you.
Maybe that'll help
what I found.
the that instructors
other team member had changed over the summer,
and what had happened is the person the previous year
hadn't left the computer on all year long
and only shut the screen off.
So Instructor number two came in the room.
All they need to do is to
punch the button to turn the screen on. And there was the computer.
The partner this year when he was done with the computer,
went and shut the system down
so the first New was complaining about the problem, didn't know how to turn a computer on.
They only knew how to turn the screen on. So when they turned the screen on, they didn't see anything. They assumed that the computer wasn't coming on for some reason.
So those kind of always identified the problem. Say, you know, if the Internet's down,
Maybe if you got in the house website to ask him, Hey, can you try the local website inside?
And if they can't get to that, either, then maybe they've got a connection problem right on their own system.
If they can get to the inside sight, but not the outside site. Now you're beginning to identify the problem better
because you need to do that so you get to the next step, establish a theory of probable cause
the other was come up with. Well, I think it's dissed, right?
You do that based on experience. You could do that based on the person's explanation.
But ultimately you need some idea of probable cause. So you know what to investigate, what to look for, to try to fix it,
and then step three tests of theory to determine the actual cause. Is it really what you thought? Itwas
if they can't reach the outside side and they can't reach the inside site.
You think? OK, never. Cable must be important. So
go back and see Is the never cable really unplugged?
If the neighbor cable was unplugged, plug it back in,
see if that theory is confirmed and move on to the resolution. See if that fixes it.
If they never Cable is plugged in, though,
then the first theory is wrong. And maybe there's a second theory.
Maybe Tom parked on the other end where the switch is
so there it is. Move step by step, have some process, have some
some pattern where you're getting closer and closer to the solution.
So if you have to and you don't have another theory,
that's what escalations for
when you're out of reasonable ideas. That's when it's time to share it with the second person.
They don't even necessarily have to be higher than you.
But sometimes the second set of eyes will help solve a problem right away.
Once you know what the
actual causes, then establish the plan to resolve the problem and implement the solution.
Do the repair. See a full functionality has returned.
If it hasn't,
then look for a second problem.
When you do have, the full functionality has returned one. The most important things is if you can implement preventive measures.
If the reason that ever cable was unplugged was because the little retention camp was broken off,
shoving the cord back in and still not having a retention town isn't going to solve the problem. It's going to
come out again next week or the week after.
So look for the preventive measure to make sure whenever possible, that this is a permanent solution and not just a stopgap waiting to break again next week
and then finally document your findings, actions and outcomes. And for most of us, that means closing the trouble ticket
record, everything that happened close to ticket
because sometimes the problem you see is going to be seen by five other people or 10 other people or 100 other people.
And they're systemic problem maybe hired that the systemic problem. Maybe
that there's a virus going through the network.
The problem may be that the switch in the closet is flaky and work sometimes and not others. But if you don't record the findings, the actions and the outcomes,
nobody else could get that bigger picture. That says, Oh, we've seen you 15 instances of this
in that building this month. We should go see what matter with the infrastructure.