Common Green Belt Distributions Part 1: Normal

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *

Already have an account? Sign In »

9 hours 53 minutes
Video Transcription
Hi, guys. Welcome back to your lean six Sigma green belt. I'm Katherine MacGyver, and hallelujah, we made it. This module is bring about distributions.
So let's pause and celebrate for a minute, because this is your get down as a green boat. This is going to be your jam. Your normal distribution. This is what we've worked towards. We've talked about it for the last two modules. Heck, we even or two courses, we even named the discipline after. So in two days,
um, lesson you'll be able to understand recognizing normal distributions.
Now, I'm going to stop for a minute here and give you a hard stare. If you are not able toe already recognize a normal distribution because it has shown up throughout the past two courses. You are, however, gonna learn something new. You're gonna learn the requirements for a normal distribution,
and we're also going to go over the spectrum of the bell curve. And what that means is,
how does the data statistically break out?
So the first thing that you need to know we love the normal distribution, it does need to be continuous data. So if you think back to our yellow belt course and we had a graph with the faucet where we had a stream versus drips. You know the difference between our discrete and continuous data,
but it's important that it's continuous data
because we're talking about reporting on a spectrum were starting to look at probabilities, but most importantly, because it is a spectrum, this is where we're starting to see our slices.
This is not to say that our hissed a grams with our discrete data are not helpful.
But when we use this as a futre into statistical process control, we really need that spectrum. Expect
you remember from our distributions module that mean equals median are normal. Distribution is metrical. Um, we don't care about how toll or wide it is. So we're not looking at the data spread or arm statistics of dispersion,
but we are looking at our measures of central tendency.
It does are mean equal our median. And with that, the central's of dispersion are important because what we're looking at is how wide it tells us our standard deviation, which plays into our c p r c p k r p p R P p. K.
So if you think back to our voice of the, um,
process modules when we were talking about C p c p k, that sort of thing. What we were talking about is how well does your normal distribution fit within your upper and lower spec limits? So we'll look at that and a couple of slights so important things mean equals median.
This is how you are going to be such a great greenbelt. You're going to be a bad A
when you go in and you're lighten. This is a normal distribution. It does need to be Smith symmetrical. I'm going to give you a little teeny, tiny hint because this is your green belt on. This is your get down and this is your jam.
If you look at data, that is not necessarily a normal distribution, all data normalizes with a larger sample size. So if you get data and you're like, whoa, this is not a normal distribution. That's okay. Get a larger sample site.
Student T, who was one of our great statisticians, he wrote under his pen name Student t
on, and we'll talk about it when we talk about probability a little bit, he told us that about 600 data points is when you'll start seeing that normalizing. Ah, fact, if for some reason you don't have 600 data points, you can always plug through a multiplier. Also, start seeing that normalising effect. But,
well, normalizing data is a black belt skill. You as a green bell who's super savvy,
will say I if my data doesn't look right, I'll get a little bit more. It'll probably look like a normal distribution then.
So this is this is it. This is our normal distribution. So a couple of things that you need to note remember, we talked about this being probability, so you'll see these patterns in your discrete data we can use. Some of the tools to analyse are discrete data,
but where we're looking here is a probability, which means we do need that continuous data
so they don't notice right down the middle. That's going to be our median or average or mean Excuse me are mean or average. That's going to be when you take all of your numbers, you create that average and then each of the chunks away from it.
So we have our new emu, which is our average,
and then you'll notice there's plus or minus this funky little Greek symbol. That's our sigma symbol. Eso if you take your average plus one sigma which, if you remember from how we named six Sigma sigma means standard deviation,
you're going to see about 34.13%. Um, what that means is 68 change percent of your data should be within one standard deviation of your mean
eso. When we talked about our presidential height and I said that we have
about two inch standard deviation on either side when we were talking about measurement scales, What that will tell us is we have about 68% of our data within here. If you do not, then this is where you are going to see there are atypical distributions, and you're going to want to do those interventions
that you are developing in your improved phase.
Um, we have a module on a typical distributions just to help you recognize it coming up in a few lessons. Two standard deviations from the mean you're gonna add an additional 13.59%. So what that tells us is want between um,
two standard deviations, plus or minus the mean.
You should be looking at around 95% of your data on and then add the additional three standard deviations from your mean that's gonna push you up to the 99. And I see 990.8. It depends on how many significant figures you go out,
but 99 change percent of your data should be within
three standard deviations of your mean. So what that tells us is that as we decrease our variation because remember, Six Sigma is about decreasing variation as we decrease our variation are bell curve is going to get narrower. The width of our bell curve
is driven directly by our standard deviation, which is a measure of variation.
So when we talk about our average and are centralising tendency that should be right between our upper and lower customer spec limits. We did talk about
There are scenarios where you don't have a lower or you don't have an upper, but you know where your average is. This is your average time to complete your average number of units and a shift.
The average phone calls thes air. The things that you are counting on time to completion being a really big one if you're reducing your cycle time. But with that where you really are interested in your Six Sigma aspect of this is in your variation or the width of your standard deviations.
that was really, really great and very statistical. I prefer to think of data in something that is more relevant for me. So now let's look at a bell shaped curve for how I think of people. So about the middle Chunk will say about 68% of people
I don't have a positive or negative preference for thes. They're going to be the people right smack dab in the middle. Then if we go out to 95% of people, you'll see that now I have some positive or negative preferences. My bell curve is starting to drop. We've got friends. We've got sports teams.
Andan. If we go out, three standard deviations from the mean
we should get all levels of awesomeness. So this comes back to telling us that we're thinking about a spectrum when we're looking at our data, where there you will have these points within it. But It all indicates to positive and negative over your average.
So when we're thinking about our normal distribution, we want to remember all data normalizes about 600. You do need greater than 25 data points for an indicative graph. Eso we'll talk a little bit and statistical process control about how many data points you need to recognize a trend.
But you do wanna have This is one of those situations where larger is better. You do wanna have greater than 25. If you have upwards of 600 you're gonna feel great.
There can be infinite numbers of shapes. This can be, you know, the all colors under the Rambo tall, short, wide narrow. But when you look at a normal distribution, you know it is a normal distribution. Mathematically, when mean equals media
does not matter if this is pancake flat
or straight up and down. If the mean does not equal the median, you do not have a normal distribution.
We talk about let's talk about what good looks like for a minute. So remember back CBK, PPK when we and we're gonna use K because that tells us that it is centralized on so a normal distribution
when we have our upper and lower specifications limit. If we working at a three Sigma, what you see is you have data points outside of those spec elements.
That means that this is waste. This is scrapped. This is rework. This is not acceptable by the customer. Same with 45 is where we get everybody functioning within. But you'll notice we have no room for error. So if we have a bad day and our operators aren't paying as close of attention or I haven't had my coffee,
there is a possibility we're going to get something outside of our customer requirements.
Six Sigma is ideal because we've got this big old cushion between our upper and are lower that gives us some breathing room. If we do see incidental variation or what we call special cause, variation will talk about it later in the course. But this is where
the our conversation about the voice of the process and how is our voice of the process functioning with our customer requirements?
The normal distribution is what tells us this.
So today we went over normal distributions. We read some normal distributions, and we understand the requirements greater than 25 data points mean equals median, all colors under the rainbow. And in our next module, we're going to go over the binomial distribution.
Up Next