Hi, guys. Welcome to measure phase types of data. I'm Catherine MacGyver and today you're going to understand the types of data used in process improvement. You will be able to understand the differences between continuous and discrete data, and you'll have an awareness of the measurement implications by the types of data.
So jumping right in the reason why this is important to you is because the type of data that you are collecting is an input in a very large factor in your data collection plan which is one of the major Tolgay items in the major delivery bols
from your measure phase. So when you complete this phase of the domestic project,
you will have a plan that says, this is how we're going to collect our data. This is what the baseline is, and we're going to use that same plan to re measure as we move to our improving Our control phase is so very important for you as the practitioner to understand the differences in the data
because there are different ways that you
collect this information now. That being said, there is no specific right way notice the
right way, what you're looking for when you're doing data collection is that it is reproducible and repeatable. So you have some stability and consistency in that plan.
Most of the time, the majority of these actually do have some semblance of a measurement plan. So jumping right into data, we have two different. You see, data, we have two major types we have categorical in numerical. For those of you who have taken any form of science classes or any statistics,
you will recognize this as
qualitative and quantitative. So categorical is going to be data that is based off of language and words. I'm not necessarily having any numbers assigned to it unless you kowtow to them. Numerical data is going to be no data that is based in numbers.
So these are going to be things that would be eligible for descriptive statistics.
So what you would find in a math class as compared to what you would find it may be a social studies class
diving more into categorical data. We have to sub categories of that. We have nominal data, which is data that has no rank or assignment. What you're looking at here is there are they're not apples to apples or one apple is better than the other apple
you're looking at. You're a cat or a dog
or a goldfish, these air going to be really no relationship to each other. But you can count frequencies on them. You can when you're measuring them. This is going to be something like complaints. Nominal data tends to always be a complaint because it gives us,
um, a snapshot in time of that person's experience.
Conversely, you have orginal data, which is ranked data, and when we talk about ranked data, what we're talking about is there is a sequence. So I mentioned briefly, Unless there is, unless you code the data, which is where you assign a numerical value to a non new miracle piece of data. A
is the customer satisfaction, like Kurt Scale. So you have
most dissatisfied, dissatisfied, neutral, satisfied and most satisfied. So there's a rank order. There is a progression that can happen here. Another way to think of a ranked order is you can have
elementary school, middle school, high school, college and so forth. So there is some relationship between the data, so
recapping, nominal has no relationship between the data. This is cats and dogs or Nall is relationship between the data that it's still a category. Hence categorical like this person is an elementary school aged child.
When they moved further, they will be a middle school aged child. So there is a loose association
with that being said. Try to keep in mind that by applying data coding so by saying that a customer satisfaction of most satisfied is a five.
That doesn't mean that it's subject to descriptive statistics because they are
independent of each other, even if there is some sort of sequence in your orginal data.
So conversely, we have numerical gate data. This is our get down. This is what we love. This is lean six Sigma practitioners really like it because you can put statistics on it and you can grab it. And this is super cool. And you get your stopwatch and you do your time studies and all of this. This is new miracle data. This is
commonly referred to as the gold standard. Um, and specifically continuous data is the gold of the gold. This is the platinum sooner. I want to put this out there as in caution for you to not think of it as a gold standard. So continuous data is
the best and we love it, especially as we start getting into higher level.
I'm in advanced statistics. This is the only thing we can work with.
That being said, ah, lot of organizations work in the categorical data land to you have customer satisfaction surveys. You have a customer complete information. All of this is categorical. So don't rule it out just because we like numerical beta better because it makes prettier graphs so
off of my soapbox the difference between continuous and discrete data. We're gonna drill into that and our next slide, but continuous status, Something to think about on a spectrum discreet. That is something that you would see on an interval
or more specifically,
if you look at the image on the left are discreet, is the number of drops. We had 10 drops that came out of the faucet. We could count 1234 These are going to be our intervals there. Also considered into jurors. If you guys think back to middle school math when we talk about continuous data, we're talking about
measurements. This is a spectrum. You can have finite pieces within it. So
temperature, time, distance These sorts of things are all continuous data. If you look to the right, well, I have two examples of a graph that show a the difference between a continuous and a discreet
distribution curves. So bell shaped curve when we're starting to look at it like above or below customer expectations.
So blue on the bottom. This is going to be continuous what you're looking or excuse me, this is going to be your discreet where you're looking at, you have individual data points that we can plot in a sequence, but they're not necessarily on a spectrum. If you look at the top, you have your continuous data.
This is going to be a spectrum. It's gonna be 1.11 point 21.3, etcetera, etcetera.
So, again, looking at it in a graphical way, continuous data is easiest to graphically display and to work with from a statistics standpoint,
so really, really quick, test your knowledge. I want you to think for a second What type of data is our client satisfaction serving. And look at this one specifically
all right. So this one, that is actually a tricky one. This is one that I have tried to reinforce with you guys because we're starting to talk about the ways that data looks funny. This one is actually orginal data. So there is a loose association you have not likely to. Very likely. However,
when you are looking at these scores, when you're doing your data analysis,
this looks a lot like discrete data. So you've got, you know, four number twos and five number five. So you end up with an average somewhere between three and four,
doing math really fast, and now it looks like you could do descriptive statistics on it. Unfortunately, that's not the way that categorical data works because they're not necessarily numerically derived. This is how their experience is. So
when you're looking at this when you're digging into the data, just because it's coded doesn't necessarily mean that it is a quantitative
or a new miracle data.
All right, so coming back to our types of data, summary continuous data is considered of the gold standard, and that has to do with the statistical implications that go along with it. If you were to say, Get into projections or regression analysis. You must have continuous data, which comes back to
the types of data dictates the types of collection that you do. So when we're talking about doing categorical data, you're going to be looking at client satisfaction.
Um, we're talking about discrete data. We're talking about checklists and tick marks which will become history, grams. When you're talking about continuous data, you're going to be talking about data that's on a spectrum and can be statistically analyzed.
All right, guys. Thank you. And I look forward to seeing you in the next model.