Regression Analysis

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *

Already have an account? Sign In »

9 hours 53 minutes
Video Transcription
Hi, guys. Welcome back. I'm Katherine MacGyver in the scissor lean six Sigma green belt. So today we're going to go over a regression analysis.
So if you remember, through our hypothesis testing, we went down the continuous data and we said we as greenbelts are going to be spending a lot of time looking at relationships. Regression analysis is your primary relationship tool. So today we're gonna introduce the idea,
and then I want you to understand why you care about regression analysis and how this is gonna help you
as a practitioner and potentially just as a business person. I have to tell you that regression analysis is
my most commonly used analysis tool when I'm looking at disparate data sets,
regression analysis is the most useful tool. And let me tell you why
first I love it. So it's obviously going to be useful. But aside from me loving it, what I really love about regression analysis is it gives us a magnitude. Well, it gives us an ability to measure causation in two data sets,
and it gives us a sense of magnitude of those causation.
So regression analysis is fundamentally the relationship tool. You can be tricked up once in a while and see things that have a great relationship correlation, but don't necessarily have causation, however, because you're going to do really great pilots. So you're gonna have before and after data or
concurrent data that has different data sets toe. Look at
you're gonna be OK in that whole correlation causation relationship.
This is the best tool for agreeing belt. When we're measuring
our exes and are wise, are inputs and outputs are independent variables and our dependent variable.
We love it. Um, correlation, of course, is not causation. Bishara's a hint. Literally. I have never been able to find the statistician who said this Eso, I guess. Apparently you don't want that to be your claim to fame. But something to keep in mind
is ice cream sales and murder rates or shark attacks to do
grow together when you have warm weather. Like I said, it is a big driver. But do you know a grain of salt? Think about other factors as well as you are thinking about this godsend, which is the greatest thing ever. So
regression analysis mathematically describes the relationship between your dependent and your independent variables. So these are the things that I can change these air the outputs that I see If I change X, what happens toe? Why, If you remember, we talked about why equals f of X? This is what a processes
we are now at a point where we can measure
the magnitude or whether or not If I do something to X, something happens. Toe. Why this is going to be your hypothesis test is a greenbelt. Um, What you're going to need to dio is ideally you want to or more continuous datasets,
um can do it with
continue with and discreet. So if you are looking at, say, downtime of systems and customer satisfaction scores, I want to warn you that it's not super grave because you have a mini toe, one relationship on your discrete data. So if you say
your, um,
your, uh uh, I lost it all your discrete data is worth one, and you have this spectrum of continuous that relates to it. It's not as refined as it could be. It can be done.
It's just not ideal. It's one of the ones where you're really gonna wanna look at it with a grain of salt and ask yourself is there may be a different
data set that you can work from. That would be more indicative because, ideally to continuous sets together. Um, but just mathematically describes our relationship, which is really, really exciting. It also gives us the sense of magnitude in those relationships. When I say orm or
remember we talked about, you're gonna have a lot of root causes for your potential problem statement you want to measure for your problem. It's why we keep talking about things. And he's kind of funny night ways, cycle, time, throughput intact time, etcetera, etcetera.
You wanted that. You wanna have a measure for your problem because this is gonna be your Why
so ideally continuous is best, but you're gonna work with the data that you have. Then you're going to want a have a measure for each of your hypotheses.
So if you remember, we were talking about our hypothesis development, and we want to know
how we can You wanna have a measure for your ex so either before and after, if you're looking at something where you're piloting or your ex has a unique measure and your why has a unique measure, But you can influence your ex. You want to datasets again before after
is really best? Because I give you a sense of piloting or if you have two sets concurrently, like
here is one department using the new and improved process. But this department still using the old department, we're gonna compare those. Those are the kinds of data sets you want to look for for regression analysis. So with that, it's been a while since you guys have had a homework assignment. So I don't want you guys to think through
what are some processes in your work place that you can apply regression analysis to. So you're gonna need to data sets that are related or we're hypothesizing they're related. Um, preferably continuous.
For the sake of this exercise, we're going to say continuous and start thinking about what could that relationship will look like?
Today? We went over an overview of regression analysis. You know that it is a mathematical way to demonstrate relationships. You know that I am super geeked out about it, and this is what you're going to do. A lot of is a greenbelt. Um, regression analysis is done in phases. So our next couple of lessons are actually going to be the how to
in those phases,
starting with creating and reading a scatter plot. So I will see you guys there.
Up Next