9 hours 41 minutes
I welcome back to module six. This is less than 6.1 data types. This lesson We're gonna review data types and how they're implemented in my sequel,
and we're gonna discuss their roles as well. Let's go ahead and walk through these bullet points. The 1st 1 is data types, data types, data types. And the reason we say that is because data types are common concept in most, if not all programming languages. There are some high level languages where you don't have to indicate
what the data type is. Python comes to mind. You don't have to indicate that a type of a variable,
now. Data types affect the amount of memory used by the computer. For example, taxes typically more expensive than ended your data types. So if we have an energy data types, the computer might expect the maximum number of that to be 128 whereas if we indicate a text at a type,
the computer might be preparing
to hold data that goes, all that fills up an entire article, which is much larger than a simple imager. Now for specific implementation of the data types in the My sequel Database Engine, you can visit the following link listed on the slide and it
it's a good document, and it goes over all the data types that exists and how they are implemented in the my sequel database Engine.
So data types
effective use plays a large role in efficiency, especially as data volume increases.
Bad decisions may go unnoticed in small applications, while the same bad decisions may break large applications. A quick example would be to consider the Amazon database for customers. They most certainly have a database that is supporting hundreds of thousands, if not millions of customers.
And there's a big difference between
control and millions of records efficiently and controlling 10 or 20 records efficiently or not so efficiently and it not mattering that might happen at a small company
Now, with that said, let's go ahead and jump over to the development environment, take a quick look at these data types in D Beaver.
I welcome back to the development environment and get started. I wanted to bring this up, and this is taken straight from the my sequel documents and you weaken. Read that in 11.9. Choosing the right type for a column for optimum storage. You should try to use the most precise type in all cases.
For example, if a manager column is used for values in the range from 1 to 99999 medium and unsigned is the best type
of the types of represent all the required values. This type uses the least amount of storage.
So what my sequel, ISS saying here Oracle, who owns more? My sequel is saying that the data type you choose for a column has an impact, and that impact will grow as more and more data is added. So if
a poor choices made, you're going to end up using a lot of unnecessary space. So let's take a look at the data base that we built because there are some inefficient choices in that design.
So I'm looking at the buildings table and will notice that the zip code is a bar chart to 55 now. This is definitely larger than it needs to be, because even if I did want to use a var, char, a zip code is never going to get to the size of 255 characters.
Now, to make some of these decisions, you do have to have some domain knowledge and domain knowledge would be knowing that a zip code
I won't ever get that big because you know the format and the size of a zip code.
Let's take a look at another table
if we can.
Time stamp time some
Some of these choices leave some room for subjective this now. It wasn't that long ago that lay arable if it when it created a i. D column by default, he used imager on signed in Not Big Imager on Sign. Now big energy or unsigned definitely prepares you for the possibility of gathering a large
volume of data.
But on a table of states, maybe where there's 50 states and, you know, maybe extra states get added. But they will never be that many big energy on signs really wouldn't make a lot of sense.
you know, I I don't think we are making too big of a sacrifice by having our i d Be big energy on sign, just in case our database grows to a large size.
Now, some other examples is course code would probably never get to the size of 2 55
And here's another one. Letter grade will never be a bar chart to 55 have probably be more appropriate to make this a bar char
well, one if it's following a traditional letter grading system, which is a, B, C, D and F
great score with decimal eight to makes sense.
So there's not a lot of bad decisions in this database, which is good. You should definitely get a little concern if you open a database up, go through all the tables and everything is text, because that would typically be assigned that no thought at all was given two data types,
which is, ah could be a precursor of things to come.
So the big take away here is that when designing a database, you should definitely be given some thought to dio data types of your columns. You should be trying to
select data types that capture the data that is necessary without leaving a bunch of wasted space, and that will help your data applications scale upwards so that if lots of that is added, it doesn't slow down. Now. Data types is not the one answer to answer everything,
even if you do an excellent job of data types. But then you leave out indexes than the lack of indexes will slow you down.
So there's all these little pieces of the database that you want to try to line up so that you have a really fast application at the end of the day.
Now, with that said, let's go ahead and head over to the summary, and that brings us to the 6.1 data types summary thought it might be useful to list some of the common mistakes I see on the field with data types, and these include, but aren't limited to using text when imager was appropriate. Some examples of that includes Social Security number,
date parts. If you're gonna have three columns one for the year, one for the month and one for the day, you can get away with small energy for that.
Sometimes these occur because it's just too easy to use a text column.
In addition, using debt types of their too large, for example, using a regular inner jer for a boolean value zero and one now as a new program are coming in. If I see a regular inner jer value,
I think you're storing some kind of number.
But if they're being used for billions, then it takes a little bit of additional investigation to realize that that feels being used in place of another data type. For some, some reason.
Another example is using big managers for dates. You can use regular size managers for dates. If you're doing the four digit year, the two digit month, two digit day format
and one more additional issue I see is not using texts when appropriate and you want to. You typically use the text or bar chart data type. When you're dealing with unknowns. Now you might be asking yourself Well, if I'm designing the database, when would I be dealing with an unknown? Well, a good example of that is, if
you were given a task to scan emails and farm data from emails
that are arriving from some third party. That third party might not be exposing, how they store their data, how they deliver their data, which makes their data kind of an unknown. You might see a few examples and say, Yeah, it's usually a number, but I'd rather not crash the insert on my program, so I'll just make it
and that would be a good example when to use text is when you're dealing with unknowns.
In any case, that brings this lesson to an end, and I hope to see you in the next thank you.
The Microsoft Azure Test is a premium Cybrary assessment created by Interview Mocha. The exam ...
AWS Data Engineering
The AWS Data Engineering test is a premium Cybrary assessment test created by Interview Mocha. ...