6.1 Introduction to Cosmos DB

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *

Already have an account? Sign In »

18 hours 43 minutes
Video Transcription
Welcome back. This is the first episode in Module six. We're gonna talk about an introduction to Cosmo. Stevie,
the objectives include understanding what cosmos D B is taking a look. Get some features of it such as consistency levels, request units and then some additional components that we need to know about.
So, first, what is Cosmos TB? Cosmos D B is a globally distributed multi model database service, but what does that really mean?
Cosmos TV allows you to dynamically scale the database service across multiple azure regions worldwide.
Well, Cosmo, Stevie does is in the background. It replicates the data tow wherever the user is, so they can work with the data closest to them.
Unlike something like a virtual machine, which is stored in a specific azure region, the data and cosmos TB is replicated to multiple places. To be closer to the end user, you can manage cosmos TB to be located and multiple azure regions at any time when a new region has added the cosmos TV service will automatically start replicating that data to the region.
Cosmos TV is designed and ready for various applications like Web mobile gaming and I ot or Internet of things that require handling lots of data. Reade and rights.
Some of the benefits of using Cosmos DP include the ability to have it always on by replicating data to multiple regions. Cosmos TV offers a 99.999% availability or five nines for both reads and writes,
and you can also programmatically invoca fail over of the cosmos account to another region
so your application can't remain available in the event of a regional outage or disaster.
Cosmos TV also has immense scalability
and you can use multi master replication, meaning each data server has the ability to write. The data then will be replicated to other data servers with this model, and along with the data residing in multiple azure regions, and application can scale to millions of requests per second across the globe. Cosmos TV also offers low latent see
it's extremely highly responsive, with less than 10 millisecond latency for reads and writes, so applications can query data very quickly, and Cosmos TV also is very compliant. Meeting is so fed ramp you hip and PC I compliance
to get started with azure cosmos TV, we need to create an azure cosmos account.
This account is associated with our azure subscription and has a unique D. N s name that will use to access it programmatically.
In order to scale our data globally, we need to add azure regions to our account, and this could be done at any time in our configuration. One thing to note is there is currently a limit of 100 azure Cosmos accounts under a single subscription
with Cosmo. Stevie, you're not restricted to using a single A P I to work with the data stored in it.
There are a variety of AP eyes available. Such a sequel, which is considered the Cory P I.
We'll have a couple of different versions of no sequel such as Cassandra, Mongo, D, B or Gremlin, and then we can also use azure table storage. By supporting these multiple AP eyes, you can take an existing application and easily migrate the back end to Cosmo's D. Be without the need to rewrite significant portions of the application.
Next, we have consistency levels. Traditionally, with distributed databases, you have to choose between having strong consistency and the data,
which will often lead to higher Leighton see or you could get better performance with eventual consistency, but it was harder to program applications this way.
With Cosmos TV, we have the option of choosing multiple data consistency options from stronger consistency to more relaxed options.
And here in our diagram will be moving left to right. And as we move to the right will get higher availability lower. Leighton, see in higher third print. Our first consistency level is strong and this is where data Reade requests are guaranteed to return the most recently committed version of the data.
So, you know, you're always reading the most up to date data.
Next, we have bounded staleness where you have lower Leighton see to access the data, but you choose how stale the data will be.
This could be configured based on time constraints or the number of operations that are waiting to complete.
Next. We have session, and this is the most widely used option. As he gives the best experience for a user in their current session, the user will be able to see any data they have written and have a consistent read experience.
We also have consistent prefix when reading the data. It may not be fully up to date with what is available, but when you received the data, it's going to be in order.
This gives lower lighten seat but requires more reeds to get all the data that is being requested.
And finally, we have eventual when reading the data, there's not gonna be a guaranteed order in which you receive it, and riding data will eventually convert and synchronize across all the regions. But the rights may not show up in order. A CZ Well,
when we're discussing operations taking place on a cosmos TV, we need to understand what request units are our use. Our request units summarized the cost of database operations occurring in the azure cosmos TV are you is the currency for ther put in his measured per second
this way. You don't have to worry about how many CPU cycles
or I ops or memory are being required for your database operations. It's strictly based on data being read. One are you equals the cost to read one kilobyte of data, and this is independent of any of the AP eyes. We choose from when we're interacting with our cosmos d B.
And this is also regardless of the operation. If it's a read, write or just a query are used. Our provisions in increments of 100 are used per second, and you can scale this at any time to increase or decrease the number of our use needed
in our screen shot on the right, you can see where we can configure the number of third puts for our containers.
Speaking of containers, let's talk about items, containers and databases.
First items are just individual records that are being stored in the database and depending on which a P I you choose, an item can represent different things, like a document in a collection, a row in a table or a note in a graph.
Next, we have containers and containers or the main storage inside of Cosmo's D B, and they hold our items that we have. Uploaded containers are the unit for a scalability and ther put and storage.
They're partition and replicated across our multiple azure regions. When creative container, there are two modes to choose from. For the third put first, we have dedicated provisioned, where the third put provisioned on a container is used exclusively for that container and it's backed by S. L. A's or we have share provisioned
where the containers shares the provisions. Third, put with other containers inside the database.
And finally we have databases, and this is just a name space that holds multiple containers. And if you look at the diagram on the right, hopefully it illustrates how our hierarchy works.
We first have our cosmos count. Then inside of it will create a database or a name space, which will hold a collection of containers, and that container will hold multiple items. And while we're just showing single entities here, a cosmos account can have multiple databases.
A database can have multiple containers, and a container can definitely have multiple items.
Next up is partitioning
and as your cosmos DP uses partitioning and order to scale our containers and databases in order to meet our applications. Performance requirements
when we partition were taking items in a container and dividing them into subsets called logical partitions.
And these logical partitions are formed based on a partition key that each item in the container has. That means every item in a logical partition has the same partition key.
So let's say we haven't container which is holding 500 items. Let's say their student records
each of these 500 student records has a unique value for the student i. D. If we use student I d as our partition key, then we're gonna have 500 logical partitions created for the container along with the partition key that determines where the items logical partition is. Each item is also gonna have an item i D,
which is unique within that logical partition.
When you combine the partition key along with the item I d. It creates an index for the item, which is how we're going to uniquely identify it.
An important thing to note when choosing a partition key is that a single logical partition has an upper limit of 10 gigabytes of storage
going back to our student record. Example. Let's say, instead of choosing a partition key like a unique student, I d. We decide to use the first name of the students. We could end up with a very large, logical partition in the event we have a very common first name, say, John or Mary.
So we'll want to choose something a little bit more unique.
Well, that does it for some of the concepts for Cosmos d B. There's definitely a lot more information that goes into some of these. I highly encourage youto check out the Microsoft docks or other articles,
But coming up next, we're going to take some of these concepts and go out to our azure portal and create a cosmos account in some containers and databases for it. See you in the next episode.
Up Next