Business Continuity Part 1

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Time
9 hours 59 minutes
Difficulty
Intermediate
CEU/CPE
10
Video Transcription
00:00
>> Welcome back. In this video,
00:00
we are going to be all about business continuity,
00:00
dealing with failure in the Cloud,
00:00
specific things you'll want to take into
00:00
account and the particular strategies
00:00
you'll want to employ to keep
00:00
your systems up and running in the Cloud.
00:00
In the Cloud, you need to architect for failure.
00:00
The thing is, single assets aren't as reliable
00:00
in the Cloud as they were
00:00
in the traditional data-center model.
00:00
The Cloud providers environment is
00:00
highly complicated and virtualized,
00:00
you often have many different tenants.
00:00
If you can imagine you have
00:00
these virtual machines running on physical machines
00:00
and it's an environment that there's
00:00
an expectation things are running 24/7.
00:00
In the Cloud provider needs to make
00:00
changes to a physical machine.
00:00
They need to take actions to divert and move
00:00
virtual machines off of that physical machine
00:00
and onto other machines within the cluster.
00:00
In early 2020, Daredevil Nick Wallenda traversed
00:00
a tight rope across Nicaragua as Messiah volcano,
00:00
with lava boiling beneath him
00:00
at 2,300 degrees Fahrenheit,
00:00
wind blowing and toxic gas in his face.
00:00
He made his way across the rope step-by-step.
00:00
You may have noticed in the picture that he had
00:00
an emergency chord setup to
00:00
catch him if he completely fell off the line.
00:00
Additionally, he had this very large balancing pole
00:00
to adjust to the wind drafts that
00:00
he was encountering during his trip.
00:00
Then finally, he had a special mask on to
00:00
deal with that volcanic and toxic gas coming up,
00:00
Adam, because as you can imagine,
00:00
you pass out on a tight rope,
00:00
it doesn't go well for you.
00:00
Now, even with all those safety precautions
00:00
myself, could I do this?
00:00
No way. Could you do this? I'm not sure.
00:00
Unless you are a professional daredevil yourself,
00:00
likely not.
00:00
Just as I am scared to walk
00:00
across that kind of a tight rope,
00:00
Nick may be scared of the Cloud.
00:00
Fortunately, the Cloud is your profession.
00:00
Be like Nick, but in the Cloud, prepare for failure.
00:00
If failure happens, your systems
00:00
don't have to be running at 100 percent.
00:00
In fact, it's usually not
00:00
cost-effective to do that except for
00:00
the most critical workloads but assuming failure
00:00
is the way to build reliable Cloud native systems.
00:00
Before moving forward, let's define a few terms:
00:00
business continuity, business continuity planning.
00:00
This is a playbook to address large-scale failures.
00:00
I'm talking about buildings
00:00
collapsing or being untenable.
00:00
We're talking about electricity outages,
00:00
maybe for short periods of time and
00:00
you kick on a backup Trinidad writer,
00:00
maybe for real long periods of time,
00:00
we're talking about natural disasters.
00:00
I live in California,
00:00
everybody talks about earthquakes out here.
00:00
A few years back, the country of
00:00
>> Puerto Rico encountered
00:00
>> a massive hurricane and
00:00
the company I was working for
00:00
>> had a large presence there.
00:00
>> We really had to do a lot
00:00
of disaster recovery in its most extreme.
00:00
In a new area where we're just figuring
00:00
out how that'll work are pandemics,
00:00
large-scale virus or bacterial outbreaks.
00:00
The goal of business continuity planning
00:00
is to get people and
00:00
processes critical for business working
00:00
with an acceptable amount of time.
00:00
But let's talk about disaster recovery.
00:00
That's similar, but a little bit different.
00:00
Disaster recovery is more of
00:00
a tactical plan that you're going to use to restore
00:00
the technology systems that are critical to
00:00
those key people and processes of your business.
00:00
When you consider backup strategy
00:00
>> in its most basic form,
00:00
>> it comes down to three major methods.
00:00
You have the hot backup strategy.
00:00
This is the highest cost,
00:00
but it has the least downtime.
00:00
This is where you have hardware, software, data,
00:00
people, everything ready to cut over
00:00
to this new location at a moment's notice.
00:00
It's tricky to manage in the sense of replicating data,
00:00
in the sense of ensuring there is
00:00
enough capacity at the fail over site.
00:00
If you're managing all of this yourself in it.
00:00
In the traditional data-center model,
00:00
it was very difficult to manage something like this.
00:00
You had to have a separate facilities to make sure
00:00
that you had not only data replicating,
00:00
but you had enough hardware at both locations.
00:00
Often, hardware of similar models and
00:00
similar nature's to make sure that everything went,
00:00
it would fail over, would operate at
00:00
a reasonable level and
00:00
go down the chain we have the warm site.
00:00
This is a compromise between hot and cold.
00:00
You don't have everything running at
00:00
an alternative site up and ready
00:00
to just fall over and go.
00:00
But you do have the necessary servers,
00:00
applications,
00:00
operating systems,
00:00
and even some ongoing data replication.
00:00
If things do go south and
00:00
your primary region goes down with
00:00
some manual or automated effort,
00:00
but with some level of effort in a little bit of time,
00:00
you can get that new location up and running and
00:00
hosting the role of the region that went down.
00:00
Then the last we have is the cold.
00:00
This is the lowest cost,
00:00
but also the lowest and
00:00
downside and its most extreme situation,
00:00
you'll have a data center room with
00:00
some racks and Internet connections,
00:00
maybe sitting there and necessary
00:00
electricity and cooling,
00:00
but you haven't even installed the servers.
00:00
Oftentimes, companies will still have a cold,
00:00
but they don't want to quite be that that cold.
00:00
They'll have a handful of servers sitting there.
00:00
They have virtualization technologies,
00:00
but there's really no active efforts underway to
00:00
be replicating any of the servers and
00:00
applications that are going
00:00
to be assumed by this cold site when fail
00:00
over does happen and it's not
00:00
until the main site falls over.
00:00
That's when the efforts are just made
00:00
to maybe start restoring
00:00
backups from a certain location
00:00
and rebuilding all of those machines,
00:00
restoring the data from
00:00
other cross-site backups and so forth.
00:00
It just takes a lot longer.
00:00
In traditional IT when you manage the data centers,
00:00
any of these strategies really
00:00
provided a lot of overhead.
00:00
The hot being the highest obviously,
00:00
but in the Cloud you have the resiliency trade-off
00:00
that still exists in terms of
00:00
hot being more expensive than cold.
00:00
But the cost to achieve even the most basic
00:00
of these strategies is significantly less.
00:00
We're going to talk about how
00:00
you achieve these different things.
00:00
But before we dive into those specifics,
00:00
it's worth noting that not all systems are equal on.
00:00
You want to focus your efforts on
00:00
those that are the most critical.
00:00
Business impact assessment.
00:00
A BIA is a questionnaire-based tool that you can
00:00
create to capture and set
00:00
expectations for each system within your business.
00:00
Different companies make these.
00:00
I'm sure you could even find a simple one if you
00:00
were to just type a little Google search string.
00:00
What it does is it allows you to
00:00
justify the high costs associated with
00:00
those systems that really are system
00:00
critical and demand resiliency.
00:00
The recovery time objective
00:00
is the amount of time between when a system
00:00
goes down and it needs to be backed
00:00
up and running after a disaster.
00:00
It's really all to all avoid unacceptable consequences
00:00
associated with a break in business continuity.
00:00
RTO is the answer to the question,
00:00
how much time did it take to recover
00:00
after notification of business process disruption?
00:00
A close cousin to the RTO is
00:00
the recovery point objective, the RPO.
00:00
This is where you define the amount of data that you
00:00
can lose in the event of some disaster.
00:00
Let's say you are doing some data backup procedure,
00:00
data backing up all databases every night,
00:00
and then taking those backups off-site or
00:00
using a Cloud providers functions and having
00:00
the backups moved to a different region.
00:00
In that circumstance, it would be
00:00
just fine as long as the RPO
00:00
is 24 hours because you
00:00
could lose up to one day's worth of data.
00:00
But for some systems,
00:00
that's not acceptable in the event of a disaster,
00:00
there's a lot less data that they're willing
00:00
to lose and so in those circumstances,
00:00
you're going to need to have a much shorter RPO.
00:00
The most critical traffic sites can even
00:00
reach down to five-minute RPO expectations.
00:00
In a SaaS model,
00:00
you don't have as active role controlling
00:00
the technical implementation regarding RTOs and RPOs.
00:00
However, we talked about the key tool of the contract.
00:00
These contractual agreements are
00:00
an awesome place to put RTO and RPO expectations,
00:00
as well as remedies in the case that
00:00
the SaaS provider fails to meet those expectations.
00:00
The geo-redundancy capability of
00:00
Cloud providers are extremely powerful,
00:00
especially when we're talking about disaster recovery.
00:00
In the next video, we will continue
00:00
this discussion and look at the different mechanisms
00:00
Cloud providers give that allow you to
00:00
achieve the necessary disaster recovery.
00:00
We'll also talk about
00:00
the reality of Cloud provider outages,
00:00
as well as options for
00:00
portability across different cloud providers.
Up Next