Part 6 Defenses

Video Activity

This lesson discusses a two-fold mitigation strategy: *input validation: whitelists, black lists and regular expressions (regex) *output encoding The instructor offers samples and discusses what to include when doing defensive coding with the emphasis that white list is more desirable than black list as black list is limited. It is also crucial to ...

Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *

Already have an account? Sign In »

9 hours 31 minutes
Video Description

This lesson discusses a two-fold mitigation strategy: *input validation: whitelists, black lists and regular expressions (regex) *output encoding The instructor offers samples and discusses what to include when doing defensive coding with the emphasis that white list is more desirable than black list as black list is limited. It is also crucial to not use regex for all the fields as it nullifies the application.

Video Transcription
Hello and welcome to the Cyber Aires secure coding course. My name is Sonny Wear, and this is a WASP Top 10 for 2013 cross site scripting medications, Countermeasures and defenses.
Now four Defenses overview. We're basically going to look at the two fold mitigation strategy,
and those two components include input, validation and output in coding.
Now the reason why I'm putting them together is because in addressing any of the cross site scripting attacks that we looked at,
you need to incorporate both of these in order to properly mitigate the problem. Now for input. Validation were speaking on Lee in terms of server side input validation. Never do we want to have
any validation done on our client side because, as we've seen in our demos,
it is very easy for Attackers to manipulate that information
to manipulate any kind of validation checks that are done within the browser.
We're gonna look at white listing blacklisting and regular expressions, which are common ways that input validation is done. If you don't have a built in framework, method or function available,
and then we'll take a look at a code sample showing some input validation.
The other main area is output in coding.
Now, output in coding is going to neutralize any kind of special characters that might be misinterpreted by either the Web server or Web browser that might be contained in an http response.
So what output in coding essentially does is it forces a conversion of all characters
to some sort of character scheme or character set.
Now why is this said Advantageous? Because that forces the characters to be treated as data instead of the Web server or the Web browser actually executing malicious scripts unintentionally.
Generally speaking, your L in coding is used for output in coding hand. It usually conforms the utf eight, and I'll explain why in just a moment
we will also look a code sample of that. And also I'm going to go through the output in coding Web page context that you need to make sure you include when doing your defensive coating now input validation techniques.
As I mentioned, if you do not have something built into your framework already, these are some alternatives. White lis blacklists and regular expressions. Now white lis we spoke about in a one injection as another defense there.
These are the same ones. Basically, you're going to define your white lis to be as specific and tight as you possibly can,
yet wide enough to still allow for the business functionality to be maintained in the page. Now what do I mean by that? What I mean is, if you have a text box that on Lee receives a name,
then you should Onley receive
Alfa alphabetic characters. You should not receive any numbers. That's just one example.
So obviously we've gone through white. Listen how what they do is they only
allow acceptable values to pass through,
and those acceptable values can actually be defined in an array in a new Marais shin. Possibly in Constance may be defined in header file or common area
and, of course, in regular expression pattern matching.
Now, black lists are exactly the opposite. They're going to reject malicious characters that are defined in the listing but realize that those air chest, the malicious characters that are known at the time of the writing
until that makes blacklists very limited in their effectiveness. And remember that every time there's a new exploit, a new way to get around a black list that's going to require more manual updating of your code and re compiling of the code, etcetera.
So the preferences to always go with the white list over a blacklist.
And then, finally, you can use regular expressions. Of course,
regular expressions could be part of your white list if you like.
This does require some learning of rejects pattern building. You can consult some very good books by O'Reilly on the subject, but to give an example. Here I have a carrot, the less than sign greater than signed the Empress stand. Some slashes in a tick.
This actually is a regular expression,
the kid carrot, meaning not
and then the other characters being literal. And so this could be used in a blacklist fashion, for example, to not accept any of these characters. But as we talked about before, you can use your oral in coding too easily. Get around a blacklist of this type.
The illustration is just to show you what is involved in creating and understanding howto build regular expressions.
One other caveat.
Please do not use one regular X pattern
to have all of the fields in your entire Web application go through.
That takes basically makes the
validation completely nullified. So that's why I was saying, Go back to the business purpose for a particular field.
See if you can identify what would be acceptable and then create a special white list for that field. And you have repeated fields in every application, name, address, et cetera.
And so you should probably come up with a common library that can address those common types so that you can reuse given rejects patterns.
So moving now to our input validation code sample, as we saw before in our explanation section, we have a variable from our client side called First name,
and we can get the value stored in that variable very easily. From our request object,
we can call, get parameter method, and then we can take that value whatever might be stored in that variable and assign it to a local variable. In this case, it's called first named parameter. Now, instead of immediately starting to use first name parameter,
what we would do instead is past that value through some sort of input validation. And in this case, the white list is made from a regular expression. Now, the regular expression I have here declared as my data validation white list.
It's set to receive on Lee Alfa characters,
so it will receive a through Z lower case in a Through Z upper case. Every single letter
that means that should there be some sort of character or number or special character that is contained in the first name variable that does not conform to this pattern, it will be rejected.
And so, as you look at the method must pass white list check. You can see how that's done. We have a pattern dot matches that passes in that value that we received and then compares it against our Reg X. If something should go wrong and it does not match,
then I'm actually throwing
a custom exception. White List failure exception.
So this is an example of how you could do the same in your code.
Now let's talk about output in coding. So as I mentioned, output in encoding helps to neutralize characters, particularly in our http response that we send back out into the browser
and you're Ellen coding. In particular is the general standard that's used by Web servers for the display of characters
as well as the receiving of characters, which is why you see a lot of attacks done with Earl Encoding for the input if you've noticed on. And that is for this very same reason, just the flip side of the coin, if you will.
And that is to ensure that the Web server actually understands the special characters. And so the attacker will
sin that input through Europe in coding.
Well, it can be used in a defensive way as well. For the http response.
And so you're really in coding, actually replace his characters in a string with one or more character triplets. Now what I mean by triplets. Basically, it's going to be a percentage sign, followed by two Hexi decimal numbers.
So, for example, percent to E is actually the dot notation.
So dot meaning current directory. Now you should realize that utf eight is theme Most predominant character said. That's generally used these days.
Four euro encoding. And the major reason why is because the 1st 128 values of UTF eight actually map directly to, uh, the United States asking codes. If you take a look at the excerpts I have below,
you can see the Unicode coat code point.
The actual character, which is what we recognize, right, the character literal.
And then you can see the UTF eight Hexi decimal number.
And so
what we have here is a less than sign is going to be represented in UT F. A. A is three c, which means in Ural encoding, it's going to be percentage three seats. If we look at the greater than sign,
it's the same idea we have the greater than sign in
But in utf eight, it's going to be represented as three e. Of course, making it in your old encoding means percent three e
And so taking that same example, in continuing with our output in coding, we can see how we can take the response that we're going to send back from this HTML page that's going to echo hello and someone's name
and see how we can instead send it through our rejects white lis to sanitize it
and also on the response on the way out, send it through some sort of encoder.
Now, the encoder that I'm showing on the slide is actually a job encoder library that's available through O ost,
but many
frameworks these days have built in encoders that you can call and so you would pass in your sanitized variable name into that encoding method or function
prior to sending it back out to your Web page.
And then, finally, I want to make you aware of all the many contacts that need to be encoded,
and I know that this could be a bit dizzying because
you don't realize just how many attack vectors are available on a Web page until you start watching a lot of demos or seeing a lot of exploits done. But
this is to make you aware that it's very important that you provide in coding
for all of the different context that might be available in your Web page.
So at a minimum, you should look at these contexts. Html Body
HTML attributes you're you are rail
any JavaScript that's in there, including any Jace on Jake weary type calls
and cascading style sheets.
Now, generally, you would have some sort of library that would be able to encode for all of these different context for you. What I'm showing is the drop down list for the encoder that we just that I had just mentioned on the previous slide.
But look inside of your framework and make sure that different context can be encoded. For
now, we're going to move on to the lab portion of this module.
Up Next