Investigative Process md5sum Lab

For the md5Sum Lab, we observe where to navigate and download, the md5Sum utility. md5Sum is an integrity algorithm utility from eTree, you can download it from the site which is the most trustworthy resource for the instructor. You’ll observe a run of the utility and a sample integrity check of a basic text file searching thru Google, then again with a modification to the file name, then again with the file itself is changed back. This tool determines the state of integrity of a given file, whereas that state is defined at changed, modified, or altered. [toggle_content title="Transcript"] Hi, Leo Dregier here. I want to talk about md5sum. Md5sum is a very, very easy utility to use, and what I find surprisingly about this is not a lot of people know how to do it. It’s a very easy utility to run, uh, anybody should be able to, be able to go to the Internet and download it. So, to get the tool to begin with, either you could have it from one of your forensics hacking kits or something like that, toolkit of some sort, or just google md5sum. Okay, and, the one that I’ve, uh, routinely always gone to, uh, in, for years and years and years is the one from Etree; Etree has been around, it’s been a project, it’s been around for years and years and years, uh, and they have an md5sum.exe that is, uh, real easy to get right here on their website. Uh, I know it, I trust it, easy download, reliable, every time, all the time. So, once you install it, or download it, um, what I’ve actually done is, uh, I just like to copy it to the root of my hard drive. So you can see here I have the md5 utility, uh, right here. Okay, right here. So what we’re gonna do is open up the command prompt and use the utility. And, so, so down to command prompt, okay. Uh, we’re gonna clear the screen, and we’re gonna need a file to analyze, so let’s do a Notepad, you know, password.txt, okay? And you can see, I’ve got a password file in here, and what I’ll do is I’ll just simplify this, uh, at first, and I’ll just put the word ‘password’ in here, I’ll save the file, um, and we’ll just minimize it for now. Okay, so it’s md5sum, and I like to tab my way through it, uh, space, password.txt, and here is the message digest. So what these integrity algorithms do is they analyze this file without changing, modifying or altering it, it’s, uh, just an algorithm that kinda passively analyzes it and then gives you the output. Now what’s interesting about this, this message digest is if I copy that and then go over here to Google, okay, and then just put the hash inside, um, of, uh, Google and then hit search, you can see that something as simple as ‘password’, uh, is easily index able by Google. Now, I, I, when I’m in class, I always ask the students, you know, “Well how did Google find it?” You know, did it, did it reverse it? Did it find the trapdoor? Did it decrypt it? What exactly happened here? And, and the class always kinda scratches their head, and think, “Well, wait a second, how did it find it?” So, all Google did, realistically, was find a reference to somebody else having a file with the same exact contents in it, and then also the reference of the hash, and then Google just referenced that. Uh, Google didn’t do anything to this file. But nonetheless, you can see that, you know, for all of these different password options there’s already previously-defined values for, uh, in, for the word ‘password’, okay. So that’s a very, very, very common one to say the least, as you might expect as it’s the word ‘password’. So it’s easily index able, and we’ll come back to that later because when we, I show you a harder version or when I salt it I’ll show you that, you know, it pretty much defeats this, this type of attack right away. Okay? Alright. So, let’s just run the algorithm again and prove that it hasn’t changed. Now, I can do this literally 100 times here; it is not going to change, okay? Um, if I use the right command. Uh, and, for this lab, you guys can just pay attention to the last two digits of this file, okay? So, if I open up this file and take the word ‘password’ and I add one space to it, this little space right here, what’s gonna happen if I run the integrity algorithm again? Is it gonna end in 99? Is it gonna be something different? What’s gonna happen? And if you’re thinking it’s gonna be changed, um, well, let’s get the right command, here we go, md5sum, it’s exactly the same because I didn’t change the file, okay. So you still have to actually save the file, so I like actually proved that. Now that I’ve actually, ho-hahaha, you know, I got you, haha. So now that I’ve actually seen the file, if I do it again, you can see now that it ends in 75. And I can take this right back over to Google and I just highlight this, hit enter, go back over to Google, put my pasted value in here, search for that, it should still find this no problem. And it’s just the ‘password’ with a space to it, and as a matter of fact they’ve actually finds it with SHA and things like that as well too. Alright, so still not hard at all, okay? Now that, one of the tricks to this is that we could actually go back to the word ‘password’ and take out the space, File, Save, so no tricks this time, right? Now what’s gonna happen? Is it gonna end in 99? Is it gonna end in 75 or am I gonna get something different? You guys decide, you vote. I’ll give you a second to think about it. Alright, got your answer? So md5sum password.txt, it goes back to 99. Uh, I’d say a good half of the class, uh, about 95% of the class always goes, “Well, it’s not gonna be 75,” but, um, a good ¾ of the class, almost 90% of the class generally chooses something different. But it actually goes back to 99, and this is important to realize when we start talking about the state of integrity, because integrity is just a snapshot in time. In this case what I’ve done is I’ve taken a file, I’ve moved away from that file so I haven’t made changes, modifications and alterations. Notice I did not say encrypted, decrypting, ciphering, substitution, transposition, permutation; I didn’t say any of that. What I did say is change, modify and alter. Change, modify and alter have to do with the principals of integrity. Encrypt, decrypt, transpose, substitute, you know, all that stuff has to do with confidentiality. We’re talking about integrity here, which is critical in, uh, in the beginning part of forensics because we don’t want to change the integrity of, you know, our files, folders and hard drives, okay? So it actually changes it back to, to 99, okay? So, then people start going “Well, well, what happens if you do, you know, do things like attrib?” Okay. Okay, what if I do attrib to password.txt, right? You can see, uh, what attributes are set to it. If I go to the hard drive and grab my password.txt file and look at the properties, now I did change the timestamps to it, but the md5 utility doesn’t – the md5 utility, or the md5sum utility itself, doesn’t care about any of the stuff. It only cares about the data of the file or the contents of the file, it doesn’t care about created, modified, accessed, read-only, hidden, any of the advanced options here, uh, the name of the file; uh, any of the security permissions could have been changed, anything over here could’ve been added, you coulda went in Advanced over here; none of that matters specifically in md5sum. What is interesting is, is that if you compare that on a, like a Linux platform and you use another hashing algorithm, um, other platforms may include some of, uh, those, uh, attributes or things like that. But specifically in Windows, md5sum does not. Okay? So, you can see that it basically goes back to the original because it doesn’t take that into consideration. So, now what we can talk about is, um, you know, what do these values mean? Now, I pretty much crack the word ‘password’ in this file because I can easily find a reference to what the hash is. But how do I defeat this? Well, let’s talk about the power of salting. Salting is basically another file that is added, typically on Unix systems or Solaris systems or any system that supports salt. But it adds this prefix, which is just random numbers and letters, um, to the, uh, beginning of the password file, and then pretty much your password. So it now has two, um, two components. It’s got the original value, plus it’s got this random, um, prefix to those. Now, as you may, uh, have guessed, now if I do an md5sum to the password file after saving it, I get something completely different. But if I copy this value here and then just go to try to find a quick, uh, match to it, using, you know, you know our favorite Oracle choreus 09:57 here, and search for that, it’s gonna come up null, it’s not gonna be able to do it. So, uh, I know it’s not exactly the same, but going up a brief sanity check, I’m basically defeating, you know, um, dictionary-based attacks or rainbow table-based attacks, not necessarily dictionary, more rainbow, ‘cos rainbow’s more hashing, dictionary is more trying to decrypt, uh, more cryptography, more confidentiality, uh, or, by matching a list. Uh, this is specifically trying to find a reference. Alright, so I can prove that I don’t get an easy one, alright? And just to kinda build upon this, just to make this, uh, this lab, you know, really, really interesting, I wanna prove one other thing here. I wanna search for online – let’s get my mouse back – online hashing calculator. Okay. And I really like this one,, it’s normally the first one that comes up if you search for online hashing calculator. Right, so you click on that. Give it a second to load. Once, once it loads, we can put some text right here, uh, I gotta wait for the webpage to finish loading, it’s still loading the components of it, you can tell because it’s actually scrolling down here; just give it a second. It’s probably on your system gonna work faster than this. Alright, here we go. So if I just take my text string in here and do ‘password’. If I just put in my password function here and then select hash, um, you will see that it basically just runs ‘hash?text=password’, um, and then, go ahead and scroll down and look at your results. What’s really, really cool about this tool, is it’ll tell you the value of the, the date and the form of the hash, through just about every major hashing algorithm out there that you would need to know it in the results, right? So it tells you your original text, it tells you what the original bytes look like in hex, it runs it through Adler32, gives you the value, runs it through cyclic redundancy check-32, gives you the value, runs it through HAVAL, which we commonly talk about in CISSP classes, it talks about MD2, 4, and 5, here’s the value of 5. Look familiar? Ends in 99. Um, and it proves that all of these are 128 bits or 32 characters; here’s your RIPE-128 and -160, notice 160 is a little longer, SHA, and the respective size of SHA, so here’s 160, here’s 256, here’s 384 and here’s 512, and it tells you in Tiger and it tells you in Whirlpool. Okay? So, what I like about this is it just tells you all of ‘em right, right away, and that’s just basically what a hashing calculator does. And you can just put in some text here and you can upload a file and the website, uh, works pretty, pretty well. Alright, so, the only thing that improving in summary, the only thing improving this is that ‘password’, now specifically referenced to what we’re doing, ends in 99. So I supplied the data, ‘password’, it supplied the hashing algorithm, and we got the same exact result every single time. So, it’s very, very important in the fundamentals to understand. The md5 algorithm has to work exactly the, the same for the data every single time, all the time, no exceptions, because, you know, you, if I send some data to you, you need to be able to verify that data no matter where you are. As long as we use the same algorithm, we will get the same result, okay? So let’s go back over to ‘password’, and let’s take out these values here, save the file, run the algorithm one last time, and you can see it goes back to 99, okay? So, this, this lab is one of my, my classic, most famous, uh, md5 labs, uh, that exist for mastering the concepts of integrity. Please remember, in the world of confidentiality, you do want to change. You want to change plaintext into ciphertext. In the world of integrity, you do not want to change, you cannot change. It shouldn’t make sense that, you know, if we were analyzing a file here, and we were changing it in the process of analyzing it to prove that it hasn’t been changed, but every time we’re analyzing it, it’s changing it, how could anybody ever verify that it’s never been changed? Right? So what this algorithm has to do, it has to do, is never change your file. Okay? That is the, the sole principle of integrity. It’s the lack of change that we’re detecting here, or, or when was it changed? We need to be able to detect that when we changed it here and it ended in 75, it changed; we detected that. Um, when it always stayed at 99, like if I just keep doing it right now – not that – if I just keep doing it right now, uh, over and over again, okay, it’s gonna be 99 1000 times if I do it. So I’m proving nothing has been changed, modified, altered, nothing has been added, nothing has been subtracted, the algorithms are gonna be consistent, which is what we want, every single time, all the time. So I hope this helps separate in your head the pure differences from a principle point of view, the differences between confidentiality and integrity. This lab is so powerful in the fundamentals of one, cryptography, and two, a lot of these certifications that you’re gonna go after, because when they ask about the differences between confidentiality and integrity, most people can’t simply understand the difference between confidentiality and integrity. They can talk about them, but they can’t do, they don’t know how to back up their talking technically. So what I want you to do is try some hands on labs on integrity, and then some of the, uh, confidentiality tools, and see that, in the confidentiality world, we do change plaintext and the ciphertext, we run it through transpositions, substitutions, permutations, we take out, replace, but in the world of integrity, we do not, and it’s just that simple. My name’s Leo Dregier, thank you for watching. Don’t forget to check us out on Facebook, LinkedIn, YouTube, and Twitter. [/toggle_content]
Recommended Study Material
Learn on the go.
The app designed for the modern cyber security professional.
Get it on Google Play Get it on the App Store

Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

Cybrary On The Go

Get the Cybrary app for Android for online and offline viewing of our lessons.

Get it on Google Play

Support Cybrary

Donate Here to Get This Month's Donor Badge

Skip to toolbar

We recommend always using caution when following any link

Are you sure you want to continue?