Obfuscation Part 3: base64

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with

Already have an account? Sign In »

3 hours 41 minutes
Video Transcription
all right. So as we continue our discussion, talking about Mauer challenges. Now let's learn about obfuscation. Using base 60 for encoding
based 64 is a binary to text and coding scheme. More specifically, it allows us to take a sequence of eight bit binary data and format them in a raid x 64 representation.
This allows malware authors to encode binary data into a nasty format.
Understand, though, that not all based 64 implementations are malicious. Today, you see based 64 pretty often it's used in http and email attachments. As the encoding scheme is very fast and easy to reverse.
The standard based 64 format employees the following characters that you see here in our raid ICS. We've got a combination of letters, numbers as well as the plus and slash at the end. Also, the base 64 format encodes data into 24 bits or three bytes of binary data,
and each three bites is translated into four characters from our rate ICS.
So let's understand how data is encoded.
The first thing that we need to dio is we need to look at our ratings and here we've got 64 values zero is a through slash, which is 63.
So now let's say we want to convert the message dog that we used earlier to base 64.
So the first thing that we want to dio is we want to convert each letter to binary. We could do this a few ways, but to speed up the process, I've already looked at the asking table
and we know that the X representation for D is 44 hex in binary is 01000100
The letter O is 01001111 and G is 01000111
Now what we do is we put all these values in a line together. This is a 24 bid, three bite piece of data. This is exactly how base 64 processes data.
So now what we can do is now we can split the binary digits into 46 bit groups, and then we convert those groups to their decimal equivalents. And then we look up their equivalents in the table.
So if we convert our first group of six digits, we have a value of 17.
If we convert our next six digits, we have a value of four.
When we convert our third set of digits, we have a value of 61. And if we convert our fourth set of digits, we have seven. Once we have the decimal equivalents, we simply look up the base 64 equivalent
17 as our
for his E
61 is nine and seven is H giving us a base 64 string of R E nine h.
Okay, so why is this all important? Why did I go through the steps to show you how to convert? Based 64? There are many tools that can help you locate in reverse based 60. For encoding. You could use Python Cyber Chef, which is a favorite of mine or even converter net.
But understanding the base 64 scheme that we've just reviewed can help you recognize and identify custom and coding schemes.
In general, you wanna look for the base 64 character set as a string in our programs to look for base 64 encoding. You can use any of the tools that I mentioned, but probably the fastest way is to use strings or a P E tool like P E studio. But I do pro can also help us in these situations.
Toe. Look for base 64 encoding. We can use the strings feature in Ida Pro,
so let's first load are binary.
If you're already working with the database such as I am, you can simply double click your database now. Once the data bases loaded, we can look for strings by navigating to the view menu item and then sub views and then strings.
You can then look for the base 64 character set.
Here we have one.
We can double click the item, and it will bring us to the section of the program where the character set is referenced. As you can see, this string is being referenced in a function by looking at the cross references listed on the right.
If we'd like to navigate to this function, weaken simply double click on the references,
or we can press X on our keyboard to view all of the cross references.
Once we find the function that we're most interested in, we can navigate to that function by double clicking. It As you can see, Ida Pro navigates us to a base 64 function. If you notice here on the right, we also have an equal sign.
This is because there's a number of alphabets that use 64 characters, and this additional character indicates padding to see where the base 64 encoding is happening. We can see what function calls this one by again using the cross references. We can do this by navigating to the top, clicking on the function name
and pressing
We only have a one function so we can double click and navigate to it, and I do pro. The function we navigated from is highlighted in our graph.
From here. You could start to analyze your code by seeing how the base 64 is being implemented.
Okay, so that wraps up our base 64 session in the next session. Let's look at how we can identify encryption
Up Next
Advanced Malware Analysis: Redux

In this course, we introduce new techniques to help speed up analysis and transition students from malware analyst to reverse engineer. We skip the malware analysis lab set up and put participants hands on with malware analysis.

Instructed By