3 hours 41 minutes
in the malware component section, we're gonna be looking at the basics of the operating system while understanding common characteristics of malware, such as its persistence in different types of payloads. We'll also look at the different tools we can use to identify. Malware is hidden components.
As a malware analyst and reverse engineer, we need to understand where the malware that we want to analyze is being run.
So because we're looking at Windows malware, this requires us to know the major parts of the Windows operating system. So as it relates to Windows, the software components will be looking at most of the time is the file system, the Windows memory and the registry? So let's take a brief look at thes individually before we look at some malware characteristics
to begin, we have the Windows file system. This is where data in the form of files and directories are stored directly to the physical disk drive. Now, the Windows operating system have supported various versions of the file system over the years. For instance, the file allocation table, or fat
in addition to fat, all Windows operating systems since Windows NT was released in 1993 support the newer file system, called New Technology File System, or N. T. F s.
Newer versions of Windows also support X fat. This was included in Windows CE, which is designed for flash drives in Windows Server 2012 and newer, their support for newer file systems like those which support data availability and scale ability, such as the resilient file system,
as well as support for Lennox file systems.
Now, although Windows supports different formats, stored in the file system is information about the directory paths and the files themselves. This information includes the file name, size of the file, time and date stamps as well as permissions.
These will be pretty important as you're conducting your malware investigations. For instance, by using a tool like file Isar. We can examine a file or maybe a militia sample toe, understand when a piece of now where was created or modified on the file system.
As malware analysts, we examine one or many different types of files during investigation, so understanding the various file formats and how to identify them is pretty key.
So computers understand binary
where every file, every bit of every piece of data, right is represented by a one or a zero.
This means that any type of data, whether it's an execute herbal, a text file, were document or whatever it's stored in the form of a binary file.
Now, as we know, we can view these files in its hex form by opening the files and hex editor. For example, when we open up our test file here that we've got in our Mauer lab, the HEX program displays the file bites in hex.
In the file, the X codes range from 0 to 9 and a through F, and these air on the left
and on the corresponding right
is the Ask E printed version of the characters.
So here you can see that 54 is the hex equivalent to Capital. T 68 is the hex equivalent to lower case H and so on and so forth. So we should remember that hex is an alternate representation of bits. Now usually is common users or programmers. We don't deal with files at the hex or binary level that much.
But as malware analysts,
we always need to be looking deeper into samples, so we really should be very comfortable with Hex. Now, later on, as we look at X 86 60 for internals, we'll do some conversions from binary to decimal from binary to hex and vice versa to get more comfortable.
Also has malware analysts. We understand that there's many different types of files on our computers, so we need a way to uniquely identify them.
We can't simply just use the file name because we could have two files with the same name on the same computer where the two files could contain the same types of content and properties.
So this is where hashing comes in to uniquely identify a malware, sample or file. We can use a hashing algorithm
now. Ah, hashing algorithm employees a hash function, which is a mathematical function that converts an input value into a compressed numeric value or hash value.
Now the output depends on the hashing algorithm, but a very popular one that we use is the empty five message digest algorithm. And so, as you can see in our diagram here, we have our file, which is used as an input value,
and the hash function operates on our input and the output or value returned by the hash
function is a 32 bit message digest, or hash value.
It's with these hashing algorithms and others like it, such as Shawan and shot 2 56
that allows us to use the numerical output as a check sum for a file and verify its data integrity.
This has the added benefit of allowing us to use this unique identifier when we write reports, or we search the Internet for information about a specific file that were researching.
There are, of course, other ways to identify files. The first way is to look at the file extension. However. As you know, malware authors typically employ these kinds of social engineering techniques, such as renaming the file extension so that users think that they're clicking on a pdf when in fact they're actually executing a piece of malware
course. Um, or full proof way of identifying a file is by looking at the Magic header, which we discussed previously. Or we can use the file command from our Lenox terminal, which in essence does the same thing. The file command looks at the magic header bites to tell you the file type, so usually I like to employ all three methods.
So first we can look at the file extension, make sure it's in the XY. Then we could open up our file in a text editor and be certain it's an execute herbal by examining the magic header bites. And Third, we'll use the file command in our Seguin terminal to verify the file is in e x C.
And as you can see here, the file command outputs that the malware dot txt file
is in fact, a P E 30 to execute herbal.
Now, as we're on the topic of file identification, it's important to note that the location of the magic bites is not located randomly in the file. The magic bites are part of the file header. So here we have a file, and every file has a file header, which contains a structure or format that defines how data should be stored in the file.
The structure is usually defined by its headers, and it's these headers that hold metadata on the data stored in the file. So when you parse the header and the magic bites, this process lets you identify the file format and the type.
All right, so I hope you're with me so far. Remember that the whole purpose of this section is to make sure that you can successfully form an analysis strategy based on the file identification. Keep in mind that the strategy employed to analyze the militias pdf is gonna be different than strategy you employ to analyze and execute herbal.
And so, knowing this information is gonna help you as you perform analysis
in the next video, let's take a look at the p e A file specifically and Windows memory.
Advanced Malware Analysis: Redux
In this course, we introduce new techniques to help speed up analysis and transition students from malware analyst to reverse engineer. We skip the malware analysis lab set up and put participants hands on with malware analysis.