Malware Components Part 2: PE Files and Memory

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
or

Already have an account? Sign In »

Time
3 hours 41 minutes
Difficulty
Advanced
CEU/CPE
5
Video Transcription
00:00
in the last section, we took a look at how to identify files. However, the question you may ask yourself is how a file or piece of network gets from just being a file on the hard drive to running as a process of memory.
00:13
To obtain answers to our questions, let's learn about the portable Execute herbal format in Windows memory.
00:21
Previously we've discussed it. Every file has a format, and of course, the same goes for a execute herbal file running on Windows. This file, of course, is called a portable execute ble, or P E.
00:32
So hence the file format for it. Execute bull. It's called the P e file format.
00:37
Now the P E file format contains various headers that define the structure of the file. It's code in data and or various. Other resource is that it needs to run on a Windows operating system. It also contains different fields that show how much memory it needs when it becomes a process in where the process copies
00:56
various code data and other resource is
00:58
now, the P E format is rather large and sometimes a little bit intimidating. It's got a number of fields, but When you start to understand the sections and where they're mapped into memory, it really gives you a great point as to where you can focus your analysis
01:15
to look at the p e file in detail. We're going to use a tool called CFF Explorer now. There's a lot of different tools out there that you can use to look at the P E header, like P IV, U, P E Bear and even Python. So really, you can use any one of these tools that you feel the most comfortable with. When we open up our sample in CFF Explorer,
01:34
we could see that the left side displays the headers and a tree view.
01:38
Now, as we mentioned, the P E file has different components. It's got the headers, and it has sections which also have headers. Headers are meant to store different minute information, and the sections are meant to store the code data and other resource is
01:53
one of the elements of our view. Here is the dose header. This contains fields like the Magic field, which corresponds to a value the magic bites of the file. We also have the P E header. This is called the NT header and has split into further headers such as the file header and the optional header.
02:12
If you click any of the fields on the left side,
02:14
you can see the corresponding fields and their values on the right.
02:17
In the final headers, we have information which includes the time date stamp
02:23
these air displayed as values. Or we might also have an offset as to where the data is present. In the different P E sections, for instance, we have the pointer to the symbol table, which is displayed as a hex value offset, which is calculated from the beginning of the file.
02:42
Now the data in these headers
02:43
are useful for different reasons. As an example, we can use the data in the number of sections field to confirm that there are actually 15 sections to the P e file. Now, how do we know this? Because F and Hex is equal to 15.
03:00
And so if we click the section headers
03:04
and we count the sections, we should have 15. So let's count them up.
03:08
123456789 10, 11, 12 13 14 15
03:16
As malware analysts, we have other information that we can view That's pretty interesting to us in the p e file, such as In the File header. We've got some characteristics. This holds a to bite field value, which gives us different properties of the P E file. And if we click in the box here, CFF Explorer gives us different properties of the P E,
03:37
such as if it's inexcusable, dll
03:39
or if it's a 32 bit or 64 bit, execute herbal
03:44
in the optional header field. We also have information that's important, such as the address of the entry point and the image based.
03:52
These are important because Windows uses them to map the process into virtual memory. Now there's other sections and information we can explore here. But let's pause on the examination of the P E file because we want to talk about how a file gets loaded into memory and becomes a process.
04:12
In most operating systems as well as Windows, we have computer memory
04:15
computer memory. Aram gives applications a place to quickly store and access data on a short term basis and then freeze that memory when the application is completed.
04:26
Now, before a program can be executed. We need to load it into the computers. Main memory
04:32
Well, look at processes in a second and how those work. But as it relates to malware analysis, we need to be concerned with two different types of memory. Physical and virtual. So physical memory exists on chips and on storage devices such as hard disks
04:50
and virtual memory is a storage location which
04:54
exists through software
04:56
now in the past, and to some extent now, we've had problems when running applications because the memory requirements of the program would require more resource is than our hardware would allow.
05:09
But through software, computer scientists invented a solution called the virtual memory, and this was implemented to simulate these physical memory of the system and its employed in all modern operating systems. Now, virtual memory is pretty complex, and we're not going to cover all of it now.
05:26
However, basically what it does is create an illusion toe, a process
05:30
that there is a large amount of ram available to the process to that process,
05:35
and it doesn't have to share that memory with any other processes.
05:40
The virtual memory allows data to be exchanged between the physical memory very quickly and allows the use of much larger programs.
05:49
So what that means is, with 32 bit operating systems, we've got four gigabytes of virtual memory that's assigned to each process. And it doesn't matter if the size of the ram of the computer is 16 gigabytes or 32 each. Running processes assigned its own dedicated four gigabytes of memory
06:10
that it can execute
06:11
all of their instructions without interfering with each other's memory.
06:16
When a program is double clicked and it's under execution, it's defined as a process.
06:21
This program execution involves three components. The computers CPU ram in the hard disk. When a Windows program is executed, the system allocates the memory space that reads the program from the disk and writes it at some locations in the allocated memory. Now, once this is completed, the program is executing, and it's a process.
06:41
Now, with tools like Process Hacker,
06:44
we can inspect these processes and where they reside in memory.
06:47
The physical in virtual memory spaces are what we call addressable. Virtual memory, just like physical memory, is addressable, meaning that every part of the file or bite in the memory of the process it has an address
07:01
in four gigabytes of virtual memory. The address starts at zero and ends at a decimal number, which is equivalent to 2 to 2 32nd power in decimal. However, when we're looking at various tools, we're dealing with these numbers in hex. So in a 32 bit system,
07:17
the first bite is represented as eight zeros, and the last bite is represented as eight efs. As we can see in our image. Our low memory addresses start at zero in hacks, and they grow to the higher memory addresses represented as eight F syntax.
07:34
Our process is mapped into memory within the space.
07:39
If we ask ourselves. Okay, how does Windows know where it's going? Thio Allocate the space and virtual memory and at what address? This is where our image based field in the P E header can help.
07:51
The answer is that usually the image based field is used to tell Windows where to begin mapping the P E file in its sections into memory.
08:01
So to view the allocated memory for running process, we can use a tool like process hacker in our malware lab.
08:09
With process hacker, we can view the properties of a running process, such as its file name and path performance. In other detailed information,
08:18
if we click on the memory tab, we can view the memory regions allocated to the process. Windows divides the virtual memory of a process into memory chunks that we call pages.
08:28
These pages are made up of the processes, data and instructions which, if you remember, are part of the program. Execute will file as it resides on disk.
08:39
So when the operating system loads the program as a process, the data and instructions air transferred into the memory by splitting into these pages.
08:48
Now the pages are further divided into frames, which helps with how the memory is being moved about between the OS, the process in the hard disk. You could think of these as different buckets that the physical memory sets up for virtual memory to occupy.
09:05
The virtual memory of a process is also split into two separate areas. We have the user space in the kernel space
09:13
and 32 bit windows. The total addressable range, as we mentioned before, is zero through eight efs and hacks.
09:22
However, this is a total representation of the virtual memory space in which the user space and the kernel space
09:30
both occupied. So in essence, the process runs in two areas the user space and the kernel space.
09:37
When we view a process and process Hacker has here in our lab here, we can see that the process is occupying a space from zero to our hex number, and that begins with zero x seven F, and it runs to the end of the user space, which is
09:56
an additional 15 FC hex.
09:58
This is the user space in the kernel space would start at eight and 15 zeros and would end with all EFS. Now, if all of this is a little bit confusing to you, not to worry in the next module will take a look at converting numbers into hex and how the 32 bit and 64 bit addressing works.
10:16
For now, just remember that the process is mapped into a 32 or 64 bit address space that's divided into two areas. Three user space in the kernel space. When we look at our process and memory process, hacker groups the memory blocks of the same type into pages, which you can expand
10:33
here. A process hacker is showing us the different types of data that are stored in pages,
10:39
and the pages are categorized based on the type of data that they store.
10:43
There's three types of pages we've got private image and mapped.
10:48
Private images are exclusive to the process, and it holds process related data structures such as the process stack, the process environment block and the threat environment block
10:58
thes private pages will be pretty important to us as we dissect malware later.
11:03
Image pages contain the modules of the main execute Herbal and D. L. L's and mapped pages contain different files needed by the process, such as the ones that that reside on the hard disk that get mapped into the virtual memory. Virtual memory pages also have a state.
11:18
The page states tell us if a page has some physical memory allocated for it or not,
11:26
pages can be in a committed, reserved or free state, and it's listed here in process hacker. When a page is in a reserve state, this means that a page has some virtual memory allocated in the process, but it doesn't have any corresponding physical memory.
11:43
A committed page state is an extension of reserve pages, but now the page does have physical memory, allocate to it
11:50
and free pages thes are address ranges for pages and virtual memory that haven't been assigned a process yet.
11:56
As we mentioned, memory pages contain code as well as data. Some pages contain code that needs to be executed by the CPU, while others contain code that needs to be read. So sometimes the process wants to read or write some data into the page. But it needs to be granted permissions to do so, so pages have read, write and execute permissions,
12:16
and they're displayed here in the protection column
12:20
process. Hacker uses R, W and X to indicate the permission of the page.
12:24
As the malware analyst, we should be pretty mindful as to what type of missions of Page has because, for example, the stack in the heap of a process are Onley meant to store data and shouldn't contain any execute herbal code. As you can see here, the stack of our processes only marked as read right.
12:41
Okay, so you may be asking yourself Well, he's explained all this memory to me, but really, how's this useful as I'm performing my Mao analysis, and the answer is that the memory of a process really has a lot of data. And some of that data is in memory as human readable strings like you, RL's domain names, file names and I P addresses.
13:00
So with Process Hacker, we can view the data present in various pages
13:03
by double clicking a particular memory block.
13:07
For instance, if we double click inside our image base address, you could see the strings of our DOS header written within the memory page.
13:16
However, to go through all of the memory pages, one by one could take you a bit of time. So instead you can click the strings but at the top right of the memory tab. And this will give you the option to select the pages and to see the strings inside those pages.
13:31
Okay, so I know there's been a lot to take in with memory so far, but it's actually pretty important that we get to know how Windows memory works and how the execute a bles reside in that memory. However, before we move on, I'd like to discuss relative virtual addressing or R v a.
13:48
So our Via is a method used to locate parts of the P E file after their mapped into memory.
13:56
It's a simple concept to understand when you realize that and execute a bles based dress won't be known to us before it's loaded into memory.
14:03
The image based field, as we saw previously, is merely a suggestion for this mapping, and it always isn't mapped in that space. If you notice when we view the address of entry field of the Malware p e file on CFF Explorer, that instead of an address value like zero x 4000 it simply gives us an offset.
14:22
This offset is the relative reference from where the program will be loaded into memory. It's similar to like giving directions.
14:30
For instance, when you get to the right house, take a left five houses after in RP e, the entry point of the file will be 14 e zero from wherever the file is loaded. Whatever that base address, maybe.
14:43
Okay, that was a little bit of a long video, but I hope you're still with me in the next section. Let's look at some Win 32 AP I internals and the Windows Registry
Up Next
Advanced Malware Analysis: Redux

In this course, we introduce new techniques to help speed up analysis and transition students from malware analyst to reverse engineer. We skip the malware analysis lab set up and put participants hands on with malware analysis.

Instructed By