3 hours 41 minutes
in the last session, we looked at how to convert numbers between basis. So let's build upon what we've reviewed and examined x 86 x 64 computer architecture. Er.
In previous sessions, we've stated that all the information in a computer is stored in one's and zero's orbits.
These bits can represent a number, character or any other piece of information
in x 86 x 64 architecture. Er, we've got some fundamental data types we've discussed These previously for bits is a nibble. Eight bits is a bite and so on and so forth. As an example on the left, we have one bite. This is represented as the hex number 55.
Now, just keep in mind, we're going to be using hex to display all of our bytes of data from here on out.
Now, using these fundamental units, we conform other data types. For instance, if we group two bites together will form a word if we Group four bites together will form a double word if we group eight bytes together will form a quad word and in x 64 architecture er, we also have one double quad board
now. One thing you want to keep in mind is that one bite or sequence of bites thes could be interpreted differently. For instance, if I give you a binary number of 11000011 This could represent the hex character C three decimal 1 95 or it could even mean Rhett in assembly.
It all depends on the context.
Okay, So in the context of data types and memory and bits and bytes, why is this all important?
Well, because you should start to understand that what we're doing here is we're building larger data structures from smaller ones. Bits are grouped together to form bites. Bites are formed together to form complex data types. And our data types are grouped together to form information as assembly language that we can understand from a high level
now to process these data structures, we have computer architectures,
computer architecture. ER is just a set of methods and rules that describe the organization the functionality in the implementation of a computer system.
Now, in general, we can break up the study of the computer architectures into three main categories. We've got the instruction set, architecture er, the micro architectures and the system's design.
Now the instruction set architecture or the I say this defines the machine code that the processor reads an ax on, as well as defines the word size, the memory address, processor modes, registers and data types that we've looked at earlier
now in modern computer architectures, the central processing unit executes machine code, the main memory of the system. This is the RAM. This stores all of the data as well as code and the input output system. This directly interfaces with devices such as monitors, keyboards and hard drives,
the value or the arithmetic Logical unit. This executes instructions in RAM inputs, results in memory or registers when we'll hit registers in a second. Now the processor. This handles all of the logical arithmetic and control activities, but a processor only understands machine language. But the machine language.
It's a purely numerical language. It's too obscure and complex for software development.
So what we did is we created a pneumonic for this language, and that's called assembly language.
Now, before we get into the next session, which will be about simple language, I want to talk about main memory because the main memory stores, machine code and data for the computer.
The main memory, as it's displayed here, is an array of bites sequenced in hex format. Now, when we view memory, lower addresses appear at the bottom and they increase towards the top to hire addresses now in X 64 32 bit processors,
data is stored in what we call little Indian format.
Now, Little Indian format deals with how we place this data into memory. When we place data into memory using Little Indian, the bites of a word are just starting from the least significant byte to the most significant byte. For example, if we place our word into memory,
the lowest significant byte is addressed first
in lower addresses, followed by the most significant bite in higher addresses. Now, how do we know which is the least significant byte? Simple. It's the right most bite.
How do we know which is the most significant by this is the left most bite Now? The reason why Intel computers use little Indian is so that the processor can perform operations on the data in memory faster. However, when viewing data in memory, you only need to know that little Indian systems store data in the reverse
that we humans re data
right to left.
All right, So before we review the fundamentals of assembly language, let's talk about the CPU. Okay, so the CPU is responsible for executing instructions, and these instructions are in the form of machine code, which, of course, make up our programs. The CPU is made up of the control unit three Alieu,
the Iot Device control and most importantly, registers.
Now in programming, we need access to data in a way to process variables. So registers are a temporary storage area that's used by the CPU to hold various pieces of data that are referenced by instructions when they're executed.
On Intel architecture, we've got four categories of registers. We've got general purpose segment, flat registers and instruction pointers. Now, as malware analysts, we're gonna want to know what the various registers dio So in this section will describe general purpose registers and pointers. And then we'll filter in the other ones
as they come up.
Now in x 64 architecture, we have 16 general purpose, 64 bit registers. The first eight, these are our X through RSP and the second eight are named R eight. Through our 15, these registers support access to data for different sizes and locations.
Now, in this course, we're not going to be looking at
registers are eight through our 15 very much, but the other registers. They enable us to move data into a location that suits their needs. For instance, if we need to store a quad word, we use the entire 64 bit register. If we need to store a D word, we can access the X 86
e a X portion of the Rx Register.
This will accommodate our four bites. If we need to store a word, we can access the to bite a X portion of the Rx register.
And finally, we can even store a single bite in the Rx Register by utilizing the A L Register. Now. Typically, these general purpose registers they've got a designated purpose. However, this really isn't always the case, but in general we've got the Rx register. It's called the Accumulator,
and it's typically used to store results from subroutines and functions.
The R B X Register is used by instructions for indexing and addressing calculations.
We've got an R C X register, and this is typically used for
counters in loops.
The RDX Register is used for input output operations for various arithmetic operations. Thea Rvp Register refers to the stack frame of the currently executing function, and we'll take a look at that later.
And then we also have the R. C and the R D. I registers thes 0.2 addresses in memory for the means of indexing purposes. Theme. The RC registers, called the Source Index Register and R D. I. Is called the destination index register, and these air just commonly used for data transfer related operations
like transferring content among strings and arrays and so forth.
The RSP register is a pointer to the top of the stack, and we're gonna talk about that later as well.
Okay, now that we've covered a lot of the necessary computer architecture information, let's move on to assembly language and get into our lab and start playing around with more malware
Advanced Malware Analysis: Redux
In this course, we introduce new techniques to help speed up analysis and transition students from malware analyst to reverse engineer. We skip the malware analysis lab set up and put participants hands on with malware analysis.