Assembly

Course
Time
13 hours 15 minutes
Difficulty
Beginner
CEU/CPE
14

Video Transcription

00:01
Hello. This is Dr Miller, and this is Episode 2.1 of Assembly.
00:07
Today we're gonna learn about assembly language and suddenly in assembly instructions identify IRS directives and then creating enlisting of a file
00:18
assembly language.
00:20
So machine language is the low level that a machine uses in order to execute instructions.
00:27
And so, for example, if we would like to add e x t b X and then store the result in E X,
00:34
we would run the machine instructions. 03 c three.
00:39
Now, clearly, that's not very readable. And it changes with different processor types,
00:45
and so a better
00:46
language is needed and that languages assembly.
00:50
So given the machine language 03 c three, in order to do the movement of data, we have the assembly language. Add yaks, EBX. So so add those two registers together and then store the result in E X.
01:06
Clearly, this is more readable, and it is a higher level language than writing something
01:11
in binary In hex
01:12
from a previous lecture, you should recognize that this is hex, given the zero X and the letter C in there
01:21
and then we for different architectures, we can keep the same code and end up getting the right result.
01:27
Additionally, optimization is could be made, so if you have code that is not working or is not needed, the assembler and or compiler can get rid of those or pick a different instruction at the low level that will be faster.
01:42
And then the format of incessantly instructions is we have a pneumonic or something that you can remember and then that is followed by operations.
01:52
Now there is a difference between an assembler or compiler. So with an assembler, generally there's a 1 to 1 ratio of assembly instructions to machine code,
02:02
and an assembler is machine dependent. So if I write something an assembler for X 86 I have to write completely different code for arm or for spark.
02:13
Whereas if we have a compiler again, it's one higher level abstraction
02:17
and so one high level line to create a variable or move some data. My generate many machine and assembly instructions,
02:28
but the benefit of it being higher level is that it's more portable, so you can write something in C, and if you have a compiler, it can compile a to X 86 code or can compile a tow arm code, and you don't have to change the the high level code.
02:42
Now. This is not always the case, but generally it is the case.
02:46
So the assembler we're using for this class is Nazem, and so with that were actually underneath it using the canoe compiler can collection. So this is a compiler that will compile code.
03:00
And then we're also using the C programming language to interface with GCC.
03:05
And then we should also be using make. So that's part of our template uses make in order to build the program
03:14
assembly instructions.
03:15
So again we have a pneumonic and then that is followed by operation. So the pneumonic is add, whereas the operations are what it's going to do that operation on.
03:25
We have three general types, so we have registers as we saw. Yea x ebx ccx are registers
03:32
memory locations, which will look at later
03:36
and in immediate value, so that's an actual value that's hard coded inside of the binary.
03:43
Additionally, we do have implied so, for example, an increment will
03:47
effect a register by default. It's a little add one to that register or or deck a minute will decrease the register by one.
03:58
So example instructions. So we have our demonic, which is move.
04:01
And then we have the register, e B X, and we also have the the hard coded value. Seven. Right? So the seven is an immediate and then e b X is a register.
04:14
Here we have two registers, so movie a X ebx
04:16
So that says that E B X's value is going to go get copied into E x
04:23
or increment e x or deck Ament e x. So those again would incremental deck Ament that register.
04:31
And the way that instructions are in Nazem is that we have an optional hurts. The brackets mean optional. We have a label
04:40
followed by a pneumonic, so we have to have a pneumonic. And then if a demonic doesn't have implied operations, we need some operations, and then we have an optional comment. And so here's some example. Demonic smoove ad.
04:53
So mole
04:55
jump call
04:57
Notice that they aren't necessarily spelled right. So there's no eon move.
05:00
Um, jump doesn't have a you.
05:02
Um, but those are examples of new Monix and then all of the Monix that will learn
05:08
and then comments. So a comment starts of the semicolon. So if you used to programming and see,
05:14
you might just put a semi colon at the end of your statements. But that is a Nazem comment.
05:21
Okay,
05:23
directives.
05:25
So directives air, not actual instructions. They just tell the assembler that you want to do something you wanted to. For example, set the size of the stack, defined some memory to find some constants,
05:36
so directives air are for humans to make things easier, but they aren't actual assembly instructions.
05:45
So, for example, percent define.
05:47
So if you've programmed in a language like C pound, define allows us to create Constance. So also percent defined does so. It says percent define size tens every place that will see size. It just replaces it with the actual value of 10.
06:03
So this instruction would be the same as movie a X 10.
06:11
We have data directors, which we will have to do a lot of,
06:15
and so the D stands for define. So we got define a bite to find a word and define a deed word. Remember, a bite is one bite, which is a pits. A word is two bites or 16 bits and a double word is 32 bytes
06:33
identify IRS.
06:35
So throughout the course, you're going to be creating different variables or identifiers. And so we need to know how to define them and what er actually allowed.
06:46
So we use them for variables, constants, procedures or functions
06:51
and labels.
06:54
And so
06:55
the main key here is that you need to start your variables basically with a letter. It's lets the easiest way to start it. And then you can add some of these additional characters. But generally, most of my variables
07:09
would just have letters, possibly numbers within them, and then maybe underscores to make things more readable.
07:18
All right, so here is someone examples of addition and subtraction. So movie X 100 in hex
07:25
ad E X 400 hex. So one plus four should give us five
07:30
and then subtract GX 200 hex. Right, So that should give us 300 hex.
07:36
So if you open up your editor, you can type this code in and see what the result is.
07:43
Um, the dump regs one. We'll just print off what all the registers are.
07:47
I mean, then it puts the number one in front of it. So it does
07:53
hashtag number one. And so you can call dump regs multiple times and you'll see, um, each one of those You can put a different number in there,
08:01
So if you have the template already set up, I would go ahead and positivity. Oh, and I would enter this in and see what it produces and maybe change the numbers in here and see if you get a different result
08:16
generating a listing.
08:18
So listening will give us information about offsets and binary code for commands. And it also will help us when we're trying to find Ares.
08:28
So this is to tell us that we're using the program. Nazem were using Elf because I'm on Lennox.
08:33
I'm naming the listing, and then this is the example that I want to do for it.
08:41
So I'm gonna go ahead and switch to a
08:43
terminal window and go ahead and do that,
08:54
so I'm going to create my project here.
08:58
I'll just call it 2.1
09:01
now if I look inside my directory of my projects. So I already entered that before I started. I have a directory called 2.1.
09:11
And now I should have my assembly file, so I'll go ahead and add it it.
09:18
And so let's put some
09:20
code in here.
09:26
So I'm just going to move the value of negative 10 into yea X. Maybe I do another one.
09:41
I'll go ahead and write and quit.
09:43
I can run make in order to make sure that my program will assemble and compile.
09:50
So you can see right make is the program that we've been using its running the command Nazem. And it's also running GCC in order to generate our output.
09:58
And then the driver dot c is the program that
10:03
provides a skeleton so that our program can run
10:07
And so if we want to generate a listing
10:16
All right, so Nazem minus f elf. So that's the type of execute herbal.
10:22
Um minus l is the listing father we're going to generate and then 2.1 dot es, um is a file that I want to use for that.
10:31
So now I can look at that. I can use a program like cat, which means can Katyn eight or print to the terminal
10:41
so we can see in here that
10:43
our this is our A s a main, and we can see the different instructions.
10:48
So
10:50
the instruction to move something into E. X
10:54
is this Be eight,
10:56
and then we can see we've got a bunch of EFS and a six.
11:00
So again, remember that this is stored in two's complement, which we talked about in a previous lecture.
11:05
So and it also stores that a little bit backwards. Right? So the first bite is right here.
11:13
The second bite is right here. The third bite is right here in the fourth bite is right here,
11:16
and we'll talk about in in subsequent lectures about little Indian versus big Indian and why it looks like this.
11:24
But this is the actual number negative 10 stored in two's complement. And then we can see that in Hex
11:31
in our listing or the number 20 so 20 is 14 and hex. It's a 16 plus a four,
11:39
and then we can see minus 30.
11:41
So if you want practice on your two's complement, pick a number that you want to do. So say negative 76 try and figure out what it's binary representation is figure out what it's hex, a decimal representation is, and then go ahead and
11:56
convert it and then build a program that does it generate a listing and then you can see what the result is.
12:11
So here is an example that I had run before again. You see the B eight, which has moved something in the X, and then we have the number 01 which is the number one.
12:24
Additionally, we have commands that we can run in order to view data. That's inexcusable.
12:28
So, like a listing. So *** D is going to give us Raw hex by default, it prints the actual data that's in the file. Amenity also can convert from its its output back into binary if you need to.
12:46
So, for example, my execute Herbal is called 2.1, so I can say *** d
12:52
2.1
12:54
internal print, all of the raw bites, and we can actually see some Maskey characters
13:00
in our file.
13:01
The represent different things that are included inside of the binary file,
13:07
so that's kind of interesting.
13:11
Additionally, we can run, object, dump social, print off the disassembly of a binary program, and so if you have a binary program that you want to look at and you want to see the assembly of
13:20
We can run, object, um, so object dump. And then we are using Intel's in tax. So you need to set the disassemble options to Intel.
13:30
The minus D is to do disassembly, and then the aid out out is the name of the program that you want to run it on.
13:43
So let's go ahead and run it on our program
13:50
and so you can scroll through here and you can look at the actual disassembly of that program
13:56
or the assembly
13:58
that a dis assembler figured out. So dis assembler is just a program that is going to
14:05
go and convert all of the
14:07
the functions in the data into assembly code.
14:11
So we can see here that we put in our numbers as negative numbers or positive numbers,
14:16
and the disassembly is going to show us that raw number and not necessarily
14:22
the number and base 10 that were used to.
14:28
All right, so in summary, we learned about assembly language and assembly instructions,
14:33
identify IRS and directives and then creating a listing of a file and looking the contents of a file that you might have that you want to look at
14:43
looking forward, We're going to talk about logical operators, segments and functions.
14:48
If you have questions, you can contact me at Miller MJ at you and Kate. I e do you or you can find me on Twitter at Milhouse 30?

Up Next

Assembly

This course will provide background and information related to programming in assembly. Assembly is the lowest level programming language which is useful in reverse engineering and malware analysis.

Instructed By

Instructor Profile Image
Matthew Miller
Assistant Professor at the University of Nebraska at Kearney
Instructor