00:03
>> Hi, welcome to Cybrary. My name is Sean Pierce.
00:03
I'm the subject matter expert for
00:03
introduction to malware analysis.
00:03
Today, we're going to be covering
00:03
calling conventions and X86 assembly.
00:03
What exactly are calling conventions?
00:03
when a function call is made with parameters,
00:03
the assembly code will do a series of
00:03
pushes to push those parameters
00:03
onto the stack in reverse order.
00:03
Then we'll execute the call instruction
00:03
and the call instruction as its operand.
00:03
It's second part of the instruction is
00:03
>> the location of the function call.
00:03
the caller is responsible for cleaning up the stack.
00:03
The instruction right after
00:03
that call instruction is usually
00:03
a sub ESP instruction
00:03
where the stack is being cleaned up.
00:03
Because the caller had
00:03
just pushed a bunch of values onto the stack.
00:03
If it pushed three values
00:03
on the stack and pop three values,
00:03
that would work just as well.
00:03
I'm sorry, it's not sub ESP.
00:03
The other major calling convention, is standard call.
00:03
It's actually something made up by Microsoft.
00:03
It's how it's API works.
00:03
It works pretty much the same way,
00:03
where it pushes the parameters in reverse order,
00:03
and the only differences is,
00:03
the callee is responsible for cleaning up the stack.
00:03
You'll see in a standard call
00:03
there is, push, push, push,
00:03
and then a call and then right under the call is
00:03
no other parameters and/or no other
00:03
>> stack instructions like
00:03
>> add ESP or add something to ESP,
00:03
or pop, pop, pop or whatever else.
00:03
Inside that function call,
00:03
that function cleans up the stack.
00:03
It does pop, pop, pop at the end or it'll do a RET
00:03
for or whatever where it is explicitly saying,
00:03
''I am going to fix the stack.''
00:03
You might think, that doesn't matter very much.
00:03
You're right, it doesn't,
00:03
the compiler just need to choose one,
00:03
be consistent or not.
00:03
For instance, it can do some pretty cool optimizations,
00:03
like with recursive function calls.
00:03
We're not going to look at some
00:03
of the more advanced cases of
00:03
how compilers can optimize code really well,
00:03
but you should know it is rather
00:03
expensive to push timewise,
00:03
to push something onto the stack or to pop it off.
00:03
We're going to look at an instance where
00:03
compiler might choose to do it a different way.
00:03
>> For now, we're going to look at the difference between
00:03
>> a standard call and a C declaration call.
00:03
We can actually do both in the same program
00:03
with instructions right next to each other.
00:03
Then I'm going to show you the different ways that data
00:03
can be put onto the stack as
00:03
just a preference thing by different compilers.
00:03
We can't solve some differences between
00:03
GCC in Visual Studio last time.
00:03
I will show those really quickly again this time.
00:03
Here we have our Hello World function again.
00:03
This time we're going
00:03
to have two variables that are going to be
00:03
passed in to the printf function along
00:03
with another parameter, the format string.
00:03
It's just going to print Hello World.
00:03
Then right under that is
00:03
API function call Hold MessageBoxA,
00:03
which just pops up a message box.
00:03
We're going to just debug this. I build it.
00:03
I'm going to go and generate the disassembly.
00:03
[NOISE] Look at it side-by-side if we want to.
00:03
Up here, we see push EBP,
00:03
move ESP into EBP and sub ESP.
00:03
These are the beginnings of most functions.
00:03
Most, or what we would call procedures or sub-routines,
00:03
is what the old terminology is for it.
00:03
the current base pointer
00:03
and then pushes it onto the stack.
00:03
It's saving that value.
00:03
Then it takes the current ESP to the top of the stack,
00:03
and it puts it to EBP.
00:03
What you're essentially saying is,
00:03
I am going to save where
00:03
my current EBP is and then wherever my ESP is,
00:03
that's my new base of the frame.
00:03
That's where I'm going to start counting my values now.
00:03
As soon as this move happens, EDP and ESP,
00:03
they're pointing to the same location in memory.
00:03
Moved doesn't mean, move isn't what
00:03
>> you might think of it in terms of copy paste,
00:03
>> but move is more like copy.
00:03
It doesn't zero out anything.
00:03
When you have these two values equal the same thing,
00:03
you have a stack frame of no size.
00:03
Then we're subtracting 0D8 hex bytes from ESP.
00:03
We're extending our stack.
00:03
Now, we have D8 hex bytes
00:03
available on the stack that we can
00:03
use for storage of variables.
00:03
Mostly local variables.
00:03
That's pretty much [LAUGHTER]
00:03
only local variables should be stored on the stack.
00:03
What does it do then? There's a push, push, push.
00:03
It's saving more registers onto the stack.
00:03
This is another type of convention.
00:03
It's not call your caller convention.
00:03
It's the compiler deciding that
00:03
the callee should save the registers.
00:03
The code cannot rely on the fact that,
00:03
if it makes a call instruction,
00:03
that it's registers will still be the same values.
00:03
These are general purpose registers
00:03
like we were talking about last time.
00:03
That means they can be used by the code for anything.
00:03
You really shouldn't think
00:03
something or jump somewhere or do something,
00:03
that the registers will
00:03
still be the same value afterwards.
00:03
A lot of compilers use
00:03
this convention where it saves the registers.
00:03
It's going to mess with on the stack.
00:03
Then right before is going to exit,
00:03
it'll pop them back off.
00:03
If you go down to the bottom, we
00:03
can see that it's popping
00:03
those original three values back into their registers.
00:03
It doesn't know what's in those registers.
00:03
It may not matter, it may not
00:03
care is just a safety thing.
00:03
Just in case the code calling this one main,
00:03
was relying on the fact that it could still use
00:03
those registers, which it shouldn't.
00:03
It's saving that. It's just being good.
00:03
Code up compilers may optimize this out.
00:03
It's convention and the X86 official manual by Intel,
00:03
expect that these registers will be saved,
00:03
but compilers try to save them anyway.
00:03
What's it doing next? Those LEA,
00:03
which is load effective address.
00:03
This is really a shortcut,
00:03
use a lot of time to generate,
00:03
to do some quick math.
00:03
Because you can combine multiple additions,
00:03
subtraction, multiplication operations together,
00:03
and it's pretty fast.
00:03
it's saying, ''What is EBP minus my stack size?''
00:03
Then it grabs that location,
00:03
or it grabs the value at
00:03
that location and puts it into EDI.
00:03
What's it using for? I don't really know just yet.
00:03
>> The next instruction, is it puts the value of
00:03
and it's moving this value into EAX.
00:03
I know what this is now because since I built
00:03
this program using debug right up here,
00:03
it's adding some extra safety net type code.
00:03
If this code is executed by accident,
00:03
if we have a buffer overrun
00:03
or something goes wrong,
00:03
executing the stack code, it'll break here.
00:03
CC is the instruction INT 3,
00:03
which is interrupt 3,
00:03
which is software interrupt,
00:03
which means it will signal with
00:03
debugger that something has gone wrong.
00:03
This is a repeat string instruction.
00:03
That has to do with EDI,
00:03
with the destination register and the source register.
00:03
We can see here that
00:03
it's going to use the value at this address.
00:03
Then whatever ESI is
00:03
pointing to is generally
00:03
what the string operation will use,
00:03
or generally the value in ESI and EDI are
00:03
what this string instructions do.
00:03
These repeat instructions act as a loop.
00:03
Not a whole lot of compilers will produce
00:03
these instructions because they
00:03
like to have more control,
00:03
or they'd like to manipulate the logic
00:03
and their loops a bit more,
00:03
but the string operations were
00:03
>> made by Intel because they
00:03
>> noticed a lot of the processing power
00:03
their chips are doing were string operations.
00:03
They weren't that much
00:03
faster than the other ways compilers
00:03
figured out how to do them.
00:03
Not that common, but you still see them
00:03
>> every once in a while.
00:03
>> We can see hello being stored into var1.
00:03
That's what this instruction down here,
00:03
where it's basically storing var.
00:03
There's a reference here.
00:03
This is a memory reference
00:03
to wherever this string is stored in memory.
00:03
It's storing that address
00:03
>> into location of variable one.
00:03
>> I'm doing the same for variable two.
00:03
Here is really what I wanted to show you.
00:03
printf is taking three parameters.
00:03
The first one is going to be var2.
00:03
It's going to do this maneuver,
00:03
where it's going to move
00:03
the location into EAX and then push EAX.
00:03
and a string is pretty much a character array.
00:03
At least if we're keeping it simple,
00:03
it's a character array of one byte ASCII values.
00:03
When we push from right
00:03
to left and we're pushing it in the opposite order,
00:03
we're going to go from var2 to var1.
00:03
We could have used EAX again,
00:03
but the compiler decided to store var1 in ECX,
00:03
and then it pushes ECX,
00:03
and then it pushes this last value.
00:03
This memory value must contain
00:03
a reference to this format string.
00:03
Then this call is performed,
00:03
where printf is then executed.
00:03
Then right after it,
00:03
we see an add ESP and 0C.
00:03
0C was the damage done to the stack by this push,
00:03
this push, and this push.
00:03
That was 12 bytes that were push,
00:03
four here, eight here, 12 here,
00:03
and 12 in hex is c 0X,
00:03
0X, 0c is the standard way of writing it.
00:03
if you put an H at the end of it,
00:03
This is pretty common.
00:03
This is the standard C libraries that it's using.
00:03
That's the calling convention.
00:03
Standard call made by
00:03
Microsoft and pretty much only used by Microsoft.
00:03
Not really standard at all,
00:03
as it's called it, does the opposite.
00:03
For this line of code,
00:03
we're going to do something similar.
00:03
I don't know why that instruction was produced.
00:03
It's going to do a push.
00:03
That was this null byte or zero.
00:03
Then it's going to do another push
00:03
for the reference to the string.
00:03
Another push for the reference to the string.
00:03
Reference to the string. I'm going
00:03
to do another push as a reference to this zero.
00:03
Then it's going to do a call.
00:03
Now, this call looks a little weird.
00:03
That's because it's stored
00:03
the location of this function at this location.
00:03
This ds is data segment and dword means 32-bit,
00:03
and PTR means pointer.
00:03
It's a 32-bit pointer in the data segment.
00:03
This is part of our IAT or our import address table.
00:03
If you don't know what that is, that's okay.
00:03
It's part of the PE file structure. It is important.
00:03
>> You should get to know it down the line.
00:03
>> For now, just know that
00:03
the location of this function is stored here.
00:03
We can see it a bit more elegantly
00:03
>> and here in a minute.
00:03
>> Afterwards, there is no add ESP.
00:03
This function, this message box,
00:03
a function, cleaned up the stack.
00:03
It at some point did
00:03
a minus 16 bytes or it
00:03
added 16 bytes to ESP to clean it up.
00:03
You might be curious to see what other code is here.
00:03
Your compilers will insert code for you.
00:03
A lot of that has to do with performance
00:03
or security checks to
00:03
make sure the stack isn't corrupted or something,
00:03
or to set exception handlers,
00:03
SEH handlers, especially in debugging code,
00:03
because I built it in debug mode.
00:03
It will produce extra security checks
00:03
and wrapper functions and to just double check things.
00:03
Then this last instruction,
00:03
this instruction is very, very common.
00:03
You should really get to know it, XOR.
00:03
EAX will make EAX zero.
00:03
The return value is kept in EAX.
00:03
Returns zero means turn
00:03
EAX to zero and then use the right function,
00:03
Here's the pop, pop,
00:03
pop that we saw earlier.
00:03
Another double-check to see
00:03
make sure that the ESP is correct.
00:03
While we were in debugging mode,
00:03
we're trying to figure out why something isn't
00:03
working like a particular function,
00:03
third-party libraries aren't working,
00:03
you can work out maybe why.
00:03
As I was saying, when I mentioned this CC bytes before,
00:03
these are the instructions the CC disassembled.
00:03
If you looked at this in a hex editor,
00:03
it'd be just be CC, CC,
00:03
CC, CC, CC, CC, CC and etc.
00:03
This is also in case our code kept executing beyond
00:03
this return value or maybe a jump
00:03
was off and jumped down here or something like that.
00:03
It happens from time to time.
00:03
If we want to look at this
00:03
in a disassembler without an IDE,
00:03
if we didn't have the source code.
00:03
>> We often don't for malware,
00:03
we would find this very useful.
00:03
I'm going to go hello.
00:03
There's a project we made last time.
00:03
I'm going to go to debug
00:03
because that's what we were just looking at.
00:03
same that it noticed
00:03
that there is some debugging information in there.
00:03
Does it want to retrieve the file?
00:03
I going to say no because
00:03
we usually don't have that file,
00:03
I'm going to say no, I don't really
00:03
like the proximity review.
00:03
This might seem pretty innocent.
00:03
You might click through it and try to find
00:03
where your few lines of code were.
00:03
Remember, when I was
00:03
saying there's a lot of added code,
00:03
a lot of protections built-in for the stack,
00:03
for performance to help debuggers, this is it.
00:03
Last time we went and viewed the strings,
00:03
and found the strings that we knew were referenced.
00:03
I'm going to show you another technique that's
00:03
very popular with reverse engineers and probably,
00:03
the most common technique, is that we
00:03
go view the imports.
00:03
These are all the functions that
00:03
our program has requested to call.
00:03
You might look at this and you're just like, "Hey,
00:03
I didn't call that function query
00:03
performance counter," or get
00:03
current thread id or any of that.
00:03
No, that's one of the debugging code and
00:03
just other boilerplate code that Microsoft
00:03
will insert in there when it's inserting
00:03
its C libraries or whatever else.
00:03
We call the function MessageboxA.
00:03
I'm going to go there, I'm going to
00:03
see that it's at that location.
00:03
I can see where else
00:03
this memory address is referenced in memory.
00:03
Overall, on the right here,
00:03
you'll be able to see
00:03
that it's referenced in two places,
00:03
but if I want to get a list of all of them,
00:03
I can hit "X." I can see that it's called right here.
00:03
These two values are the same address.
00:03
If you look over at type,
00:03
one is a pointer and the other one is it's being read.
00:03
That's what the R stands for and the P stands for.
00:03
Below it, there's a jump,
00:03
>> something that is jumping there.
00:03
>> We're just going to go with where we
00:03
know this thing is being called.
00:03
If you remember, this function was
00:03
a thing built into this program by
00:03
the compiler and it was like check ESI
00:03
or ESP or whatever it was checking.
00:03
It was checking something about the stack
00:03
but we pretty much get
00:03
the same instructions that we
00:03
saw with the same CC values,
00:03
the same push, push,
00:03
push, push, push, push.
00:03
IDA doesn't actually know what the stack was used for.
00:03
These weren't assembly instructions,
00:03
but it's figured out that based on how
00:03
the assembly was referencing
00:03
the various values in the stack,
00:03
it said, "Okay, there's probably
00:03
three local variables in here."
00:03
It got that right, sometimes it gets it
00:03
wrong, especially with arrays
00:03
but it also figured out that
00:03
this function was MessageboxA and it said,
00:03
"Okay, MessageboxA according
00:03
to Microsoft, has these parameters.
00:03
It knows it's calling them or it's going
00:03
>> to push them onto the stack
00:03
>> from right to left in reverse order."
00:03
>> It said, "What were the last four pushes?"
00:03
It's like, the parameter to that is u type.
00:03
The parameter to this is caption,
00:03
the parameter to this is text,
00:03
and the parameter to this is window handle.
00:03
It's very kindly provided hints as to what
00:03
the parameters of this MessageBoxA
00:03
is but be warned that
00:03
sometimes the analysis gets a little messed
00:03
up and it might be a little off.
00:03
These are helpful, but you should
00:03
not just completely rely on them.
00:03
It's also very helpful that it
00:03
put a little comment here and said, "Hey,
00:03
this is referencing a pointer
00:03
and that pointer is pointing to an ASCII string."
00:03
It's going to guess that this is
00:03
a string and it names it
00:03
A for ASCII and then whatever the string is.
00:03
That's very helpful.
00:03
We can also rename things in IDA.
00:03
If I click this and hit "N,"
00:03
I can rename this variable to var 1.
00:03
I can rename this variable to var
00:03
2 and then I'm going to just name this format string.
00:03
I know this is printf.
00:03
in my debugger and that's what said in my source code,
00:03
but this is showing up as sub,
00:03
short for subroutine and then
00:03
a number which is the memory address it's going to.
00:03
I can click on it and then scroll
00:03
down and get a peek of what's there.
00:03
It looks like there's just a jump there.
00:03
I can follow this jump by double-clicking on
00:03
the label and I can find all this other stuff.
00:03
I'd like to highlight the calls and jumps,
00:03
and I can see that it's
00:03
calling into several other things.
00:03
I'm just thinking, is this printf?
00:03
Is this something else?
00:03
Microsoft doesn't actually include
00:03
the C libraries that are standard.
00:03
Microsoft implemented its own C runtime libraries.
00:03
Usually, they're bundled with
00:03
visual C runtime libraries or
00:03
VC runtime or whatever.
00:03
There's a lot of programming
00:03
that libraries out there for programmers use.
00:03
They're pretty standard, but this is not it.
00:03
This is more debugging code
00:03
and it's also more code to check certain things.
00:03
The parameters that you gave printf were safe
00:03
>> because they have been vulnerabilities,
00:03
>> but they can't change the functionality
00:03
calls because maybe that
00:03
might break old software if you try to recompile it.
00:03
We know somewhere down this line there's going to be
00:03
a printf or equivalent thereof.
00:03
I happen to know what exactly this function call is.
00:03
I'm going to go over and look in the imports tab.
00:03
It's going to be underscore underscore
00:03
something because that's generally what they name it.
00:03
I can search by hitting
00:03
Control F and then just type in print,
00:03
and I can see it right here towards the bottom,
00:03
but I'm going to type in print anyway.
00:03
There's two function calls,
00:03
__stdio_commonvfprintf,
00:03
and then above it is sprintf_s. The compiler is
00:03
doing its best to protect us from
00:03
being dumb programmers, and that's very nice.
00:03
We can see if we hit "Escape,"
00:03
our analysis and we can get to our original stuff.
00:03
You'll notice that this function wasn't named main.
00:03
With the X, you can say,
00:03
what has a reference to this location?
00:03
You can see there's a jump.
00:03
What has a reference to this location?
00:03
Jump further back, what
00:03
has a reference to this location,
00:03
and then you can keep going like that.
00:03
>> Meant to click that, hit "X" on that,
00:03
and see what calls that function.
00:03
We can see there's more checking code for argv,
00:03
argc, and preparation code for us,
00:03
and we can jump to see what call is this,
00:03
we are in another function called this.
00:03
If I was curious to see what function calls
00:03
I can say I can go under view and then graphs,
00:03
or I can right-click and say xrefs graphs from.
00:03
I can see that this function will call this function,
00:03
and then this function, and then this function.
00:03
That can be useful if you're trying to get
00:03
an idea of what a function
00:03
does because you can use
00:03
this graphing recursively and you can say,
00:03
this function has a bunch of functions it calls.
00:03
What are they? How many of them are libraries?"
00:03
Because the pink labeled function calls or
00:03
function names are those libraries or functions
00:03
that IDA has referenced or IDA has recognized.
00:03
If we wanted to see this actually
00:03
executing and we didn't have the original source code,
00:03
we can always crack open all the debugger.
00:03
>> Here Olly is loading in the libraries
00:03
>> and analyzing them.
00:03
>> We might not always know exactly where our code is.
00:03
Just similarly to us
00:03
not knowing where this code was in IDA.
00:03
We might not know where the code we want is in OllyDbg.
00:03
We can do the same thing where we can right-click and
00:03
then search for all inner module calls.
00:03
That basically looks for call instructions.
00:03
We can find the call instructions to MessageBoxA.
00:03
We could go there and see
00:03
the same push call and set a breakpoint.
00:03
We can do that with F2.
00:03
This will set a soft breakpoint.
00:03
Behind the scenes it's actually putting
00:03
an INT 3 instruction there or a CC byte.
00:03
Then when this code gets executed,
00:03
it will stop there and then the debugger will take
00:03
over and then replace
00:03
that byte with the original byte that was there,
00:03
execute as if nothing had ever happened.
00:03
I'm going to put a breakpoint here where the entrance
00:03
to this main function
00:03
is which Olly was kind enough to recognize.
00:03
I'm going to remove this breakpoint,
00:03
I'm going to hit "Play" and it ran,
00:03
and it has now stopped at my breakpoint.
00:03
Now, I can do the same step
00:03
over that I could with Visual Studio's debugger.
00:03
Now, this is important.
00:03
My breakpoint is still there,
00:03
>> so I'm going to hit "Play" and
00:03
>> it's going to go breakpoint.
00:03
>> The first instruction was push EBP,
00:03
the next one was yes,
00:03
As I was saying earlier,
00:03
this is important because
00:03
[NOISE] here the stack is going to be manipulated.
00:03
This thing that's at the top of the stack,
00:03
AKA a lower memory address,
00:03
is the return function from the previous caller.
00:03
The previous function that called this one said,
00:03
It pushed EIP onto the stack.
00:03
EIP was going to execute this instruction next,
00:03
but instead the call forced it to come to this address.
00:03
Now, I'm going to make my own stack here.
00:03
How do I do that safely? The first step
00:03
is, [NOISE] push EBP.
00:03
I save the current base pointer.
00:03
I'm going to move ESP to EBP.
00:03
We can see EBP was just overwritten here,
00:03
is now the same value as ESP.
00:03
[NOISE] Now, I'm going to make room on the stack.
00:03
and it took D8 bytes.
00:03
the stack is now allocated all of this for
00:03
us to use and local variables and
00:03
>> see how they're all filled with CCs.
00:03
>> That was the debugging code that
00:03
was added in because
00:03
we didn't say this was released try use.
00:03
This thing probably has
00:03
bugs and we're going to want to fix them.
00:03
Just in case any of this code gets executed,
00:03
the debugger will kick in and say,
00:03
oh, something wrong happened.
00:03
Now, we have a local stack frame.
00:03
[NOISE] Now, we're about to execute another function.
00:03
Now, we're saving the registers.
00:03
Push. It doesn't know what was in them, doesn't care.
00:03
It's just trying to make sure
00:03
that those values are now saved.
00:03
This is important, where the string
00:03
hello is being stored in a local variable on the stack.
00:03
We can do step over.
00:03
Now, we can see a reference to
00:03
the hello string is right here.
00:03
Step over again and now world is again being used.
00:03
Now, if we're debugging the release version,
00:03
these CC bytes would not be here.
00:03
Step over is going to push EIX,
00:03
it's going to push world and then it's going to push
00:03
the format string and Olly
00:03
debug was nice enough to say, "oh,
00:03
play tough? Okay, you at least to have to have
00:03
one parameter in this format string," because it
00:03
doesn't know how many parameters are
00:03
going to get pushed into
00:03
the printf function call or pushed as parameters.
00:03
Now, I'm going to make this call.
00:03
I'm just going to step over.
00:03
I could step in and watch,
00:03
actually do exactly what it says it's going to do.
00:03
I can even flip over to
00:03
the tab and say, oh, to print out.
00:03
We notice that the values are still on the stack here.
00:03
Now, as I step over,
00:03
>> the next instruction, did this add ESP 0C.
00:03
>> Now, the stack frame has decreased.
00:03
The data is still there on the stack but
00:03
the extended stack pointer ESP is now pointed here.
00:03
It's now pointing at lower. As far
00:03
as the compiler is concerned,
00:03
as far as all these instructions are concerned,
00:03
this is now the top of the stack.
00:03
These other values don't
00:03
matter, they don't care about them.
00:03
They're going to override them
00:03
>> if they do anything else,
00:03
>> they no longer exist.
00:03
It's going to push 0,
00:03
and then call MessageBoxA.
00:03
Watch carefully because while
00:03
ESP is up here as soon as we're going to call it,
00:03
the callee function is going to clean up the stack.
00:03
[NOISE] New modules are
00:03
being loaded and Olly is
00:03
analyzing them as disassembling Olly code.
00:03
Those are some pretty big DLLs.
00:03
Give it a minute. Now, there's a pop-up
00:03
>> that's happened and that's what this function does.
00:03
>> I hit "Okay," now that function finished.
00:03
You can notice the stack has cleaned up.
00:03
It didn't need to do that that add ESP.
00:03
That's pretty much the major difference between
00:03
those calling conventions and
00:03
you should be aware of them.
00:03
I don't really care what
00:03
happens to the rest of this function.
00:03
I'm just going to terminate the process and
00:03
take a quick look at the release version.
00:03
>> It's stopped in what looks to be
00:03
simply some slightly different boilerplate code.
00:03
Here's the CH exception handling code.
00:03
We're going to find the module call,
00:03
find references too,
00:03
search for in your module calls.
00:03
Go to MessageBoxA, see Hello World.
00:03
Looks like it's calling print up from
00:03
there. It looks about right.
00:03
[NOISE] Here's some more code
00:03
up here and separating the code is n three,
00:03
just in case something overran were supposed to go,
00:03
and we can see it doing this stack manipulation.
00:03
It didn't subtract nearly as much.
00:03
[NOISE] There it's doing some checking.
00:03
Here's where it's pushing the parameters to printf
00:03
and set special printf function
00:03
that Microsoft made, that's a little safer.
00:03
[NOISE] It does some more parameter
00:03
pushing for another little
00:03
function call it's going to make,
00:03
which means that printed to the screen.
00:03
Now, it's fixing up the stack [NOISE] and it's done.
00:03
You'll notice it says add ESP 06
00:03
here [NOISE] and it's cleaning up the stack.
00:03
Now, we're about to start
00:03
pushing for the next function call,
00:03
which is MessageBoxA.
00:03
Push call and I'm going to switch
00:03
over to wherever made the call there it is.
00:03
Then it's ready to end this function,
00:03
which is easy quick way to do it is
00:03
XOR EAX, as I said before.
00:03
I'm pretty much done with this.
00:03
I'm just going to exit out.
00:03
[NOISE] we notice that
00:03
the compiler changed in Visual Studio from
00:03
what functions it was calling based on the release and
00:03
what wrapper functions ahead
00:03
and Cygwin uses the GCC compiler,
00:03
the GNUC compilers with a stands for and
00:03
[NOISE] it will produce different code as well,
00:03
every compiler has its own little
00:03
[NOISE] programmers that made it, so it's important.
00:03
Here we demonstrated that I've compiled this
00:03
>> program just like it did in Visual Studio.
00:03
>> Now, I'm going to go find
00:03
that program Cygwin, home, Sean, a.exe.
00:03
>> I'm going to crack it open in IDA and
00:03
>> see what code it produced.
00:03
answered a lot of extra code.
00:03
This looks to be some parameter checking,
00:03
but I'm going to go and find
00:03
the code that I'm most interested in.
00:03
I see MessageBoxA right there.
00:03
I'm going to go and hit "X". I'm going say,
00:03
okay, it's when we reference in two places.
00:03
One is a call instruction
00:03
and the other is a move instruction.
00:03
The call instructions, what I'm interested in.
00:03
This is something that
00:03
I really want to show you in that,
00:03
like I said, push instructions
00:03
are very expensive time-wise.
00:03
Some compilers have moved away from them.
00:03
The GCC compiler in this case,
00:03
when instead of pushing onto the stack,
00:03
is simply does a move.
00:03
It's moves the variable
00:03
onto the stack, because you can do that.
00:03
You can just move your register or the pointer to
00:03
the string and the pointer is stored in
00:03
part C. You can just move that into the stack.
00:03
This way it doesn't have to push,
00:03
but it does have to make sure the stack is
00:03
the right size by time it calls print
00:03
advanced and how it's trying to do something.
00:03
Similarly, when it calls MessageBoxA,
00:03
it does another little trick,
00:03
which malware will do frequently,
00:03
in which they'll have a function pointer
00:03
and store that function pointer in
00:03
a register and then call that register.
00:03
This makes it very hard for disassemblers to
00:03
figure out what functions are being
00:03
called from where unless
00:03
it's more intelligent like IDA Pro.
00:03
Even then IDA Pro will frequently
00:03
not catch the soul trick.
00:03
Afterwards, you'll see sub ESP.
00:03
It's not cleaning up after
00:03
this function call because
00:03
MessageBoxA will clean up after itself.
00:03
It's cleaning up for all the instructions before it.
00:03
These two function calls.
00:03
GCC is trying to be a bit faster than
00:03
Microsoft Visual Studio compiler.
00:03
That's it for this demo.
00:03
Thank you for watching we covered
00:03
standard code versus cdecl.
00:03
We looked at different ways code is
00:03
generated and we steps through a lot of assembly there.
00:03
compilers will choose to do certain things.
00:03
How in one instance,
00:03
a lot of push instructions for use to get data onto
00:03
the stack and then in another instance with GCC,
00:03
it'll just move the data onto the stack
00:03
instead of pushing and then we'll
00:03
clean up the stack when it's done with it.
00:03
we're going to do some stack analysis on
00:03
some actual malware. Hope to see you there.