Basic Static Analysis Part 3

Video Activity

Welcome to the Basic Static Analysis - Part 3, where we would be covering Calling Conventions. We'll begin with understanding of the calling conventions. In cdecl (C declaration) function, the caller is responsible for cleaning up the stack. While for stdcall (standard call), an API developed by Microsoft, the callee is responsible for cleaning up ...

Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Time
9 hours 10 minutes
Difficulty
Advanced
Video Description

Welcome to the Basic Static Analysis - Part 3, where we would be covering Calling Conventions. We'll begin with understanding of the calling conventions. In cdecl (C declaration) function, the caller is responsible for cleaning up the stack. While for stdcall (standard call), an API developed by Microsoft, the callee is responsible for cleaning up the stack. The hands-on demonstration will engage as part of the module and would include the following:

  • The key differences between a cdecl call and stdcall.

  • Different ways assembly code is generated.

  • Several assembly instructions such as PUSH instructions and how they work.

  • How compilers, debuggers, and disassemblers work with assembly.

Video Transcription
00:03
>> Hi, welcome to Cybrary. My name is Sean Pierce.
00:03
I'm the subject matter expert for
00:03
introduction to malware analysis.
00:03
Today, we're going to be covering
00:03
calling conventions and X86 assembly.
00:03
What exactly are calling conventions?
00:03
In simple terms,
00:03
when a function call is made with parameters,
00:03
the assembly code will do a series of
00:03
pushes to push those parameters
00:03
onto the stack in reverse order.
00:03
Then we'll execute the call instruction
00:03
and the call instruction as its operand.
00:03
It's second part of the instruction is
00:03
>> the location of the function call.
00:03
>> In most code,
00:03
the caller is responsible for cleaning up the stack.
00:03
The instruction right after
00:03
that call instruction is usually
00:03
a sub ESP instruction
00:03
where the stack is being cleaned up.
00:03
Because the caller had
00:03
just pushed a bunch of values onto the stack.
00:03
If it pushed three values
00:03
on the stack and pop three values,
00:03
that would work just as well.
00:03
I'm sorry, it's not sub ESP.
00:03
It's like at ESP.
00:03
The other major calling convention, is standard call.
00:03
It's actually something made up by Microsoft.
00:03
It's how it's API works.
00:03
It works pretty much the same way,
00:03
where it pushes the parameters in reverse order,
00:03
and the only differences is,
00:03
the callee is responsible for cleaning up the stack.
00:03
You'll see in a standard call
00:03
there is, push, push, push,
00:03
and then a call and then right under the call is
00:03
no other parameters and/or no other
00:03
>> stack instructions like
00:03
>> add ESP or add something to ESP,
00:03
or pop, pop, pop or whatever else.
00:03
Inside that function call,
00:03
that function cleans up the stack.
00:03
It does pop, pop, pop at the end or it'll do a RET
00:03
for or whatever where it is explicitly saying,
00:03
''I am going to fix the stack.''
00:03
You might think, that doesn't matter very much.
00:03
You're right, it doesn't,
00:03
the compiler just need to choose one,
00:03
be consistent or not.
00:03
For instance, it can do some pretty cool optimizations,
00:03
like with recursive function calls.
00:03
We're not going to look at some
00:03
of the more advanced cases of
00:03
how compilers can optimize code really well,
00:03
but you should know it is rather
00:03
expensive to push timewise,
00:03
to push something onto the stack or to pop it off.
00:03
We're going to look at an instance where
00:03
compiler might choose to do it a different way.
00:03
>> For now, we're going to look at the difference between
00:03
>> a standard call and a C declaration call.
00:03
We can actually do both in the same program
00:03
with instructions right next to each other.
00:03
Then I'm going to show you the different ways that data
00:03
can be put onto the stack as
00:03
just a preference thing by different compilers.
00:03
We can't solve some differences between
00:03
GCC in Visual Studio last time.
00:03
I will show those really quickly again this time.
00:03
Here we have our Hello World function again.
00:03
This time we're going
00:03
to have two variables that are going to be
00:03
passed in to the printf function along
00:03
with another parameter, the format string.
00:03
It's just going to print Hello World.
00:03
Then right under that is
00:03
API function call Hold MessageBoxA,
00:03
which just pops up a message box.
00:03
We're going to just debug this. I build it.
00:03
I'm going to go and generate the disassembly.
00:03
[NOISE] Look at it side-by-side if we want to.
00:03
Up here, we see push EBP,
00:03
move ESP into EBP and sub ESP.
00:03
These are the beginnings of most functions.
00:03
Most, or what we would call procedures or sub-routines,
00:03
is what the old terminology is for it.
00:03
To push EBP, takes
00:03
the current base pointer
00:03
and then pushes it onto the stack.
00:03
It's saving that value.
00:03
Then it takes the current ESP to the top of the stack,
00:03
and it puts it to EBP.
00:03
What you're essentially saying is,
00:03
I am going to save where
00:03
my current EBP is and then wherever my ESP is,
00:03
that's my new base of the frame.
00:03
That's where I'm going to start counting my values now.
00:03
As soon as this move happens, EDP and ESP,
00:03
are the same value,
00:03
they're pointing to the same location in memory.
00:03
Moved doesn't mean, move isn't what
00:03
>> you might think of it in terms of copy paste,
00:03
>> but move is more like copy.
00:03
It doesn't zero out anything.
00:03
When you have these two values equal the same thing,
00:03
you have a stack frame of no size.
00:03
Then we're subtracting 0D8 hex bytes from ESP.
00:03
We're extending our stack.
00:03
Now, we have D8 hex bytes
00:03
available on the stack that we can
00:03
use for storage of variables.
00:03
Mostly local variables.
00:03
That's pretty much [LAUGHTER]
00:03
only local variables should be stored on the stack.
00:03
What does it do then? There's a push, push, push.
00:03
It's saving more registers onto the stack.
00:03
This is another type of convention.
00:03
It's not call your caller convention.
00:03
It's the compiler deciding that
00:03
the callee should save the registers.
00:03
The code cannot rely on the fact that,
00:03
if it makes a call instruction,
00:03
that it's registers will still be the same values.
00:03
These are general purpose registers
00:03
like we were talking about last time.
00:03
That means they can be used by the code for anything.
00:03
You really shouldn't think
00:03
that if you call
00:03
something or jump somewhere or do something,
00:03
that the registers will
00:03
still be the same value afterwards.
00:03
A lot of compilers use
00:03
this convention where it saves the registers.
00:03
It's going to mess with on the stack.
00:03
Then right before is going to exit,
00:03
it'll pop them back off.
00:03
If you go down to the bottom, we
00:03
can see that it's popping
00:03
those original three values back into their registers.
00:03
It doesn't know what's in those registers.
00:03
It may not matter, it may not
00:03
care is just a safety thing.
00:03
Just in case the code calling this one main,
00:03
was relying on the fact that it could still use
00:03
those registers, which it shouldn't.
00:03
It's saving that. It's just being good.
00:03
Code up compilers may optimize this out.
00:03
It's convention and the X86 official manual by Intel,
00:03
they say do not
00:03
expect that these registers will be saved,
00:03
but compilers try to save them anyway.
00:03
What's it doing next? Those LEA,
00:03
which is load effective address.
00:03
This is really a shortcut,
00:03
use a lot of time to generate,
00:03
to do some quick math.
00:03
Because you can combine multiple additions,
00:03
subtraction, multiplication operations together,
00:03
and it's pretty fast.
00:03
In this context,
00:03
it's saying, ''What is EBP minus my stack size?''
00:03
Then it grabs that location,
00:03
and then puts it,
00:03
or it grabs the value at
00:03
that location and puts it into EDI.
00:03
What's it using for? I don't really know just yet.
00:03
>> The next instruction, is it puts the value of
00:03
36X into the ECX,
00:03
and it's moving this value into EAX.
00:03
I know what this is now because since I built
00:03
this program using debug right up here,
00:03
it's adding some extra safety net type code.
00:03
If this code is executed by accident,
00:03
if we have a buffer overrun
00:03
or something goes wrong,
00:03
and the EIP starts
00:03
executing the stack code, it'll break here.
00:03
CC is the instruction INT 3,
00:03
which is interrupt 3,
00:03
which is software interrupt,
00:03
which means it will signal with
00:03
debugger that something has gone wrong.
00:03
This is a repeat string instruction.
00:03
That has to do with EDI,
00:03
or that has to do
00:03
with the destination register and the source register.
00:03
We can see here that
00:03
it's going to use the value at this address.
00:03
Then whatever ESI is
00:03
pointing to is generally
00:03
what the string operation will use,
00:03
or generally the value in ESI and EDI are
00:03
what this string instructions do.
00:03
These repeat instructions act as a loop.
00:03
Not a whole lot of compilers will produce
00:03
these instructions because they
00:03
like to have more control,
00:03
or they'd like to manipulate the logic
00:03
and their loops a bit more,
00:03
but the string operations were
00:03
>> made by Intel because they
00:03
>> noticed a lot of the processing power
00:03
their chips are doing were string operations.
00:03
They weren't that much
00:03
faster than the other ways compilers
00:03
figured out how to do them.
00:03
Not that common, but you still see them
00:03
>> every once in a while.
00:03
>> We can see hello being stored into var1.
00:03
That's what this instruction down here,
00:03
where it's basically storing var.
00:03
There's a reference here.
00:03
This is a memory reference
00:03
to wherever this string is stored in memory.
00:03
It's storing that address
00:03
>> into location of variable one.
00:03
>> I'm doing the same for variable two.
00:03
Here is really what I wanted to show you.
00:03
printf is taking three parameters.
00:03
The first one is going to be var2.
00:03
It's going to do this maneuver,
00:03
where it's going to move
00:03
the location into EAX and then push EAX.
00:03
This is a string,
00:03
and a string is pretty much a character array.
00:03
At least if we're keeping it simple,
00:03
it's a character array of one byte ASCII values.
00:03
When we push from right
00:03
to left and we're pushing it in the opposite order,
00:03
we're going to go from var2 to var1.
00:03
We could have used EAX again,
00:03
but the compiler decided to store var1 in ECX,
00:03
and then it pushes ECX,
00:03
and then it pushes this last value.
00:03
This memory value must contain
00:03
a reference to this format string.
00:03
Then this call is performed,
00:03
where printf is then executed.
00:03
Then right after it,
00:03
we see an add ESP and 0C.
00:03
0C was the damage done to the stack by this push,
00:03
this push, and this push.
00:03
That was 12 bytes that were push,
00:03
four here, eight here, 12 here,
00:03
and 12 in hex is c 0X,
00:03
0X, 0c is the standard way of writing it.
00:03
If you also,
00:03
if you put an H at the end of it,
00:03
we know it's hex.
00:03
That was C dackle.
00:03
This is pretty common.
00:03
This is the standard C libraries that it's using.
00:03
That's the calling convention.
00:03
Standard call made by
00:03
Microsoft and pretty much only used by Microsoft.
00:03
Not really standard at all,
00:03
as it's called it, does the opposite.
00:03
For this line of code,
00:03
we're going to do something similar.
00:03
I don't know why that instruction was produced.
00:03
It's going to do a push.
00:03
That was this null byte or zero.
00:03
Then it's going to do another push
00:03
for the reference to the string.
00:03
Another push for the reference to the string.
00:03
Reference to the string. I'm going
00:03
to do another push as a reference to this zero.
00:03
Then it's going to do a call.
00:03
Now, this call looks a little weird.
00:03
That's because it's stored
00:03
the location of this function at this location.
00:03
This ds is data segment and dword means 32-bit,
00:03
and PTR means pointer.
00:03
It's a 32-bit pointer in the data segment.
00:03
This is part of our IAT or our import address table.
00:03
If you don't know what that is, that's okay.
00:03
It's part of the PE file structure. It is important.
00:03
>> You should get to know it down the line.
00:03
>> For now, just know that
00:03
the location of this function is stored here.
00:03
We can see it a bit more elegantly
00:03
>> and here in a minute.
00:03
>> Afterwards, there is no add ESP.
00:03
This function, this message box,
00:03
a function, cleaned up the stack.
00:03
It at some point did
00:03
a minus 16 bytes or it
00:03
added 16 bytes to ESP to clean it up.
00:03
You might be curious to see what other code is here.
00:03
Your compilers will insert code for you.
00:03
A lot of that has to do with performance
00:03
or security checks to
00:03
make sure the stack isn't corrupted or something,
00:03
or to set exception handlers,
00:03
SEH handlers, especially in debugging code,
00:03
because I built it in debug mode.
00:03
It will produce extra security checks
00:03
and wrapper functions and to just double check things.
00:03
Then this last instruction,
00:03
this instruction is very, very common.
00:03
You should really get to know it, XOR.
00:03
XOR, EAX,
00:03
EAX will make EAX zero.
00:03
The return value is kept in EAX.
00:03
Returns zero means turn
00:03
EAX to zero and then use the right function,
00:03
which is down here.
00:03
Here's the pop, pop,
00:03
pop that we saw earlier.
00:03
Another double-check to see
00:03
make sure that the ESP is correct.
00:03
While we were in debugging mode,
00:03
we're trying to figure out why something isn't
00:03
working like a particular function,
00:03
third-party libraries aren't working,
00:03
you can work out maybe why.
00:03
As I was saying, when I mentioned this CC bytes before,
00:03
these are the instructions the CC disassembled.
00:03
If you looked at this in a hex editor,
00:03
it'd be just be CC, CC,
00:03
CC, CC, CC, CC, CC and etc.
00:03
This is also in case our code kept executing beyond
00:03
this return value or maybe a jump
00:03
was off and jumped down here or something like that.
00:03
It happens from time to time.
00:03
If we want to look at this
00:03
in a disassembler without an IDE,
00:03
we can with IDA,
00:03
if we didn't have the source code.
00:03
>> We often don't for malware,
00:03
we would find this very useful.
00:03
I'm going to go hello.
00:03
There's a project we made last time.
00:03
I'm going to go to debug
00:03
because that's what we were just looking at.
00:03
Disassemble it, the
00:03
same that it noticed
00:03
that there is some debugging information in there.
00:03
Does it want to retrieve the file?
00:03
I going to say no because
00:03
we usually don't have that file,
00:03
I'm going to say no, I don't really
00:03
like the proximity review.
00:03
This might seem pretty innocent.
00:03
You might click through it and try to find
00:03
where your few lines of code were.
00:03
Remember, when I was
00:03
saying there's a lot of added code,
00:03
a lot of protections built-in for the stack,
00:03
for performance to help debuggers, this is it.
00:03
Last time we went and viewed the strings,
00:03
and found the strings that we knew were referenced.
00:03
I'm going to show you another technique that's
00:03
very popular with reverse engineers and probably,
00:03
the most common technique, is that we
00:03
go view the imports.
00:03
These are all the functions that
00:03
our program has requested to call.
00:03
You might look at this and you're just like, "Hey,
00:03
I didn't call that function query
00:03
performance counter," or get
00:03
current thread id or any of that.
00:03
No, that's one of the debugging code and
00:03
just other boilerplate code that Microsoft
00:03
will insert in there when it's inserting
00:03
its C libraries or whatever else.
00:03
We call the function MessageboxA.
00:03
I'm going to go there, I'm going to
00:03
see that it's at that location.
00:03
If I hit "X,"
00:03
I can see where else
00:03
this memory address is referenced in memory.
00:03
Overall, on the right here,
00:03
you'll be able to see
00:03
that it's referenced in two places,
00:03
but if I want to get a list of all of them,
00:03
I can hit "X." I can see that it's called right here.
00:03
These two values are the same address.
00:03
If you look over at type,
00:03
one is a pointer and the other one is it's being read.
00:03
That's what the R stands for and the P stands for.
00:03
Below it, there's a jump,
00:03
>> something that is jumping there.
00:03
>> We're just going to go with where we
00:03
know this thing is being called.
00:03
If you remember, this function was
00:03
a thing built into this program by
00:03
the compiler and it was like check ESI
00:03
or ESP or whatever it was checking.
00:03
It was checking something about the stack
00:03
but we pretty much get
00:03
the same instructions that we
00:03
saw with the same CC values,
00:03
the same push, push,
00:03
push, push, push, push.
00:03
IDA doesn't actually know what the stack was used for.
00:03
These weren't assembly instructions,
00:03
but it's figured out that based on how
00:03
the assembly was referencing
00:03
the various values in the stack,
00:03
it said, "Okay, there's probably
00:03
three local variables in here."
00:03
It got that right, sometimes it gets it
00:03
wrong, especially with arrays
00:03
but it also figured out that
00:03
this function was MessageboxA and it said,
00:03
"Okay, MessageboxA according
00:03
to Microsoft, has these parameters.
00:03
It knows it's calling them or it's going
00:03
>> to push them onto the stack
00:03
>> from right to left in reverse order."
00:03
>> It said, "What were the last four pushes?"
00:03
It's like, the parameter to that is u type.
00:03
The parameter to this is caption,
00:03
the parameter to this is text,
00:03
and the parameter to this is window handle.
00:03
It's very kindly provided hints as to what
00:03
the parameters of this MessageBoxA
00:03
is but be warned that
00:03
sometimes the analysis gets a little messed
00:03
up and it might be a little off.
00:03
These are helpful, but you should
00:03
not just completely rely on them.
00:03
It's also very helpful that it
00:03
put a little comment here and said, "Hey,
00:03
this is referencing a pointer
00:03
and that pointer is pointing to an ASCII string."
00:03
It's going to guess that this is
00:03
a string and it names it
00:03
A for ASCII and then whatever the string is.
00:03
That's very helpful.
00:03
We can also rename things in IDA.
00:03
If I click this and hit "N,"
00:03
I can rename this variable to var 1.
00:03
I can rename this variable to var
00:03
2 and then I'm going to just name this format string.
00:03
I know this is printf.
00:03
That's what it said
00:03
in my debugger and that's what said in my source code,
00:03
but this is showing up as sub,
00:03
short for subroutine and then
00:03
a number which is the memory address it's going to.
00:03
I can click on it and then scroll
00:03
down and get a peek of what's there.
00:03
It looks like there's just a jump there.
00:03
I can follow this jump by double-clicking on
00:03
the label and I can find all this other stuff.
00:03
I'd like to highlight the calls and jumps,
00:03
and I can see that it's
00:03
calling into several other things.
00:03
I'm just thinking, is this printf?
00:03
Is this something else?
00:03
As it turns out,
00:03
Microsoft doesn't actually include
00:03
the C libraries that are standard.
00:03
Microsoft implemented its own C runtime libraries.
00:03
Usually, they're bundled with
00:03
visual C runtime libraries or
00:03
VC runtime or whatever.
00:03
There's a lot of programming
00:03
that libraries out there for programmers use.
00:03
They're pretty standard, but this is not it.
00:03
This is more debugging code
00:03
and it's also more code to check certain things.
00:03
The parameters that you gave printf were safe
00:03
>> because they have been vulnerabilities,
00:03
>> but they can't change the functionality
00:03
of standard library
00:03
calls because maybe that
00:03
might break old software if you try to recompile it.
00:03
We know somewhere down this line there's going to be
00:03
a printf or equivalent thereof.
00:03
I happen to know what exactly this function call is.
00:03
I'm going to go over and look in the imports tab.
00:03
It's going to be underscore underscore
00:03
something because that's generally what they name it.
00:03
I can search by hitting
00:03
Control F and then just type in print,
00:03
and I can see it right here towards the bottom,
00:03
but I'm going to type in print anyway.
00:03
There's two function calls,
00:03
__stdio_commonvfprintf,
00:03
and then above it is sprintf_s. The compiler is
00:03
doing its best to protect us from
00:03
being dumb programmers, and that's very nice.
00:03
We can see if we hit "Escape,"
00:03
we can go back in
00:03
our analysis and we can get to our original stuff.
00:03
You'll notice that this function wasn't named main.
00:03
With the X, you can say,
00:03
what has a reference to this location?
00:03
You can see there's a jump.
00:03
What has a reference to this location?
00:03
Jump further back, what
00:03
has a reference to this location,
00:03
and then you can keep going like that.
00:03
>> Oops, excuse me.
00:03
>> Meant to click that, hit "X" on that,
00:03
and see what calls that function.
00:03
We can see there's more checking code for argv,
00:03
argc, and preparation code for us,
00:03
and we can jump to see what call is this,
00:03
and we can see that
00:03
we are in another function called this.
00:03
If I was curious to see what function calls
00:03
this function made,
00:03
I can say I can go under view and then graphs,
00:03
or I can right-click and say xrefs graphs from.
00:03
I can see that this function will call this function,
00:03
and then this function, and then this function.
00:03
That can be useful if you're trying to get
00:03
an idea of what a function
00:03
does because you can use
00:03
this graphing recursively and you can say,
00:03
"Okay, I know
00:03
this function has a bunch of functions it calls.
00:03
What are they? How many of them are libraries?"
00:03
Because the pink labeled function calls or
00:03
function names are those libraries or functions
00:03
that IDA has referenced or IDA has recognized.
00:03
If we wanted to see this actually
00:03
executing and we didn't have the original source code,
00:03
we can always crack open all the debugger.
00:03
Sorry, hello.exe.
00:03
>> Here Olly is loading in the libraries
00:03
>> and analyzing them.
00:03
>> We might not always know exactly where our code is.
00:03
Just similarly to us
00:03
not knowing where this code was in IDA.
00:03
We might not know where the code we want is in OllyDbg.
00:03
We can do the same thing where we can right-click and
00:03
then search for all inner module calls.
00:03
That basically looks for call instructions.
00:03
We can find the call instructions to MessageBoxA.
00:03
We could go there and see
00:03
the same push call and set a breakpoint.
00:03
We can do that with F2.
00:03
This will set a soft breakpoint.
00:03
Behind the scenes it's actually putting
00:03
an INT 3 instruction there or a CC byte.
00:03
Then when this code gets executed,
00:03
it will stop there and then the debugger will take
00:03
over and then replace
00:03
that byte with the original byte that was there,
00:03
and then begin to
00:03
execute as if nothing had ever happened.
00:03
I'm going to put a breakpoint here where the entrance
00:03
to this main function
00:03
is which Olly was kind enough to recognize.
00:03
I'm going to remove this breakpoint,
00:03
I'm going to hit "Play" and it ran,
00:03
and it has now stopped at my breakpoint.
00:03
Now, I can do the same step
00:03
over that I could with Visual Studio's debugger.
00:03
Now, this is important.
00:03
Let me start over.
00:03
[NOISE] I say yes.
00:03
My breakpoint is still there,
00:03
>> so I'm going to hit "Play" and
00:03
>> it's going to go breakpoint.
00:03
>> The first instruction was push EBP,
00:03
the next one was yes,
00:03
move ESP and EBP.
00:03
As I was saying earlier,
00:03
this is important because
00:03
[NOISE] here the stack is going to be manipulated.
00:03
This thing that's at the top of the stack,
00:03
AKA a lower memory address,
00:03
is the return function from the previous caller.
00:03
The previous function that called this one said,
00:03
call 101780.
00:03
It pushed EIP onto the stack.
00:03
EIP was going to execute this instruction next,
00:03
but instead the call forced it to come to this address.
00:03
Now, I'm going to make my own stack here.
00:03
How do I do that safely? The first step
00:03
is, [NOISE] push EBP.
00:03
I save the current base pointer.
00:03
I'm going to move ESP to EBP.
00:03
We can see EBP was just overwritten here,
00:03
is now the same value as ESP.
00:03
[NOISE] Now, I'm going to make room on the stack.
00:03
Step over and sub,
00:03
and it took D8 bytes.
00:03
We can see here on
00:03
the stack is now allocated all of this for
00:03
us to use and local variables and
00:03
>> see how they're all filled with CCs.
00:03
>> That was the debugging code that
00:03
was added in because
00:03
we didn't say this was released try use.
00:03
This thing probably has
00:03
bugs and we're going to want to fix them.
00:03
Just in case any of this code gets executed,
00:03
the debugger will kick in and say,
00:03
oh, something wrong happened.
00:03
Now, we have a local stack frame.
00:03
[NOISE] Now, we're about to execute another function.
00:03
Now, we're saving the registers.
00:03
Push. It doesn't know what was in them, doesn't care.
00:03
It's just trying to make sure
00:03
that those values are now saved.
00:03
This is important, where the string
00:03
hello is being stored in a local variable on the stack.
00:03
We can do step over.
00:03
Now, we can see a reference to
00:03
the hello string is right here.
00:03
Step over again and now world is again being used.
00:03
Now, if we're debugging the release version,
00:03
these CC bytes would not be here.
00:03
Step over is going to push EIX,
00:03
it's going to push world and then it's going to push
00:03
the format string and Olly
00:03
debug was nice enough to say, "oh,
00:03
play tough? Okay, you at least to have to have
00:03
one parameter in this format string," because it
00:03
doesn't know how many parameters are
00:03
going to get pushed into
00:03
the printf function call or pushed as parameters.
00:03
Now, I'm going to make this call.
00:03
I'm just going to step over.
00:03
I could step in and watch,
00:03
actually do exactly what it says it's going to do.
00:03
I can even flip over to
00:03
the tab and say, oh, to print out.
00:03
We notice that the values are still on the stack here.
00:03
Now, as I step over,
00:03
>> the next instruction, did this add ESP 0C.
00:03
>> Now, the stack frame has decreased.
00:03
The data is still there on the stack but
00:03
the extended stack pointer ESP is now pointed here.
00:03
It's now pointing at lower. As far
00:03
as the compiler is concerned,
00:03
as far as all these instructions are concerned,
00:03
this is now the top of the stack.
00:03
These other values don't
00:03
matter, they don't care about them.
00:03
They're going to override them
00:03
>> if they do anything else,
00:03
>> they no longer exist.
00:03
I step over again.
00:03
It's going to push 0,
00:03
push hello world,
00:03
push alert, push 0,
00:03
and then call MessageBoxA.
00:03
Watch carefully because while
00:03
ESP is up here as soon as we're going to call it,
00:03
the callee function is going to clean up the stack.
00:03
[NOISE] New modules are
00:03
being loaded and Olly is
00:03
analyzing them as disassembling Olly code.
00:03
Those are some pretty big DLLs.
00:03
Give it a minute. Now, there's a pop-up
00:03
>> that's happened and that's what this function does.
00:03
>> I hit "Okay," now that function finished.
00:03
You can notice the stack has cleaned up.
00:03
It didn't need to do that that add ESP.
00:03
That's pretty much the major difference between
00:03
those calling conventions and
00:03
you should be aware of them.
00:03
I don't really care what
00:03
happens to the rest of this function.
00:03
I'm just going to terminate the process and
00:03
take a quick look at the release version.
00:03
[NOISE]
00:03
>> It's stopped in what looks to be
00:03
simply some slightly different boilerplate code.
00:03
Here's the CH exception handling code.
00:03
We're going to find the module call,
00:03
find references too,
00:03
search for in your module calls.
00:03
Go to MessageBoxA, see Hello World.
00:03
Looks like it's calling print up from
00:03
there. It looks about right.
00:03
[NOISE] Here's some more code
00:03
up here and separating the code is n three,
00:03
just in case something overran were supposed to go,
00:03
will say stop over
00:03
and we can see it doing this stack manipulation.
00:03
It didn't subtract nearly as much.
00:03
[NOISE] There it's doing some checking.
00:03
Here's where it's pushing the parameters to printf
00:03
and set special printf function
00:03
that Microsoft made, that's a little safer.
00:03
[NOISE] It does some more parameter
00:03
pushing for another little
00:03
function call it's going to make,
00:03
which is V, F,
00:03
print
00:03
f and it finished,
00:03
which means that printed to the screen.
00:03
Now, it's fixing up the stack [NOISE] and it's done.
00:03
You'll notice it says add ESP 06
00:03
here [NOISE] and it's cleaning up the stack.
00:03
Now, we're about to start
00:03
pushing for the next function call,
00:03
which is MessageBoxA.
00:03
Push call and I'm going to switch
00:03
over to wherever made the call there it is.
00:03
Then it's ready to end this function,
00:03
which is return 0,
00:03
which is easy quick way to do it is
00:03
XOR EAX, as I said before.
00:03
I'm pretty much done with this.
00:03
I'm just going to exit out.
00:03
Quickly in Cygwin,
00:03
[NOISE] we notice that
00:03
the compiler changed in Visual Studio from
00:03
what functions it was calling based on the release and
00:03
what wrapper functions ahead
00:03
and Cygwin uses the GCC compiler,
00:03
the GNUC compilers with a stands for and
00:03
[NOISE] it will produce different code as well,
00:03
every compiler has its own little
00:03
[NOISE] programmers that made it, so it's important.
00:03
Here we demonstrated that I've compiled this
00:03
>> program just like it did in Visual Studio.
00:03
>> Now, I'm going to go find
00:03
that program Cygwin, home, Sean, a.exe.
00:03
>> I'm going to crack it open in IDA and
00:03
>> see what code it produced.
00:03
>> [NOISE]
00:03
Cygwin with GCC
00:03
answered a lot of extra code.
00:03
This looks to be some parameter checking,
00:03
but I'm going to go and find
00:03
the code that I'm most interested in.
00:03
I see MessageBoxA right there.
00:03
I'm going to go and hit "X". I'm going say,
00:03
okay, it's when we reference in two places.
00:03
One is a call instruction
00:03
and the other is a move instruction.
00:03
The call instructions, what I'm interested in.
00:03
This is something that
00:03
I really want to show you in that,
00:03
like I said, push instructions
00:03
are very expensive time-wise.
00:03
Some compilers have moved away from them.
00:03
The GCC compiler in this case,
00:03
when instead of pushing onto the stack,
00:03
is simply does a move.
00:03
It's moves the variable
00:03
onto the stack, because you can do that.
00:03
You can just move your register or the pointer to
00:03
the string and the pointer is stored in
00:03
part C. You can just move that into the stack.
00:03
This way it doesn't have to push,
00:03
but it does have to make sure the stack is
00:03
the right size by time it calls print
00:03
f. The compiler is
00:03
being a bit more
00:03
advanced and how it's trying to do something.
00:03
Similarly, when it calls MessageBoxA,
00:03
it does another little trick,
00:03
which malware will do frequently,
00:03
in which they'll have a function pointer
00:03
and store that function pointer in
00:03
a register and then call that register.
00:03
This makes it very hard for disassemblers to
00:03
figure out what functions are being
00:03
called from where unless
00:03
it's more intelligent like IDA Pro.
00:03
Even then IDA Pro will frequently
00:03
not catch the soul trick.
00:03
Afterwards, you'll see sub ESP.
00:03
It's not cleaning up after
00:03
this function call because
00:03
MessageBoxA will clean up after itself.
00:03
It's cleaning up for all the instructions before it.
00:03
These two function calls.
00:03
GCC is trying to be a bit faster than
00:03
Microsoft Visual Studio compiler.
00:03
That's it for this demo.
00:03
Thank you for watching we covered
00:03
standard code versus cdecl.
00:03
We looked at different ways code is
00:03
generated and we steps through a lot of assembly there.
00:03
We talked about how
00:03
compilers will choose to do certain things.
00:03
How in one instance,
00:03
a lot of push instructions for use to get data onto
00:03
the stack and then in another instance with GCC,
00:03
it'll just move the data onto the stack
00:03
instead of pushing and then we'll
00:03
clean up the stack when it's done with it.
00:03
In the next video,
00:03
we're going to do some stack analysis on
00:03
some actual malware. Hope to see you there.
Up Next