Basic Static Analysis Part 4A

Video Activity

In this module, we'll cover some basic malware tricks using the reverse engineering technique. As many of you might know that malware uses certain tricks to confuse, make analysis more difficult, or even break the disassembler. We'll demonstrate stack analysis of a malware to understand the malware and its attribution, identify Indicators of Compro...

Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *
or

Already have an account? Sign In »

Time
9 hours 10 minutes
Difficulty
Advanced
CEU/CPE
9
Video Description

In this module, we'll cover some basic malware tricks using the reverse engineering technique. As many of you might know that malware uses certain tricks to confuse, make analysis more difficult, or even break the disassembler. We'll demonstrate stack analysis of a malware to understand the malware and its attribution, identify Indicators of Compromise (IOCs), confirm our dynamic analysis results, and discover the anti-debugging code.

Video Transcription
00:03
>> Welcome to Cybrary. Hi, my name is Sean Pierce,
00:03
and I'm the subject matter expert for
00:03
introduction to malware analysis.
00:03
Today we are going to be covering
00:03
the basics of stack analysis Part 4,
00:03
where we will actually be
00:03
reverse engineering a piece of malware.
00:03
We're going to be looking at some of
00:03
the tricks commonly imposed by malware.
00:03
Sometimes in an effort to break disassemblers,
00:03
sometimes in an effort to just be
00:03
confusing or make analysis more difficult,
00:03
whether it be automated or just to try to trick a person.
00:03
We have looked at illusion bot before,
00:03
we looked at it last time.
00:03
I'm going to just extract it here,
00:03
which is something you should be a bit careful
00:03
with. Be careful where you click.
00:03
I'm just going to hit F2 and
00:03
rename these files so they
00:03
don't accidentally get executed.
00:03
I'm going to pull up IDA Pro and start
00:03
analyzing the botbinary.exe because
00:03
that is the executable that actually does the infection.
00:03
Some pretty fast analysis.
00:03
Here, we see something immediately down here in red,
00:03
it says SP analysis failed.
00:03
That's the stack pointer analysis.
00:03
Last time we're talking about
00:03
the stack a lot and we're talking about how
00:03
the compiler will generate
00:03
instructions to clean up the stack or make
00:03
functions that will save
00:03
registers and then push them in the beginning and pop
00:03
them off at the end or will do things
00:03
so that it can clean up the stack before
00:03
after different functions called depending on
00:03
a standard call or [inaudible].
00:03
Here, we can see this very basic trick
00:03
where a lot of very basic disassemblers will just break,
00:03
is where the stack
00:03
doesn't match up from beginning to end.
00:03
We see here, push ebp,
00:03
move ebp, esp,
00:03
standard introduction to
00:03
a standard prologue to a subroutine.
00:03
Then there's no minus esp.
00:03
I'm not making room on the stack
00:03
except it is doing this push, push, push.
00:03
Usually, this is done in
00:03
an effort to save these registers.
00:03
But you'll notice there's no popping here at the end.
00:03
It does this load effective address
00:03
and it does this subroutine.
00:03
But the location to the subroutine,
00:03
I figured out this was executable code,
00:03
puts it into this address,
00:03
and it pushes this address,
00:03
and then does this return.
00:03
Now, think about that for just a minute.
00:03
The return instruction will go and find
00:03
whatever return instruction it thinks it's supposed to go
00:03
to because it thinks some function call called it,
00:03
so it has to return to that address.
00:03
It will be this.
00:03
The return instruction, once it executes,
00:03
it will actually execute this function.
00:03
If we hit Space,
00:03
we can see that there might have
00:03
been something else down here, maybe code.
00:03
I like to hit C and find out.
00:03
I'll try to analyze it as code.
00:03
But it seems like it failed.
00:03
I'm going to switch over to this hex view,
00:03
which is basically hex editor.
00:03
I like to look at this data here.
00:03
These data in between these two functions,
00:03
these two bits of code.
00:03
I like to usually look to see if they're strings
00:03
or any usable data or if it's random junk.
00:03
That looks to be random junk
00:03
because there's no references to
00:03
it so it looks just from the side of it,
00:03
it looks like there is no part
00:03
of the program trying to access it.
00:03
But it could do that dynamically because like we said,
00:03
"There's no way to know
00:03
what the code is going to do until
00:03
we actually execute it."
00:03
Here, I recognize that there's a reference to this,
00:03
and it was likely some code.
00:03
This is the function here,
00:03
so I'm going to hit N and call this realMain.
00:03
We can see that it's also called right here.
00:03
I don't know how this code gets executed at all.
00:03
I'm going to hit X to see if there's
00:03
any references to it and there's not,
00:03
not at least the kind IDA is found.
00:03
That might be interesting,
00:03
but it's not important to us right now,
00:03
so we'll look into it later.
00:03
I'm interested in what this function does,
00:03
what the subroutine does.
00:03
It was more proper.
00:03
It looks like it calls three functions and it uses
00:03
some strings like Win98 so perhaps it's
00:03
testing to see what version
00:03
of the operating system it's using.
00:03
Now, we see this constant here.
00:03
That would be something I would google for.
00:03
Perhaps someone else has done
00:03
some analysis that we can leverage.
00:03
I keep looking down and I see
00:03
this registry call or see this path.
00:03
We can see this.
00:03
But notice that there's
00:03
no names in any of these functions.
00:03
That's not uncommon,
00:03
especially if something statically linked.
00:03
But what is interesting is here,
00:03
it's calling a dword.
00:03
Usually you call addresses,
00:03
but there's dwords here everywhere,
00:03
so that means it's data.
00:03
At some point this is manipulated.
00:03
I like to see what's there.
00:03
We look at it and we see it's just a reference to
00:03
some memory address that hasn't been filled in.
00:03
I'm going see what references are to this memory address.
00:03
I'm going to hit X and
00:03
see there is reference in three places.
00:03
Here, we can see on the right-hand side
00:03
the instructions associated with it.
00:03
One instruction, this move instruction
00:03
is the only one that writes to it.
00:03
That's curious.
00:03
The only places modified
00:03
in the whole program is right here.
00:03
I don't click on it and go to it.
00:03
This is very interesting.
00:03
These are all function calls.
00:03
There's this GetProcAddress function call
00:03
here and if we scroll up,
00:03
we can see there's this LoadLibrary address and
00:03
it's loading these DLLs.
00:03
It's getting the functions it needs from these DLLs,
00:03
rather than have all these DLLs
00:03
or all these function names
00:03
populated in the import address table.
00:03
Under import, we see this actually just
00:03
using a few of these function calls.
00:03
You only need to LoadLibrary and
00:03
GetProcAddress to resolve all the other function calls.
00:03
LoadLibrary is used to get a handle to
00:03
a library and then GetProcAddress is
00:03
used to get the address
00:03
to that function call, as you might guess.
00:03
We can flip back over here
00:03
and we can start looking at this pattern.
00:03
LoadLibrary is called,
00:03
it needs a parameter.
00:03
It's doing the same move as doing the same push.
00:03
It looks like eax was being stored in bar 8.
00:03
We can see that LoadLibrary was called up here as well.
00:03
I'm going to say there is
00:03
a push wininet was pushed onto the stack.
00:03
This move doesn't really affect
00:03
that and LoadLibrary was called.
00:03
LoadLibrary was called and a handle was returned.
00:03
A handle was put into eax because remember,
00:03
eax is what a functions return.
00:03
eax holds the value of whatever the function returned.
00:03
There is a GetProcAddress
00:03
and it's moving that into
00:03
esi because LoadLibrary is not using it anymore.
00:03
It's going to push two things,
00:03
ebx and a ProcName
00:03
and move eax into var 4.
00:03
eax was last modified when it called LoadLibrary.
00:03
We now know that var 4 is the wininet DLL handle.
00:03
I'm going to just say handle.
00:03
GetProcAddress will return a value in the eax register.
00:03
If you look at that,
00:03
it'll be for proc
00:03
and that's a pretty standard Windows API return.
00:03
But what we really want is the dwords to
00:03
say send or whatever else.
00:03
If we want to GetProcAddress
00:03
here and store the result in eax,
00:03
we can guess that send,
00:03
the function address for send is stored there
00:03
[NOISE] and make it a little
00:03
clean and say, [NOISE] function send.
00:03
Let's double check that, so GetProcAddress.
00:03
It'll return. I think it returns.
00:03
Oh, the good thing about this is,
00:03
we can always google [NOISE] and check.
00:03
[NOISE]
00:03
Retrieves the address of an exported function or
00:03
variable from the specified dynamic link library or DLL.
00:03
>> It takes in two parameters,
00:03
each module, which is a handle to the library,
00:03
and a long pointer
00:03
to the ProcName as a long pointer C string.
00:03
If it succeeds, the return value is
00:03
the address of the exported function,
00:03
so yes, and if it's null,
00:03
then it failed, but there
00:03
doesn't seem to be any error checking.
00:03
We can see this pattern,
00:03
ProcAddress was called,
00:03
so we know that eax was then the send function.
00:03
Let's see. ProcAddress was moved,
00:03
so last call was the library.
00:03
When was this receive used?
00:03
Pushed the each module, called get ProcAddress.
00:03
This send was a parameter to the next one,
00:03
and this receive was
00:03
actually the parameter to this get ProcAddress.
00:03
That makes sense. This was actually receive.
00:03
What tipped me off to this was it loaded up wininet,
00:03
and then used this string.
00:03
Send and receive are pretty common functions
00:03
used by network-aware software,
00:03
and they're pretty common in malware as well.
00:03
We can continue the same type of process
00:03
where eax was answered here,
00:03
send was the next one, so function_send.
00:03
Good to be consistent. We could
00:03
do all of this down the line.
00:03
If you're really good,
00:03
you can use IDC or
00:03
Python to write a small script to do this for you,
00:03
but what are we really after again?
00:03
Well, we might like to know a Mutex that's created,
00:03
we can see it's pushed,
00:03
it will be the parameters to this.
00:03
We now know that eax is a result of GetProcAddress,
00:03
so we know that this is
00:03
CreateMutex A. [inaudible] I'm going to say x,
00:03
what else uses this?
00:03
Only two functions uses this,
00:03
so if we see here call CreateMutex function A,
00:03
there's three parameters that were pushed.
00:03
This one being the first,
00:03
I'm willing to bet that if we Googled CreateMutex A,
00:03
the final parameter, the rightmost parameter,
00:03
will be the name of the Mutex.
00:03
I'm willing to bet this is probably zero, yeah.
00:03
Xor ESI, ESI,
00:03
so the zero rules out ESI,
00:03
compares it to something but reuses the value down here.
00:03
So it's calling CreateMutex A 00, and then this value.
00:03
Let's Google that Mutex A.
00:03
If you see these functions like Message Box,
00:03
Message Box A, Message Box W, CreateMutex,
00:03
CreateMutex A, CreateMutex W,
00:03
A at the end stands for ASCII,
00:03
W at the end stands for Unicode.
00:03
I don't know why. Anyway, I was right,
00:03
the rightmost parameter is a long pointer C string,
00:03
and it's pointing to the name of the Mutex.
00:03
This is another indicator of compromise.
00:03
If this Mutex appears on your computer,
00:03
then you might be compromised with this piece of malware.
00:03
This is pretty good knowledge to have when you're
00:03
fighting malware because I
00:03
pulled that out in a few minutes.
00:03
So if your organization had
00:03
the capability to search across your entire enterprise
00:03
for all running Mutexes
00:03
with some software like [inaudible]
00:03
or whatever agents are listening on your devices,
00:03
you could immediately pass that over
00:03
to a hunt team or something like that,
00:03
or the SOC team or whoever
00:03
does that, and you can say, hey,
00:03
I'm not done with my analysis,
00:03
but look for this Mutex and then you can keep going.
00:03
I'm interested to know what part of the program
00:03
is checking the Mutex
00:03
to see if there's another infection going on,
00:03
and we can see that it was called by realMain.
00:03
If I'm not mistaken,
00:03
realMain also called CreateMutex here.
00:03
So is this the same string?
00:03
We hit Escape and we can just go back,
00:03
forward, back, forward,
00:03
back, forward, back,
00:03
it looks like the same string.
00:03
It's checking in two different places and it's code to
00:03
make sure another instance of infection isn't running.
00:03
I'd like to know what this code is because depending
00:03
on what the CreateMutex A function returns,
00:03
it's going to do something else,
00:03
because it makes another function call
00:03
and then compares the output of that to this value.
00:03
I'm going to go find out what this is,
00:03
it's only written two once, eax,
00:03
eax was modified by
00:03
this call instruction for GetProcAddress,
00:03
and the parameters to that ProcAddress was GetLastError.
00:03
That's pretty useful. It's probably called
00:03
in more than one place.
00:03
Following that path of logic again,
00:03
GetLastError was pushed as a parameter to call this,
00:03
and which affected eax,
00:03
which was inserted into here,
00:03
Get Last Error, and then it compares that to B7.
00:03
Do some more Googling, say Windows error codes.
00:03
Let's just say, I hope they have hex, 0xB7.
00:03
Perfect. Error_Already_Exists. We know
00:03
that it's comparing that to Error_Already_Exists,
00:03
and if it already exists, it does this.
00:03
This is probably exit process.
00:03
Terminate or do something else.
00:03
Another cool trick we can do with
00:03
either is we can say, hey,
00:03
I know this is that Error_Already_Exists enum,
00:03
so we can actually specify that here,
00:03
use standard symbolic constant or the M shortcut.
00:03
It's going to bring us a list of
00:03
all of the enumerations that it knows of.
00:03
Here it knows that
00:03
Error_Already_Exists
00:03
matches this value, and then click "Okay".
00:03
It makes it a little easier to read.
00:03
GetLastError already exists.
00:03
May as well figure out what this is.
00:03
It's written to here,
00:03
and it's written to here.
00:03
That's very interesting.
00:03
Let's check this one out.
00:03
That's the location we already are at, that location.
00:03
CreateMutex A it stores the value here,
00:03
and the result of CreateMutex A.
00:03
>> I search for returns.
00:03
As the function succeeds,
00:03
the return value is a handle to
00:03
the newly created new text object.
00:03
As the function fails, the value is null.
00:03
I'm going to say Mutex Handle.
00:03
That's a local variable.
00:03
I want to find out what this
00:03
is apparently is called a lot.
00:03
We should really label that function.
00:03
I'm going to figure out what this is,
00:03
the EAX was modified by GetProcAddress here,
00:03
which had the parameters of CloseHandle,
00:03
changes name, or I guess I should be
00:03
consistent and say function
00:03
CloseHandle, but that's okay.
00:03
I'm going to name this because I see it quite frequently.
00:03
FunctionTable builder.
00:03
No, it's not really a table.
00:03
FunctionResolver.
00:03
We can see who calls it realMain, that's right.
00:03
Calls it right there has the first thing it calls.
00:03
We can pick apart this program bit by
00:03
bit and really get into exactly what it's doing.
00:03
We can totally understand all of its capabilities.
00:03
But before diving in,
00:03
you should really ask yourself,
00:03
what am I after?
00:03
Am after indicators of compromise,
00:03
am I after network traffic,
00:03
am I after authorship sophistication.
00:03
Because it's easy to get
00:03
lost in the details because you have
00:03
a very detailed view of everything.
00:03
You can easily just go down rabbit holes.
00:03
You can go off into the weeds,
00:03
you can take whatever analogy you want.
00:03
But early on I said,
00:03
this is very detail-oriented
00:03
and every once in
00:03
a while you need to pop your head up and say,
00:03
''Okay, what am I after again?''
00:03
Because this malware was found on the machines.
00:03
What's its capabilities?
00:03
I can go for that.
00:03
A good way to determine its capabilities.
00:03
After you've already gotten a look
00:03
at what it's doing to defend itself,
00:03
would be to look at the function calls it makes.
00:03
We can see the hostname, we can do some.
00:03
We can say, "Okay,
00:03
let's doing something with the network."
00:03
I can delete files,
00:03
I can read files,
00:03
I can create files.
00:03
It can modify the registry.
00:03
We already knew this from art dynamic analysis.
00:03
We saw that it modify that registry.
00:03
It opened a whole new firewall for itself.
00:03
We saw that it
00:03
could take certain commands from the IRC server.
00:03
We can see virtual free virtual allies
00:03
is important because it can create memory which
00:03
include insert code into and then use
00:03
the VirtualProtect function to execute
00:03
that code and search directories.
00:03
This using some ZW functions,
00:03
ZW are usually reserved for lower-level libraries.
00:03
They should never be called by
00:03
higher-level user land programs
00:03
unless you really know what you're doing
00:03
and even then you probably shouldn't do it.
00:03
This guy is using this function.
00:03
I would be curious to see how he's using that.
00:03
Because it's probably an anti-debugging method.
00:03
Because you can call
00:03
this function and you will get back information
00:03
about your process or
00:03
your thread or any other processors or do you want.
00:03
You can determine from the kernels
00:03
response if that thread process
00:03
or whatever is being debugged.
00:03
I'd be willing to look into that
00:03
>> if I wanted to debug it.
00:03
>> Say OpenAddress Registry stuff.
00:03
It can also mess with services.
00:03
If I didn't know any better,
00:03
I would be like, okay,
00:03
it's probably installing via
00:03
service and if I wanted to go
00:03
down that rabbit hole and
00:03
determine the service name it's using.
00:03
I could do that with these functions very quickly.
00:03
I could just say,
00:03
"Oh, where is this thing being used?"
00:03
It's only being used in one function
00:03
or it's being used here.
00:03
I can look for any strings around there.
00:03
I see none. I go for another one.
00:03
But if I wanted to follow that rabbit hole, I could.
00:03
Now here's some more interesting strings.
00:03
It's interesting because there's a slash, slash,
00:03
and escape slash in there. This is interesting.
00:03
P colon slash slash and what looks to
00:03
be a file name extension. This looks interesting.
00:03
These look like encrypted strings, basic obfuscation.
00:03
It looks like it's pushing these.
00:03
Moving a value into d word x is modified by this.
00:03
This is delete registry handle here.
00:03
It's GetProcAddress.
00:03
This was the last of these,
00:03
but it calls this after is received.
00:03
The parameters of the pointers
00:03
to these strings and it's called
00:03
multiple times and each time there's
00:03
a obfuscated string and a different location.
00:03
I'm willing to bet the string is input.
00:03
Dysfunction is decryption or deification,
00:03
and this is output.
00:03
Let's take a quick look into that.
00:03
I'm going to say decryptString,
00:03
takes in two arguments,
00:03
two pointers is what it looks like.
00:03
Sores on an ESI and then stores this one
00:03
in EDI system to AL,
00:03
which is the lower eight bits of
00:03
the 32-bit EAX register, if I'm not mistaken.
00:03
This is useful if you just want a character.
00:03
Test AL means it tests to see if this is zero,
00:03
so test AL is the string blank.
00:03
If the zero flag was active,
00:03
it was a 1-bit in there.
00:03
It will skip to the end so a test to
00:03
see if there was any characters in that string.
00:03
If that string had a character in it,
00:03
then it would begin executing this loop.
00:03
Where does the push increments,
00:03
ESI, the source index?
00:03
It calls this function.
00:03
It moves AL and to
00:03
EDI and then moves it into ESI increments EDI,
00:03
pops the value off the stack into ECX,
00:03
and tests it to see if it's zero.
00:03
If this is not zero,
00:03
the jump is taken.
00:03
This is taken.
00:03
It means conversely, if it is zero, this full circle.
00:03
We can hit space and say,
00:03
if the value is zero,
00:03
then just falls through to
00:03
the next instruction, which is down here.
00:03
[NOISE] You can see
00:03
this function has something
00:03
to do with these lists decryption.
00:03
We can pick at it really quickly and say,
00:03
"Okay, there's an alphabet there,
00:03
and there's an alphabet there."
00:03
I'm just going to guess just by looking at it.
00:03
It's not really an encryption off
00:03
your cation is really more of an encoding.
00:03
It does a simple swap.
00:03
You can double-click it and dive into it.
00:03
There's the basic function prologue,
00:03
mixed space for local variables.
00:03
There is a push, push, push.
00:03
This is interesting and
00:03
it would take a few minutes to figure it out.
00:03
It wouldn't be too hard and
00:03
we could even write our own version.
00:03
We can decrypt the strings ourselves.
00:03
However, in the next video,
00:03
I'm going to show you the easy way of doing it.
Up Next