Intro to Win64 Assembly and Process Dumping

February 4, 2016 | Views: 3721

Begin Learning Cyber Security for FREE Now!

FREE REGISTRATIONAlready a Member Login Here

 


=============================================================================
=== ===
= Intro to Win64 Assembly and Process Memory Dumping =
=== ===
=============================================================================
current_user
20160202
for Cybrary.it|0P3N

Recently I’ve checked out the “Intro to Malware Analysis and Reverse
Engineering” course by Sean Pierce. Inspired by his contribution and taking a
rest from my current activities, I also decided to share something with you.
What I noticed is that Sean references rather outdated tools in his
videos. Windows XP? Really? OllyDbg… well, it’s a pretty good debugger, I
can’t argue. But its development is so slow, that I’m afraid my grandchildren
will turn gray before they see x64 version go live. There were times long ago
when SoftICE ruled the world. Yeah, it was (and it surely is) the undisputed
God of all debuggers. Times change, however. All becomes history. ImportREC
as import table reconstructor has also overgrown with moss, not being able to
work in 64-bit world. What are the alternatives? CHimpREC, Scylla. But they
don’t always work as expected, too. Well, what I’m trying to say here is that
there may (and definitely will) be time when your handy tool can’t fulfill
its purpose any longer; when it just fails, leaving you barehanded. What will
you do when your hammer breaks? Will you wait for someone to fix the hammer
for you or will you forge the new, better one yourself? Just remember, that a
hacker is not the one, who has mastered the art of using tools, but is able
to build the tools themselves.
For reverse engineering (and malware analysis) knowledge of assembly
language is vital. You should have at least some basic understanding of it to
follow what’s written further. Get Intel 64 and IA-32 Architectures Software
Developer Manuals[1] and AMD64 Architecture Programmer’s Manuals[2].
Read ’em. Learn ’em.
Meanwhile, I’ll give you a short introduction to Win64 assembly and
process memory dumping.

Intro to assembly for Win64
——————————–

Programming in assembly for Windows is very simple. Operating system
provides you with an Application Programming Interface (API) — a set of
functions, scattered across dynamic-link libraries (DLLs) like Kernel32.dll,
User32.dll, etc. You just link the required libraries to your application and
call functions they provide.
There’re a lot of different assemblers in the world, but I would
recommend you to use flat assembler[3]. This is a very fast and flexible
assembler with an extremely powerful macroinstructions support. Just try it.
You’ll love it, I’m sure.
Not worth mentioning, that you should get familiar with PE file format.
Get Microsoft PE and COFF Specification[4]. There’s also a great document
about portable executables by Bernd Luevelsmeyer[5], the one I myself studied
and referenced while learning the subject in the past.

Win64 application source code template for flat assembler
————————————————————–

Win64 application source code template replicates the standard portable
executable structure:

format pe64
entry start

section ‘.text’ code readable executable
start:
; {here goes executable code}

section ‘.data’ data readable writeable
; {here goes data}

section ‘.idata’ import data readable writeable
; {here goes import table}

The first `format pe64` directive tells fasm to produce PE32+ executable
image. It can be followed by additional `console` or `gui` keywords to
explicitly specify Windows Subsystem: character or GUI respectively.
`entry {label}` directive defines address of entry point.
`section` directive, followed by name and flags, defines a new PE
section. For example, `section ‘.text’ code readable executable` from the
above template will add a new section with name “.text” and
IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE flags.
Any number of sections may be defined as needed. If you need export
table, you can define export section for it:

section ‘.edata’ export data readable

If you need a section for resources, you can add it, too:

section ‘.rsrc’ resource data readable

And so on, and so forth. You get the idea. Please refer the flat
assembler programmer’s manual (comes with fasm) for more details.
Save the above template as “stub.asm” or whatever name you call it and
compile:

> fasm stub.asm

A 512-byte, empty, containing only PE header, .exe file will be produced.
Feel free to investigate it with some PE Info tool.

fastcall — Windows x64 calling convention
———————————————–

The first four arguments with size of 8, 16, 32, or 64 bits are passed
(in order left to right) in registers RCX, RDX, R8, and R9. Arguments five
and higher are passed on the stack. All arguments are right-justified in
registers.
The caller is responsible for allocating “shadow” space on the stack for
parameters to the callee, and must always allocate sufficient space for the
four register parameters, even if the callee doesn’t have that many
parameters. This means, that you always need to reserve 32 bytes on the stack
before calling an API function, even if it has less than four params. Keep in
mind, that for functions with more than four parameters, shadow space must be
reserved *after* parameters five and above have been pushed to stack.
Stack pointer and malloc or alloca memory must be aligned to 16 byte.
Results are returned in RAX register.
Those are the basic rules. More detailed description is available on
MSDN[6].

My first Win64 app in assembly
———————————–

Enough theory, let’s code! The first program will simply show a message
box (WinAPI function MessageBox) and exit (WinAPI function ExitProcess).
MSDN WinAPI Reference gives the following syntax for MessageBox:

int WINAPI MessageBox(
_In_opt_ HWND hWnd,
_In_opt_ LPCTSTR lpText,
_In_opt_ LPCTSTR lpCaption,
_In_ UINT uType
);

Pay attention to requirements:

DLL: User32.dll
Unicode and ANSI Names: MessageBoxW (Unicode) and MessageBoxA (ANSI)

Unicode and ANSI Names gives actual names of the function as defined in
DLL export table. It means, that MessageBox function doesn’t actually exist
in nature. It’s an alias to either MessageBoxA or MessageBoxW. You could
already have guessed, that “A” suffix in the name of the function stands for
“ANSI”; “W” stands for “Wide”.
You should always consider using Unicode variants of Windows API
functions when coding in year 2016+ unless you absolutely must revert to ANSI
functions for some strange reason. Throughout the text I’ll always refer to
Unicode variants, even when not explicitly specifying W suffix.
As you can see from the syntax, the function accepts four parameters.
According to the fastcall calling convention described above, the first four
parameters are passed in RCX, RDX, R8, and R9 registers:

– hWnd is optional and can be NULL (0), goes to RCX;
– lpText is a pointer to a zero-terminated message text string, goes to
RDX;
– lpCaption is a pointer to a zero-terminated message box caption string,
goes to R8;
– uType defines type of the messaage box and buttons it contains, goes to
R9.

First, let’s add message and caption to display on the message box. This
is initialized data and should be stored in the corresponding PE section:

section ‘.data’ data readable writeable
szText du ‘Hello, Cybrary.it!’, 0
szCaption du ‘My 1st Win64 App’, 0

To call the function, the following code should be added to a section,
which contains executable code:

section ‘.text’ code readable executable
start:
sub rsp, 8 ; align stack to 16-byte boundary.
; App will crash if stack is not aligned.
sub rsp, 32 ; reserve 32 bytes shadow space for parameters

mov r9, 0 ; uType = MB_OK
mov r8, szCaption ; save pointer to caption text to R8
lea rdx, [szText] ; other method of saving a pointer
xor rcx, rcx ; the same as ‘mov rcx, 0’, but smaller code
call [MessageBoxW] ; call the function

flat assembler package has a set of macros to make life easier. By
including those macros in source code it is possible to simplify, among the
variety of other things, calling of WinAPI functions. So that the code to
call MessageBox will look as follows:

invoke MessageBox, NULL, szText, szCaption, MB_OK

Pretty much high-level, right?
The next step is to include references to external functions to Import
Table.

section ‘.idata’ import data readable writeable
;
; Import Directory Table
; (see Microsoft PE and COFF Specification, section 5.4.1)
;
dd 0, 0, 0, rva dll_user32, rva imports_user32
dd 0, 0, 0, 0, 0

;
; User32 Import Lookup Table
; (see Microsoft PE and COFF Specification, section 5.4.2)
;
imports_user32:
MessageBoxW dq rva user32_fn_MessageBoxW
dq 0

;
; List of linked DLLs
;
dll_user32:
db ‘User32.dll’, 0

;
; User32 Hint/Name Table
; (see Microsoft PE and COFF Specification, section 5.4.3)
;
user32_fn_MessageBoxW:
dw 0
db ‘MessageBoxW’, 0

flat assembler package has macros to simplify Import Table construction,
too. So that importing functions from a User32.dll and Kernel32.dll will be
like

library kernel32, ‘KERNEL32.DLL’,
user32, ‘USER32.DLL’

include ‘apikernel32.inc’
include ‘apiuser32.inc’

What is left is to call ExitProcess function from Kernel32.dll and
include a reference to it to Import Table.
Below is full source code of the application.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ code start ~~~~~
format pe64 gui
entry start

section ‘.text’ code readable executable
start:
sub rsp, 40
xor r9, r9
mov r8, szCaption
mov rdx, szText
xor rcx, rcx
call [MessageBoxW] xor rcx, rcx
call [ExitProcess]

section ‘.data’ data readable writeable
szText du ‘Hello, Cybrary.it!’, 0
szCaption du ‘My 1st Win64 App’, 0

section ‘.idata’ import data readable writeable
dd 0, 0, 0, rva dll_kernel32, rva imports_kernel32
dd 0, 0, 0, rva dll_user32, rva imports_user32
dd 0, 0, 0, 0, 0

imports_kernel32:
ExitProcess dq rva kernel32_fn_ExitProcess
dq 0

imports_user32:
MessageBoxW dq rva user32_fn_MessageBoxW
dq 0

dll_kernel32:
db ‘Kernel32.dll’, 0
dll_user32:
db ‘User32.dll’, 0

kernel32_fn_ExitProcess:
dw 0
db ‘ExitProcess’, 0

user32_fn_MessageBoxW:
dw 0
db ‘MessageBoxW’, 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ code end ~~~~~

You can type the above code in a text editor (don’t copy-paste code from
examples or you won’t remember it in 5 minutes), save, and compile with the
following command:

> fasm {filename}

Great! Now you know how to code in assembly! But do you know what happens
when you run a program in Windows?

Process initialization
—————————

I assume you already have understanding of what a process is. It is a
running program. In simple words. It consists of a virtual address space,
which is allocated to it by operating system; executable image, mapped to
that address space; one or more execution threads – units, to which operating
system allocates processor time; and a bunch of data structures and handles
to various system resources. Each process has its unique identifier —
process ID or PID.
A new process can be created by calling one of the process-creation
functions: CreateProcess, CreateProcessAsUser, CreateProcessWithTokenW, or
CreateProcessWithLogonW. Creating a Windows process consistes of several
stages carried out in three parts of the operating system: Windows
client-side library, the Windows subsystem process, and the Windows
executive. Basic process creation flow is described in Figure 1.

Windows client-side library
(kernel32.dll or advapi.dll)
+———————–+
| Convert and validate |
| creation flags and |
| parameters |
+———–+———–+
|
v
+———–+———–+
| Open .exe and create |
| section object |
| |
+———–+———–+
|
v
+———–+———–+
| |
| Create process object |
| |
+———–+———–+
|
v
+———–+———–+
| |
| Create thread object |
| |
+———–+———–+
| Windows Subsystem
v (Csrss.exe)
+———–+———–+ +———————–+
| Perform specific | | Set up new process |
| Windows Subsystem +—————–>+ and thread |
| process initialization| | |
+———————–+ +———–+———–+
|
+——————————————+
|
| Windows Executive
v (Ntdll.dll)
+———–+———–+ +———–+———–+
| Start initial | | Finalize process |
| thread execution +—————–>| initialization |
| | | |
+———–+———–+ +———–+———–+
| |
v v
+———–+———–+ +———–+———–+
| | | Jump to Entry Point |
| Return to caller | | to start execution |
| | | |
+———————–+ +———————–+

Figure 1 – Process creation flow

Note, that when a process-creating function returns to caller (the left
part of the diagram), the new process may not be fully initialized, yet (the
right part of the diagram). Hopefully, there’s an API function to help you
wait for the process to be fully initialized: WaitForInputIdle.
I highly recommend you to get a copy of Windows Internals[7] book and
study it for details on how Windows works.

Dumping process memory
—————————

To dump a process means to take a snapshot of its address space at a
given time. Dumps can be later used for offline code analysis and debugging.
There’re numerous tools created for this purpose. Some of them are rather
sophisticated, allowing you to take snapshots at time intervals or on
triggered events, rebuild PE images. Actually, creating memory dump can be
considered basic (if not trivial) task. But it’s always good to know how
things work, especially basic. With this knowledge you will be able to move
on to something advanced.
To dump memory of a process you’ll need to get its PID, then gain access
to its address space, and, finally, copy and write that address space to
file.
To get process ID you could create the process from your application. In
that case, a function which created the process will return its PID. Please
note, that before dumping you’ll need to wait until newly created process is
fully initialized. WaitForInputIdle API function can help you with that.
However, most probably you’ll want to dump an already running process. A
couple of techniques to obtain ID of an active process exists. Below I’ll
describe you one of them.
There’s a tool help library[8] in Windows. As MSDN states, the functions
provided by the tool help library make it easier for you to obtain
information about currently executing applications. These functions are
designed to streamline the creation of tools, specifically debuggers.
Given that you know the name of executable file, from which the process
originates (process name), the algorithm of retrieving its PID could be
something like described in Figure 2 below:

………..
/
( Start )
/
“““““`
|
|
v
+——-+——–+
| |
| CreateToolhelp32Snapshot
| |
+——-+——–+
|
|
v
/ /
/ Process32First / INVALID_HANDLE_VALUE?
/ /
+—FALSE–+ <————-NO–+ +–YES—+
| / / |
| / / |
| / / |
| + + |
| |TRUE |
| | |
| v |
| / |
| / PROCESSENTRY32.szExeFile |
| / matches required process name? |
| +——-> +–YES———–+ |
| | / | |
| | / | |
| | / | |
| | + | |
| | |NO | |
| | | | |
| | v | |
| | / | |
| | / Process32Next | |
| | / | |
| +-TRUE–+ + | |
| / | |
| / | |
| / | |
| + | |
| |FALSE | |
| | | |
+————–+ v |
| +————+————-+ |
| | Get | |
| | PROCESSENTRY32.th32ProcessID |
| | | |
| +————+————-+ |
v | |
/ | |
/ GetLastError returned | |
/ ERROR_NO_MORE_FILES? | |
+ +–YES————–+ | |
/ | | |
/ | | |
/ | | |
+ | | |
|NO | | |
| | | |
v v | v
………………. ………………. | ……………….
/ Error getting / / Process not / | / Error getting /
/ process info / / found / | / system snapshot /
“““““+“““` “““““+“““` | “““““+“““`
| | | |
| +—————–+ | | |
| | Close snapshot | | | |
+->| handle with |<-+———–+ |
| CloseHandle | |
+——–+——–+ |
| |
+—————————————–+
|
v
………..
/
( Finish )
/
“““““`

Figure 2 – Find Process ID Algorithm

First, you need to create a system snapshot of all running processes
using API function CreateToolhelp32Snapshot with dwFlags = TH32CS_SNAPPROCESS
and th32ProcessID = 0. After that, you iterate through the snapshot with
Process32First/Process32Next functions until one of them returns FALSE. In
this case, GetLastError returning ERROR_NO_MORE_FILES will indicate, that
there’re no more processes left in the snapshot. Upon success,
Process32First/Process32Next functions fill the PROCESSENTRY32 structure with
information about process from the snapshot, which includes process ID you’re
looking for.
Once you have ID of the process, you can get access to its address space
by calling OpenProcess API function with PROCESS_VM_READ access flag, and
then read portions of its memory by calling ReadProcessMemory. However,
before you can read memory, you have to know the location (starting address)
where to read from. Tool help library is here again for your aid. All
required information is stored in modules information of a process. You will
need to create a snapshot of the process using the old good
CreateToolhelp32Snapshot function, but passing dwFlags = TH32CS_SNAPMODULE |
TH32CS_SNAPMODULE32 and th32ProcessID = {PID} this time. After that, you
iterate through the snapshot with Module32First/Module32Next functions until
one of them returns FALSE. In this case, GetLastError returning
ERROR_NO_MORE_FILES will indicate, that there’re no more modules left in the
snapshot. Upon success, Module32First/Module32Next functions fill the
MODULEENTRY32 structure with information about modules belonging to the
process (executable image and all linked DLLs), which includes base address
and size of the module you’re looking for.
Having module base address (MODULEENTRY32.modBaseAddr) and size
(MODULEENTRY32.modBaseSize), you can allocate a heap of memory of enough
size to hold full module dump with some memory allocation function (for
example, HeapAlloc) and then copy module address space to the heap with the
help of ReadProcessMemory function.
Once that done, you simply write memory from the heap to file.

My first memory dumper for Win64
————————————-

Now, when you already know how to code in asm for Windows and how to dump
process memory, you are able to create your own cool process dumper. However,
some practice won’t hurt and below I’m sharing with you source code of
dumpp.exe – a simple quick and dirty process memory dumper I created as an
addition to this tutorial for you to investigate and get more familiar with
Win64 assembly. It’s a console application which receives one parameter:
name of a process to dump. Dump is saved as {process_name}.dump file in the
current directory.
I tried my best to comment source code for people who are not very
familiar with assembly. If you’re one of them, then you should go through it
in the first place, because everybody knows: the best way to learn a
programming language is to read sources. And not to be too much boring, I
diluted it with some assembly tricks like self-modifying code, exit procedure
without entry, miscellaneous exit locations from procedure, tips on flat
assembler syntax.
Save source as ‘dumpp.asm’ or whatever name you like, then compile and
execute as “dumpp.exe process_name”.
To debug the program you may want to try x64dbg (http://x64dbg.com/).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ code start ~~~~~
format pe64 console ; Create Win64 Console application
entry _entry ; Original Entry Point

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Define PROCESSENTRY32W structure used by Windows Toolhelp functions
;
struc PROCESSENTRY32W
{
.dwSize rd 1
.cntUsage rd 1
.th32ProcessID rd 1
.th32DefaultHeapID rq 1
.th32ModuleID rd 1
rd 1
.cntThreads rd 1
.th32ParentProcessID rd 1
.pcPriClassBase rd 1
.dwFlags rd 1
.szExeFile rw 260
rw 2
}

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Define MODULEENTRY32W structure used by Windows Toolhelp functions
;
struc MODULEENTRY32W
{
.dwSize rd 1
.th32ModuleID rd 1
.th32ProcessID rd 1
.GlblcntUsage rd 1
.ProccntUsage rd 1
rd 1
.modBaseAddr rq 1
.modBaseSize rq 1
.hModule rq 1
.szModule rw 256
.szExePath rw 260
}

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~;
; Define PE32+ section containing executable code. ;
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~;
section ‘.text’ code readable writeable executable

quit:
mov rcx, r12 ; never forget to close handles and
call [CloseHandle] ; free system resources which are not
call [FreeConsole] ; needed any more
xor rcx, rcx
call [ExitProcess]

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Start program execution from here
;
_entry:
;
; Align stack to DOUBLE QUADWORD (16 bytes) and reserve 32 bytes
; for local variables as required by fastcall.
; Note: app will crash if stack is not dqword aligned.
;
sub rsp, 40

mov rbp, vars ; see notes in ‘.data’ section for its purpose
call [AllocConsole] ;
; Get StdOut handle
;
mov ecx, -11 ; STD_OUTPUT_HANDLE
call [GetStdHandle] mov [hStdOut], rax
;
; get command line string and convert it to array of argument strings
;
call [GetCommandLineW] mov [lpCmdLine], rax
lea rdx, [numArgs] mov rcx, rax
call [CommandLineToArgvW] mov [lpArgList], rax
;
; Check if number of command line arguments passed is correct.
; Exit with error if not.
;
cmp [numArgs], dword 2
jnz error.invalid_args
;
; Take a snapshot of all system processes.
; If the function fails with INVALID_HANDLE_VALUE error code,
; then display the appropriate error message and quit.
;
xor edx, edx
mov ecx, 2 ; TH32CS_SNAPPROCESS
call [CreateToolhelp32Snapshot] cmp rax, -1 ; INVALID_HANDLE_VALUE
jz error.create_system_snapshot
mov r12, rax ; R12 will keep snapshot handle
; registers R12-R15, RBP are not destroyed
; by WinAPI calls, so it’s convenient to use
; them to store frequently used variables.
; gives less code size and faster access to
; values compared to when stored in memory

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; The following routine iterates through the snapshot of system processes
; to retrieve information about each process with the help of Process32First
; and Process32Next Windows API functions, then checks process name to see
; if the required process is found
;
; BONUS: This routine contains a very simple example of self-modifying code.
; Check how it works using a debugger.
;
get_process_info_loop:
mov rbx, Process32FirstW
lea r13, [pcEntry] ; R13 will point to PROCESSENTRY32W
mov [r13], dword sizeof.pcEntry
;
; FASM SYNTAX NOTE:
; In flat assembler syntax, the label whose name begins with dot is treated
; as local label, and its name is attached to the name of last global label
; (with name beginning with anything but dot) to make the full name of this
; label. So you can use the short name (beginning with dot) of this label
; anywhere before the next global label is defined, and in the other places
; you have to use the full name.
; .cont and all labels starting with dot below are local for a code block
; between two global labels: get_process_info_loop and error. Within this
; block they can be addressed with their short names, i.e. jmp .cont
; To access these labels from other parts of code use full name.
; See, for example, `jnz error.get_process_info` instruction below.
;
.cont:
mov rdx, r13
mov rcx, r12
call qword [rbx] test al, al
db 0x74 ;<- SMC part 1. The two bytes are `jz error.get_process_info`
.smc1: ; originally, but changed to `jz .finish` at runtime
db error.get_process_info – $ – 1
;
; check if process name matches the name passed on command line
;
lea rdx, [r13 + 44] ; rdx = pointer to pcEntry.szExeFile
mov rcx, [lpArgList] mov rcx, [rcx + 8] ; rcx = pointer to the first cmd line argument
call [lstrcmpW] test eax, eax ; strings equal?
jz stage2
.smc2: ; <- SMC part 2.
or al, al ; junk command, just to reserve 2 bytes.
; changed to `jmp .cont` at runtime
add rbx, 8 ; <- what is this for? you tell
mov byte [.smc1], .finish – .smc1 – 1 ; modify SMC part 1
mov word [.smc2], (.cont – .smc2 – 2) shl 8 + 0xeb ; modify SMC part 2
jmp .cont
.finish:
call [GetLastError] cmp eax, 18 ; ERROR_NO_MORE_FILES
jz error.process_not_found

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Error handling routine.
; Performs preparations for displaying appropriate error messages.
;
error:
.get_process_info:
mov rdx, szErrGetProcessInfo
mov r8d, szErrGetProcessInfo.size
jmp .show

.invalid_args: ; if you don’t understand what’s going
mov r8, [lpArgList] ; on here, use debugger to find it out.
mov r8, [r8] ; HINT: run the app without command line
mov rdx, szErrInvalidArgs ; arguments to get here.
@@: ;<—————– anonymous label
lea rcx, [tmpbuf] push rcx
call [wsprintfW] pop rdx
mov r8d, eax
jmp .show

.process_not_found:
mov r8, [lpArgList] mov r8, [r8 + 8] mov rdx, szErrProcessNotFound
jmp @b ;<- jump to the nearest preceding anonymous label (above)
; use `jmp @f` to jump to the nearest following (below)

.create_system_snapshot:
mov rdx, szErrCreateSystemSnapshot
mov r8d, szErrCreateSystemSnapshot.size
jmp .show

.create_module_snapshot:
mov rdx, szErrCreateModuleSnapshot
mov r8d, szErrCreateModuleSnapshot.size
jmp .show

.get_module_info:
mov rdx, szErrGetModuleInfo
mov r8d, szErrGetModuleInfo.size
jmp .show

.open_process:
mov rdx, szErrOpenProcess
mov r8d, szErrOpenProcess.size
jmp .show

.allocate_heap:
mov rdx, szErrHeapAlloc
mov r8d, szErrHeapAlloc.size
jmp .show

.read_process_memory:
mov rdx, szErrReadProcessMemory
mov r8d, szErrReadProcessMemory.size
jmp .show

.create_file:
mov rdx, szErrorCreateFile
mov r8d, szErrorCreateFile.size

.show:
;
; Call showMessage procedure to display error message, then quit.
;
push qword quit

; ^^^^^^^
; Normally the above should have been `call showMessage` instruction,
; followed by `jmp quit`, but due to the code structure and design,
; showMessage procedure starts directly after the above part of code, so
; there’s no need in calling it. The return address can simply be pushed
; directly on the stack, which reduces code size and speeds up execution.
; Ain’t asm cool? 🙂
;
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Uses WriteConsoleW to display a message.
; IN: RDX – pointer to Unicode string to display
; R8D – size of the string
;
showMessage:
lea r9, [numCharsWritten] mov rcx, [hStdOut] push rbp
mov rbp, rsp
push qword 0
sub rsp, 32
call [WriteConsoleW] mov rsp, rbp
pop rbp
ret

stage2:
;
; open process object with PROCESS_VM_READ access
;
mov r8d, [r13 + 8] ; [pcEntry.th32ProcessID] xor rdx, rdx
mov ecx, 16 ; PROCESS_VM_READ
call [OpenProcess] test rax, rax
jz error.open_process
mov [hProcess], rax
;
; close previous (system processes) snapshot, not needed any more
;
mov rcx, r12
call [CloseHandle] ;
; open snapshot of desired process and include all its modules in it
;
mov edx, [r13 + 8] ; [pcEntry.th32ProcessID] mov ecx, 0x00000018 ; TH32CS_SNAPMODULE | TH32CS_SNAPMODULE32
call [CreateToolhelp32Snapshot] cmp rax, -1 ; INVALID_HANDLE_VALUE?
jz error.create_module_snapshot
mov r12, rax ; R12 will keep snapshot handle
;
; loop through the list of modules in the snapshot to find executable.
; quit with error message if not found or any other error occurred
;
mov rbx, Module32FirstW
@@:
lea rdx, [mdEntry] mov [rdx], dword sizeof.mdEntry
mov rcx, r12
call qword [rbx] test al, al
jz .error
lea rdx, [mdEntry.szModule] mov rcx, [lpArgList] mov rcx, [rcx + 8] call [lstrcmpW] test eax, eax
jz stage3
mov rbx, Module32NextW
jmp @b
.error:
mov rcx, [hProcess] call [CloseHandle] jmp error.get_module_info

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; The following routine can exit to different locations based on which error
; occurred. Investigate how it works.
;
stage3:
;
; allocate a heap of MODULEENTRY32.modBaseSize bytes
;
mov r15, [mdEntry.modBaseSize] mov rdx, r15
xor r8, r8
xor ecx, ecx
call [HeapCreate] test rax, rax
jnz @f
push qword error.allocate_heap
jmp .ret
@@:
mov [hHeap], rax
mov r8, r15
mov edx, 8 ; HEAP_ZERO_MEMORY
mov rcx, rax
call [HeapAlloc] test rax, rax
jnz .read_process_memory
push qword error.allocate_heap
jmp .ret2

.read_process_memory:
mov [lpHeap], rax
;
; read (dump) process memory to allocated heap
;
mov r8, rax
mov r9, r15
mov rdx, [mdEntry.modBaseAddr] mov rcx, [hProcess] push rbp
mov rbp, rsp
lea rax, [numCharsWritten] push rax
sub rsp, 32
call [ReadProcessMemory] mov rsp, rbp
pop rbp
test al, al
jnz .save_dump
push qword error.read_process_memory
jmp .ret2

.save_dump:
;
; save memory dump to file in the current directory.
; overwrite if such file exists
;
lea r8, [mdEntry.szModule] lea rdx, [szDumpFileName] lea rcx, [tmpbuf] call [wsprintfW] xor r9, r9
mov r8d, 1 ; FILE_SHARE_READ
mov edx, 0xc0000000 ; GENERIC_READ | GENERIC_WRITE
lea rcx, [tmpbuf] push rbp
mov rbp, rsp
push r9 ; hTemplateFile = NULL
push qword 128 ; FILE_ATTRIBUTE_NORMAL
push qword 2 ; CREATE_ALWAYS
sub rsp, 32
call [CreateFileW] mov rsp, rbp
pop rbp
test rax, rax
jnz @f
push qword error.create_file
jmp .ret2
@@:
mov [hFile], rax
lea r9, [numCharsWritten] mov r8, r15
mov rdx, [lpHeap] mov rcx, rax
push rbp
mov rbp, rsp
push qword 0
sub rsp, 32
call [WriteFile] mov rsp, rbp
pop rbp
test al, al
jnz @f
call [GetLastError] push qword error.create_file
jmp .ret3
@@:
push qword quit
.ret3: ; based on progress before error occurred
mov rcx, [hFile] ; this or that handles should or should not
call [CloseHandle] ; be released
.ret2:
mov rcx, [hHeap] call [HeapDestroy] .ret:
mov rcx, [hProcess] call [CloseHandle] ret

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~;
; Define PE32+ section containing (un)initialized data. ;
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~;
section ‘.data’ data readable writeable

szDumpFileName du ‘%s.dump’,0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Error massages
;
; FASM SYNTAX NOTE:
; `du` directive accepts the quoted string values of any length, which will
; be converted into chain of words with zeroed high byte. This way Unicode
; strings are defined.
;
szErrInvalidArgs:
du ‘Intro to Win64 Assembly and Process Dumping Practice Application’
du 13, 10, ‘See https://www.cybrary.it/0p3n/intro-to-win64-assembly-‘
du ‘and-process-dumping’, 13, 10, ‘Usage: %s process_name’, 0

szErrCreateSystemSnapshot:
du ‘Failed to create a snapshot of system processes’, 0
.size = ($ – szErrCreateSystemSnapshot) / 2 ;string size in bytes

szErrCreateModuleSnapshot:
du ‘Failed to create a snapshot of process modules’, 0
.size = ($ – szErrCreateModuleSnapshot) / 2

szErrGetProcessInfo:
du ‘Failed to get process information from the snapshot’, 0
.size = ($ – szErrGetProcessInfo) / 2

szErrGetModuleInfo:
du ‘Failed to get module information from process snapshot’, 0
.size = ($ – szErrGetModuleInfo) / 2

szErrProcessNotFound:
du ‘No running processes with name “%s” found’ ,0

szErrOpenProcess:
du ‘Failed to get access to the process’, 0
.size = ($ – szErrOpenProcess) / 2

szErrHeapAlloc:
du ‘Failed to allocate memory for process dumping’, 0
.size = ($ – szErrHeapAlloc) / 2

szErrReadProcessMemory:
du ‘Failed to read memory of the process’, 0
.size = ($ – szErrReadProcessMemory) / 2

szErrorCreateFile:
du ‘Failed to create or write to dump file’, 0
.size = ($ – szErrorCreateFile) / 2

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Uninitialized global variables
;
; FASM SYNTAX NOTE:
; `virtual` directive defines virtual data at specified address. This data
; will not be included in the output file, but labels defined there can be
; used in other parts of source. `virtual at rbp` tells fasm, that all
; labels inside the virtual data space will be relative to the value of
; RBP register. For example, instruction `mov rcx, [hStdOut]` will be
; assembled as `mov rcx, [rbp+0]`, instruction `lea rdx, [lpCmdLine]` will
; be assembled as `lea rdx, [rbp+8]`, and so on. Note `mov rbp, vars`
; instruction at the beginning of executable code. It initializes RBP
; register with base address of uninitialized data. Using register-based
; addressing produces smaller code: avg. 4 bytes per `mov reg, [rbp+num]`
; instruction instead of 7-8 bytes for direct addressing `mov reg, mem`.
; `sizeof.vars = $ – $$` defines the size of virtual data. This amount
; of data should then be reserved with `vars rb sizeof.vars` expression.
; This is required for fasm to properly compute PE section virtual size
; during executable image creation.
;
virtual at rbp
hStdOut rq 1
hProcess rq 1
hHeap rq 1
lpHeap rq 1
hFile rq 1
lpCmdLine rq 1
lpArgList rq 1
numArgs rd 1
numCharsWritten rd 2
pcEntry PROCESSENTRY32W
sizeof.pcEntry = $ – pcEntry
mdEntry MODULEENTRY32W
sizeof.mdEntry = $ – mdEntry
tmpbuf rb 512
sizeof.vars = $ – $$
end virtual

align 16
vars rb sizeof.vars

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~;
; Define PE32+ import section ;
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~;
section ‘.idata’ import data readable writeable

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Import Directory Table
; (see Microsoft PE and COFF Specification, section 5.4.1)
;
dd 0, 0, 0, rva dll_kernel32, rva imports_kernel32
dd 0, 0, 0, rva dll_shell32, rva imports_shell32
dd 0, 0, 0, rva dll_user32, rva imports_user32
dd 0, 0, 0, 0, 0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Kernel32 Import Lookup Table
; (see Microsoft PE and COFF Specification, section 5.4.2)
;
imports_kernel32:
AllocConsole dq rva kernel32_fn_AllocConsole
CloseHandle dq rva kernel32_fn_CloseHandle
CreateFileW dq rva kernel32_fn_CreateFileW
CreateToolhelp32Snapshot dq rva kernel32_fn_CreateToolhelp32Snapshot
ExitProcess dq rva kernel32_fn_ExitProcess
FreeConsole dq rva kernel32_fn_FreeConsole
GetCommandLineW dq rva kernel32_fn_GetCommandLineW
GetLastError dq rva kernel32_fn_GetLastError
GetStdHandle dq rva kernel32_fn_GetStdHandle
HeapAlloc dq rva kernel32_fn_HeapAlloc
HeapCreate dq rva kernel32_fn_HeapCreate
HeapDestroy dq rva kernel32_fn_HeapDestroy
lstrcmpW dq rva kernel32_fn_lstrcmpW
Module32FirstW dq rva kernel32_fn_Module32FirstW
Module32NextW dq rva kernel32_fn_Module32NextW
OpenProcess dq rva kernel32_fn_OpenProcess
Process32FirstW dq rva kernel32_fn_Process32FirstW
Process32NextW dq rva kernel32_fn_Process32NextW
ReadProcessMemory dq rva kernel32_fn_ReadProcessMemory
WriteConsoleW dq rva kernel32_fn_WriteConsoleW
WriteFile dq rva kernel32_fn_WriteFile
dq 0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Shell32 Import Lookup Table
; (see Microsoft PE and COFF Specification, section 5.4.2)
;
imports_shell32:
CommandLineToArgvW dq rva shell32_fn_CommandLineToArgvW
dq 0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; User32 Import Lookup Table
; (see Microsoft PE and COFF Specification, section 5.4.2)
;
imports_user32:
wsprintfW dq rva user32_fn_wsprintfW
dq 0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; List of linked DLLs
;
dll_kernel32:
db ‘Kernel32.dll’, 0
dll_shell32:
db ‘Shell32.dll’, 0
dll_user32:
db ‘User32.dll’, 0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Kernel32 Hint/Name Table
; (see Microsoft PE and COFF Specification, section 5.4.3)
;
kernel32_fn_AllocConsole:
dw 0
db ‘AllocConsole’, 0
kernel32_fn_CloseHandle:
dw 0
db ‘CloseHandle’, 0
kernel32_fn_CreateFileW:
dw 0
db ‘CreateFileW’, 0
kernel32_fn_CreateToolhelp32Snapshot:
dw 0
db ‘CreateToolhelp32Snapshot’
kernel32_fn_ExitProcess:
dw 0
db ‘ExitProcess’, 0
kernel32_fn_FreeConsole:
dw 0
db ‘FreeConsole’, 0
kernel32_fn_GetCommandLineW:
dw 0
db ‘GetCommandLineW’, 0
kernel32_fn_GetLastError:
dw 0
db ‘GetLastError’, 0
kernel32_fn_GetStdHandle:
dw 0
db ‘GetStdHandle’, 0
kernel32_fn_HeapAlloc:
dw 0
db ‘HeapAlloc’, 0
kernel32_fn_HeapCreate:
dw 0
db ‘HeapCreate’, 0
kernel32_fn_HeapDestroy:
dw 0
db ‘HeapDestroy’, 0
kernel32_fn_lstrcmpW:
dw 0
db ‘lstrcmpW’, 0
kernel32_fn_Module32FirstW:
dw 0
db ‘Module32FirstW’, 0
kernel32_fn_Module32NextW:
dw 0
db ‘Module32NextW’, 0
kernel32_fn_OpenProcess:
dw 0
db ‘OpenProcess’, 0
kernel32_fn_Process32FirstW:
dw 0
db ‘Process32FirstW’, 0
kernel32_fn_Process32NextW:
dw 0
db ‘Process32NextW’, 0
kernel32_fn_ReadProcessMemory:
dw 0
db ‘ReadProcessMemory’, 0
kernel32_fn_WriteConsoleW:
dw 0
db ‘WriteConsoleW’, 0
kernel32_fn_WriteFile:
dw 0
db ‘WriteFile’, 0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; Shell32 Hint/Name Table
; (see Microsoft PE and COFF Specification, section 5.4.3)
;
shell32_fn_CommandLineToArgvW:
dw 0
db ‘CommandLineToArgvW’, 0

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; User32 Hint/Name Table
; (see Microsoft PE and COFF Specification, section 5.4.3)
;
user32_fn_wsprintfW:
dw 0
db ‘wsprintfW’, 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ code end ~~~~~

References
—————

1. Intel 64 and IA-32 Architectures Software Developer Manuals — http://www
.intel.com/content/www/us/en/processors/architectures-software-developer-manu
als.html
2. AMD64 Architecture Programmer’s Manuals — http://developer.amd.com/resou
rces/documentation-articles/developer-guides-manuals/
3. flat assembler — http://flatassembler.net
4. Microsoft PE and COFF Specification — https://msdn.microsoft.com/en-us/w
indows/hardware/gg463119.aspx
5. The PE File Format by Bernd Luevelsmeyer — http://www.pelib.com/resource
s/luevel.txt
6. Overview of x64 Calling Conventions — https://msdn.microsoft.com/en-us/l
ibrary/ms235286.aspx
7. Windows Internals — https://technet.microsoft.com/en-us/sysinternals/bb96
3901.aspx
8. MSDN: Tool Help Library — https://msdn.microsoft.com/en-us/library/window
s/desktop/ms686837(v=vs.85).aspx

current_user > _ EOF

Share with Friends
FacebookTwitterLinkedInEmail
Use Cybytes and
Tip the Author!
Join
Share with Friends
FacebookTwitterLinkedInEmail
Ready to share your knowledge and expertise?
3 Comments
  1. Thanks @CURRENT_USER, this was impossible to read until I saw you comment. Also, thanks for the article. If you’re reading this and can’t understand why this language looks like some moron was both drunk and high while trying to format the basics of this coding language, I highly recommend copying this article, going to http://hastebin.com and pasting the entire article. Your eyes will thank you. Thanks for the knowledge C_U.

  2. thanx for ur work

  3. Obviously text editor sucks at preserving text formatting. Those who interested may want to check out http://hastebin.com/raw/reqicutine for a readable version

Comment on This

You must be logged in to post a comment.

Our Revolution

We believe Cyber Security training should be free, for everyone, FOREVER. Everyone, everywhere, deserves the OPPORTUNITY to learn, begin and grow a career in this fascinating field. Therefore, Cybrary is a free community where people, companies and training come together to give everyone the ability to collaborate in an open source way that is revolutionizing the cyber security educational experience.

Support Cybrary

Donate Here to Get This Month's Donor Badge

 

We recommend always using caution when following any link

Are you sure you want to continue?

Continue
Cancel