Reverse Engineering for Beginners: Basic Programming Concepts
By: Dra4hos7
April 5, 2019
By: Dra4hos7
April 5, 2019
In this article we will look under the hood of the software. Newbies in reverse engineering will get a general idea of the software research process itself, the general principles of building code, and how to read assembly code.
Note The program code for this article is compiled using Microsoft Visual Studio 2015, so some features in newer versions may be used differently. IDA Pro is used as a disassembler.
Andinitializing variables
Variables - one of the main components of programming. They are divided into several types, here are some of them:
in C ++, a string is not a primitive variable, but it is important to understand how it will look in native code.
Let's look at the assembler code:
Variable initialization
Here you can see how IDA shows the allocation of space for variables. First, space is allocated for each variable, and then it is initialized.
Variable initialization
Once the space is allocated, the value we want to assign to the variable is placed in it. The initialization of most variables is shown in the picture above, but how the string is initialized is shown below.
Initializing a string variable in C ++
To initialize the string, you need to call the built-in function.
CStandard output function
Note Here we will talk about how variables are pushed onto the stack and then used as parameters for the output function. The concept of the function with parameters will be discussed later.
Now look at the machine code. First string literal:
String literal output
As you can see, the string literal is first pushed onto the stack for calling the function as a parameter
printf()
.
Now let's look at the output of one of the variables:
Variable output
As you can see, the variable
intvar
is first placed in the EAX register, which in turn is written to the stack along with the string literal
%i
used to denote integer output. These variables are then taken from the stack and used as parameters when calling the function
printf()
.
MMathematical operations
Now we will talk about the following mathematical operations:
Addition.
Subtraction.
Multiplication.
Division.
Bitwise conjunction (I).
Bitwise Disjunction (OR).
Bitwise exclusive OR.
Bitwise negation.
Bit shift to the right.
Bit shift to the left.
void mathfunctions () {// math operations int A = 10; int B = 15; int add = A + B; int sub = A - B; int mult = A * B; int div = A / B; int and = a & b; int or = A | B; int xor = A ^ B; int not = ~ A; int rshift = A >> B; int lshift = A << B;}
We translate each operation into an assembler code:
we first assign the
A
value
0A
to a variable in hexadecimal or 10 in decimal. The variable
B
is
0F
equal to 15 in decimal.
Variable initialization
For addition we use the instruction
add
:
Addition
When subtracting, the instruction is used
sub
:
Subtraction
When multiplying -
imul
:
Multiplication
For division the instruction is used
idiv
. We also use an operator
cdq
to double the size of EAX and the result of the division fits in the register.
simply displays the message “Hello! I'm a new function! ”:
void newfunc () {// new function without parameters printf ("Hello! I'm a new function"!);}
Newfunc () function
This function uses the instruction
retn
, but only to return to the previous location (so that the program can continue its work after the end of the function). Let's look at a function
newfuncret()
that generates a random integer using the C ++ function
rand()
and then returns it.
int newfuncret () {// new function that returns something int A = rand (); return A;}
Newfuncret () function
First, space is allocated for the variable
A
. Then the function is called
rand()
, the result of which is placed in the EAX register. The value of EAX is then placed in the place allocated to the variable
A
, effectively assigning the
A
result of the function to the variable
rand()
. Finally, variable A is placed in the EAX register so that the function can use it as a return parameter. Now that we have figured out how to call functions without parameters and what happens when returning a value from a function, let's talk about calling the function with parameters.
Calling such a function is as follows:
funcparams (intvar, stringvar, charvar);
Function call with parameters
Strings in C ++ require calling a function
basic_string
, but the concept of calling a function with parameters does not depend on the data type. First, the variable is placed in the register, then from there to the stack, and then the function is called.
Let's look at the function code:
void funcparams (int iparam, string sparam, char cparam) {// function with parameters printf ("% i n", iparam); printf ("% s n", sparam); printf ("% c n", cparam);}
Funcparams () function
This function takes a string, an integer and a character, and prints them with a function
printf()
. As you can see, first the variables are placed at the beginning of the function, then they are pushed onto the stack for calling the function as parameters
printf()
. Very simple.
CCycles
Now that we have studied the function call, output, variables, and mathematics, we turn to controlling the order of code execution (flow control). First we examine the for loop:
void forloop (int max) {// normal for loop for (int i = 0; i <max; ++ i) { printf ("% i n", i); }}
Graphic overview of the for cycle
Before breaking the assembler code into smaller parts, let's look at the general version. As you can see, when the for loop starts, it has 2 options:
he can go to the block on the right (green arrow) and return to the main program;
it can go to the block on the left (red arrow) and go back to the beginning of the for loop.
Cycle for detail
Variables are compared first
i
and
max
to check if the variable has reached its maximum value. If the variable is
i
not greater than or not equal to the variable
max
, then the subroutine will go along the red arrow (down to the left) and output the variable
i
, then
i
increase by 1 and return to the beginning of the cycle. If the variable is
i
greater than or equal
max
, the subroutine will go along the green arrow, that is, it will exit the cycle
for
and return to the main program.
Now let's take a look at the loop
while
:
void whileloop () {// while loop int A = 0; while (a <10) { A = 0 + (rand ()% (int) (20-0 + 1)) } printf ("I'm out!");}
While loop
In this cycle, a random number is generated from 0 to 20. If the number is greater than 10, then the loop will exit with the words “I'm out!”, Otherwise the work in the loop will continue.
In machine code, the variable is
А
first initialized and equated to zero, and then the cycle is initialized and
A
compared to a hexadecimal number
0A
, which is 10 in the decimal number system. If
А
not greater than and not equal to 10, then a new random number is generated, which is written in
А
, and the comparison again occurs. If
А
greater than or equal to 10, then exit from the cycle and return to the main program.
At theconditional operator
Now let's talk about conditional statements. First, let's look at the code:
void ifstatement () {// conditional statements int A = 0 + (rand ()% (int) (20-0 + 1)); if (A <15) { if (A <10) { if (a <5) { printf ("less than 5"); } else { printf ("less than 10, greater than 5"); } } else { printf ("less than 15, greater than 10"); } } else { printf ("greater than 15"); }}
This function generates a random number from 0 to 20 and stores the resulting value in a variable
А
. If A is greater than 15, then the program will display "greater than 15". If A is less than 15, but more than 10 - “less than 15, greater than 10”. If less than 5 - "less than 5".
Let's look at the assembler graph:
Assembly graph for conditional operator
The graph is structured similarly to the actual code, because the conditional operator looks simple:"If X, then Y, otherwise Z". If you look at the first pair of arrows above, then the operator is preceded by a comparison
А
with
0F
, which is 15 in the decimal number system. If it is
А
greater than or equal to 15, then the subroutine will output “greater than 15” and will return to the main program. In another case, a comparison
А
with
0A
(1010) will occur . This will continue until the program displays something on the screen and returns.
OOperator selection
The select statement is very similar to the condition statement, only in a select statement one variable or expression is compared with several “cases” (possible equivalences). Let's see the code:
void switchcase () {// select statement int A = 0 + (rand ()% (int) (10-0 + 1)); switch (A) { case 0: printf ("0"); break; case 1: printf ("1"); break; case 2: printf ("2"); break; case 3: printf ("3"); break; case 4: printf ("4"); break; case 5: printf ("5"); break; case 6: printf ("6"); break; case 7: printf ("7"); break; case 8: printf ("8"); break; case 9: printf ("9"); break; case 10: printf ("10"); break; }}
In this function, the variable
А
gets a random value from 0 to 10. It is then
А
compared with several cases using
switch
. If the value
А
is one of the cases, the corresponding number will appear on the screen, and then the operator will exit the selection operator and return to the main program.
The choice operator does not follow the rule “If X, then Y, otherwise Z”, unlike the conditional operator. Instead, the program compares the input value with existing cases and performs only the case that matches the input value. Consider the first two blocks in more detail.
The first two blocks of the operator of choice
First, a random number is generated and written to
А
. Now the program initializes the select statement, equating the temporary variable
var_D0
with
А
, then checks that it is equal to at least one of the cases. If
var_D0
a default value is required, the program will follow the green arrow to the final return section from the subroutine. Otherwise, the program will make the transition to the desired one
case
.
If it
var_D0 (A)
is equal to 5, then the code will go to the section shown above, output “5” and then go to the return section.
PUser input
In this section, we will look at user input using a stream
In this function, we simply write the string to the variable of the sentence using the C ++ cin function and then output the sentence using the operator
printf()
.
Let's sort it out in machine code. First, the function
cin
:
cin (C ++)
First, the string variable is initialized
sentence
, then the call
cin
and the entry of the entered data into
sentence
.
C ++ cin function more detailed
First, the program sets the contents of the sentence variable to EAX, then pushes EAX onto the stack, from where the value of the variable will be used as a parameter for the stream
cin
, then the stream operator >> is called. Its output is placed on ECX, which is then pushed onto the stack for the operator
printf()
:
We considered only the basic principles of the software at a low level. Without these fundamentals, it is impossible to understand the work of software and, accordingly, to engage in its research.