Ready to Start Your Career?

Reverse Engineering for Beginners: Basic Programming Concepts

Dra4hos7 's profile image

By: Dra4hos7

April 5, 2019

In this article we will look under the hood of the software. Newbies in reverse engineering will get a general idea of the software research process itself, the general principles of building code, and how to read assembly code.

Note The program code for this article is compiled using Microsoft Visual Studio 2015, so some features in newer versions may be used differently. IDA Pro is used as a disassembler.

Andinitializing variables

Variables - one of the main components of programming. They are divided into several types, here are some of them:

  • line;
  • integer;
  • logical variable;
  • symbol;
  • real number with double precision;
  • real number;
  • array of characters.

Standard variables:

string stringvar = "Hello World";int intvar = 100;bool boolvar = false;char charvar = 'B';double doublevar = 3.1415;float floatvar = 3.14159265;char carray [] = {'a', 'b', 'c', 'd', 'e'};

in C ++, a string is not a primitive variable, but it is important to understand how it will look in native code.

Let's look at the assembler code:

Variable initialization

Here you can see how IDA shows the allocation of space for variables. First, space is allocated for each variable, and then it is initialized.

Variable initialization

Once the space is allocated, the value we want to assign to the variable is placed in it. The initialization of most variables is shown in the picture above, but how the string is initialized is shown below.

Initializing a string variable in C ++

To initialize the string, you need to call the built-in function.

CStandard output function

Note Here we will talk about how variables are pushed onto the stack and then used as parameters for the output function. The concept of the function with parameters will be discussed later.

For data output, it was decided to use

printf()
, not
cout
.

Standard output:

printf ("Hello String Literal");printf ("% s", stringvar);printf ("% i", intvar);printf ("% c", charvar);printf ("% f", doublevar);printf ("% f", floatvar);printf ("% c", carray [3]);

Now look at the machine code. First string literal:

String literal output

As you can see, the string literal is first pushed onto the stack for calling the function as a parameter

printf()
.

Now let's look at the output of one of the variables:

Variable output

As you can see, the variable

intvar
is first placed in the EAX register, which in turn is written to the stack along with the string literal
%i
used to denote integer output. These variables are then taken from the stack and used as parameters when calling the function
printf()
.

MMathematical operations

Now we will talk about the following mathematical operations:

  1. Addition.
  2. Subtraction.
  3. Multiplication.
  4. Division.
  5. Bitwise conjunction (I).
  6. Bitwise Disjunction (OR).
  7. Bitwise exclusive OR.
  8. Bitwise negation.
  9. Bit shift to the right.
  10. Bit shift to the left.
void mathfunctions () {// math operations int A = 10; int B = 15; int add = A + B; int sub = A - B; int mult = A * B; int div = A / B; int and = a & b; int or = A | B; int xor = A ^ B; int not = ~ A; int rshift = A >> B; int lshift = A << B;}

We translate each operation into an assembler code:

we first assign the

A
value
0A
to a variable in hexadecimal or 10 in decimal. The variable
B
is
0F
equal to 15 in decimal.

Variable initialization

For addition we use the instruction

add
:

Addition

When subtracting, the instruction is used

sub
:

Subtraction

When multiplying -

imul
:

Multiplication

For division the instruction is used

idiv
. We also use an operator
cdq
to double the size of EAX and the result of the division fits in the register.

Division

When bitwise conjunction used instruction

and
:

Bitwise conjunction

When bitwise disjunction -

or
:

Bitwise Disjunction

With bitwise exclusive or -

xor
:

Bitwise exclusive OR

With bitwise negation -

not
:

Bitwise negation

With a bit shift to the right -

sar
:

Bit shift to the right

At bit shift to the left -

shl
:

Bit shift left

InCalling Functions

We will look at three kinds of functions:

  1. A function that does not return a value (void).
  2. A function that returns an integer.
  3. Function with parameters.

Function call:

newfunc ();newfuncret ();funcparams (intvar, stringvar, charvar);

First, let's see how the functions are called

newfunc()
and
newfuncret()
which are called without parameters.

Calling functions without parameters

The function

newfunc()
simply displays the message “Hello! I'm a new function! ”:

void newfunc () {// new function without parameters printf ("Hello! I'm a new function"!);}
Newfunc () function

This function uses the instruction

retn
, but only to return to the previous location (so that the program can continue its work after the end of the function). Let's look at a function
newfuncret()
that generates a random integer using the C ++ function
rand()
and then returns it.

int newfuncret () {// new function that returns something int A = rand (); return A;}
Newfuncret () function

First, space is allocated for the variable

A
. Then the function is called
rand()
, the result of which is placed in the EAX register. The value of EAX is then placed in the place allocated to the variable
A
, effectively assigning the
A
result of the function to the variable
rand()
. Finally, variable A is placed in the EAX register so that the function can use it as a return parameter. Now that we have figured out how to call functions without parameters and what happens when returning a value from a function, let's talk about calling the function with parameters.

Calling such a function is as follows:

funcparams (intvar, stringvar, charvar);
Function call with parameters

Strings in C ++ require calling a function

basic_string
, but the concept of calling a function with parameters does not depend on the data type. First, the variable is placed in the register, then from there to the stack, and then the function is called.

Let's look at the function code:

void funcparams (int iparam, string sparam, char cparam) {// function with parameters printf ("% i n", iparam); printf ("% s n", sparam); printf ("% c n", cparam);}
Funcparams () function

This function takes a string, an integer and a character, and prints them with a function

printf()
. As you can see, first the variables are placed at the beginning of the function, then they are pushed onto the stack for calling the function as parameters
printf()
. Very simple.

CCycles

Now that we have studied the function call, output, variables, and mathematics, we turn to controlling the order of code execution (flow control). First we examine the for loop:

void forloop (int max) {// normal for loop for (int i = 0; i <max; ++ i) { printf ("% i n", i); }}
Graphic overview of the for cycle

Before breaking the assembler code into smaller parts, let's look at the general version. As you can see, when the for loop starts, it has 2 options:

  • he can go to the block on the right (green arrow) and return to the main program;
  • it can go to the block on the left (red arrow) and go back to the beginning of the for loop.
Cycle for detail

Variables are compared first

i
and
max
to check if the variable has reached its maximum value. If the variable is
i
not greater than or not equal to the variable
max
, then the subroutine will go along the red arrow (down to the left) and output the variable
i
, then
i
increase by 1 and return to the beginning of the cycle. If the variable is
i
greater than or equal
max
, the subroutine will go along the green arrow, that is, it will exit the cycle
for
and return to the main program.

Now let's take a look at the loop

while
:

void whileloop () {// while loop int A = 0; while (a <10) { A = 0 + (rand ()% (int) (20-0 + 1)) } printf ("I'm out!");}
While loop

In this cycle, a random number is generated from 0 to 20. If the number is greater than 10, then the loop will exit with the words “I'm out!”, Otherwise the work in the loop will continue.

In machine code, the variable is

А
first initialized and equated to zero, and then the cycle is initialized and
A
compared to a hexadecimal number
0A
, which is 10 in the decimal number system. If
А
not greater than and not equal to 10, then a new random number is generated, which is written in
А
, and the comparison again occurs. If
А
greater than or equal to 10, then exit from the cycle and return to the main program.

At theconditional operator

Now let's talk about conditional statements. First, let's look at the code:

void ifstatement () {// conditional statements int A = 0 + (rand ()% (int) (20-0 + 1)); if (A <15) { if (A <10) { if (a <5) { printf ("less than 5"); } else { printf ("less than 10, greater than 5"); } } else { printf ("less than 15, greater than 10"); } } else { printf ("greater than 15"); }}

This function generates a random number from 0 to 20 and stores the resulting value in a variable

А
. If A is greater than 15, then the program will display "greater than 15". If A is less than 15, but more than 10 - “less than 15, greater than 10”. If less than 5 - "less than 5".

Let's look at the assembler graph:

Assembly graph for conditional operator

The graph is structured similarly to the actual code, because the conditional operator looks simple:"If X, then Y, otherwise Z". If you look at the first pair of arrows above, then the operator is preceded by a comparison

А
with
0F
, which is 15 in the decimal number system. If it is
А
greater than or equal to 15, then the subroutine will output “greater than 15” and will return to the main program. In another case, a comparison
А
with
0A
(1010) will occur . This will continue until the program displays something on the screen and returns.

OOperator selection

The select statement is very similar to the condition statement, only in a select statement one variable or expression is compared with several “cases” (possible equivalences). Let's see the code:

void switchcase () {// select statement int A = 0 + (rand ()% (int) (10-0 + 1)); switch (A) { case 0: printf ("0"); break; case 1: printf ("1"); break; case 2: printf ("2"); break; case 3: printf ("3"); break; case 4: printf ("4"); break; case 5: printf ("5"); break; case 6: printf ("6"); break; case 7: printf ("7"); break; case 8: printf ("8"); break; case 9: printf ("9"); break; case 10: printf ("10"); break; }}

In this function, the variable

А
gets a random value from 0 to 10. It is then
А
compared with several cases using
switch
. If the value
А
is one of the cases, the corresponding number will appear on the screen, and then the operator will exit the selection operator and return to the main program.

The choice operator does not follow the rule “If X, then Y, otherwise Z”, unlike the conditional operator. Instead, the program compares the input value with existing cases and performs only the case that matches the input value. Consider the first two blocks in more detail.

The first two blocks of the operator of choice

First, a random number is generated and written to

А
. Now the program initializes the select statement, equating the temporary variable
var_D0
with
А
, then checks that it is equal to at least one of the cases. If
var_D0
a default value is required, the program will follow the green arrow to the final return section from the subroutine. Otherwise, the program will make the transition to the desired one
case
.

If it

var_D0 (A)
is equal to 5, then the code will go to the section shown above, output “5” and then go to the return section.

PUser input

In this section, we will look at user input using a stream

сin
from C ++. First, look at the code:

void userinput () {// keyboard input string sentence; cin >> sentence; printf ("% s", sentence);}

In this function, we simply write the string to the variable of the sentence using the C ++ cin function and then output the sentence using the operator

printf()
.

Let's sort it out in machine code. First, the function

cin
:

cin (C ++)

First, the string variable is initialized

sentence
, then the call
cin
and the entry of the entered data into
sentence
.

C ++ cin function more detailed

First, the program sets the contents of the sentence variable to EAX, then pushes EAX onto the stack, from where the value of the variable will be used as a parameter for the stream

cin
, then the stream operator >> is called. Its output is placed on ECX, which is then pushed onto the stack for the operator
printf()
:

We considered only the basic principles of the software at a low level. Without these fundamentals, it is impossible to understand the work of software and, accordingly, to engage in its research.

Schedule Demo
Build your Cybersecurity or IT Career
Accelerate in your role, earn new certifications, and develop cutting-edge skills using the fastest growing catalog in the industry