Arrays

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
or

Already have an account? Sign In »

Course
Time
13 hours 15 minutes
Difficulty
Beginner
CEU/CPE
14
Video Transcription
00:01
Hello. This is Dr Miller, and this is Episode 10.1 of Assembly.
00:08
Today, we're gonna talk about pointers and arrays
00:12
pointers.
00:14
So, Pointers, what are they? Will they point to a location in memory and this is RAM or the ram that you would install on your computer?
00:22
And then you can get the address of them by using one of two operations.
00:29
So we can either use the move assembly, pneumonic and every little copy the address, or you can use L E A, which will do a calculation, but it won't actually load any memory.
00:40
So here are a couple examples,
00:42
so we can have the
00:44
array text and we can move that into yea X and that will load the address
00:50
or similarly weaken, Say, get the data at that address. So they had data at text and that will be loaded into yea X. And this will move a D word.
01:00
So here we can another example. We can copy that address into EA X. We can add to do it, which is gonna be a colder offset. And then we can get the data at that location.
01:11
Was there a couple examples
01:14
and then if we want to use l E a r load effective address,
01:17
we can basically say load me the address of the variable text, right? So that it does the equivalent of that move.
01:26
And then here we can just copy the data in if we want, or we could even use E x inside the brackets.
01:32
And then here we can use L e A. To do this calculation. So it'll take the address of text and then it'll add the offset B e B X into that and then store that address in E B X. And so that's one of the benefits of L E A. Is that we can do that
01:48
a raise.
01:49
So what are raise an array is a region of contiguous memory in the sense that it is all allocated at once. Um, and it
01:59
it is contiguous in that there are no breaks in between different elements of that,
02:04
um, and the signore sort of memory segmentation and pages.
02:07
And then an array is defined by the element size or how big each element is, and then how long it ISS. So when you define an array, you're gonna have to tell it how big it is and how how many elements are going to be in that array.
02:21
And then if you're going to actually use the array, you have a base address or the address of the beginning of the ray, the offset that you're looking at. And then you need to know the size of each element inside that array.
02:34
So here are some examples of defining arrays. We can either use the data section where we actually give them values.
02:39
So here's an array of D words, an array of words and over a of bites,
02:45
Um, or we can use the BSS. And so this is a array of D words, but we don't give them initial values. We just say that I'd like five d words
02:53
are like five words, or I'd like five bites
02:59
and then inside of See, we have a couple of different ways of actually allocating a raise.
03:04
Um, and this is just a little extra info, so you can either allocate them on the stack or you can use Malik. And those are the two typical ways that we define a raise inside of C
03:15
and then see uses a bracket style notation to do the offset of the index. But we don't have this available to us and assembly, so we have to actually do it manually.
03:28
So what is the offset? The opposite represents how much from the beginning of the ray that we're going to add. So how far into the array are we going to get?
03:36
And that uses those pointers? And so you take a pointer and then you add the offset in order to
03:42
get the data at a particular location.
03:45
And then you have to do the calculation of how big each element is in the array. And so the standard sizes, as we've seen, are bites, words and the words.
03:54
And so, for a character array or a string right, each index is going to be one. But if you have a bunch of D words than our index is gonna be, too
04:05
our sorry if we have a bunch of words that's gonna be too. And if we have a bunch of the words, that's gonna be four.
04:13
So here's an examples. We have several raise that we defined earlier,
04:16
and then we say we're going to reading an edger and then we're going to moving into yea X. The value at array plus are offset.
04:26
And so
04:28
array one is a bunch of the words and we're gonna take the beginning array and we're gonna add those d words. And then we're just gonna print off what the energy is. So hopefully we get 12345
04:39
But we see when we run that code that when we type zero, we get one, which is what we expected. But when we type one,
04:46
we get 35 554432
04:49
or two. We get 13 107 to.
04:55
So why is this?
04:56
Well, if we look back at the beginning here, right, we said this is an array of D words, but we are only adding one each time through. But a D word is four bites. So let's look at how this is
05:08
so in memory. This is how d words are laid out. These air each bites. I've written them in hex. And so we have the number one, and that is at the base address. And then the beginning for bites of this or next three bites actually
05:21
are going to be filled with zeros on the number two and these air filled with serious and the number three and these air filled with zeros.
05:29
So our base addresses right here. But we're adding one, which is still part of this same our first array and element.
05:36
And so
05:39
if we look at reading a wording a D word right, that's gonna be four bytes starting it offset zero.
05:46
And so we read these four bites and this is our number. And then we convert that from hex into decimal.
05:51
But if we load offset one, that's the data starting right here and we're loading these set of bites.
05:58
Well, we can see if we have the number zero x 0 to 000000 in hex and we convert that to base 10. We get the 33554432
06:10
And so the problem is that we're not adding the right offset. Let's look at one more example. So if we do offset to which we also printed, we get the number 00020000 in hex.
06:23
And if we take that and we convert that into decimal we get this number here, so we start reading at this location and we go this way, and that is the number that we read.
06:31
And so because we're not adding the right offset
06:34
because we're not multiplying by the size of each element in the array, we're getting the wrong values.
06:43
So to fix that here, when we read an imager, then we multiply, are offset by four or the size of each D word and then get the data at array plus whatever that offset is, and then we can print that.
06:56
And when we run that if we enter offset zero, we get one. If we don't enter offset one, we get to two, we'll get 33 we get four and four, we get five,
07:04
which, as we look here, those are the values that are in our array, right? So it's reading each one of those properly,
07:11
and so you got to know the size of each element and then multiply that so that you can,
07:16
um, get the correct data given the size of the array.
07:20
And so what you should do is maybe try
07:24
taking this code and using array to and figuring out how to
07:29
go through that array.
07:33
So strings.
07:38
So a string is an array of characters
07:41
where characters air defined by the Ask e standard. And that's what those values are. And then each element is a bite, and they actually ask You uses seven bits, but we fill it out to eight. So that way it matches the size that our computers are allowed to breed in store.
07:57
And then at the end of an array, we have a zero or null byte hex 00 And this tells us that we're done with the string,
08:05
and that is the standard that is used by nearly all programming languages in order to store the fact that we're at the end of a string of characters.
08:16
And so here we have defined a an example here. So we have string and it says this is a test, and then it has the 10 which is a new line and then a zero, which terminates our string,
08:28
and we're actually using a couple functions. So this is this function is gonna figure out how long this string is and then going to go ahead and print it without using prints string.
08:37
So a couple functions that we're gonna use we're going to use Sterling in and put car. Sterling will figure out the length of the string put car will print a single character to the console.
08:48
So here, in order to use Sterling, we have to figure out the address of our string. So we load that address in t a X, push that onto the stack and then call Sterling and then e X will have the length of our strain.
09:01
And then we can add forward to correct the stack. And then we're going to ride a little loop here. And so to initialize the loop, we're going to use easy X because we're gonna loop, right? We're gonna loop up to print.
09:11
So we copy that length and TC X so we can do our Luke Luke properly.
09:16
E d x is gonna be our index are index that we're going to get within the array or offset.
09:22
And then e b X is going to have the address of the string.
09:28
And so here, when we copy the data in, we're gonna copy into ea x
09:33
e b x plus e d x so ebx is our base. E d exes are offset.
09:37
And then all this code right here is to save all the registers. We save these three registers because put car will clobber them.
09:46
We then push on the argument that we want. So you want to print off the character E X? So we pushed that onto the stack called put car. Now that single character gets printed to the screen
09:56
and then we have to correct the stack. So this is just we pushed it, so we might as well, poppet.
10:01
And then here we are, restoring our registers,
10:05
and then each time through the loop, we're gonna increment GDX so et X will step through the array.
10:11
So don't go from index 01234 At the same time, E c X is being deck amended by our loop rights. We're going back up to the top each time because we know how many times we need to print. So we're going to go through and print each one of these characters and we don't have to print the zero, and so we just go all the way through the end.
10:33
So today we talked about pointers in a raise in a very specific type of array called a string.
10:39
And in the future, we're gonna have some examples of a raise, and we're gonna actually look a tsum mawr end up string operations that we can have.
10:48
So here's our quiz.
10:50
What is a pointer?
10:54
It points to a location in memory.
10:56
And then what info do you need to access in a rain?
11:01
Well, you need the base address, pointer. You need the offset. And they needed to know how big each one of the elements are.
11:09
So if you have questions, you can email me, Miller MJ at you and Kate I e to you, and you can find me on Twitter at Milhouse 30.
Up Next
Assembly

This course will provide background and information related to programming in assembly. Assembly is the lowest level programming language which is useful in reverse engineering and malware analysis.

Instructed By