Strings in C

Video Activity
Join over 3 million cybersecurity professionals advancing their career
Sign up with
Required fields are marked with an *

Already have an account? Sign In »

13 hours 15 minutes
Video Transcription
Hello. This is Dr Miller, and this is Episode 14.15 of Assembly.
Today we're gonna talk about strings and see an assembly. We're gonna talk about how they could be on the stack in the data slash B assess section and then on the heap
so strings
so street see strings allows us to store an array of characters and see allows us to store them in several different ways. So, for example, they could be within a function. They could be defined statically. They could be done using Malik.
Um, but one of the restrictions on strings is the printable characters used the lowest seven bits, so they're all
also very easy to find typically.
And then we have a zero that is Arnold Terminator.
So strings on the stack.
So when you define a string inside of a function using the brackets, that is going to put it on the stack
and then we have two different types. One where we give the size the other one, we allow the compiler to figure out how big it is.
So when we look at these inside of our dis assembler, we're going to see that they get converted directly into numbers, and we can see that they end up getting pushed onto the stack or using a move.
And so we can see a bunch of numbers. They get moved into those areas, and so it's actually represent the characters.
Now the compiler Explorer doesn't try and decipher these, and so that makes it a little bit difficult to try, and
I'm see what they're doing.
But if we take a side by side, look and we use a an actual reverse engineering tool,
we can actually right click on these and turn them from, um numbers into the actual character values. So you can see hello here and then hello there, class, so we can see them in a dis assembler in the way that they are being represented. This is kind of just showing us the raw data that's inside of there,
so strings in the data section.
So if you declare something as static or outside of the main, then that's going to automatically put those strings into the data section.
And so when we look at it, we can see this is again another dis assembler. This is Ida, and we can see why and z inside of here. It's kind of showing us that this is Dr Miller. It's also actually showing me the string percent s, which is inside of that function.
And so if we try and look at those inside of just a regular disassemble, or we're just going to see actual addresses that air on the stack and so it might be hard in order to understand exactly what's at those addresses, unless you print them off
and again, we can jump to the dis assembler and we can see that in the dot data section we can see our strings. So there's Dr Miller. There's why which had I believe in A, which was 61 in Hex, and so we will actually see those located in the data section and not put onto the stack.
And then if we have an un initialized buffer so we can see that we give it a size, but we don't put any data in it. And so this is an un initialized buffer.
So again we throw that in there and we're going to see that we get offset of buffer as what the data gets moved into it. And so then if we go and look at the offset of the buffer, we're going to see that that's in the B S s. So it's un initialized, so the compiler doesn't know what's there, so the disassemble er
doesn't know what's there. And so it puts question marks to say, I don't really know what's here. It depends on
now the loader in the OS is what gets there.
And then in a dis assembler, you can kind of compact that, and you can right click on it and say, By the way, this is an array and then it turns that array size into
hex. And then it shows us that it's a dupe or it's the same thing. Question mark over and over again for all of that data.
And then we have strings using Malik.
So here, weird making the function called to Malek. It's allocating 100 characters for that, and then we're going to see some store cats on top of that. And again, this is to point out how the compiler will actually take functions and use them.
And so here we can see the call to Malik, and then we can see that we copy that into far 14 and then Var 14 or E b p minus E ends up being used in a lot of different places again. I went through the dis assembler and I corrected all of these, and so we actually end up seeing that this,
um, called to stir cat actually ends up being a repeat, not equal. And so it sets up
the string functions with, yes, I and E d I, and then copies the data into those locations. And so sometimes you might put in a function, but the disassembly woke actually creates something completely different.
And so we see that for all of these where it's copying the data from one location into another, and so that can make it difficult in order to understand exactly what's going on if you're not used to reading that type of code.
So today we talked about some of the different ways that we can have strings in this in C and how those manifest themselves an assembly. We looked, at example, on the stack. We looked at one that was in the data or the BSS section, and then we looked at one on the heap using Malik
in the future, we're gonna look at some other data types and see how they manifest themselves in C and then the corresponding versions in assembly.
If you have questions, you can email me, Miller MJ at you and Kate. I e d u. And you can find me on Twitter at Milhouse 30.
Up Next