Programming for Wannabes. Part I

0x00pf · September 26, 2016, 6:20pm

If you are reading this is because you want to be a hacker. Therefore, you are automatically a wannabe. It does not sound that cool, but it is actually pretty cool. It means that you still have a lot of exciting things to discover!!!

First things first. This is going to be a joint course on C and assembly programming. Those are the only two languages a real hacker has to know. Yes. Really. The only two languages that will let you do the real hacking. You can do some stuff with Python or PowerShell… but that is Skid stuff… You know what the S on skid stands for… don’t you?. It stands for Script

Easy, this was just a joke to catch your attention. Now seriously. Learning scripting languages is a powerful tool and will save you a lot of time in many cases. Mastering Python is indeed a great skill and it is something you should invest time on. Analogously, C and assembly are the only way forward for some topics: reverse engineering, kernel level rootkits, and some system level programming can only be done with those low level languages. So, if you really want to be a hacker… man, there is no way around, you gotta have to learn CASM (C + ASM :).

Two notes before we start:

I’m not a hacker myself. Not even a wannabe. Really I do not have any interest on becoming a hacker. This basically means that you should take this course as a starting point. When you get done with it… then… your journey will really start.
No. HTML is not a programming language. If you think that… you really need to go through this course.

The course will be organized as follows:

Some introduction to what a computer is. These are low level languages and you need to get to know low level stuff
That is it for now

How have you end up here?

That is a good question. In all honestly there are zillions of C/ASM programming tutorials and courses out there. I know, they look useless. They are not what you are looking for. That’s fine… and that is why you are reading this right now.

To understand why all those tutorials haven’t help you, we need to learn a little bit about the different layers beneath the tools we want to build… Yes, layers. There are lots of them.

So, let’s start this trip bottom-up. At the very low level you have what is usually know as the Bare Metal. That is roughly the computer processor and the memory. We may consider the BIOS to be also at this level and some other elements usually grouped under the generic name of chipset.

On top of the Bare Metal we find the Operating System. The operating system is in charge of managing the HW. There is a lot to do there and each hardware device has its own peculiarities (that’s why we have all those drivers). The operating system gives us standard services to access the disk, the memory, the USB devices… You can indeed do all this at the Bare Metal layer but… then you will end up writing an operating system…

The next layer we found is the System Libraries layer. These are normal fragments of code that are used for many different programs. We group those pieces of code in a file and we call it a library. Then we can use that code from our programs. There are literally, thousands of libraries available for any OS out there.

The top level of this stack of layers are the Programs (this also includes Application Level Libraries). These are the things you want to write yourselves when you decide to learn how to program a computer. You write these programs in some programming language. C and ASM are two of those programming languages… There are hundreds of them, and you should learn quite some and choose the right one for the task you want to accomplish.

Going Top Down

OK, fair enough. Now that we are at the top level, let’s look down for a while. You have learn a specific language.

Let’s say you have learn C using one of those on-line tutorials. The original C language has 32 keywords. The syntax is pretty simple and easy to learn. Yes, really, is a pretty simple language. However, knowing the C keywords and syntax does not let you do much.

You need to learn something more. At least, one thing called the Standard C library. Without this, you cannot even print anything on the screen, read any used input or deal with files. And this happens with nearly all programming languages out there. You need a minimal library to be able to interact with your operating system and then, let your operating system deal with the Bare Metal.

Therefore, learning the syntax of a language, and the basics of its standard libraries only brings you half way (in the best case) to where you were aiming to be.

Learning C and ASM to become a hacker requires you to learn not just the language keywords, syntax, idioms, (put whatever nowadays buzzword in here)… You also need to learn how to use the system libraries, how does your operating system works and also how does your computer works.

That is why there is a huge difference between a C programming course that teaches you how to code C and a hacking programming course that teaches you how to do stuff. Usually, after you have learnt the syntax, you have to go into specific courses tutorials: Network Programming, Kernel Programming, 3D graphics programming, GUI programming,… Each one of those has their own peculiarities and requires quite some time to manage them. In a sense we will be covering here what is generally known as System Programming. This cover quite some of the topics usually related to hacking.

Hope that at this point you have understood, that hacking/system programming is not about learning a programming language. That is the simplest part. You have to learn a lot of other things… summing up. You have to learn what is going on every time you type a line of code.

So, it is time to learn the basics about a computer. C and ASM are low level programming languages. We will see soon how true this is. This means, that, to really master those programming languages you need to know, at least, the basics of how does a computer works.

The Simplest Computer in The World

Despite of how people things technology has advance in the last decades, the reality is that computers haven’t change much in nearly 50 years. They are faster, they are smaller, they consume less power… but they still work the same way. Indeed, this is a simplification, but at the level at which we are going to work this is actually the case.

For illustration purposes, let’s introduce the SCTW-2000. This is a fictional computer that will let us introduce different concepts in a generic way. It could easily be one of the popular home microcomputers from the 80s. We will match the SCTW-2000 with real examples, but we will avoid using real computers, at least at the beginning and for the sake of simplicity.

The SCTW-2000, as many other computers out there is basically composed of two main parts: CPU and memory

Let’s dive into each of these elements to get the big picture.

The Memory

There are a lot of different types of memories: RAM, ROM, SRAM, DRAM, SDRAM, PROM, FLASH, Serial FLASH, … From a SW point of view, unless we go really low level (something we are not going to do in this course), we do not care about the type of memory our computer have. That is important for the HW guys but not for us. At least not now. What is important for us is how does this element fit within the overall computer architecture.

For the time being, let’s imagine the computer’s memory as a huge bunch of drawers, one on top of the other. Inside each drawer we can store one number. An 8 bits number, or in other words, a number between 0 and 255 (that is 2^8 values… 8 bits). Finally, let’s number the drawers, starting for the one on the bottom (number 0), and giving consecutive numbers to the drawers on top. Something like this

+--------+
+ 8 bits | Drawer 3
+--------+
| 8 bits | Drawer 2
+--------+
| 8 bits | Drawer 1
+--------+
| 8 bits | Drawer 0
+--------+

The higher the memory in our computer, the taller the drawer pile will be. For instance, if our computer have 64KB of RAM… yes it sounds ridiculous nowadays, it is just an example. So a computer with 64 KB of memory will have 65536 drawers one on top of the other.

Each drawer is known as a memory address, and the number inside the drawer is the content of that memory address.

This is it about the memory for now. Let’s look to the CPU

The CPU

The CPU is the other main component of the computer. It is composed of the following parts:

A set of registers
A Arithmetic-Logic Unit (ALU)
Some control logic.

Registers

Let’s start with the registers. You can see the CPU registers as a small piece of memory that is inside the CPU and therefore it is super fast, and can hold numbers bigger than the 8 bits stored in each of our fictional memory addresses. On the other hand, this is a very small memory, something in the range of 4 to 32 positions (drawers). Compare to the 65536 drawers for our ridiculous small 64KB memory… Each of this positions are known as a register. Registers can be numbered starting from 0 like the memory or they can be named.

For instance the Z-80 processor, named their registers as: AF, BC, DE, HL, IX, IY. The Intel processors traditionally named their registers as: AX, BX, CX, DX, SI, DI,… Motorola processor used more systematic naming schema for their registers: A0-A7 (address registers) D0-D7 (data registers) for its 68K.

As the number of available registers grows, it is usual to follow the Motorola schema and name the registers with a consecutive number. This way, X86-64 (64 bits intel/amd processors) name their registers (the general purpose ones) r0-r15 (lower register have names compatible with the i386 processor). Same with MIPS and ARM processor.

Any way. Most of the processors used to build computers have a set of registers. A very fast and very small memory area inside the processor itself.

The ALU

The second element of the processor is the Arithmetic-Logic Unit. And yes, it basically performs arithmetic and logic operations. You can add, substract, multiply, and, or, xor values using this processor element… and that is what a computer actually does most of the time.

The ALU performs this operations either using the values stored in the registers or values taken from the memory, and stores the results, again, either in a register or a memory address. The available options actually depends on the processor. Old processor only could perform operation on registers and some only on certain registers. Nowadays, this is no longer the case.

This is really it for the ALU

The Control Unit

The last part of the processor we are interested on is the Cotrol Unit. In all honesty, in this simplified computer model, the control unit is anything else inside the processor except the registers and the ALU (branch prediction unit, cache management, bus signaling, pipeline management, memory management unit…). We won’t go in details about most of those circuitries in this course, however, there are a couple of basic functions that we have to be aware of.

One of these things is controlling the CPU external interface. In other words, the state of all those little pins that go out of the CPU chip. Using these pins, the processor can talk to the memory to read and write values in those drawers and can also interface to different types of hardware…

So, when the ALU needs some data from the memory to perform some operation, it asks the control unit to activate the right pins on the processor to command the memory chip to read or write a given memory position. The memory chip will get the value to write from some CPU pines or put the value to read in some other pins (or multiplex these two pin sets, that is using the same pins for reading/writing and even addressing the memory). OK, it does not really work like this, but otherwise we will have a very, very long introductory chapter

The control unit is also in charge of reading the program from memory, parsing it and executing the machine code…

Machine Code

We will not talk much about machine code… we will be talking about ASM, but, you at least need to know what this machine code is. For that, we need to understand how the CPU runs a program and how does a program looks like.

The first thing we have to introduce is the Program Counter (PC) or Instruction Pointer (IP). The name depends on the processor family you are working with. This is a special register in the CPU that indicates which memory address in the main memory contains the next instruction to run. Whenever the processor executes an instruction, the IP is increased by one (actually it may be more than one, if the current instruction takes several bytes… more on this in a sec), unless the instruction is a branch/jump. In that case, a new value is assigned to the IP so the next instruction will be taken from a new position.

But, you may be wondering… what are those instructions?. OK, no panic. Let’s go back for a second to our fictional SCTW-2000. Now that we know a lot about processors, I can tell you about the specs for the awesome SCTW-2000:

It has 2 registers named EBX and EBP (you will see later why we have chosen these names), plus an instruction pointer named RIP
The ALU can perform the following operation: ADD two registers
It has an instruction MOV to assign values to registers.
After reset (or on power on) all registers are set to 0 (including RIP)

Now, to fully specify the SCTW-2000 we have to define the instruction it can run. The numbers the control unit will be able to understand… that is the machine code. For the SCTW-2000 it is something like this:

OPCODE     | MNEMONIC    | Description
-----------+-------------+-----------------------------------------------
0xbb XX    | MOV EBX, XX | Copies the value XX in register R0
0xbd XX    | MOV EBP, XX | Copies the value XX in register R0
0x01 XX YY | ADD XX, YY  | Adds the values of register XX and register YY
           |             | ands stores the result on register XX
0x90       | NOP         | No Operation. Does nothing
0x00       | HALT        | Stops the processor

As you can see we have used three columns for this table. The first column contains only numbers (1, 2 or 3 numbers depending on the instruction). Yes, instruction may take, and actually take more than one single memory address. This is the machine code, the sequence of numbers stored in memory that the CPU can understand. We have chose some arbitrary opcodes (not so arbitrary but for the time being, it does not really matter which OPCODES we’ve chosen).

Remembering those number is hard for humans. Sure, we can manage for the SCTW-2000, but a real processor may easily have hundreds of those OPCODEs. For that reason, we use something a bit easier to remember for us (that’s why this is named mnemonics - from the greek “memory related”, yes a bit of a free translation)… That is actually the assembly language that you are trying to learn. The ASM language has to be converted into machine code… that is the task of the program know as assembler.

Our first program

To illustrate all this, let’s write a simple program that adds two numbers: 10 and 20. The assembly code for the SCTW-2000 and for such a stupid program will be:

MOV EBX, 10
MOV EBP, 20
ADD EBX, EBP
HALT

Now let’s do the work of the assembler by hand using the opcode/mnemonic table above:

ASM             MACHINE CODE
MOV EBX, 10  -> 0xbb 0x0a
MOV EBP, 20  -> 0xbd 0x14
ADD EBX, EBP -> 0x01 0x00 0x01
HALT         -> 0xff

Now we can put the program in our memory:

ADDR-07 | 0xff |
ADDR-06 | 0x01 |
ADDR-05 | 0x00 |
ADDR-04 | 0x01 |
ADDR-03 | 0x14 | 
ADDR-02 | 0xbd |
ADDR-01 | 0x0a |
ADDR-00 | 0xbb | <= RIP

Our program requires 8 bytes of memory. We copy the program starting at position 0, and after resetting the SCTW-2000 (sure, we are using SDRAM for the SCTW-2000), the RIP will point to address 0 and will start reading the machine code and executing the instructions.

Not all processor starts execution at address 0. You have to check the datasheet for the processor to figure out the boot process and conditions

C as a Low-Level Programing Language

Before finishing this first part, we have to quickly introduce the other language we are going to work with in this course and we also have to justify why we shall consider C a low level programming language.

For doing that, we are going to try to produce exactly the same code we have generated by hand in the previous section, using a C program. Instead of the SCTW-2000, we are going to use an Intel 64bits… You will see how similar these two processors are

Let’s write this program in a text file. Call it ph1.c

int f1 (void)
{
  register  int a = 10;
  register  int b = 20;

  a += b;
  return a;
}
int main (void)
{
  int a;
  a = f1();
}

The program looks a bit weird. The reason is that we are not working at the bare metal level. We have an operating system and a file format to honour, so the compiler has to create quite some more stuff than just the machine code for our code. Moreover, we have to use everything we put in the code, otherwise, the compiler will realise that, and remove the code detected as dead (the one that will never be used)… which is bad for our didactic purposes.

We will progressively go into all those details. Right now, it is not really needed to completely understand what is going on. You just need to realize a couple of things.

First, concentrate on the function f1, at the beginning of the program. No need to fully understand the syntax, just pay attention to the following:

The register keyword in C, tell the compiler that we want to use a register. By default it will try to use memory for variables (actually the stack, but we haven’t talked about it yet), but, as you can see we can force the compiler to produce machine code that uses the registers… as we did with our assembler version.
Then you can see the equivalent to our assembler program. We store 10 in one variable (that will be hold in a register), 20 in another, and then we add both number and store it in the first variable. This is roughly the ASM code we wrote before!!!

Let’s compile the program and take a look to the machine code:

$ gcc -fomit-frame-pointer -o ph1 ph1.c

The -fomit-frame-pointer is just to remove some code generated by the compiler that is not relevant right now. And what we’ve got out of this compilation is:

$ objdump -M intel -d ph1 | grep -A 20 '<f1>'
00000000004004b4 <f1>:
  4004b4:	55                   	push   rbp
  4004b5:	53                   	push   rbx
  4004b6:	bb 0a 00 00 00       	mov    ebx,0xa
  4004bb:	bd 14 00 00 00       	mov    ebp,0x14
  4004c0:	01 eb                	add    ebx,ebp
  4004c2:	89 d8                	mov    eax,ebx
  4004c4:	5b                   	pop    rbx
  4004c5:	5d                   	pop    rbp
  4004c6:	c3                   	ret

I had kindly asked objdump to produce intel assembly because it’s closer to the one we used for our awesome SCTW-2000 (that is the -M intel flag).

Let’s look at 0x4004b6 to 0x4004c0:

  4004b6:	bb 0a 00 00 00       	mov    ebx,0xa
  4004bb:	bd 14 00 00 00       	mov    ebp,0x14
  4004c0:	01 eb                	add    ebx,ebp

So, this looks pretty similar to the machine code, we generated for our fictional SCTW-2000 processor. The main differences:

gcc is generating 32 bits values for our constants. Even if we declare our variables as char, a 32 bits value will be generated… Later in the series we will learn why.
The add instruction is a bit different. Intel machine code encodes the registers in the opcode to save space, that is why the machine code for the add ebx,ebp is just 0x01 0xeb, instead of our 0x01, 0x00, 0x01. Check the intel manual for full details

Well, we have managed to write some C code that almost exactly matches the machine code we want the computer to run… this is why C is considered a low level programming language :).

For the Lulz

Just for fun. Let’s see how the ASM for ARM will look like.

First let’s compile the program for ARM:

$ arm-linux-gnueabi-gcc -fomit-frame-pointer -o ph1-arm ph1.c

And now, let’s look at the code:

$ arm-linux-gnueabi-objdump -M intel -d ph1-arm | grep -A 20 '<f1>'
Unrecognised disassembler option: intel
0000840c <f1>:
    840c:	e92d0030 	push	{r4, r5}
    8410:	e3a0400a 	mov	r4, #10
    8414:	e3a05014 	mov	r5, #20
    8418:	e0844005 	add	r4, r4, r5
    841c:	e1a03004 	mov	r3, r4
    8420:	e1a00003 	mov	r0, r3
    8424:	e8bd0030 	pop	{r4, r5}
    8428:	e12fff1e 	bx	lr

OK, the OP codes are completely different, but the assembly is pretty accurate. You can see that, for ARM, the add instruction accepts 3 parameters instead of just 2. Other than that… it is pretty similar to our SCTW-2000 ASM and also to the Intel 64bits ASM.

And for MIPS we get:

$ mips-linux-gcc -fomit-frame-pointer -o ph1-mips ph1-1.c
$ mips-linux-objdump -d ph1-mips | grep -A 20 '<f1>'
00400720 <f1>:
  400720:	2402000a 	li	v0,10
  400724:	24030014 	li	v1,20
  400728:	00431021 	addu	v0,v0,v1
  40072c:	03e00008 	jr	ra
  400730:	00000000 	nop

This is a bit different but still… change li to mov and addu to add… and there you go.

Conclusions

Well, this concludes the first part of this course. We have quickly seen the main components of a computer, and the relation between the machine code, the assembler and the C programming language. We have also seen that knowing the language syntax is just a small part of what you need to know to actually write useful code. Finally, we have checked why C is considered a low level programming language and we’ve also got a glimpse of how the language can be used to control what is actually executed by the processor.

Please, let me know in the comments if you have found this useful, if it is comprehensive, if you are missing something, if you got bored to dead… and click the heart icon if you are interested in more installments for this series (I know some people read the posts but never comment on them)… I won’t bother you guys in case you are not interested on this kind of stuff.

Appendix. Installing required tools

In case you need help to get the tools we used in this tutorial, here are some pointers

For intel you can just install build-essential

apt-get install build-essential

This should install all the tools you need, including the compiler. For non debian based distros you need to look for the packages for gcc and the so-called binutils.

Forar ARM things are also easy

 apt-get install gcc-arm-linux-gnueabihf
 apt-get install gcc-arm-linux-gnueabi

The first one is for Hardware Floating-Point and should work for any recent ARM. Just go for that. You may also need to install the binutils package for ARM

apt-get install binutils-arm-linux-gnueabi

For MIPS you won’t find a toolchain in your distro repository, so you have to get one from somewhere else or compile your own. Check this page to chose one. I cannot remember which one I used for this post.

https://www.linux-mips.org/wiki/Toolchains

Just uncompress the toolchain anywhere and set your PATH to the directory containing all the binaries… That’s it

anon79434934 · September 26, 2016, 6:27pm

Ahh, finally: tutorials aimed to the beginner. Excellent work @0x00pf.

-Phoenix750

0x00pf · September 26, 2016, 6:38pm

Thanks… Glad to hear you like it

oaktree · September 26, 2016, 6:58pm

Perhaps you meant 64 K Bytes??

0x00pf · September 26, 2016, 7:13pm

:)… that is what I meant… actually some microcontroller have a few bytes of RAM… but that is indeed a mistake. Thanks for pointing it out. I’ll fix it

Valentine · September 26, 2016, 7:56pm

I’ve been trying to learn ASM on my own but it’s hard to learn for a noob. C is a bit easier to understand and learn but at the same time it’s still hard. I am glad that someone is starting to teach the beginners. Cheers.

_py · September 27, 2016, 4:25pm

Great post @0x00pf. Really resourceful for the newcomers.

oaktree · September 27, 2016, 10:35pm

Just a guess, but is it because of memory pagination?

0x00pf · September 27, 2016, 10:49pm

Nop… but quite close. We will come to it later

_py · September 27, 2016, 10:52pm

@oaktree It’s to make CPU’s life easier while reading the values and it also has to do with the arch you are working on. Google “memory alignment in C” or something close to that and check out this interesting paper http://www.catb.org/esr/structure-packing/.

oaktree · September 27, 2016, 10:53pm

Ah, @0x00pf and @_py: that’s sort of what I meant.

EDIT: @_py that was a wonderful article you linked. Anyone reading through this discourse should definitely spend time reading that tutorial as a supplement or to entertain curiosity.

Valentine · September 28, 2016, 12:19pm

In C, can’t you insert ASM with declaring _ asm _ or does it have to be declared with register? Something like that.

_py · September 28, 2016, 12:54pm

@Valentine yes, google “Inline Assembly” for further info.

Valentine · September 28, 2016, 4:13pm

Just wanted to make sure.

SmartOne · September 30, 2016, 4:36pm

Excellent tutorial! @0x00pf Can’t wait for part II!

pry0cc · October 1, 2016, 10:07am

Really awesome tutorial. A real must have for any aspiring programmer/hacker. I am a fan of how you’ve linked it all together, a lot of content is just C, or just ASM or just memory, but they all link, and I think you did a really good job of linking it all together.

FFY00 · October 12, 2016, 8:46pm

This is actually better than the book I’m reading on ASM. Now I understand most of the things that I was missing. Thanks for writing this, it’s really really helpful

nopSled · October 14, 2016, 6:17pm

Really enjoyed this writeup, the way you teach is quite effective!

pry0cc · October 14, 2016, 6:52pm

Nice username xD really fitting in with the 0x00sec theme

boinclement · March 4, 2017, 3:04pm

Thank you, really nice tuto
What would you think to add a few lines on how to install the required tools ?

Regards,