Those of you that are following me on twitter may have heard about STAN. This is my pet project to learn about reverse engineering. It was born as an experiment with the capstone disassembly framework (, and it evolved in something usable for simple projects.
If you want to use it for your practice, or you want to take a look to the code, or even you want to extend it for your own needs… just grab the code here:
STAN is an early alpha phase and you may expect crashes and some misbehaviour. In general, for simple challenges it should work fine, but… as I said, it is just a pet project. So far it only works with GNU/Linux binaries and with ELF format for x86 and ARM (only 32bits for ARM).
A Simple Crackme for Testing
So, instead of writing a boring tutorial on the different options. I will just solve a very simple challenge.
Here it is:
cat << EOM | base64 -d | gunzip > /tmp/specimen
Just run it and you will get the challenge in the tmp
Basic STAN Operations
So, let’s launch STAN passing as parameter our challenge:
$ stan /tmp/specimen
STAN is a sTAtic aNalyser. v 0.1
(c) pico
+ Opening file '/tmp/specimen'
+ Loaded '/tmp/specimen' 4808 bytes
+ ELF Machine ID: 62
+ Arch: 1 Mode:2 Type: 1
+ Processing Core...
Starting analysis
+ Processing 4 sections/segments
+ Processing section [0] '.text'
* Analysing 342 instructions
+ Processing section [1] '.rodata'
+ Processing section [2] '.eh_frame'
+ Processing section [3] '.data'
CASE: 'corpse'
CORE: 0x696140
+ Dumming Core
- File : /tmp/specimen
- Size : 4808
- Type : ELF64
- Valid : VALID
- Architecture : X86
- Mode : 64bits
[00] text_00 Addr:0x400000 Offset:0x0000 Size:0x06f0 (1776)
[01] data_01 Addr:0x601000 Offset:0x1000 Size:0x000c (12)
[00] .text 0x03 Addr:0x400144 Offset:0x0144 Size:0x0433 ( 1075) [text_00+0x0144]
[01] .rodata 0x06 Addr:0x400577 Offset:0x0577 Size:0x003f ( 63) [text_00+0x0577]
[02] .eh_frame 0x06 Addr:0x4005b8 Offset:0x05b8 Size:0x0138 ( 312) [text_00+0x05b8]
[03] .data 0x06 Addr:0x601000 Offset:0x1000 Size:0x000c ( 12) [data_01+0x0000]
It shows a bunch of data related to the ELF file… Let’s look directly to the code:
STAN] > dis.section .text
(the whole program is dumped)
The family of dis
commands are used for disassembling code. The dis.section
one, disassembles a whole section. Usually you will use the dis.function
that disassembles just 1 function. Any way, in this case, dumping the whole section will allow us to quickly find the main function.
Oops. I haven’t mention that, but the challenge is a stripped binary… so no symbols in it. STAN can deal with symbols and show then to you… but the stripped binary is just smaller to be included as text in this write up.
So, if you quickly browse the asm code, you will see the string Password:
towards the begining of the dump. That is our main
function. Alternatively, STAN creates a special symbol named __entry_point
to reference the program entry point. Starting from there and following the different function calls you will eventually arrive to the same point.
The main function
Let’s take a look to the main function at the same time we discover more STAN functions.
STAN] > dis.function .text
+ Function '.text'@0x400144 found at section '.text'(1075,342)
400144: 48 81 ec 08 04 00 00 sub rsp, 0x408
40014b: ba 0b 00 00 00 mov edx, 0xb
400150: be 8a 05 40 00 mov esi, 0x40058a # 40058a(.rodata+13) : 'Password: '
400155: bf 01 00 00 00 mov edi, 1
40015a: 31 c0 xor eax, eax
40015c: e8 b0 00 00 00 call <func_400211> # <func_400211> 400211(.text+0xcd)
400161: 48 89 e6 mov rsi, rsp
400164: ba 00 04 00 00 mov edx, 0x400
400169: 31 ff xor edi, edi
40016b: 31 c0 xor eax, eax
40016d: e8 98 00 00 00 call <func_40020a> # <func_40020a> 40020a(.text+0xc6)
400172: ff c8 dec eax
400174: 48 89 e7 mov rdi, rsp
400177: 48 98 cdqe
400179: c6 04 04 00 mov byte ptr [rsp + rax], 0
40017d: e8 29 00 00 00 call <func_4001ab> # <func_4001ab> 4001ab(.text+0x67)
400182: 31 c0 xor eax, eax
400184: 48 81 c4 08 04 00 00 add rsp, 0x408
40018b: c3 ret
+ Stopped after founding symbol '__entry_point' (18 instructions)
As you can see, STAN already show us strings if the are referenced from the program (not always the case) and it also creates dummy names for every function it finds. You can see three function calls in the main program.
If you further explore the first two func_400211
and func_40020a
you will find out that the first one is a write
and the second one is a read
. To figure out this using STAN you will have to dump the whole section. The opcode analysis module is still pretty basic and it cannot figure out complex structures like the one you will find in there.
Alternatively, use your intuition. If you have already run the program, you know it will show a message and then ask for some input… Let’s go that path and let’s use the renaming functions provided by STAN:
STAN] > func.rename func_400211 maybe_write
+ Found function func_400211
+ Found Symbol func_400211
STAN] > func.rename func_40020a maybe_read
+ Found function func_40020a
+ Found Symbol func_40020a
Note: All those debug messages will eventially disappear
If we now dump the code again, it will look like this:
STAN] > dis.function .text
+ Function '.text'@0x400144 found at section '.text'(1075,342)
400144: 48 81 ec 08 04 00 00 sub rsp, 0x408
40014b: ba 0b 00 00 00 mov edx, 0xb
400150: be 8a 05 40 00 mov esi, 0x40058a # 40058a(.rodata+13) : 'Password: '
400155: bf 01 00 00 00 mov edi, 1
40015a: 31 c0 xor eax, eax
40015c: e8 b0 00 00 00 call <maybe_write> # <maybe_write> 400211(.text+0xcd)
400161: 48 89 e6 mov rsi, rsp
400164: ba 00 04 00 00 mov edx, 0x400
400169: 31 ff xor edi, edi
40016b: 31 c0 xor eax, eax
40016d: e8 98 00 00 00 call <maybe_read> # <maybe_read> 40020a(.text+0xc6)
400172: ff c8 dec eax
400174: 48 89 e7 mov rdi, rsp
400177: 48 98 cdqe
400179: c6 04 04 00 mov byte ptr [rsp + rax], 0
40017d: e8 29 00 00 00 call <func_4001ab> # <func_4001ab> 4001ab(.text+0x67)
400182: 31 c0 xor eax, eax
400184: 48 81 c4 08 04 00 00 add rsp, 0x408
40018b: c3 ret
+ Stopped after founding symbol '__entry_point' (18 instructions)
Getting there
So, we have figured out the write
and the read
and there is only one more function left, the mysterious func_4001ab
. So, we better figure out what we are passing as parameter to the function.
You are probably better than me, but I cannot really remember the calling convention for 64bits functions… so I added a help.abi
command to STAN to remember me the order:
STAN] > help.abi
Current Core is: Linux X86 64bits
-> func (RDI, RSI, RDX, RCX) -> RAX
Good, so, let’s look at the code above to figure out what we may expect to receive in that mysterious function… OK, let’s rename it before continuing.
STAN] func.rename func_4001ab mystery
So, we can see that RDI
is initialised with RSP
(the top of the stack). And what may be in the top of the stack?. Let’s check the previous function, the read:
RDI = 0 (xor edi,edi)
RDX = 0x400
So, it actually looks like a read
call, and the second parameter is the buffer to store the data that is set to the top of the stack. Then, without changing anything we call the mystery
function passing as parameter whatever we had read from the user.
OK. It is being time for a break so, let’s note down what we have found out, before we leave so we do not have to start from scratch when we come back:
STAN] > func.rename func_4001ab mystery
+ Found function func_4001ab
+ Found Symbol func_4001ab
STAN] > comment.add 40016d read (0 = stdin, RSP, 0x400)
+ Adding comment 'read (0 = stdin, RSP, 0x400)' at 0x40016d
STAN] > comment.add 40017d mystery (RSP) -> we just pass in the data read from the user
+ Adding comment 'mystery (RSP) -> we just pass in the data read from the user' at 0x40017d
Yes, we can add comments to specific addresses. And then the code will look like this:
STAN] > dis.function .text
+ Function '.text'@0x400144 found at section '.text'(1075,342)
400144: 48 81 ec 08 04 00 00 sub rsp, 0x408
40014b: ba 0b 00 00 00 mov edx, 0xb
400150: be 8a 05 40 00 mov esi, 0x40058a # 40058a(.rodata+13) : 'Password: '
400155: bf 01 00 00 00 mov edi, 1
40015a: 31 c0 xor eax, eax
40015c: e8 b0 00 00 00 call <maybe_write> # <maybe_write> 400211(.text+0xcd)
400161: 48 89 e6 mov rsi, rsp
400164: ba 00 04 00 00 mov edx, 0x400
400169: 31 ff xor edi, edi
40016b: 31 c0 xor eax, eax
40016d: e8 98 00 00 00 call <maybe_read> # <maybe_read> 40020a(.text+0xc6)
; read (0 = stdin, RSP, 0x400)
400172: ff c8 dec eax
400174: 48 89 e7 mov rdi, rsp
400177: 48 98 cdqe
400179: c6 04 04 00 mov byte ptr [rsp + rax], 0
40017d: e8 29 00 00 00 call <mystery> # <mystery> 4001ab(.text+0x67)
; mystery (RSP) -> we just pass in the data read from the user
400182: 31 c0 xor eax, eax
400184: 48 81 c4 08 04 00 00 add rsp, 0x408
40018b: c3 ret
+ Stopped after founding symbol '__entry_point' (18 instructions)
Time to save our work and take a break.
Just type:
Finishing the challenge
Hope you have had a nice break. I did. Now we can launch again STAN, but this time we are going to open the file from the STAN command-line
$ stan
STAN is a sTAtic aNalyser. v 0.1
(c) pico
STAN] core.load /tmp/specimen
+ Cleanning up core
+ Deleting Segments....
+ Deleting Sections....
+ Deleting Symbols....
+ Opening file '/tmp/specimen'
+ Loaded '/tmp/specimen' 4808 bytes
+ ELF Machine ID: 62
+ Arch: 1 Mode:2 Type: 1
+ Processing Core...
Starting analysis
+ Processing 4 sections/segments
+ Processing section [0] '.text'
* Analysing 342 instructions
+ Processing section [1] '.rodata'
+ Processing section [2] '.eh_frame'
+ Processing section [3] '.data'
STAN] case.load /tmp/specimen.srep
-> SYMBOL: 'mystery' '4001ab'
+ Found function mystery
+ Found Symbol mystery
-> FUNCTION: 'mystery' '0x4001ab'
-> COMMENT: 'read (0 = stdin, RSP, 0x400)' '0x40016d'
-> COMMENT: 'mystery (RSP) -> we just pass in the data read from the user' '0x40017d'
As you can imagine core.load
allows you to load a binary from the disk (you can use TAB completion to navigate the file system), then you use case.load
to load your previous saved status. For the time being
saves the state as a file with the same name than the binary under analysis but with extension .srep
Now we can go, and reverse the mystery
Unveiling the mystery
Let’s disassemble mystery
STAN] > dis.function mystery
+ Function 'mystery'@0x4001ab found at section '.text'(1075,342)
4001ab: 51 push rcx
4001ac: be 77 05 40 00 mov esi, 0x400577 # <.rodata> 400577(.rodata+0) : '0x00sec'
4001b1: e8 98 02 00 00 call <func_40044e> # <func_40044e> 40044e(.text+0x30a)
4001b6: 85 c0 test eax, eax
4001b8: 75 11 jne <l0> # 4001cb(.text+0x87)
4001ba: ba 05 00 00 00 mov edx, 5
4001bf: be 7f 05 40 00 mov esi, 0x40057f # 40057f(.rodata+8) : 'Good\n'
4001c4: bf 01 00 00 00 mov edi, 1
4001c9: eb 11 jmp <l1> # 4001dc(.text+0x98)
4001cb: ba 04 00 00 00 mov edx, 4
4001d0: be 85 05 40 00 mov esi, 0x400585 # 400585(.rodata+e) : 'Bad\n'
4001d5: bf 01 00 00 00 mov edi, 1
4001da: 31 c0 xor eax, eax
4001dc: e8 30 00 00 00 call <maybe_write> # <maybe_write> 400211(.text+0xcd)
4001e1: 83 c8 ff or eax, 0xffffffff
4001e4: 5a pop rdx
4001e5: c3 ret
Here we find two labels: l0
and l1
. And we can also see a call to what we believe is a write
after setting the strings for the right and wrong password. Everything should be obvious now, but let’s rename the labels for the LuLz
STAN] > label.rename l0 BadBoy
+ Found label l0
- DEBUG: Symbol 'l0' not found
STAN] > label.rename l1 print_and_exit
+ Found label l1
- DEBUG: Symbol 'l1' not found
STAN] > dis.function mystery
+ Function 'mystery'@0x4001ab found at section '.text'(1075,342)
4001ab: 51 push rcx
4001ac: be 77 05 40 00 mov esi, 0x400577 # <.rodata> 400577(.rodata+0) : '0x00sec'
4001b1: e8 98 02 00 00 call <func_40044e> # <func_40044e> 40044e(.text+0x30a)
4001b6: 85 c0 test eax, eax
4001b8: 75 11 jne <BadBoy> # 4001cb(.text+0x87)
4001ba: ba 05 00 00 00 mov edx, 5
4001bf: be 7f 05 40 00 mov esi, 0x40057f # 40057f(.rodata+8) : 'Good\n'
4001c4: bf 01 00 00 00 mov edi, 1
4001c9: eb 11 jmp <print_and_exit> # 4001dc(.text+0x98)
4001cb: ba 04 00 00 00 mov edx, 4
4001d0: be 85 05 40 00 mov esi, 0x400585 # 400585(.rodata+e) : 'Bad\n'
4001d5: bf 01 00 00 00 mov edi, 1
4001da: 31 c0 xor eax, eax
4001dc: e8 30 00 00 00 call <maybe_write> # <maybe_write> 400211(.text+0xcd)
4001e1: 83 c8 ff or eax, 0xffffffff
4001e4: 5a pop rdx
4001e5: c3 ret
OK… so can you figure out the password for this crackme?
In case you wonder, the func_40044e
is actually strcmp
you can dive deeper in the code to figure this out… dis.function func_40044e
… this is a simple one if you want to try.
Other commands you may find interesting
Just for completion, these are a few commands that may also be useful if you want to play with STAN:
comment.del addr
: Deletes a previous comment… sometimes we make mistakes -
mem.dump x addr count
. Dumps content of addressaddr
as hex bytes. You can changex
to dump words (pointers)STAN] > mem.dump x 0x400577 10
. This allows you to tell STAN that you believe there is a function at some address. As I said, the analysis module is pretty poor so you may spot obvious functions that STAN missed.
And, I haven’t say that… but STAN is colourful . This is how our functions looks like after all our hard work
As a final note, I have found this little tool very useful. It is not as powerful as radare2 or binary ninja, but it is very easy to use and helps a bit but still force you to do some work… which is something good when you are starting and you are still learning the basics.
It is still a lot to do and as I said, it is kind of alpha SW, so, use it at your own risk
hack fun