Late Easter Challenge Solution with STAN :)

asm
challenge
javascript
challengesolution
reverseengineering

(pico) #1

It has been a while since I wrote the Late Easter challenge and now that STAN is around, this is a good excuse to present some of the latest functions I have add to my little tool :wink:

So, in this write-up I’m going to solve the stripped version of the challenge. Grab the binary, grab STAN (https://github.com/0x00pf/STAN) and let’s start!!! :slight_smile:

Getting started

After loading the binary with STAN we see it is a stripped static binary (this is new information shown by the latest version of STAN).

+ Dumming Core
  - File         : ./test/eo-strip
  - Size         : 15374
  - Entry Point  : 4001ef
  - Type         : ELF64
  - Valid        : VALID
  - Architecture : X86
  - Mode         : 64bits(2)
  - Info         : Static Stripped
[00] text_00 Addr:0x400000 Offset:0x0000 Size:0x2718 (10008)
[01] data_01 Addr:0x603000 Offset:0x3000 Size:0x00e0 (224)
.................................................
[00]           .text 0x03 Addr:0x400144 Offset:0x0144 Size:0x1d6b (   7531) [text_00+0x0144]
[01]         .rodata 0x06 Addr:0x401eb0 Offset:0x1eb0 Size:0x0170 (    368) [text_00+0x1eb0]
[02]       .eh_frame 0x06 Addr:0x402020 Offset:0x2020 Size:0x06f8 (   1784) [text_00+0x2020]
[03]           .data 0x06 Addr:0x603000 Offset:0x3000 Size:0x00e0 (    224) [data_01+0x0000]

So, let’s take a look to the entry point (you can now use TAB completion for symbols :slight_smile: ). STAN names the entry point as __entry_point.

STAN] > dis.function __entry_point
+ Function '__entry_point'@0x4001ef found at section '.text'(7531 bytes)
+ Disassembling function [email protected]
  * Analysing 3440 instructions at (0x4001ef)

                         __entry_point:
4001ef:   5f                      	pop	rdi
4001f0:   48 89 e6                	mov	rsi, rsp
4001f3:   57                      	push	rdi
4001f4:   48 8d 54 fe 08          	lea	rdx, qword ptr [rsi + rdi*8 + 8]
4001f9:   48 89 15 e0 2e 20 00    	mov	qword ptr [rip + 0x202ee0], rdx		# ! 0x6030e0
400200:   e8 ee 00 00 00          	call	<func_4002f3>			#  <func_4002f3> 4002f3(.text+0x1af)
400205:   48 89 c7                	mov	rdi, rax
400208:   e8 2a 15 00 00          	call	<func_401737>			#  <func_401737> 401737(.text+0x15f3)
40020d:   f4                      	hlt
+ Stopped after finding symbol 'func_40020e' (8 instructions)

Normally, you should start disassembling the two functions calls in the __entry_point, but I can tell you that this is how the _start function looks like for a dietlibc binary. I can also tell you that dietlibc usually puts the main function at the beginning of the .text segment.

Alternatively, as this is a small binary, you can just dump the whole .text section using the command dis.section .text and look for the messages printed in the screen. You will quickly find the main function. I’m working on more options to easily find the part of the program you may be interested on.

Looking at main

So, let’s rename our .text symbol to main and take a look to the function.

STAN] > func.rename .text main
 + Found function .text
 + Found Symbol .text
STAN] > dis.function main
+ Function 'main'@0x400144 found at section '.text'(7531 bytes)
+ Disassembling function [email protected]
  * Analysing 3398 instructions at (0x400144)

                                  main:
400144:   55                      	push	rbp
400145:   31 c0                   	xor	eax, eax
400147:   b9 04 00 00 00          	mov	ecx, 4
40014c:   53                      	push	rbx
40014d:   48 81 ec 18 04 00 00    	sub	rsp, 0x418
400154:   48 89 e7                	mov	rdi, rsp
400157:   48 8d 6c 24 10          	lea	rbp, qword ptr [rsp + 0x10]
40015c:   f3 ab                   	rep stosd	dword ptr [rdi], eax
40015e:   bf b0 1e 40 00          	mov	edi, 0x401eb0		#  <.rodata> 401eb0(.rodata+0)  : '0x00sec Easter Challenge'
400163:   e8 4d 05 00 00          	call	<func_4006b5>			#  <func_4006b5> 4006b5(.text+0x571)
400168:   bf c9 1e 40 00          	mov	edi, 0x401ec9		#  401ec9(.rodata+19)  : 'Enter Password: '
40016d:   31 c0                   	xor	eax, eax
40016f:   e8 82 04 00 00          	call	<func_4005f6>			#  <func_4005f6> 4005f6(.text+0x4b2)
400174:   48 8b 15 a5 2e 20 00    	mov	rdx, qword ptr [rip + 0x202ea5]		#  603020(.data+20)  : '00`'
40017b:   be 00 04 00 00          	mov	esi, 0x400
400180:   48 89 ef                	mov	rdi, rbp
400183:   e8 fb 03 00 00          	call	<func_400583>			#  <func_400583> 400583(.text+0x43f)
400188:   31 c0                   	xor	eax, eax
40018a:   48 83 c9 ff             	or	rcx, 0xffffffffffffffff
40018e:   48 89 ef                	mov	rdi, rbp
400191:   f2 ae                   	repne scasb	al, byte ptr [rdi]
400193:   ba 10 00 00 00          	mov	edx, 0x10
400198:   48 89 ee                	mov	rsi, rbp
40019b:   48 89 e7                	mov	rdi, rsp
40019e:   48 f7 d1                	not	rcx
4001a1:   c6 44 0c 0e 00          	mov	byte ptr [rsp + rcx + 0xe], 0
4001a6:   e8 46 03 00 00          	call	<func_4004f1>			#  <func_4004f1> 4004f1(.text+0x3ad)
4001ab:   48 89 e7                	mov	rdi, rsp
4001ae:   e8 a9 00 00 00          	call	<func_40025c>			#  <func_40025c> 40025c(.text+0x118)
4001b3:   85 c0                   	test	eax, eax
4001b5:   b9 08 00 00 00          	mov	ecx, 8
4001ba:   74 13                   	je	<l0>			# 4001cf(.text+0x8b)
4001bc:   ba 3a 00 00 00          	mov	edx, 0x3a		# ':'
4001c1:   be 10 30 60 00          	mov	esi, 0x603010		#  603010(.data+10)  : 0x1df32e0
4001c6:   48 8b 3d 3b 2e 20 00    	mov	rdi, qword ptr [rip + 0x202e3b]		#  603008(.data+8)  : 0x1df32d8
4001cd:   eb 11                   	jmp	<l1>			# 4001e0(.text+0x9c)
                                    l0:
4001cf:   48 8b 3d 2a 2e 20 00    	mov	rdi, qword ptr [rip + 0x202e2a]		#  <.data> 603000(.data+0)  : 0x1df32d0
4001d6:   ba 19 00 00 00          	mov	edx, 0x19
4001db:   be 10 30 60 00          	mov	esi, 0x603010		#  603010(.data+10)  : 0x1df32e0
                                    l1:
4001e0:   e8 29 00 00 00          	call	<func_40020e>			#  <func_40020e> 40020e(.text+0xca)
4001e5:   48 81 c4 18 04 00 00    	add	rsp, 0x418
4001ec:   5b                      	pop	rbx
4001ed:   5d                      	pop	rbp
4001ee:   c3                      	ret
+ Stopped after finding symbol '__entry_point' (43 instructions)

Let’s start making some assumptions to narrow down the interesting part of the program. The first function call receives as first parameter (the value of register RDI) a welcome string. Then the next function call does the same to print a prompt for the password. Both functions are different… probably one is printf and the other is puts… actually we do not care for the time being. Let’s rename them to maybe_printf and maybe_printf1.

Following the same reasoning, the third function will probably be a fgets as the first parameter is a local variable in the stack (actually is the bottom of the stack RSP). After renaming then the initial part of the program will look like this:

main:
400144:   55                      	push	rbp
400145:   31 c0                   	xor	eax, eax
400147:   b9 04 00 00 00          	mov	ecx, 4
40014c:   53                      	push	rbx
40014d:   48 81 ec 18 04 00 00    	sub	rsp, 0x418
400154:   48 89 e7                	mov	rdi, rsp
400157:   48 8d 6c 24 10          	lea	rbp, qword ptr [rsp + 0x10]
40015c:   f3 ab                   	rep stosd	dword ptr [rdi], eax
40015e:   bf b0 1e 40 00          	mov	edi, 0x401eb0		#  <.rodata> 401eb0(.rodata+0)  : '0x00sec Easter Challenge'
400163:   e8 4d 05 00 00          	call	<maybe_printf>			#  <maybe_printf> 4006b5(.text+0x571)
400168:   bf c9 1e 40 00          	mov	edi, 0x401ec9		#  401ec9(.rodata+19)  : 'Enter Password: '
40016d:   31 c0                   	xor	eax, eax
40016f:   e8 82 04 00 00          	call	<maybe_printf1>			#  <maybe_printf1> 4005f6(.text+0x4b2)
400174:   48 8b 15 a5 2e 20 00    	mov	rdx, qword ptr [rip + 0x202ea5]		#  603020(.data+20)  : '00`'
40017b:   be 00 04 00 00          	mov	esi, 0x400
400180:   48 89 ef                	mov	rdi, rbp
400183:   e8 fb 03 00 00          	call	<maybe_fgets>			#  <maybe_fgets> 400583(.text+0x43f)

This looks very reasonable, so, for the time being, we will continue. We will find out later if we have to come back and look into those function in more detail, but I can tell you right now, that this is not the case.

Going on

Just after, the function that we have renamed as maybe_fgets, there is a block of code just before the next function call. Well, that is actually an strlen, but for the shake of our own enlightenment let’s rip it off.

400188:   31 c0                   	xor	eax, eax
40018a:   48 83 c9 ff             	or	rcx, 0xffffffffffffffff
40018e:   48 89 ef                	mov	rdi, rbp
400191:   f2 ae                   	repne scasb	al, byte ptr [rdi]

First we set EAX to 0. As you know, this is the value that indicates the end of the a string in C. In other words, the value we have to look for to calculate the length of the string. Then we set RCX to -1. We will discuss this in a sec. Finally we set RDI to RBP. If you double check again the maybe_fgets call just before, you will see that the first parameter is actually RDP… so we are effectively setting RDI to the memory address where the user input has been stored.

Then, the next instruction is the one that does all the magic. Let’s look at it in detail.

repne scasb	al, byte ptr [rdi]

The repne is a instruction modifier for strings operations, and scasb is one of those string operations supported by Intel processors. When the repne modifier is used, the next instruction (scansb in this case) will be repeated while the content of RDI and the value of AL are different (NE stands for NonEqual).

As you can figure out by now, there are different repXX modifiers depending on the condition we are interested on.

But, repXX modifier does a few more things. Every time the instruction is repeated, two things happens at the same time. The RCX register is decreased and the RDI register is increased. There are other string operations that also makes use of RSI and in addition to the two operation we have just mentioned, RSI is also increased.

Note: RDI stands for Destination Index and RSI stand for Source Index. Intel processor traditionally have assigned special powers to certain registers :slight_smile:

In this specific case we will be scanning (SCAn String Byte) the memory pointed by RDI until the flag Z is set (this is the ne non-equal condition on rep). This may happen because RCX becomes 0 (and this is the reason why we start with the largest possible value for RCX, i.e. -1) or because we’ve found a byte with the same value than AL (that is 0 in this case).

Whenever the condition is satisfied, we have to convert the value stored on RCX, as we haven been counting negative numbers. What we need to do is inverted again with a not operation to get the absolute value of out counter, and subtract 1 in order to obtain the actual length of the string. I’m not going to explain this, but you can take a look to this (https://en.wikipedia.org/wiki/Two’s_complement) to get better understand why.

So, if we consider the last mov, what all this code actually does is:

buffer[strlen(buffer) - 1] = 0

OK, this may not be that obvious, so take a look to this line, the one actually writing the 0 at the end of the string:

mov	byte ptr [rsp + rcx + 0xe], 0

If you look at the beginning of our main function you will find this line:

lea	rbp, qword ptr [rsp + 0x10]

This means that RSP and RBP are 0x10 bytes away. If we index against RSP instead of RBP, the we add RCX (that actually contains strlen(string) + 1) and we decrement this by 1 we can conclude that:

  RBP        + strlen    - 1 = 
(RSP + 0x10) + (RCX - 1) - 1 = RSP +RCX + 0xe

Now that we fully understand this piece of code, we find a new function to tackle. You should be able to deal with it by yourself… I can tell you it is a strncpy :slight_smile:

Checking the Password

Finally we get to the function that checks the password: func_40025c. Let’s rename it to check_password, and let’s take a look:

func_40025c:
40025c:   31 c0                   	xor	eax, eax
40025e:   48 83 c9 ff             	or	rcx, 0xffffffffffffffff
400262:   48 89 fa                	mov	rdx, rdi
400265:   f2 ae                   	repne scasb	al, byte ptr [rdi]
400267:   31 c0                   	xor	eax, eax
400269:   48 83 f9 f6             	cmp	rcx, 0xf6
40026d:   75 27                   	jne	<l4>			# 400296(.text+0x152)
40026f:   31 c0                   	xor	eax, eax
                                l6:
400271:   0f b6 0c 02             	movzx	ecx, byte ptr [rdx + rax]
400275:   0f b6 b0 10 30 60 00    	movzx	esi, byte ptr [rax + 0x603010]		#  603010(.data+10)  : 0x6a42e0
40027c:   83 e9 30                	sub	ecx, 0x30		# '0'
40027f:   39 ce                   	cmp	esi, ecx
400281:   75 10                   	jne	<l5>			# 400293(.text+0x14f)
400283:   48 ff c0                	inc	rax
400286:   48 83 f8 08             	cmp	rax, 8
40028a:   75 e5                   	jne	<l6>			# 400271(.text+0x12d)
40028c:   b8 01 00 00 00          	mov	eax, 1
400291:   eb 03                   	jmp	<l4>			# 400296(.text+0x152)
                                l5:
400293:   31 c0                   	xor	eax, eax
400295:   c3                      	ret
                                l4:
400296:   c3                      	ret
+ Stopped after finding symbol 'func_400297' (20 instructions)

Not bad… uhm?. OK, you should now recognize the code at the beginning of the function… Don’t we?. No?. Fine, just go back to the previous section and read it again.

The only difference here is that the RCX is not fully converted to the string length… the comparison is done against the lower byte of the negative number… The compiler just saves a few instructions there (a not and a sub or dec :).

Now that we know the length is right, we can go into a typical loop. I think you can really understand this code by now, so I will just add some brief comment at each line:

    xor     eax, eax                         ; EAX = 0 This is our loop counter
loop:
    movzx	ecx, byte ptr [rdx + rax]        ; ECX = [RDX + RAX] = user_input[rax]
    movzx	esi, byte ptr [rax + 0x603010]	 ; ESI = [603010 + RAX] = some_data[rax]
    sub	ecx, 0x30		         ; ECX= ECX = '0' (ASCII 2 NUMBER)
    cmp	esi, ecx                         ; if ECX != ESI
    jne	<l5>			         ;   return 0
    inc	rax				 ; RAX++
    cmp	rax, 8				 ; if RAX != 8)
    jne	<l6>			         ;   continue
    mov	eax, 1				 ; else
    jmp	<l4>				 ; return 1

What translated into C will lead to something like this

for (RAX = 0; RAX < 8; RAX++)
  if ((user_input[RAX] - '0') != password[RAX]) return 0;

return 1;

This was really straightforward… wasn’t it?.. Our password is actually at 0x603010… So let’s take a look with STAN:

STAN] > mem.dump x 603010 8
+ Dumping 8 items from segment '.data'
0x603010 : 08 05 02 01 03 02 07 09                         |........

There you go, the password is : 85213279… Let’s try:

0x00sec Easter Challenge
Enter Password: 85213279
Well done. But... this is not the bean you have to grind!

What?.. what does this mean?

Finishing

The idea was that the message above will be a hint to keep looking for something more. There are a couple more hints on the comments in the original challenge to lead you into the path of looking farther.

So, let’s take a closest look to the file… strings is our best friend:

$ strings eo
(...)
f\'F9/U|O
G4NA
META-INF/
META-INF/MANIFEST.MFPK
e3/E3.classPK
e3/S.classPK

OK… we see some strings in there that suggest it may be some Java code hidden. So, it looks like we have to grind some coffee beans (I’m afraid the hint wasn’t that good after all ). Alternatively you could also have run binwalk and get a result like this:

$ binwalk eo

DECIMAL   	HEX       	DESCRIPTION
-------------------------------------------------------------------------------------------------------
0         	0x0       	ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV)
12824     	0x3218    	LZMA compressed data, properties: 0x24, dictionary size: 16777216 bytes, uncompressed size: 33554432 bytes
12952     	0x3298    	LZMA compressed data, properties: 0x36, dictionary size: 16777216 bytes, uncompressed size: 50331648 bytes
13080     	0x3318    	LZMA compressed data, properties: 0x41, dictionary size: 16777216 bytes, uncompressed size: 805306368 bytes
13208     	0x3398    	Zip archive data, at least v2.0 to extract, name: "META-INF/"
13269     	0x33D5    	Zip archive data, at least v2.0 to extract, name: "META-INF/MANIFEST.MF"
13425     	0x3471    	Zip archive data, at least v2.0 to extract, name: "e3/E3.class"
14450     	0x3872    	Zip archive data, at least v2.0 to extract, name: "e3/S.class"
15374     	0x3C0E    	End of Zip archive

So… it looks like there is a jar file inside the binary, this is a ZIP file containing a MANIFEST and some Java class files… Jar files are pretty cool because they can be anywhere and Java is able to find them and run them… Let’s try:

$ java -jar eo
Password:

WoW… another password to crack!!!.. We are so lucky :slight_smile:

#This is a lot easier
Yes, there is another challenge inside the challenge, but this time is Java. Let’s grab a decompiler and hope the bytecode has not been obfuscated ;).

I just quickly google it and found JD-GUI. There are probably better options but I do not do much Java reversing so I cannot really propose a better candidate. This one worked just fine. Let’s download it and give it a try.

$ curl http://jd.benow.ca/jd-gui/downloads/jd-gui-0.3.5.linux.i686.tar.gz | tar xz
$ ./jd-gui

OK… jd-gui cannot directly open the binary, so we better extract it. We can use the flag -e (for extract) with binwalk (you will get a bit of crap in the folder) or just unzip the binary. You can also be c00l and actually extract the Jar/ZIP file. Let’s see how to do this just for the LuLz

Zip files starts with the characters PK, so we have to look for that characters. We will use xxd because what we need is the offset inside the file where the zip file starts.

$ xxd eo | grep PK | head -1
0003390: 0000 0000 0000 0000 504b 0304 1400 0800  ........PK......

So the offset is 3390 + 8 in hexadecimal. Converting it to decimal we get the value 13208… Now just dd it

$ dd if=eo of=eo.zip bs=1 skip=13208 count=1M

I just used a count of 1Megabyte to dump everything after the beginning of the ZIP mark (because the file is smaller than that). You may try to calculate the end of the file… and provide a specific count. I leave this as an exercise to you to sharp your spec reading skills :slight_smile: . In this case the ZIP file has been attached to the end of the file so just giving a big count value works fine.

After that you can open the java code in your preferred decompiler and you will get a clean Java source code.

Now it is up to you to interpret the code… it should be pretty straightforward. You will need to do some operations with the numbers you will find in the Java source… but that should be easy for you… it is just Java code…

Well, this was it… a easter egg challenge in a challenge :)… and whenever you crack that easter egg… well, you could cook a pie or an omelette.

Conclusions

Let’s finish this paper with some conclusions.

  • We have learned a bit about string operations in assembler. Probably many of you already mastered this but I believe some newbies may have learned something
  • We have also learned that following a systematic approach may save us some time. Check your binary before jumping into the assembly. Use strings, binwalk, pay attention to the sections and segments reported by readelf, look for strange permissions or non-matching sizes,…
  • We have also learned that STAN rocks!

In case you want to take a look to the binary using STAN, you can use the following STAN case file. Remember just open the binary with STAN and then load the case with case.load filename.

S:main:0x400144
S:some_print_stuff:0x40020e
S:check_password:0x40025c
S:strncmp:0x4004f1
S:maybe_fgets:0x400583
S:maybe_printf1:0x4005f6
S:maybe_printf:0x4006b5
L:BadBoy:0x4001cf
L:GoodBoy:0x4001e0
L:loop:0x400271
L:return_0:0x400293
L:DONE:0x400296
F:main:0x400144
F:some_print_stuff:0x40020e
F:check_password:0x40025c
F:strncmp:0x4004f1
F:maybe_fgets:0x400583
F:maybe_printf1:0x4005f6
F:maybe_printf:0x4006b5
C:0x40019e:RCX = strlen (user_input) + 1
C:0x4001a1:user_input[strlen(user_input) - 1] = 0 -> chomp
C:0x40026d:Exits if length of strings is different 8 (-1 0xfff... -8 = 0xfff...f6)
C:0x400275:mem.dump x 603010 8 -> 08 05 02 01 03 02 07 09
C:0x40027c:Convert ASCII to int for numbers
C:0x400286:Loop from 0 to 8 using RAX as counter
C:0x40028c:Return 1 on success... we have completed the 8 iterations

Hack Fun


#2

You need to stop being so good man! If you manage to add some sort of symbolic execution features in STAN, it’d be pure orgasm.

By the way, my apologies for not posting my write up for your challenge but I got caught up with exams.


#3

Great write up for your own challenge, promoting your own tool again.
Smart move :stuck_out_tongue: .

Seriously though, really solid article which is easy to follow.
Helped me and most likely will help all the new people who want to dive into RE as well :slight_smile:


(pico) #4

@_py Thanks mate… no worries I know you are busy and as @ricksanchez it is all about promotion :slight_smile:… I have been looking at Unicorn Engine… :wink:

@ricksanchez thanks mate. Glad to hear you liked it


(Not a N00b, but still learning) #5

Meant to be “Dumping Core”? :slight_smile:


(pico) #6

Thanks @SmartOne I will fix the typo in the next commit