Continuing the discussion from Challenge: Brute and Smart:
Did you solved the Brute and Smart challenge (Challenge: Brute and Smart)? In case you did and you wonder if your solution was right or you didn’t solve it and you wonder how to do it… here is my write up.
The program asked for a 6 digit long PIN. Therefore, there are 1 million possible keys. Furthermore, the program implements a 10 seconds delay to prevent a Brute Force attack. If you try a direct brute force attack, it may take something like 1000000 * 10 = 10000000 secs … roughly 115 days. So we have to do better than that to get the secret message.
Reverse engineering
Let’s take a look to the program to see what can we do. The program does not have the symbols stripped so we get a quite clean disassembling. Let’s go with radare2 and take a look to the beginning of the main function
$ r2 sample
[0x00400270]> aaaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze len bytes of instructions for references (aar)
[x] Analyze function calls (aac)
[0x004001a3 ESIL infinite loop detectedences (aae)
0x004001ac ESIL infinite loop detected
[x] Emulate code to find computed references (aae)
[>] Scanning -r-x 0x40017c - 0x4003d9 skip
[>] Scanning mr-x 0x400000 - 0x400518 done
Analyzed 0 functions based on preludes
[x] Finding function by preludes (aap)
[x] Analyze consecutive function (aat)
[x] Analyze value pointers (aav)
[x] Constructing a function name for fcn.* and sym.func.* functions (aan)
[0x00400270]> pdf @main
;-- main:
;-- section_end..note.gnu.build_id:
;-- section..text:
;-- section_end.NOTE:
/ (fcn) sym.main 244
| ; CALL XREF from 0x00400281 (sym.main)
| 0x0040017c 55 push rbp ; [2] va=0x0040017c pa=0x0000017c sz=605 vsz=605 rwx=--r-x .text
| 0x0040017d 31c0 xor eax, eax
| 0x0040017f b900010000 mov ecx, 0x100
| 0x00400184 ba17000000 mov edx, 0x17
| 0x00400189 53 push rbx
| 0x0040018a 4881ec080800. sub rsp, 0x808
| 0x00400191 488b35580e20. mov rsi, qword [rip + 0x200e58] ; [0x600ff0:8]=0x40040e "k\W.]CLI...CP\Q^QRT\ABVX\\.RZZ.Z]_ARVMA;" LEA obj.msg1 ; obj.ms
g1
| 0x00400198 4889e7 mov rdi, rsp
| 0x0040019b 488d9c240004. lea rbx, [rsp + 0x400] ; 0x400 ; "co" @ 0x400
| 0x004001a3 f3ab rep stosd dword [rdi], eax
| 0x004001a5 4889df mov rdi, rbx
| 0x004001a8 66b90001 mov cx, 0x100
| 0x004001ac f3ab rep stosd dword [rdi], eax
| 0x004001ae 4889df mov rdi, rbx
| 0x004001b1 b10a mov cl, 0xa
| 0x004001b3 f3a5 rep movsd dword [rdi], dword ptr [rsi]
| 0x004001b5 bed9034000 mov esi, str.Enter_PIN__6_digits__: ; "Enter PIN (6 digits) : " @ 0x4003d9
| 0x004001ba bf01000000 mov edi, 1
| 0x004001bf e840010000 call sym.write
| 0x004001c4 31ff xor edi, edi
| 0x004001c6 ba00040000 mov edx, 0x400 ; "co" @ 0x400
| 0x004001cb 31c0 xor eax, eax
| 0x004001cd 4889e6 mov rsi, rsp
| 0x004001d0 e827010000 call sym.__libc_read
(....)
We can see how the program uses write
and read
instead of printf
or fgets
in order to show the message asking for the password and to read the PIN from the user. If you cannot see it directly, try to strace
the program and that should become pretty obvious. Nothing special so far… those rep stosd
are actually inlined memset
and memcpy
calls. You will see them in the source code later.
I’m not going to explain in details what all the code do. The relevant part is that, the encoded string is copied in the stack. at the beginning of the function (0x004001ac
to 0x004001b3
). Then there is an in-lined strlen
and a check of the string length. If the length is OK, the message Decoding...
is shown.
This is the inlined strlen
check and Decoding...
message.
| 0x004001d5 4883caff or rdx, 0xffffffffffffffff
| 0x004001d9 31c0 xor eax, eax
| 0x004001db 4889e7 mov rdi, rsp
| 0x004001de 4889d1 mov rcx, rdx
| 0x004001e1 f2ae repne scasb al, byte [rdi]
| 0x004001e3 4889e7 mov rdi, rsp
| 0x004001e6 48f7d1 not rcx
| 0x004001e9 c6440cfe00 mov byte [rsp + rcx - 2], 0
| 0x004001ee 4889d1 mov rcx, rdx
| 0x004001f1 f2ae repne scasb al, byte [rdi]
| 0x004001f3 4883f9f8 cmp rcx, -8
| ,=< 0x004001f7 7556 jne 0x40024f
| | 0x004001f9 ba0c000000 mov edx, 0xc
| | 0x004001fe bef1034000 mov esi, str.Decoding..._n ; "Decoding...." @ 0x4003f1
| | 0x00400203 bf01000000 mov edi, 1
| | 0x00400208 e8f7000000 call sym.write
Otherwise, if the length is not OK, the program exists using the following code (see the jne 0x4002af
):
| |```-> 0x0040024f bf01000000 mov edi, 1
| | 0x00400254 ba0a000000 mov edx, 0xa
| | 0x00400259 be03044000 mov esi, str.Wrong_PIN_n ; "Wrong PIN." @ 0x400403
| | 0x0040025e 31c0 xor eax, eax
| | 0x00400260 e89f000000 call sym.write
| | 0x00400265 bf01000000 mov edi, 1
| | ; JMP XREF from 0x0040024d (sym.main)
| `----> 0x0040026a e869000000 call fcn.004002d8
\ 0x0040026f 90 nop
The code shows the message Wrong PIN
and exits with exit code 1… Yes, take a look to fcn.004002d8
) and you will find the exit
system call.
So far so good. The program returns 1 if the length of the key is not right (6 chars in this case). Still nothing useful to attack the program. Let’s keep analysing the main function
.
| | 0x0040020d ba28000000 mov edx, 0x28 ; '('
| | 0x00400212 4889e6 mov rsi, rsp
| | 0x00400215 4889df mov rdi, rbx
| | 0x00400218 e873000000 call sym.mp1
| | 0x0040021d 3b05c50d2000 cmp eax, dword [rip + 0x200dc5] ; [0x600fe8:4]=0x31b20 LEA obj.cm ; " .." @ 0x600fe8
| ,==< 0x00400223 752a jne 0x40024f
| || 0x00400225 befe034000 mov esi, str.heco ; "heco" @ 0x4003fe
| || 0x0040022a 4889df mov rdi, rbx
| || 0x0040022d e80e010000 call sym.strstr
| || 0x00400232 4885c0 test rax, rax
| ,===< 0x00400235 7418 je 0x40024f
| ||| 0x00400237 bf02000000 mov edi, 2
| ||| 0x0040023c ba28000000 mov edx, 0x28 ; '('
| ||| 0x00400241 4889de mov rsi, rbx
| ||| 0x00400244 31c0 xor eax, eax
| ||| 0x00400246 e8b9000000 call sym.write
| ||| 0x0040024b 31ff xor edi, edi
| ,====< 0x0040024d eb1b jmp 0x40026a
(...)
Now, things become more interesting. The program calls a function named mp1
and checks the value returned by that function with the content of 0x600fe98 (a local variable with value 0x31b20
)… If the function returns the wrong value the program jumps to the exit code, showing the message Wrong PIN
and exiting, again, with code 1. Then it calls the strstr
function. This function looks for a substring within a given string. In this case, it is looking for the sub-string at address 0x400340
(heco
) within the string passed as first parameter to the mp1
function.
Again, if the string is not found, the program will end, exiting with exit code 1. Otherwise it will print the current buffer (the one passed to mp1
but in this case using stderr (file descriptor 2) instead of stdout). Then it will set EAX
to 0 and jump to the call to exit
(just check the previous assembler fragment).
So far we have found something useful. The program will return 0 when the key is correct and 1 otherwise. This is a clean condition to know when we have found the right password. A lot easier that parsing the output of the program and trying to figure out if there is some meaningful content on it.
The mp1
function
Other than a clean condition to stop our brute force attack, we haven’t found much yet. It is time to check the mp1
function.
[0x00400270]> pdf @sym.mp1
/ (fcn) sym.mp1 71
| ; CALL XREF from 0x00400218 (sym.mp1)
| 0x00400290 53 push rbx
| 0x00400291 4189d1 mov r9d, edx
| 0x00400294 31c9 xor ecx, ecx
| 0x00400296 31db xor ebx, ebx
| 0x00400298 41b806000000 mov r8d, 6
| ,=< 0x0040029e eb20 jmp 0x4002c0
| .--> 0x004002a0 99 cdq
| || 0x004002a1 41f7f8 idiv r8d
| || 0x004002a4 4863d2 movsxd rdx, edx
| || 0x004002a7 4801f2 add rdx, rsi
| || 0x004002aa 8a02 mov al, byte [rdx]
| || 0x004002ac 32040f xor al, byte [rdi + rcx]
| || 0x004002af 88040f mov byte [rdi + rcx], al
| || 0x004002b2 0fb612 movzx edx, byte [rdx]
| || 0x004002b5 0fb6c0 movzx eax, al
| || 0x004002b8 48ffc1 inc rcx
| || 0x004002bb 0fafc2 imul eax, edx
| || 0x004002be 01c3 add ebx, eax
| || ; JMP XREF from 0x0040029e (sym.mp1)
| |`-> 0x004002c0 4439c9 cmp ecx, r9d
| | 0x004002c3 89c8 mov eax, ecx
| `==< 0x004002c5 7cd9 jl 0x4002a0
| 0x004002c7 bf0a000000 mov edi, 0xa
| 0x004002cc 31c0 xor eax, eax
| 0x004002ce e841000000 call sym.sleep
| 0x004002d3 89d8 mov eax, ebx
| 0x004002d5 5b pop rbx
\ 0x004002d6 c3 ret
I will not explain this line by line, but this is a XOR encoding function using a key of 6 characters. The result is stored in the original buffer. The key itself is a parameter to the function. It is actually the string entered by the user so… the program does not store the key anywhere in the code. It uses the user provided key to decode the encrypted message. Furthermore, this code calculates a checksum of the decoded strings as it is decoded. That is actually the value the function returns and that is compared in the main function with the constant 0x31b20
. This is not really relevant and I’ll tell you later why it is there.
However, the end of the function is interesting. Let’s take a look again:
| 0x004002c7 bf0a000000 mov edi, 0xa
| 0x004002cc 31c0 xor eax, eax
| 0x004002ce e841000000 call sym.sleep
This is a call to Sleep
passing as parameter the value 0x0a
(10 secs). So, these are the 10 seconds we have to wait to be able to try another key… Sometimes, this timing can be checked somewhere else in the code and cannot be easily removed. In this case, we can see that there is nothing in the program taking into account these 10 secs delay, therefore, it looks like a way to prevent a brute force attack. The first thing to do is to remove that delay, because we will have to brute force attack the program anyway… In this case, there is no other way around.
After removing the delay, we can try the brute force attack and see how it well it will work
Note: Just look for the hex values above in your preferred Hex editor and change the sleep call (e841000000
) to NOPs (code 0x90
)
A Brute Force attack
So, we have to write a program (actually a script will be easier in this case) to try all the possible PIN codes. We know that the program will return 1 whenever we input the wrong code, so we can easily stop our attack when we are done.
I wrote my script in Perl (of course), and it looks like this:
#!/usr/bin/perl
$start = 0;
$end = 999999;
$start = $ARGV[0] if $#ARGV >= 0;
$end = $start + $ARGV[1] if $#ARGV == 1;
for ($i = $start; $i < $end; $i++)
{
$key = sprintf "%06d", $i;
$cmd = "echo $key | ./sample-no-sleep > /dev/null";
system ($cmd);
print "$key " . int(($i *100)/ 999999) . "%\n" if ($i && $i % 5000 == 0);
if ($? == 0)
{
printf "Key is $key\n";
last;
}
}
print "Last Key was: $key\n";
When executed without parameters, it will try all possible PIN codes. If we provide one parameter, it will try PIN codes starting with that value. This way we can stop the program and restart it later if needed. Finally, if we provide a second parameter then, the program will just check that number of keys. This allows us to launch the script multiple times and even run it in several computers to get the key broken faster.
I think the code is pretty straight forward. The only thing you need to know is that the variable $?
conveniently contains the exit code of the last executed program.
Running this script without parameters (checking all the keys in one go), took around 30 minutes for my computer to find the key. That is pretty reasonable, but… we can do a lot better.
This solution does not require you to completely reverse the encryption algorithm. We are using the program itself to do the decoding. If you manage to completely reverse the algorithm the brute force script will run a lot faster as we do not need to run a complete process. Kudos to @dtm for following that approach.
Alternatively, you can get the assembly for the mp1 function and use it in your program (we already know how to add asm function to our C code) without reversing it.
That strstr
function should be handy
Sure it is. Do you remember the hint provided for this challenge? No worries, here it is again:
By now, you should have figure out that the program has encrypted the target string using XOR encryption and that the key is not stored in the code at all. You need to find the key by brute forcing the password. Here it goes the hint:
If P is the plain text, C is the crypted text and K is the key, then:
P xor K = C
and
P xor C = K
Therefore, if you have the plain text (or part of it) then you can know the key (or part of it).
So, that strstr
function tries to find some characters in the decoded message. In other words, it tries to find some specific sequence of characters in the plain text message. Radare2 already show us the string heco
that is, a part of the plain text message.
Yes, I know. This strstr
thingy requires some explanation. I will provide it later when I show you the source code at the end of this write up.
Anyway, so we have the crypted message, we have a 6 characters long key and we have 4 plain characters. This means that, if we xor together the crypted message with the string we are looking for, we will get 4 digits of our key. Unfortunately, we do not know where those characters are within the whole string, and we do not know either which index of the key will affect each one.
This sounds a bit confusing, so, let’s do a diagram.
1111
01234567890123
CCCCCCCCCCCCCC This is our crypted strings
KKKKKKKKKKKKKK This is our key repeated module 6
01234501234501
heco
heco
heco
heco
heco
heco
heco
The substring can be in any of those positions. Note that, as the key is 6 digits, the string is actually being rotated (because the index 6 becomes 0 and index 7 becomes 1 and so on). With this information, we can write a simple program to try to decode the crypted string, using keys including the heco
. So I wrote this small program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define KEY_LEN 6
#define MSG_LEN 40
unsigned char *msg1 = "\x6b\x5c\x57\x59\x41\x43\x48\x03\x1d\x1e\x41\x5f\x5d\x5a\x5d\x55\x50\x5b\x5d\x4a\x41\x52\x5a\x53\x5d\x17\x51\x5e\x58\x18\x5b\x56\x5c\x45\x50\x59\x4c\x4a\x32\x31";
int find_num (unsigned char *s, int l)
{
int i;
for (i = 0; i < l; i++)
{
if (isdigit(s[i])) printf ("%c", s[i]);
else printf (" ");
}
printf ("\n");
}
int mp (unsigned char *s, unsigned char *k, int l)
{
char c;
int i;
for (i = 0; i < l; i++)
{
s[i] ^= k[(i % KEY_LEN)];
}
return 0;
}
int
main ()
{
unsigned char buffer[1024];
unsigned char msg[1024];
int c;
memset (buffer, 0, 1024);
memset (msg, 0, 1024);
memcpy (msg, msg1, MSG_LEN);
memcpy (buffer, "heco\0\0", 6);
mp (msg, buffer, MSG_LEN);
printf ("%s\n", msg);
find_num (msg, MSG_LEN);
mp (msg, buffer, MSG_LEN);
memcpy (buffer, "\0heco\0", 6);
mp (msg, buffer, MSG_LEN);
printf ("%s\n", msg);
find_num (msg, MSG_LEN);
mp (msg, buffer, MSG_LEN);
memcpy (buffer, "\0\0heco", 6);
mp (msg, buffer, MSG_LEN);
printf ("%s\n", msg);
find_num (msg, MSG_LEN);
mp (msg, buffer, MSG_LEN);
memcpy (buffer, "o\0\0hec", 6);
mp (msg, buffer, MSG_LEN);
printf ("%s\n", msg);
find_num (msg, MSG_LEN);
mp (msg, buffer, MSG_LEN);
memcpy (buffer, "co\0\0he", 6);
mp (msg, buffer, MSG_LEN);
printf ("%s\n", msg);
find_num (msg, MSG_LEN);
mp (msg, buffer, MSG_LEN);
memcpy (buffer, "eco\0\0h", 6);
mp (msg, buffer, MSG_LEN);
printf ("%s\n", msg);
find_num (msg, MSG_LEN);
mp (msg, buffer, MSG_LEN);
}
Yes, it is just 6 possibilities before they repeat so copy&paste worked faster that a loop . The program takes the crypted string (extracted from the binary using any tool you like) and encodes it with the heco
string that is shifted 1 character to the right 6 times to cover all possible positions of the substring. Then the function find_num
is called. This function just prints the numbers in the decrypted string. Finally we call again the crypt function to restore the crypt message for the next round.
The hint we have is that if we XOR together the crypted and plain text we get the key. In this case, the PIN is a numeric code, so we are interested only in numbers. Actually we are only interested in sequences of 4 consecutive numbers. Those are the ones that produces the heco
strings we are looking for. So, let’s see what does this program tell us.
$ make test
$ ./test
946AC f~qA_5?>:P[5/"=ZS5r21X33?*PY$/Q^
946 5 5 5 21 33
k42:.CHkx}._]286?[]"$15S]4=7[>9&?YL"WR
42 286 15 4 7 9
k\?<",Hu{"0]Z5034]J)79<]9;;w[V4 36LJZT
0 5034 79 9 4 36
\W1$ 'v$<2Z]=582JA:?02Q6={4V\-5:#J2Y
1 2 582 02 6 4 5 2
3WY)&+l):>5]U8>>%AR26>xQ^0}89\E8</%21
3 5 8 26 0 89 8 21
?8YA+-`rA7892UP38).RZ;8t>^Xp>53EP1))]1
8 7892 38 8 53 1 1
Nice, we see two 4 digits sequences: 5034
and 7892
. We know that the key will have those 4 digits plus 2 additional ones that we do not know. We do not know either the position within the 6 digit pin of these 4 digits, but we know that the are consecutive… Let’s brute force again:
Smarter Brute Forcing
With the information we had got, I wrote a new script to brute force the program. Here it is:
#!/usr/bin/perl
$val = $ARGV[0] if $#ARGV >= 0;
for ($i = 0; $i < 100; $i++)
{
$key = $val . sprintf "%02d", $i;
for ($j =0; $j < 6; $j++)
{
$key1 = substr ($key, $j) . substr ($key, 0, $j);
$cmd = "echo $key1 | ./c01 > /dev/null";
system ($cmd);
if ($? == 0)
{
printf "Key is $key\n";
exit;
}
}
}
print "Last Key was: $key\n";
The script receives as parameter a 4 digit number. Then, it completes the number with the two missing digits (from 00 to 99) and try the 6 PIN codes that result of shifting the just generated 6 digits code. This requires in total: 100 * 6 = 600 PIN checks.
Let’s try the script with the 4 digit candidates we found before:
$ ./crack1.pl 5034
Last Key was: 503499
$
Bad luck with the first one. Let’s try the next one:
$ ./crack1.pl 7892
CENSORED MESSAGE.
Key is CENSORED
Bingo!!!
Again, you can implement the encryption algorithm in the script and make it run a lot faster. However, for this solution it just takes a few seconds to find the key.
The source Code
To finish with this paper, this is the source code of the challenge:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define KEY_LEN 6
#define MSG_LEN 40
unsigned char *msg1 = "\x6b\x5c\x57\x11\x5d\x43\x4c\x49\x08\x1e\x1a\x43\x50\x5c\x51\x5e\x51\x52\x54\x5c\x41\x42\x56\x58\x5c\x5c\x1c\x52\x5a\x5a\x17\x5a\x5d\x5f\x41\x52\x56\x4d\x41\x3b";
int cm = 203552;
int mp1 (unsigned char *s, unsigned char *k, int l)
{
char c;
int i, sum = 0;
for (i = 0; i < l; i++)
{
s[i] ^= k[(i % KEY_LEN)];
sum += s[i] * k[(i % KEY_LEN)];
}
sleep (10);
return sum;
}
int
main ()
{
unsigned char buffer[1024];
unsigned char msg[1024];
int c;
memset (buffer, 0, 1024);
memset (msg, 0, 1024);
memcpy (msg, msg1, MSG_LEN);
write (1, "Enter PIN (6 digits) : ", 23);
read (0, buffer, 1024);
buffer[strlen(buffer) - 1] = 0;
if (strlen (buffer) != 6)
goto wrong;
write (1, "Decoding...\n", 12);
c = mp1 (msg, buffer, MSG_LEN);
if (c == cm && strstr (msg, "heco"))
{
write (2, msg, MSG_LEN);
exit (0);
}
wrong:
write (1, "Wrong PIN\n", 10);
exit (1);
}
After all your reverse engineering efforts, there are not many surprises I guess. There is only one thing to explain. I wanted the program to do not store the password at all so you will have to brute force it. The problem is that I needed the program to be able to detect if the password was right or not. In a normal case I will probably use a hash of the plain text (MD5, SHA1), but that will bring a lot of complexity to the code and make the challenge even more difficult.
So I just calculated a very basic checksum, but it didn’t worked very well, and that is why I added the strstr
check. So, when the user enters a PIN, the message is decoded with that PIN and a checksum generated. There are many PINs that produces that checksum but do not decode the message right so I added a second check to ensure that a specific sequence of characters shall appear in the plain message. Probably the checksum can be removed now… consider it noise
I just used 4 characters so you could only extract a partial password (as we have seen above) but you get a chance to heavily improve a brute force attack.
#Conclusions
Well, I’m not sure if doing the analysis described here takes longer that a direct Brute force attack but I believe it is a lot more interesting. In this challenge I try to point out the following ideas:
- Sometimes, reverse engineering is not enough… Or, in other words. Sometimes brute force is the only solution.
- Blindly brute forcing is not very effective. With a little bit of extra work you can do it a lot better
- Perl is cool