RE guide for beginners: Methodology and tools

linux
reverseengineering
tutorial
methodology

#1

Hi fellas,

A few days ago, I decided to start my adventure in the reverse engineering domain. I was quickly overwhelmed by a bunch of information and op codes that confused me a lot, even with solid knowledge in assembly and programming.

Reverse engineering can seem complex at the first glance, however, with a good methodology and toolkit, everything becomes more significant.

This article claims to guide you, based on my own experiences, in your first steps in this strange and odd universe.

Methodology

So, here we are, you downloaded your first binary and now … what to do ? RE requires two types of analysis, static and dynamic. The static analysis will help you to have a better overview and understanding on what going on within the binary, whereas the dynamic analysis will allow you to follow, step by step, the changing that occurs within each register, which system calls are used, etc.

The following methodology is pretty basic. Indeed, we start to perform static analysis to spot odd pieces of code which have to be deeply analysed through dynamic analysis. Pretty simple right ? But which tool can you use ?

Static analysis

I must admit that I didn’t take the time to assess the different tools available on the internet. Indeed, I instantly jumped on binary ninja due to its low cost (99$), compared to the functionalities provided.

Binary ninja is dedicated to static analysis, providing an awesome GUI, which is priceless when you have to deal with such amount of information !

As you can see on the image above, binary ninja displays the entire call graph of your executable, simplifying the way to understand how each block interact together. Moreover, you can easily switch of view, via the right bottom select menu. Lastly, the left side enumerates every function called, directly accessible thanks to a simple mouse click.

Upstream, this software allows:

  • To place comment within the code
  • To patch binary through assembly or C code
  • To access an API to develop your own plugin to accelerate the analysis process
  • To access a bunch of plugins available from their GitHub
  • Other functions that I didn’t use yet ^^

Note: A demo version is available for free and should be enough for beginners.

Dynamic analysis

Dynamic analysis can be done through various tools e.g gdb, radare2, etc. From my personal experience, radare2 is far from being user-friendly. Indeed, without the stylesheet, I wasn’t able to remember the shortcuts, which made me waste a lot of time ! However, gdb seems to do the job and pretty well… Moreover, the gdb user experience can be improved by using peda (Python Exploit Development Assistance for GDB), enhancing the display of gdb by colourising and displaying disassembly codes, registers, memory information during debugging.

Here is the enhanced CLI:

Example

To show you how to apply and use this methodology, I chose to show you how I successfully reverse the third phase of the bomb lab, developed by the Carnegie Mellon University, which @_py makes available on his CTF platform skidophrenia.

Here is the phase 3 entry point :

Assumption : The solution seems to have 3 components, two integers and 1 character

Let’s break on the 0x08048bbf address to see the registers state.

Input tried: 1 2 3

Ok, well, it seems that the register EAX represent the amount of argument passed to sscanf. Which confirms our previous assumption. Indeed, at least three values are necessary to pass to the next block.

Here are the next blocks :

Explanations

  1. Check if the first integer is above 0x7. If yes, the bomb will explode (block not shown in the picture)
  2. We jump to the case corresponding to our first argument
  3. Set the BL register to 0x6b and compare the third argument to 0x7b. If the values are equal, we jump to next block, otherwise the bomb explodes.
  4. Check if the second argument is equal to BL, which has been set previously. If not, the bomb explodes.

Consequently, we can assume that the password should be :

  • 3: representing the third case
  • k: corresponding to the ascii value of 0x6b
  • 251: corresponding to the decimal value of 0x7b

Let’s try it !

Challenge completed ! As you can see, this challenge didn’t need so much dynamic analysis, however, this is quite rare. I chose this exercise to show you the importance to take your times to perform static analysis cause it can easily represent 70% of the work. So, scrupulously analyse each piece of code to reach your goal !

Conclusion

As it has been demonstrated, reverse engineering is accessible for everyone. However, it is inescapable to have decent knowledge in assembly, memory management as well as in programming. Indeed, it will definitely help you to quickly identify where you have to focus your investigation to patch or bypass the security measure in place. Moreover, patience and dedication are qualities that will help you in your way to develop your RE skills.

Upstream, to help RE Linux binaries in 32 / 64 bits architecture, I created a git repository which contains a docker image, embedding a few tools necessary for such challenge.

I hope that you enjoyed your reading.

Best,
Nitrax

Note: A particular thanks for @_py, who guided me in this unknown area which is reverse engineering.


RE guide for beginners : bypassing SIGTRAP
#2

@Nitrax

Great introduction to that huge chapter of reversing.

On a side note you mentioned PEDA.
Maybe you wanna check out GEF.

As a reference from ther wiki:

###But why not PEDA?
Yes! Why not?! PEDA is a fantastic tool to do the same, but only works for x86-32 or x86-64x whereas GEF supports all the architecture supported by GDB (currently x86, ARM, AARCH64, MIPS, PowerPC, SPARC) but is designed to integrate new architectures very easily as well!

Otherwise I’ll totally agree with you on the radare2 thing.
I mean awesome work from the development team to make such a powerful tool, but when you’re not using it on a daily basis you always get lost in it.
I seldom use it and always have to fire up the documentation and the cheatsheet on my second monitor…

//Edit:
@_py is bninja worth it? I took a brief look at it and it seemed rather incomplete?

Also from their FAQ:

###Will the price ever change?
Yes, the current pricing is introductory and expected to change some time following the release.
Additionally, some future features (eg, the decompiler) will likely either be a separate purchase or may result in the base price going up.

So how much will the end product be and how does it compare against IDA Pro?


#3

Never heard of it ! I will give it a try :slight_smile: ! Cheers mate.


#4

Sure thing. I currently have it installed too and wanna play around with it in the next days when I finished my current ongoing series here :smiley:


#5

IDA Pro and bninja are totally different. Indeed, bninja is dedicated to static analysis, wherease IDA provides both. According to their website, bninja will provide such feature when their static engine will be entirely done and, apparently, the development process is far for being finished.

No, not at this time. We definitely plan for a debugging interface in the long term but right now we are focused on our static analysis efforts.

Of course, that hasn’t stopped others from integrating Binary Ninja into a dynamic analysis workflow. For example, check out Snare’s Binjatron plugin that connects Binary Ninja to Voltron which itself is a front-end for a variety of debuggers (GDB, LLDB, WinDbg and VDB at last count).


RE guide for beginners : bypassing SIGTRAP
#6

@ricksanchez: I’ll express my opinion from a Linux user’s perspective.

Binary Ninja is totally worth it for Static Analysis. It helped me step up my RE game way too much/fast. Imo it’s one of the best tools you can start with if you are serious about RE since it’s not even that pricey (atm). 90 bucks is nothing compared to IDA which can range from 500-5k.

That being said, IDA is still the industry standard for Reverse Engineering and will always be probably (bninja hasn’t been out for more than 1-2 years). I mean, I recently started playing with it as well (didn’t purchase it ofcourse, if you know what I mean) and the community for it is gigantic. There are endless plugins (you can build your own for bninja as well) made for it that can literally help you reverse almost anything without even being too skilled.

Having talked to some peeps who have talked to bninja’s devs, it seems like Binary Ninja won’t integrate any sort of Dynamic Analysis instrumentation like IDA does. It’s focused on Static Analysis and more importantly on the new up-coming trend called Symbolic Execution ( you should definitely google it, it’s a massive era ).

I’d highly recommend to grab a personal license while it stays in the range of 90 bucks and just mess with it. There’s an extensive list on their website on what bninja is capable of doing and trust me it’s a ton.


#7

in the current state they may be and that was my question but I’m sure BNinja aims to be an entry level reversing platform for <500$ to be a competitor to IDA, since named licenses there start at 500€.

And that’s the point. When will that be? And what functions will they offer then :stuck_out_tongue: ?

@_py okay you replied one second before me here.
Seems like a valid explanation. I might grab such a license. Was playing with the thought for a while but was unsure about it because I have noone with hands on experience around me.

I read about symbolic execution already and it seems massive. It definitely will be helpful and change the game.

Oh and if I buy a license is it a named one for Linux and Windows or do I need multiple license for different OSs?


#8

The license is cross-platform. I have it installed on both Windows and Linux machines.

If you have any questions regarding its usage once you purchase it, feel free to hit me up on IRC. I can give you a quick rundown of its internals.

Another reason as to why bninja is a great starting point is the fact that there is even a book explaining IDA’s features. I think that says a lot about its learning curve.


#9

It should deserve an entire post :wink:


#10

okay thanks mate. I’ll hit you up once it’s up and running!

@Nitrax after @_py gives me the basic rundown I’d post an introductionary post about it, if he’s not posting first :stuck_out_tongue:


#11

Feel free to do so, I’m planning to make some bninja-related write-ups as well.


#12

hmm sounds like you are referring to a five finger discount :grimacing:


#13

reads

doesn’t see perl

:frowning:


#14

“RE guide for beginners”

Deobfuscation isn’t beginner friendly, if you know what I mean :smirk:


#15

[spoiler]That is a very nice guide, but I was quite surprised with your choosing of tools, I have nothing personal against bninja, but why radare? You have olly and immunity, which are both very powerful tools, the later is open-source and has a ton of friendly plugins too.

Anyway, thanks for the effort, it was pleasant reading.[/spoiler]

Nevermind, was stupid not to notice that the author used ELF binary as the subject.


#16

I don’t believe either of them support ELF binaries. If there’s a plugin for that, I’d love to know about it. There is EDB which is the Linux counterpart and it’s not bad.


#17

You are absolutely right, my mistake for not reading carefully. Was foolish not taking notice that @Nitrax was RE’ing an ELF binary.

@Nitrax, ignore my previous comment.


(Art Vandelaiy) #18

Hey this is a great article. Reverse engineering is very new to me, and this was a great introduction.

Something off topic (and not in anyway a critique of your expertise), you might want to avoid putting space characters before exclamation marks. “Foo bar ! Foo bar.” should actually look like “Foo bar! Foo bar.” in typed English.


(Jordan) #19

The plan for pricing is described at the bottom of:

TL;DR – personal will be $149 after the introductory period ends, but it will be announced well in advance.

As for Static versus Dynamic, we’ll get there. As mentioned already, there are plugins that implement it already but it’s not as smooth because it can’t be natively integrated into the UI very well (that will change when the 1.2 release has a better mechanism for arbitrary GUI plugins)

And development will /always/ be ongoing. There’s tons of research we’re doing to improve things like the current state of the art in linear sweep, function similarity matching, etc. Emulation on the IL, lots of new interesting things underway. Doesn’t mean it’s not usable now, just that we’re dreaming big. :slight_smile:

Edited to add: For anyone with questions about Binary Ninja, we’ve got a public slack (hit the link above and look for the slack logo at the very bottom) that has a number of channels for different types of questions. Even if you’re brand new to RE, there’s a #ninjas-in-training channel just for folks to ask any question they like about RE.


#20

@psifertex thank you for taking your time and answering in depth to my questions!
I guess i missed the pricing section when snooping around on the website.

Hope to see some awesome ongoing development from you!