Assembly/Malware Analysis - Platform, Architecture, Resources

Hey guys,

I’m planning on learning Malware Analysis and Reverse Engineering, but was wondering if someone could help clarify a few questions that I have regarding Assembly.

1. The primary Architecture I should focus on should be x86 and possibly ARMS down the road right? x86 because that’s what most pc are run on, and ARMS for mobile devices such as android… once I have the x86 covered?

2. After some research, the followings seem to be recommended the most when it comes to learning materials for Assembly:

Art of Assembly Languages
Programming from the Ground Up
Assembly Language Step by Step Programming with Linux
http://pacman128.github.io/pcasm/

However, some of these materials use NASM, while others use MASM (Microsoft Assembler) or AS (GNU Assembler).

Furthermore, some uses Linux while others are using Windows as their platform. And when it comes to Assembly, the Assembler you use and the platform you use make a difference right? Since the syntax are different for each Assembler (at&t vs intel), and based on the OS you use, the opcodes, system calls are also different as the Assembly code will often interact with the system you’re operating on correct?

So I’m kind of confused as to which book I should use at the moment since they use different assembler and OS platforms. Or does it not make much of a difference?

While i’m using Linux as my main OS, does it make sense to learn Assembly on a Linux if Malware Analaysis largely focuses on malwares that are targetting the Windows Operating System?

3. This is more of a general question on Malware Analysis instead of Assembly, but if I plan on getting more involved in Malware Analysis, what platform is more suitable for this type of work? Linux or Windows? Are there essential tools of the trade that are only available for Windows? or does it not really matter as there are great static testing tools available for both, and others could easily be installed on the VM for dynamic analysis if neeed be?

3 Likes

Hi!

I just thought I’d give my two cents on the topic, I’ve started learning malware analysis a few months ago as well, it is not an easy thing to do!

Assembly language is very much platform specific (x86 vs ARM) and OS specific due to calling conventions and system calls. I personally learned on x86 Windows & Linux first, bare in mind a background in C programming will help a lot.

I found the (free) courses from OpenSecurityTraining to be easiest for me to learn (http://opensecuritytraining.info/IntroX86.html) although please note it focuses on teaching you how to read assembly. While following the course I wrote simple programs using NASM (writing a function in C and then Assembly, comparing your results with that of the compiler is an amazing excercise).

Concerning Malware Analysis itself, there are some amazing links on this forum, the search bar will help :slight_smile: Windows will definitely be needed if you intend to analyze windows samples, and there a number of tools for windows (such as IDA) which make your life a bit easier. Quick note: Virtual Machines/Completely isolated machine are a necessity for malware analysis.

Please note, be cautious with malware analysis, there are tricks the author’s can use to break some tools, trick analysits into infecting their machines, etc. This depends on the piece of malware of course.

Good luck with your learning! It’s definitely a fun and challenging field!

Hi,

You should have searched through the site before asking this question. @dtm recently posted a “guide” on how to get into Malware Analysis (use the search bar) for the Windows platform.

We also have a bunch of learning resources for Assembly (use the search bar).

You seem to be sort of rushing. You are asking which platform is more “suitable” for Malware Analysis while you don’t know the basics/fundamentals yet (from my understanding at least), which will make you get bored of Malware Analysis/Reverse Engineering quite soon.

Forget Malware Analysis for now or the platform for it.

  • Master the memory hierarchy/organization - Though each OS has different internals, the basic idea always stays the same (google the term).

  • Master the binary internals - What do I mean by that? Every executable is more than just instructions. Their format is quite complex and even though each OS has different format implementation, the core concepts always stay the same. Learn where and why each bit will be stored during compilation.

  • Get in touch with assembly and read loads of it even if you don’t understand what it does. Write C programs, compile them, reverse them, optimize them.

  • Get in the habit of being frustrated for hours/days/weeks/months.

  • Get in the habit of reading endless papers and watching conference talks.

These are my two cents on how you should be approaching the certain field. I’m not a pro but imo this sort of mindset will help you along the way. Work smarter, not harder. Once you master the basics, no matter how advanced the subject is, it’ll make sense.

You’ve got a lot to google, best of luck.

2 Likes

Especially these will be present during the whole learning phase and even afterwards. It’s just a big part of reversing/analysis.

But all in all i totally agree with you @_py .
The most important thing is and always will be mastering the basics!

1 Like

i started the thread with the intention of only asking the basics (the fundamentals) - the absolute starting point and requirement for malware analysis - that being Assembly.

I have looked that dtm’s thread and like you have suggested - my plan was to basically not bother with anything else and just focus on learning Assembly first… and that in all honesty was the purpose of this thread- just asking about learning assembly for windows vs for linux, and the various assemblers, etc.

But while making the thread, I guess my eagerness got the best of me and i sort of just mindlessly filled the page with the other questions that aren’t so important right now until I’ve actually learned Assembly. My apologies.

I’m actually learning C++ at the moment and plan on reading the K&N’s book on C afterwards. I just figured I would also start learning Assembly in conjunction with C++ since it should provide a deeper understanding of how everything work.

Don’t ever apologize for wanting to learn. :wink: It’s a good thing which will make you advance and make you step on a whole other level compared to others who don’t bother.

You didn’t commit a crime, I get your point. When I wanted to get into the low-level field I was exactly like you. But then I realized how essential the basics are.

The reason I responded wasn’t to shit on you, but to give you my personal experience overview in terms of how I tackled the certain field since I could relate to your confusion.

Malware Analysis/RE/Exploit dev and anything low-level related is pure chaos in the beginning.

I’m actually learning C++ at the moment and plan on reading the K&N’s book on C afterwards. I just figured I would also start learning Assembly in conjunction with C++ since it should provide a deeper understanding of how everything work.

That’s actually a great idea and exactly what I did. Combine both C/C++ and assembly. That way you will learn faster and combine both worlds (high level and low level) in your agenda. It’ll give you a much better visual on how the internals work and you’ll appreciate computers much much more.

Just be patient, it’s one of the most advanced eras in InfoSec but one of the most rewarding as well.

P.S Let’s not forget game hacking, right @dtm?

Just going mention that there exists malware which uses C#/VB.NET so it might be interesting to learn how to read those languages and also MSIL.

What it really comes down to is that it doesn’t really matter which one of these you learn as a beginner, but know that GAS is not used on Windows, nor is the AT&T syntax. IMHO, NASM is the best way to go because it’s popular on both Windows and Linux platforms and it’s used with the Intel syntax which is what Windows uses as well. You can tell that reverse engineering isn’t taken seriously on Linux because it defaults to the AT&T syntax lal. What you should do is pick one flavour of assembly and stick with it (I suggest NASM if that wasn’t clear already) and learn the underlying computer architecture from books, i.e. how the CPU works, how the data buses send and receive data to and from RAM, data alignment in memory, etc.

Learning the basics of assembly on either platform is fine. There are slight differences between Linux and Windows, for example, the fs register is reserved for pointing to the Thread Information Block (TIB).

There is a lot of information on the Windows platform and to be efficient in analysing malware, you need to be able to quickly identify and understand the API calls or whatever the piece of malware is doing to interact with the environment and what effects it has, e.g. registry writes, crypto functions, disk IO operations, process accesses, code injection, networking functions, WMI interactions, etc. In the event of more advanced malware such as packers/crypters, it’s important to know the PE file format in-depth so that you can unpack it manually or with the help of tools.

There are specially created VMs created for RE such as REmnux which has many tools available, most which provide brute forcing scripts to break encryptions such as XOR, and other tools which perform a strings analysis while deobfuscating them such as base64’d strings (floss is one). There are also disassemblers and debuggers such as radare2 and Hopper but they have an incredibly high skill floor.

IMHO, Windows is the platform which provides the most tools for the job. You’ll also need a Windows machine to do dynamic analysis anyway, so you might as well get it. Just some of the tools that I use are: Resource Hacker, exepeinfo (or PEiD), pestudio, Process Hacker, Process Monitor, Process Explorer, IDA Pro, OllyDbg, Regshot, ImpRec, PEview, Wireshark, Dependency Walker.

Just a note on disassemblers and debuggers, I notice that on Linux that these tools are pretty restricted in terms of displaying multiple panels (correct me if I’m wrong, Linux peeps) of important information such as the disassembly, the stack frame, a hex dump window, memory view unlike on Windows, tools like IDA Pro and Ollydbg have many windows which can show all of this information at once so it’s incredibly convenient.

So yeah, I’ve given you my view on this, the choice is yours.

P.S.

Awwh @_py , you remembered! How sweet of you! :relieved:

2 Likes

IDA Pro works just fine on Linux. I use it from time to time though Binary Ninja is my bae. Binary Ninja, though a static analysis tool, it includes all kinds of info (hexdumps, strings, section names etc ) except for memory (RAM) views (i.e stack, heap) ofc since that’s a dynamic analysis feature.

But yeah, I can agree with @dtm on the fact that Windows has a plethora of tools to get the job done.

1 Like

Hi @FormosaTBM and welcomed!

Right. You may also consider MIPS as that is the most extended platforms for network devices. Once you manage one of the platforms, moving to a new one is easy because, the basics (as @_py said), are the same.

/self promo mode="on"
Take a look to the programming for wanabes series to get an idea of how these x86, ARM and MIPS assembly looks like and you will see how similar they are (Books and Sources for Beginners)
/self promo mode="off" :slight_smile:

The opcodes does not change for a given processor. The mnemonics depends on the assembler you chose (NASM, GAS). The system calls change with the processor and the OS. Here I fully concur with @dtm. NASM is probably the best one to start, as you can easily use on both platforms. But, depending on what your goals are, you may easily end up using both.

I would say no if that is the case. However, lately, you can find quite some Linux malware. Not targeting desktop computers (as in the case of Windows), but targeting smartphones (Android), routers and IoT devices in general. All those, at assembly level are linux boxes.

And, to be fair, It is true that you can reverse those on Windows. Check the tools for both systems and chose the ones that better suits you. The Windows tools are probably more user-friendly, according to what others have said in the thread.

I would recommend to start with plain C, specially if you want to do it in parallel with assembly. C++ has a completely different ABI that will just make everything harder. Once you know C move to C++ if you want, but in general it is a bad idea to do it the other way around (that’s on my experience).

Finally, take all those comments as indications. Overall, what works fine for some person may not work at all for others.

2 Likes

I’d like to reiterate that learning plain old C first is probably going to make your life a lot easier…

I use C and C++ (more the latter) and I can tell you that they are not the same. C++ adds so much more that reversing it is likely less beginner-friendly…

3 Likes

thanks guys,
really appreciate all the thoughtful comments.
I will take the advice and learn C first instead of C++! :slight_smile:

1 Like

This topic was automatically closed after 30 days. New replies are no longer allowed.