Abnormal .data section size leads to AV detection

bytecode77 · September 5, 2021, 11:14am

Background: I’m currently working on a packer. I’m well familiar with programming in C & assembly. I’ve also invested a lot of time in understanding AV and detection mechanisms.

During my tests on AV detections, I’ve been hung up on detections due to a large .data section.

These are my test results with a non-malicious executable that just displays a “Hello World” dialog. I’m using AntiScan[.]me to scan:

Compiled with MSVC:

Just “Hello World” => 2/30
Adding 100KB repeating bytes into the data section => 3/30

The results tell me, that a big .data section does not necessarily trigger AV detection.

Compiled with FASM:

A simple “Hello World” => 2/30
2 KB repeating bytes in .data section => 4/30
10 KB repeating bytes in .data section => 5/30
100 KB => 13/30 (mostly Gen:Trojan.Heur.FU.bu0@aSB4pHf)

My assumption is: FASM is used more often than C++ for malware. Therefore, static signature detection (AI) is trained to detect combined traits of a FASM compiled executable that also has a big .data section

My attempts to figure out where static detection kicks in:

Make sure, that the checksum is valid (it was)
Change a lot of the IMAGE_OPTIONAL_HEADER properties to match those of the C++ executable
Copy contents of the .text .data and .rdata section of the C++ executable over to the FASM code

So, from what I can observe is that this is a static detection that is caused by a big .data section, but it is prelevant in FASM executables, but not in MSVC executables.

My question would be: Is there any preferred methodology to isolate detection indicators - and what causes the mass amount of false positives in FASM executables?

c0z · September 6, 2021, 1:58pm

Here’s your stackoverflow post: assembly - FASM executables & AV false positives - Stack Overflow
Here’s the same question asked on FASM board without your hypothesis about the .DATA section: flat assembler - PE formatter modification that removes fake virus detection
Here’s another thread talking about the LACK of information being used by AV’s with whatever the detection matrix is: flat assembler - Hello world FASM program detected as virus. Why?

To answer your question you should setup a couple VMs, install the endpoint detection tool, then use frida and api monitor to do dynamic analysis by hooking the specific engines functions that deal with detection. Then you can see when it detects, and you can trace back why it triggers.

https://book.hacktricks.xyz/mobile-apps-pentesting/android-app-pentesting/frida-tutorial

bytecode77 · September 7, 2021, 11:37am

Frida is probably worth a shot. Thanks for the suggestion!

Though, until now I have managed to decrease detection to the point where only the big .data section raises a generic flag. That’s why I haven’t hooked up profilers of debuggers to AV, yet - because it’s a considerably long process compared to some quick “changes and scan”.

And yes, I’m aware that AV’s use strange assumptions about what is suspocious and what isn’t…

In the meantime, I have figured out that an entry point that’s too far from the .text section beginning is raising this flag. Would have never thought about that. The reason why the entry point is not at the beginning of .text, is that the second stage of my packer is located at the very beginning of the executable to have the same image base when it’s decrypted directly in the section itself. So, I will have to write the second stage in a way that it does not even use absolute addresses.

But, yeah… I will get familiar with frida, because I want to directly see what AV’s do, rather than trial and erroring everything.

c0z · September 9, 2021, 5:12am

If you’re going to go through with testing FRIDA when you have time try integrating this into your application workflow.

“Fermion is an electron application that wraps frida-node and monaco-editor. It offers a fully integrated environment to prototype, test and refine Frida scripts through a single UI. With the integration of Monaco come all the features you would expect from Visual Studio Code: Linting, IntelliSense, keybindings, etc. In addition, Fermion has a TypeScript language definition for the Frida API so it is easy to write Frida scripts.”- Fermion

On the note of the .DATA section triggering detection of the EntryPoint in the .TEXT section being too far from the expected default probably trips because the EDR is not necessarily looking for that exact heuristic but something close with hooked functions with immediate redirection of execution with injected processes. As this is a hypothesis I should test that when I have time.

bytecode77 · September 9, 2021, 3:09pm

If anything is automatable, I’m on it instantly Though, I’m not sure how to cover this in my workflow, because I’ll have to get familiar with Frida first.

Thank you all for the advice about how to analyze AV detections! Really appreciate this, because it’ll help in the future, not just for this particular situation

system · January 5, 2022, 3:14am

This topic was automatically closed after 121 days. New replies are no longer allowed.