capa v4: casting a wider .NET
We are excited to announce version 4.0 of capa with support for analyzing .NET executables. This open-source tool automatically identifies capabilities in programs using an extensible rule set. The tool supports both malware triage and deep dive reverse engineering. If you have not heard of capa before, or need a refresher, check out our first blog post. You can download capa v4.0 standalone binaries from the project’s release page and checkout the source code on GitHub.
capa 4.0 adds major new features that extends its ability to analyze and reason about programs. This blog post covers the following improvements included in capa 4.0:
A new analysis backend that supports .NET executables, allowing you to analyze malware families such as Dark Crystal RAT and JUNKMAIL/DoubleZero.
A new analysis scope restricts rule evaluation to individual instructions, enabling rule authors to inspect the specific mnemonic and operand combinations used throughout programs.
Clarification of the rule set release process, including major version tagging of rules, so that you can easily update even if you are running an outdated version of capa.
A collection of breaking changes that enable capa to both run faster and represent more types of results.
capa v4.0 is the first version to support analyzing .NET executables. With this new feature, we updated 49 existing rules and added 22 new rules to capture capabilities in .NET files. Read on to understand how we extended capa with .NET support.
Adding .NET support to capa provided the FLARE team an opportunity to contribute to open-source .NET analysis projects. We merged new features and updates to dnfile, a Python library that parses metadata found in .NET executables. We also released dncil, a Python library that disassembles Common Intermediate Language (CIL) instructions. dncil parses managed method headers, CIL instructions, and exception handlers, exposing the data through an object-oriented API to help you quickly build CIL analysis tools using the library.
.NET Feature Extraction
.NET is a platform for building managed applications executed by the Common Language Runtime (CLR). These applications can be written in high-level languages including C#, VB.NET, and F# and are compiled to low-level CIL instructions. These CIL instructions are included in the .NET file alongside metadata used by the CLR to execute them. Figure 1 shows an example “Hello World” method compiled from C# to CIL.
capa parses features from the metadata and CIL instructions stored in .NET executables. For example, capa uses dnfile to parse the .NET
MethodDef metadata table that describes all the managed methods defined in a .NET file. Each table entry includes the offset of the method’s body containing its CIL instructions. capa then uses dncil to disassemble each method body and extracts features from the CIL instructions. Figure 2 shows a breakdown of the features capa extracts from our example “Hello World” method.
capa addresses .NET features using their method token and instruction offset, e.g.,
token(0x6000001)+0x6. This helps reverse engineers to quickly navigate and inspect interesting methods using .NET analysis tools like dnSpy.
Figure 3 shows decompiled C# code from the .NET malware Mandiant tracks as DOTWRAP. This code hides the console window and reads data from a file named
Figure 4 shows the features capa extracts from the above example code (this output can be obtained via the scripts/show-features.py helper script). The avid reader may recognize the
As shown in Figure 5, capa also extracts the two API calls
user32.ShowWindow. These are native Windows API functions called from the backdoor’s managed code using a technology called Platform Invoke (P/Invoke). The .NET
ImplMap metadata table describes the native functions that can be called from managed code using P/Invoke. Each table entry maps a managed method (
MemberForwarded) to a native function. The native function can be executed by calling its
MemberForwarded method and P/Invoke handles the details.
capa reads the
ImplMap table to chain
MemberForwarded methods to their native functions. This enables detecting native capabilities implemented in managed code. So, here we can rely on an existing rule to detect window hiding via native Windows functions. Figure 6 shows the capa match for our example code.
In version 4 capa extracts and analyzes the following types of features from .NET files:
We have also added two new characteristics to detect .NET executables containing both managed and unmanaged (native) code and calls from managed code to native code, respectively:
- mixed mode
Future .NET Work
As we write more .NET specific rules and perform more research we expect to expand and enhance the .NET feature support in future versions. If you encounter missing features or have ideas for good additions, please open a discussion in our GitHub repository.
We have added a new scope that restricts rule evaluation to individual instructions. With this rule authors can match specific combinations of instruction mnemonics and operands. capa’s updated rule syntax also includes a new
operand feature that specifies
offset values for operands. This enables rule authors to specify flow of data from a source or to a destination – like moving data from a structure or comparing against a constant.
Figure 7 shows an example of using the new
instruction scope and
operator feature to more reliably detect Adler32 checksum calculation. Note how it’s much clearer that the
shr instruction must use the number
0xF as its source operand. This enables capa to match capabilities more accurately and makes rules easier to understand for human readers. With these changes we’ve removed the previously supported
/x64 flavors of
Rule Release Process and Other Changes
When we introduce new functionality and breaking changes to capa, rules may become incompatible with a certain release or our current development branch. To explain this, we’ve added clarifying documentation that helps users to identify the correct rules branch for their respective capa version. In short, users must use the matching rules branch corresponding to the used capa major version. That is, use v3 rules for the v3 release of capa and v4 rules for the v4 release of capa.
capa now requires Python 3.7 or newer. If you build on top of capa also be aware that we’ve updated the freeze format to store extracted features and the JSON results document to store and exchange capa results. Moreover, the internal representation of addresses changed so that the tool now can express additional context, e.g., .NET tokens and offsets.
We look forward to seeing how the new capa functionalities further support the community and encourage you to contribute. Any form of feedback, ideas, and pull requests are welcome. Just open an issue or check out the contributing document to get started.
Rules are the foundation of capa’s identification algorithm. If you have any rule ideas, please open an issue or even better submit a pull request to capa-rules. This way, everyone can benefit from the collective knowledge of our malware analysis community.
The newest improvements add .NET executable analysis support to capa and make its rules even more expressive. The 4.0 capa release also includes bug fixes, new features, improvements to the freeze and JSON results serialization formats, and more than 60 new and updated rules. See the capa changelog for all update details.