Blog

capa v4: casting a wider .NET

Willi Ballenthin, Moritz Raabe, Mike Hunhoff, Anushka Virgaonkar
Aug 10, 2022
6 min read
|   Last updated: Aug 10, 2023
Malware
Analysis
Reverse Engineering

We are excited to announce version 4.0 of capa with support for analyzing .NET executables. This open-source tool automatically identifies capabilities in programs using an extensible rule set. The tool supports both malware triage and deep dive reverse engineering. If you have not heard of capa before, or need a refresher, check out our first blog post. You can download capa v4.0 standalone binaries from the project’s release page and checkout the source code on GitHub

capa 4.0 adds major new features that extends its ability to analyze and reason about programs. This blog post covers the following improvements included in capa 4.0: 

  • A new analysis backend that supports .NET executables, allowing you to analyze malware families such as Dark Crystal RAT and JUNKMAIL/DoubleZero. 

  • A new analysis scope restricts rule evaluation to individual instructions, enabling rule authors to inspect the specific mnemonic and operand combinations used throughout programs. 

  • Clarification of the rule set release process, including major version tagging of rules, so that you can easily update even if you are running an outdated version of capa. 

  • A collection of breaking changes that enable capa to both run faster and represent more types of results. 

.NET Support 

capa v4.0 is the first version to support analyzing .NET executables. With this new feature, we updated 49 existing rules and added 22 new rules to capture capabilities in .NET files. Read on to understand how we extended capa with .NET support.  

Open-Source Contributions 

Adding .NET support to capa provided the FLARE team an opportunity to contribute to open-source .NET analysis projects. We merged new features and updates to dnfile, a Python library that parses metadata found in .NET executables. We also released dncil, a Python library that disassembles Common Intermediate Language (CIL) instructions. dncil parses managed method headers, CIL instructions, and exception handlers, exposing the data through an object-oriented API to help you quickly build CIL analysis tools using the library. 

.NET Feature Extraction 

.NET is a platform for building managed applications executed by the Common Language Runtime (CLR). These applications can be written in high-level languages including C#, VB.NET, and F# and are compiled to low-level CIL instructions. These CIL instructions are included in the .NET file alongside metadata used by the CLR to execute them. Figure 1 shows an example “Hello World” method compiled from C# to CIL. 

Figure 1: Hello World in C# and CIL
Figure 1: Hello World in C# and CIL

capa parses features from the metadata and CIL instructions stored in .NET executables. For example, capa uses dnfile to parse the .NET MethodDef metadata table that describes all the managed methods defined in a .NET file. Each table entry includes the offset of the method’s body containing its CIL instructions. capa then uses dncil to disassemble each method body and extracts features from the CIL instructions. Figure 2 shows a breakdown of the features capa extracts from our example “Hello World” method.

Figure 2: .NET features extracted from Hello World program
Figure 2: .NET features extracted from Hello World program

capa addresses .NET features using their method token and instruction offset, e.g., token(0x6000001)+0x6. This helps reverse engineers to quickly navigate and inspect interesting methods using .NET analysis tools like dnSpy.

Figure 3 shows decompiled C# code from the .NET malware Mandiant tracks as DOTWRAP. This code hides the console window and reads data from a file named config.txt.

Figure 3: Decompiled DOTWRAP .NET malware sample
Figure 3: Decompiled DOTWRAP .NET malware sample

Figure 4 shows the features capa extracts from the above example code (this output can be obtained via the scripts/show-features.py helper script). The avid reader may recognize the namespace, class, api, string, and number features.

Figure 4: Extracted .NET features for the DOTWRAP malware sample
Figure 4: Extracted .NET features for the DOTWRAP malware sample

With a simple addition of the System.IO.File::ReadAllLines API call to an existing rule, capa now detects the read file on Windows capability in the DOTWRAP malware sample (see Figure 5).

Figure 5: read file on Windows rule match in the DOTWRAP malware sample
Figure 5: Read file on Windows rule match in the DOTWRAP malware sample

As shown in Figure 5, capa also extracts the two API calls kernel32.GetConsoleWindow and user32.ShowWindow. These are native Windows API functions called from the backdoor’s managed code using a technology called Platform Invoke (P/Invoke). The .NET ImplMap metadata table describes the native functions that can be called from managed code using P/Invoke. Each table entry maps a managed method (MemberForwarded) to a native function. The native function can be executed by calling its MemberForwarded method and P/Invoke handles the details.

capa reads the ImplMap table to chain MemberForwarded methods to their native functions. This enables detecting native capabilities implemented in managed code. So, here we can rely on an existing rule to detect window hiding via native Windows functions. Figure 6 shows the capa match for our example code.

Figure 6: hide graphical window rule match in the DOTWRAP sample
Figure 6: Hide graphical window rule match in the DOTWRAP sample

In version 4 capa extracts and analyzes the following types of features from .NET files:

  • namespace e.g., System.IO
  • class e.g., System.IO.File
  • api and import e.g., System.IO.File::Delete
  • function-name e.g., HelloWorld::Main
  • number
  • string

We have also added two new characteristics to detect .NET executables containing both managed and unmanaged (native) code and calls from managed code to native code, respectively:

  • mixed mode
  • unmanaged call

Future .NET Work

As we write more .NET specific rules and perform more research we expect to expand and enhance the .NET feature support in future versions. If you encounter missing features or have ideas for good additions, please open a discussion in our GitHub repository.

Instruction Scope

We have added a new scope that restricts rule evaluation to individual instructions. With this rule authors can match specific combinations of instruction mnemonics and operands. capa’s updated rule syntax also includes a new operand feature that specifies number and offset values for operands. This enables rule authors to specify flow of data from a source or to a destination – like moving data from a structure or comparing against a constant.

Figure 7 shows an example of using the new instruction scope and operator feature to more reliably detect Adler32 checksum calculation. Note how it’s much clearer that the shr instruction must use the number 0xF as its source operand. This enables capa to match capabilities more accurately and makes rules easier to understand for human readers. With these changes we’ve removed the previously supported /x32 and /x64 flavors of number and operand features.

Figure 7: Old rule features (left) vs. new instruction scope and operand feature (right)
Figure 7: Old rule features (left) vs. new instruction scope and operand feature (right)​​​​​

Rule Release Process and Other Changes

When we introduce new functionality and breaking changes to capa, rules may become incompatible with a certain release or our current development branch. To explain this, we’ve added clarifying documentation that helps users to identify the correct rules branch for their respective capa version. In short, users must use the matching rules branch corresponding to the used capa major version. That is, use v3 rules for the v3 release of capa and v4 rules for the v4 release of capa.

capa now requires Python 3.7 or newer. If you build on top of capa also be aware that we’ve updated the freeze format to store extracted features and the JSON results document to store and exchange capa results. Moreover, the internal representation of addresses changed so that the tool now can express additional context, e.g., .NET tokens and offsets.

Contributing

We look forward to seeing how the new capa functionalities further support the community and encourage you to contribute. Any form of feedback, ideas, and pull requests are welcome. Just open an issue or check out the contributing document to get started.

Rules are the foundation of capa’s identification algorithm. If you have any rule ideas, please open an issue or even better submit a pull request to capa-rules. This way, everyone can benefit from the collective knowledge of our malware analysis community.

Conclusion

The newest improvements add .NET executable analysis support to capa and make its rules even more expressive. The 4.0 capa release also includes bug fixes, new features, improvements to the freeze and JSON results serialization formats, and more than 60 new and updated rules. See the capa changelog for all update details.