Azure Run Command for Dummies
In Mandiant’s recent blog post, we detailed suspected Russian intrusion activity that targeted managed services providers (MSP) to gain access to their customers’ cloud environments. Other companies, such as Microsoft, have observed similarly targeted activity against customers of several cloud and managed service providers. One notable technique from these intrusions is the use of Azure Run Commands to move laterally from managed hypervisors to the MSP customers’ underlying virtual machines.
This latest blog post comes as a supplementary annex to highlight Azure Run Commands and provide guidance for mitigations, hunting, and detection mainly from the perspective of the virtual machines at risk from this type of activity. As other blog posts have focused on Azure logs and permissions and what can be learned from these sources, here we will focus more on what can be learned from the evidence sources inside the virtual machines themselves.
This method of pivoting from Azure to the underlying virtual machines (VMs) is significant for a couple of reasons:
- An Azure account compromise could result in granting full access to underlying Virtual Machines, even when those systems are segmented on the virtual layer.
- Running commands through Azure means there is an easy route for elevated privilege execution, without worrying about connectivity and authorization at the virtual layer.
In most cases, the final goal of the attacker is related to the virtual level and not the hypervisor. For example, an attacker is likely interested in stealing information residing inside the virtual systems or encrypting the data and individual systems themselves. While a hypervisor level compromise may allow attackers to circumvent many host-level detection and response mechanisms, there are still opportunities to identify attackers as they pursue their mission objectives.
Azure Run Commands
The Azure Run Command feature enables administrators to run commands on Azure Windows or Linux virtual machines by leveraging the virtual machine agent. The agent is installed by default on Windows and Linux virtual machines deployed from an Azure Marketplace image but can be disabled.
Azure Run Commands are well documented on the Microsoft website at the following locations:
- Run scripts in your Windows VM by using action Run Commands
- Run scripts in your Linux VM by using action Run Commands
Azure administrators can run them using the Azure Portal user interface, the API, PowerShell, or the Azure command line interface; each of which will be demonstrated as follows. Each operating system has distinct command types that support arbitrary script execution on virtual machines.
- On Windows, the RunPowerShellScript command executes PowerShell on the virtual machine as the SYSTEM user.
- On Linux, the RunShellScript command executes a shell script on the virtual machine as the root user.
These command types must be supplied via the “command-id” parameter when using the Run Command feature.
Via Azure Portal
Figure 1 shows two commands ‘w’ and ‘whoami’ sent to our Linux virtual machine using the Azure Portal web console.
Via Azure CLI
Administrators can execute a PowerShell script by name using the following Azure CLI command.
Azure CLI also accepts individual commands, such as in the following example using the Linux VM “RunShellScript” module.
Via the REST API
The Azure REST API’s runCommand endpoint accepts HTTP POST requests such as the following, with required parameters, defined in the POST request body.
Via PowerShell Cmdlet
The Azure PowerShell cmdlet also includes a command for Azure Run Commands. Note that the proper Azure Context needs to be provided either by running the Connect-AzAccount command or using the -Context parameter.
Invoke-AzVMRunCommand returns a PSRunCommandResult object with the script results in the “Message” field. In Figure 2, the results contain “nt authority\system”, because the script executed a single whoami command.
Limiting Run Command Access
As documented by Microsoft, executing Run Commands requires the Microsoft.Compute/virtualMachines/runCommand/ permission. The Virtual Machine Contributor role and higher levels have this permission. The Virtual Machine Contributor role has the ability to manage the virtual machines but are not allowed to access the virtual machine, storage account, and virtual network it resides in. Performing regular Azure permissions audits or limiting the number of users with this permission would reduce the attack surface of accounts that can execute Run Commands.
An organization can perform an audit of the Virtual Machine Contributor role or other high-level roles with the following PowerShell command.
Since we are mainly concerned with the permission that allow for Azure RunCommand usage, the following script queries all built-in and custom roles created within Azure that contain the specific RunCommand permission.
After an organization has identified all roles that have the ability to execute Run Commands on Virtual Machines, then it can use the previous PowerShell script to obtain the individual users or groups that have said permission. Mandiant recommends having regular entitlement reviews of this and other high-risk permissions with Azure.
Create a Custom Just-in-Time (JIT) Administrative Role for Run Command Permissions
Another method to mitigate the risk and reducing the attack surface of this potential attack is by creating a Just-in-Time (JIT) administrative role that contains the Run Command permissions needed to perform legitimate run command-like actions on a virtual machine. By creating this custom role, we remove the need for a user to have persistent Run Command capabilities on Virtual Machines.
The following sample PowerShell script creates a custom JIT Run Command role for a prescribed subscription and resource group (Note: Replace the subscriptionid and resourcegroupname with the desired organization’s subscription and resource group).
After the custom JIT role has been made, Mandiant recommends assigning the JIT Role to a newly created Azure AD Group, and then managing the access to the newly created group using Azure Privileged Identity Management (PIM).
Removing the VM agent
The most basic task that can be performed to mitigate this functionality is to remove the agent from the underlying VM, however, this may have serious consequences to how the virtual machine is administrated.
Also, it is important to note that by doing so, you are not removing the attacker’s access to the hypervisor level, and they may have other means to steal your data or compromise hosts.
By removing the agent, you are also removing the main means that you can leverage for detecting this feature being used on your virtual machines. For example, if an attacker runs a command from the Azure CLI, you can detect this with proper detections in place. But if the attacker creates a snapshot of your system, this will be undetected. Therefore, it is critical to perform risk analysis to determine how to best proceed. Removing the agent entirely would eliminate a potential vector for the adversary to abuse but would also push the attacker away from your line of sight.
The references at the following locations provide commands to remove the Windows and Linux agents:
Malicious Run Command Usage in the Wild
Mandiant has directly observed commands such as the following while conducting incident response investigations. These commands were executed in several different scripts and showcase an almost full chain of activity from reconnaissance to malicious code execution and credential harvesting.
As we can see from the aforementioned examples, while Run Command usage may be relatively novel, the commands being executed represent known attacker techniques and methodologies. This means traditional detection logic for suspicious commands will still be valid, but these mechanisms can be augmented with the context that they were executed using the Run Command feature.
Detection and Hunting
Forensic Artifacts and File Paths
Detection and Hunting on the affected virtual machines should use a combination of forensic artifacts and logs and differ between Windows and Linux virtual machines.
Windows Virtual Machines
On Windows, the use of the RunPowerShellScript functionality will create related PowerShell logs depending on your logging policy. As such, any typical rules written for malicious activity leveraging these logs or process will provide alerts as normal. For additional resources on generic PowerShell logging, see the Mandiant post, “Greater Visibility Through PowerShell Logging”.
However, in our testing of a default Azure setup, the Azure Activity Log showed that Run Command functionality was used but did not log the contents of the actual script being pushed to the virtual machine for execution. This means that by default logs will show that a Run Command action was performed but not the commands executed within the script.
On Windows virtual machines, the PowerShell scripts are downloaded to the following directory, where the <version number> varies and the <job number> is incremented starting at 0:
Results for Run Command jobs are stored in a status file, with the following path/naming convention:
Figure 5 shows the contents of a status file for a successful script execution with the various script outputs logged in JSON format. These have valuable additional forensic information such as the execution timestamp.
Linux Virtual Machines
On Linux, Azure Run Command creates a directory per job in: /var/lib/waagent/run-command/download/<job number>. Each job directory contains three files: script.sh, stderr, and stdout.
The file script.sh contains the command executed on the host; stderr contains the standard error; and stdout contains the result.
These output files have less metadata than their Windows counterpart, but additional details are recorded in the following location: /var/log/azure/run-command/handler.log. This log includes basic metadata such as the date/time of run command executions, but it does not include the script contents or results.
Commands that failed will also be logged in this file, such as the excerpt shown in Figure 9.
Identifying Process Anomalies
Windows Virtual Machines
RunPowerShellScript activity creates a very distinct process tree under the WindowsAzureGuestAgent.exe process. Figure 10 shows the processes created after execution of whoami using RunPowerShellScript.
With the parent/child relationship detection, engineers can create rules to detect commands, which may not be suspicious on their own, but would rarely be run through the aforementioned process tree. For example, whoami.exe may be common and benign in most cases, but suspicious if launched as a child of powershell -ExecutionPolicy Unrestricted -File script.ps1, which would indicate potential reconnaissance. Commands with commonly malicious usage, such as ntdsutil.exe, should be of high concern if seen as a child process of WindowsAzureGuestAgent.exe.
If any usage of RunPowerShellScript is suspicious in your organization, you can monitor any sub-processes of WindowsAzureGuestAgent and whitelist commands as needed for legitimate management activity.
In addition to process anomalies, you can monitor file creation/modification events in the directories used by Azure Run Commands. If your monitoring solution logs the full file content for these events, it will show the actual commands being executed which can also be reviewed for malicious commands or anomalous usage.
Linux Virtual Machines
On Linux, Azure run commands can be detected by looking for command lines such as the following:
Furthermore, parent-child relationships can be used to hunt for suspicious processes that have a parent process command line following this syntax:
Similar to Windows, file monitoring solutions can be used to trigger alerts on file creations in the /var/lib/waagent/run-command/download directory, and match suspicious content if this is feasible.
Azure-Level Detection and Hunting
In the Azure Portal, the Activity Logs for each VM will capture Azure Run Command executions. Azure Activity Logs can be found in numerous locations, depending on the environment, including but not limited to an Azure Sentinel instance, Microsoft Defender Advanced Hunting, a Log Analytics Workspace, or on the VM Azure Portal page itself.
The following is an example alert for a Run Command action. This log shows that the “runCommand” for the virtual machine named testvm (1) was actioned and successfully executed via the user@contoso[.]com (2) account from IP address 127.0.0[.]1 (3).
Unfortunately, Activity Logs do not include the command that was executed which would be required for further investigation. Regardless, these logs can be helpful to create baselines of expected Run Command usage in an environment to find anomalous and suspicious activity.
The goal of this blog post was to focus especially on the virtual machine side of this activity, and Microsoft has already provided excellent guidance on hunting for suspicious Azure Run Commands in your tenant.
As shown in our recent post, hypervisor level compromises afford threat actors with many opportunities for rapidly and stealthily exploiting hosted systems and applications. In this blog post, we discussed how one such method, Azure Run Command, is being leveraged by attackers and how defenders can hunt for, detect, and mitigate malicious usage of it.
Hypervisor compromises yield a high level of access to hosted systems and data, which are commonly an attacker’s primary objective in an intrusion. This evolution of intrusion methodology requires a thorough understanding of built-in cloud asset management functionality that can be abused by attackers. More importantly, this activity reinforces the need for comprehensive visibility and a vigilance for anomalous activity within environments.
Abusing the hypervisor to virtual machine interaction via Azure Run Commands can lead to full lifecycle adversary interaction including all stages of ATT&CK. However, the ATT&CK Tactics and Techniques observed include but are not limited to:
Initial Access External Remote Services (T1133)
Initial Access Valid Accounts (T1078)
Execution Command and Scripting Interpreter (T1059)
Persistence External Remote Services (T1133)
Persistence Valid Accounts (T1078)
Privilege Escalation Valid Accounts (T1078)
Defense Evasion (TA0005)
The authors would like to thank Alyssa Rahman and Matthew Dunwoody for their valuable feedback and technical review.