To complicated reverse engineering and detection, malware will sometime implement a custom virtual machine. This differs from the kind of virtual machine you might use to analyze malware, which creates an entire virtual computer and operating system. Instead, the malware might create a virtual CPU within its own process, which enables it to execute a custom machine code or scripting language.

CPUs only understand a single language: machine code, which is often referred to as native code. An x86 CPU only understands x86 machine code, and an x86_64 only understand x86_64 machine code. Every other language must either be compiled to native code, or translated to native code. For example, C++ compiles to Assembly language, which is then assembled to native code. Assembly language is a 1:1 human-readable representation of machine code. This means every assembly instruction maps to a single machine instruction.

Programming languages like C# and Java are not compiled to native code, instead, they’re compiled to custom machine code created by the language developers. These languages require a virtual machine to be installed (.NET Framework for C# and JVM for Java), which translates this custom machine code to native code.

You may be wondering what the purpose of this is. Well, the answer is code portability. Native code varies from CPU to CPU. ARM, MIPS, x86, and PowerPC are all different architecture which run difference machine code (instruction sets). Furthermore, native executable file formats differ from operating system to operating system. You can’t run ELF files on Windows, or exes on Mac.

If a developer wanted to support multiple operating systems and CPU architectures, they’d have to compile their code for every distinct platform and architecture. With virtualized languages, the language developer can instead write a virtual machine for every platform and architecture. Since the virtual machine uses the same custom language regardless of platform, the same application will work on many different platforms. This enables developers to focus on maintaining a single application, and just let the virtual machine handle the rest.

Since virtual machines allow developers to create and run code written in custom programming languages, malware developer sometimes use this to their advantage. If malware implements its own custom code, security analysts now need to figure out how it works before they can even start reverse engineering the software. While there’s plenty of tools for working with Assembly languages, C#, .NET, Python, and so on, if it’s a brand-new language, analysts may have to create brand-new tools.

Typically, malware developers will use commercial tools such as VMProtect, which are used by legitimate applications, therefore preventing security companies from simply just writing rules to detect the virtual machine itself. While some malware does implement its own custom VMs, it’s not super common. However, since tools like VMProtect are extremely complex to reverse, these challenges are built using much simpler custom virtual machines.

VM 1

Lab Type:
Static Analysis
Languages:
x86_64
Platform:
Windows 64-bit
Difficulty:
An introduction to working with regular text strings in portable executables. Great for beginners who've never done reverse engineering before.