简体中文 / [English]


An Introduction to CPU Virtualization

 

This article is currently an experimental machine translation and may contain errors. If anything is unclear, please refer to the original Chinese version. I am continuously working to improve the translation.

In our daily development, we frequently use VPS services from major cloud providers (such as Alibaba Cloud, GCP, etc.) and virtual machines on our personal computers (like VMware, PVE, etc.).

As application-level developers, we can enjoy the benefits of virtualization without needing to understand the underlying implementation details. When it comes to the core CPU virtualization technology, we often gloss over it with just a brief mention.

Recently, out of curiosity, I looked up more resources and have compiled some information on CPU virtualization here.

Overview

Without hardware virtualization support, the x86 instruction set has four privilege levels: Ring0 to Ring3. The OS kernel runs at Ring0, while user-space programs run at Ring3. The virtual memory mechanism provides each process with an isolated address space.

In this scenario, due to the lack of CPU instructions capable of isolating multiple kernel states, running multiple independent operating systems on the same machine becomes challenging. QEMU achieves full system simulation entirely in user mode through recompilation, but at the cost of significant performance overhead. (I even built a purely interpreted “youth edition” of QEMU once.)

To address this issue, both Intel and AMD introduced their own hardware virtualization solutions—VT and SVM, respectively. The following discussion will focus on Intel’s VT-x.

In VT-x, Intel introduced a series of instructions to support hardware virtualization.

Intel MnemonicDescription
INVEPTInvalidate Translations Derived from EPT
INVVPIDInvalidate Translations Based on VPID
VMCALLCall to VM Monitor
VMCLEARClear Virtual-Machine Control Structure
VMFUNCInvoke VM function
VMLAUNCHLaunch Virtual Machine
VMRESUMEResume Virtual Machine
VMPTRLDLoad Pointer to Virtual-Machine Control Structure
VMPTRSTStore Pointer to Virtual-Machine Control Structure
VMREADRead Field from Virtual-Machine Control Structure
VMWRITEWrite Field to Virtual-Machine Control Structure
VMXOFFLeave VMX Operation
VMXONEnter VMX Operation

Interaction between VMM and GuestInteraction between VMM and Guest

VT-x introduces two additional CPU modes:

  1. VMX Root Operation Mode: Entered via the VMXON instruction, where the host becomes the Hypervisor.
  2. VMX Non-Root Operation Mode: The mode in which the Guest operates. Entered from VMX Root mode via VMLAUNCH. This mode restricts access to sensitive resources—any such access triggers a VM exit, returning control to VMX Root.

The VMX lifecycle is as follows:

  • Software enters VMX operation mode by executing the VMXON instruction.
  • The VMM can enter a Guest VM via VM entries (only one VM can run at a time). The VMM uses VMLAUNCH (for the first entry) and VMRESUME (to resume from VMM) to initiate a VM entry, and regains control through VM exits.
  • Upon a VM exit, control is transferred to a VMM-specified entry point. After handling the reason for the exit, the VMM resumes the VM via another VM entry.
  • When the VMM wants to stop and exit VMX operation mode, it executes the VMXOFF instruction.

VMX Root and Non-Root ModesVMX Root and Non-Root Modes

VMX Root is analogous to Ring0 privilege, while the Guest running in VMX Non-Root is similar to Ring3. Access requests to privileged resources (such as privileged registers) from the Guest are handed over to the VM-exit handler in VMX Root. The VMX Root can schedule and manage multiple Guests concurrently.

VMCS

The VMCS (Virtual Machine Control Structure) is a data structure used to store the state of both the Guest and the Host. It primarily holds register states (for saving and restoring execution context) and various control flags that govern VM behavior.

The VMCS cannot be accessed directly; instead, it must be read from or written to using the VMREAD and VMWRITE instructions.

VMCSVMCS

VMCSVMCS

EPT

EPT (Extended Page Table) functions similarly to regular page tables by providing address translation.

The CPU can translate physical memory accesses in VMX Non-Root mode into accesses to actual host physical addresses according to the EPT mappings.

In typical virtual machines, each Guest’s physical address space is independently mapped to different regions of the host’s physical memory, allowing them to run isolated and independently.

EPT TranslationEPT Translation

Like regular page tables on the host, EPT also supports access control (Read, Write, Execute). This access control capability has proven crucial in hypervisor-assisted debugging: by removing RWE permissions from specific memory regions of a running program, any access to those regions triggers a VM exit. This allows tracing the behavior of the target program and enables further manipulation.

More Practical Applications

Hardware virtualization was initially used primarily to run multiple operating systems on a single bare-metal machine. In recent years, it has also found applications in software debugging, binary security, and other domains:

  • Common virtualization platforms (e.g., VMware, VirtualBox, Hyper-V, PVE, ESXi)
  • Hypervisor-assisted debuggers like HyperDbg
  • Operating systems and antivirus software leveraging Virtualization-Based Security (e.g., sandboxes)

Further Reading

CTF Wiki’s introduction to virtualization: https://ctf-wiki.org/pwn/virtualization/basic-knowledge/cpu-virtualization/#reference

Writing your own Hypervisor: https://rayanfam.com/topics/hypervisor-from-scratch-part-1

This article is licensed under the CC BY-NC-SA 4.0 license.

Author: lyc8503, Article link: https://blog.lyc8503.net/en/post/hypervisor-explore/
If this article was helpful or interesting to you, consider buy me a coffee¬_¬
Feel free to comment in English below o/