kernel_protect: Add VM kernel protection module#43
Draft
SaurabhQC178914 wants to merge 1 commit into
Draft
Conversation
Introduce the kernel_protect module which enforces stage-2 memory access-right restrictions on a protected secondary VM's kernel assets. The feature consists of three hypercalls: - addrspace_protect_vm_assets (hvc 0x6094): Called by the Resource Manager to associate four memory extents (RO, RW, RX, RWX) with a VM's address space. Allocates a kp_mem_info structure (kp_extent_info) as the sole sentinel for whether KP is active on an addrspace; there is no separate kp_enabled flag. - kernel_protect_arm_el1_kernel_asset_info / _compact (SMCCC 0xC606005F / 0xC6060061): Called by the protected VM's EL1 kernel to report the layout of its memory regions (TEXT, RODATA, DATA, BSS, FIXMAP_TEXT_POKE0). The compact variant accepts a single kernel_asset_info structure (132 bytes, packed) covering all regions in one call; the non-compact variant takes one region per call. Both paths use nospec_range_check() on the guest-supplied region_type before using it as an array index to prevent Spectre v1 gadgets. The version field (KP_MAGIC_VERSION = 0xF1000) acts as an idempotency guard preventing the call from being made twice. - kernel_protect_enable_vm_protection (SMCCC 0xC6060060): Called exclusively by the HLOS VM (PVM) to trigger integrity verification of a target VM. The caller passes the target VM's addrspace CapID; the hypervisor looks it up via cspace_lookup_addrspace_any() and sends KP_CHECK_INTEGRITY RM-RPC messages to the Resource Manager. Both HLOS querying another VM and any VM performing a self-check are permitted; all other combinations are rejected with ERROR_DENIED. Write faults to protected regions are routed through vdevice handlers (one each for RX, RO, RW). Faults within the FIXMAP_TEXT_POKE0 exclusion range return VCPU_TRAP_RESULT_EMULATED as a NOP (PC advanced, no instruction decoding) to allow the Linux kernel's live-patch text_poke path to proceed. Faults outside the exclusion range return VCPU_TRAP_RESULT_FAULT. MISRA 2012 compliance: - All assert(vcpu_is_vcpu()) calls replaced with proper error returns to prevent guest-triggerable DoS. - nospec_range_check() used for all guest-supplied array indices. - memset_s() used with return-value check for zero-initialisation. - EXCLUDE_PREEMPT_DISABLED and REQUIRE_READ annotations moved from function definitions to declarations in .ev and .h files. - Commented-out code blocks removed (DIR 4.4). - kp_segment_bounds_valid() refactored to single exit point (Rule 15.5). Signed-off-by: Saurabh Saxena <saursaxe@qti.qualcomm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce the kernel_protect module which enforces stage-2 memory access-right restrictions on a protected secondary VM's kernel assets.
The feature consists of three hypercalls:
addrspace_protect_vm_assets (hvc 0x6094): Called by the Resource Manager to associate four memory extents (RO, RW, RX, RWX) with a VM's address space. Allocates a kp_mem_info structure (kp_extent_info) as the sole sentinel for whether KP is active on an addrspace; there is no separate kp_enabled flag.
kernel_protect_arm_el1_kernel_asset_info / _compact (SMCCC 0xC606005F / 0xC6060061): Called by the protected VM's EL1 kernel to report the layout of its memory regions (TEXT, RODATA, DATA, BSS, FIXMAP_TEXT_POKE0). The compact variant accepts a single kernel_asset_info structure (132 bytes, packed) covering all regions in one call; the non-compact variant takes one region per call. Both paths use nospec_range_check() on the guest-supplied region_type before using it as an array index to prevent Spectre v1 gadgets. The version field (KP_MAGIC_VERSION = 0xF1000) acts as an idempotency guard preventing the call from being made twice.
kernel_protect_enable_vm_protection (SMCCC 0xC6060060): Called exclusively by the HLOS VM (PVM) to trigger integrity verification of a target VM. The caller passes the target VM's addrspace CapID; the hypervisor looks it up via cspace_lookup_addrspace_any() and sends KP_CHECK_INTEGRITY RM-RPC messages to the Resource Manager. Both HLOS querying another VM and any VM performing a self-check are permitted; all other combinations are rejected with ERROR_DENIED.
Write faults to protected regions are routed through vdevice handlers (one each for RX, RO, RW). Faults within the FIXMAP_TEXT_POKE0 exclusion range return VCPU_TRAP_RESULT_EMULATED as a NOP (PC advanced, no instruction decoding) to allow the Linux kernel's live-patch text_poke path to proceed. Faults outside the exclusion range return VCPU_TRAP_RESULT_FAULT.
MISRA 2012 compliance: