Skip to content

CyberSnakeH/SmileHook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SmileHook

Inline function hooking for Linux on x86-64, in the spirit of MinHook.

license platform rust

SmileHook redirects a function by overwriting its prologue with a jump to a detour, while keeping the original reachable through a trampoline. It is an in-process library — it patches code running in the current process — and is written in safe-by-default Rust with every unsafe block justified. Injecting code into another process is a separate concern (see the roadmap).

  • 14-byte absolute jump (jmp [rip+0] + address) — no ±2 GiB range limit.
  • Faithful trampoline: stolen instructions are relocated with iced-x86, including rewriting relative branches that no longer reach their target.
  • Thread-safe patching: every enable/disable freezes the other threads and relocates any caught instruction pointer and stack return address, so a concurrent caller never observes a half-written prologue.
  • Atomic batch apply: flip many hooks under a single freeze.
  • Hooks by address or by symbol name (via dlsym).
  • Refuses unsafe targets: a function whose prologue is shorter than the jump is rejected instead of corrupting the next function.
  • C / C++ ABI: link the static library and hook from a native payload.

Status: experimental, x86-64 Linux only. The engine is well tested (see Robustness), but treat it as a 0.x library.

Quick start

use std::sync::atomic::{AtomicPtr, Ordering};

// Holds the trampoline so the detour can call the original.
static ORIGINAL: AtomicPtr<u8> = AtomicPtr::new(std::ptr::null_mut());

extern "C" fn add(a: i32, b: i32) -> i32 { a + b }

extern "C" fn add_detour(a: i32, b: i32) -> i32 {
    let original: extern "C" fn(i32, i32) -> i32 =
        unsafe { std::mem::transmute(ORIGINAL.load(Ordering::SeqCst)) };
    original(a, b) + 100
}

fn main() -> smilehook::Result<()> {
    let registry = smilehook::global();
    let target = add as *const () as *mut u8;

    // SAFETY: `add_detour` is ABI-compatible with `add`.
    let trampoline = unsafe { registry.create(target, add_detour as *const () as *const u8)? };
    ORIGINAL.store(trampoline as *mut u8, Ordering::SeqCst);

    unsafe { registry.enable(target)? };   // calls to `add` now route through the detour
    assert_eq!(add(1, 2), 103);
    unsafe { registry.disable(target)? };  // restore the original
    assert_eq!(add(1, 2), 3);
    Ok(())
}

A detour must be ABI-compatible with its target. The original is always reached through the trampoline returned by create, never by calling the target again (that would re-enter the detour).

Two layers

  • Hook — a single RAII hook. Disabled and freed automatically on drop. Use it for scoped, one-off hooks.
  • Registry — a thread-safe collection of hooks keyed by target address. It rejects duplicates and supports bulk and batched operations. A process-wide instance is available through smilehook::global().

How it works

  1. Decodeiced-x86 decodes the target's prologue so that only whole instructions are stolen (at least 14 bytes).
  2. Relocate — those instructions are re-encoded into an executable page (mmap), followed by a jump back to target + N. Calling the trampoline is equivalent to calling the original.
  3. Patch — the prologue is overwritten with a 14-byte absolute indirect jump: FF 25 00 00 00 00 (jmp qword ptr [rip + 0]) followed by the 8-byte detour address. Unlike an E9 rel32 jump, this form has no range limit.
  4. Freeze — every enable, disable, and drop happens under a stop-the-world freeze (see Safety and concurrency), so concurrent callers never execute a partially patched prologue.

iced-x86's BlockEncoder handles the subtle part: if a stolen relative call/jmp can no longer reach its target from the trampoline, it is rewritten as an indirect jump with the absolute address stored out of line. The jump back is handed to the encoder as a block instruction (not appended by hand) so the layout stays correct. Because the encoder does not report a usable offset for an instruction it rewrites this way, SmileHook re-derives each relocated offset by decoding the encoded trampoline, and records a boundary map (original offset → trampoline offset) used to fix up any thread frozen mid-prologue.

Hooking by symbol name

symbol_address resolves an exported function through dlsym(RTLD_DEFAULT, …), and Registry::create_by_symbol hooks it in one step — handy for intercepting library functions such as glXSwapBuffers or vkQueuePresentKHR:

let trampoline = unsafe {
    smilehook::global().create_by_symbol("strlen", my_strlen as *const () as *const u8)?
};

Atomic batch apply

enable_all / disable_all freeze the process once per hook. To flip many hooks together and pay for one stop-the-world instead of N, queue the changes and apply them in a single freeze (the analogue of MinHook's MH_ApplyQueued):

let reg = smilehook::global();
reg.queue_enable(target_a)?;
reg.queue_disable(target_b)?;          // mixed enables and disables are fine
unsafe { reg.apply_queued()? };        // all take effect under one freeze

queue_* only update bookkeeping (so they are safe); the live patch happens in apply_queued. Hooks already in their requested state are skipped.

C / C++ API

SmileHook also builds as a static library exposing a C ABI that mirrors MinHook's, so a native payload (for example a C++ overlay LD_PRELOADed into a target) can install hooks without writing any Rust. The crate's [lib] type includes staticlib, so a release build emits target/release/libsmilehook.a; the matching declarations are in include/smilehook.h, and a runnable example is in examples/c_api.c.

#include "smilehook.h"

typedef int (*add_fn)(int, int);
static add_fn original;

int add_detour(int a, int b) { return original(a, b) + 100; }

void install(void *add) {
    sh_create_hook(add, (const void *)add_detour, (void **)&original);
    sh_enable_hook(add);   /* calls to `add` now route through the detour */
}
cargo build --release      # -> target/release/libsmilehook.a
cc -O0 -Iinclude examples/c_api.c target/release/libsmilehook.a \
   -lpthread -ldl -lm -o c_api && ./c_api

Every sh_* call returns a status code (SH_OK, or a negative SH_ERR_*); sh_strerror turns one into a static message. The entry points catch any internal Rust panic and return SH_ERR_OTHER rather than unwinding across the boundary, so a fault in the hooking engine cannot abort the host process.

Architecture

src/
├── lib.rs       crate root, docs, public re-exports
├── error.rs     Error + Result
├── memory.rs    ExecBuffer (executable mmap) and patch_code (mprotect)
├── arch/
│   ├── mod.rs     ISA dispatch (x86-64 only; compile error otherwise)
│   └── x86_64.rs  prologue decoding/relocation, jump generation, boundary map
├── freeze.rs    stop-the-world thread suspension during a patch
├── symbol.rs    symbol_address — resolve a function by name via dlsym
├── ffi.rs       C ABI bridge (sh_* functions) for non-Rust payloads
├── hook.rs      Hook — a single RAII hook
└── registry.rs  Registry — thread-safe multi-hook registry + global()

include/
└── smilehook.h  C/C++ header matching the ffi.rs bridge

ISA-specific logic (decode, relocate, jump emission) is isolated in arch::x86_64; the rest of the crate is architecture-agnostic.

Safety and concurrency

Installing a hook patches live, executable code, so the detour must be ABI-compatible with the target.

The 14-byte write is not atomic, and Linux has no in-process "suspend thread" primitive, so SmileHook makes patching safe against concurrent execution with a stop-the-world freeze. During every enable, disable, and drop the patching thread:

  1. enumerates the other threads via /proc/self/task and sends each a real-time signal (tgkill with SIGRTMIN+4);
  2. waits until all of them have parked inside an async-signal-safe handler (threads that exit before parking are dropped from the wait set);
  3. relocates any thread caught in the affected bytes — both its saved instruction pointer and return addresses on its stack that point into the stolen prologue (a call in the prologue would otherwise return onto patched bytes) — using the relocation boundary map;
  4. applies the patch and releases the threads.

A thread that blocks the freeze signal makes the operation fail with Error::FreezeTimeout rather than hang. Registry operations are additionally serialized by a mutex.

One obligation remains the caller's: do not remove or drop a hook while its detour may still be executing, since that frees the trampoline it calls into.

Robustness

Two test suites exercise the relocation engine over many prologues, not just a handful:

  • Corpus soundness runs the relocator over ~130 real glibc/pthread functions — vectorized string/memory ops, stdio, math, threading — and asserts every relocation is structurally and semantically sound (well-formed boundary map, no sentinel offsets, plain instructions copied byte-for-byte, a valid jump back) or is cleanly refused. It never patches the live functions, so it is safe to run over arbitrary code.
  • Differential fuzzing hooks synthetic functions of varied prologue shapes (leaf, early call, large frame, branches, SSE, many locals) with thousands of random inputs each, checking that a transparent detour reproduces the original exactly and a transforming detour composes correctly.

Functions whose optimized prologue is shorter than the 14-byte jump (e.g. the endbr64; mov; syscall; ret syscall wrappers) are refused with Error::PrologueTooShort rather than allowed to overwrite the adjacent function.

Building and testing

No injector is required — the tests hook functions inside the test process.

cargo build --release      # also emits target/release/libsmilehook.a
cargo test                 # hook, registry, symbol, ffi, thread-safety, fuzz
cargo test --release       # the same, with optimized codegen
cargo run --example basic  # multi-hook registry demo
cargo run --example diag   # disassemble a target prologue and its trampoline

tests/thread_safety.rs is a stress test: worker threads hammer a target in a tight loop while the main thread toggles the hook hundreds of times (singly and in atomic batches). Every observed return value must be either the original or the hooked one — never a torn result.

Status

  • x86-64 inline hook (14-byte absolute jump)
  • Trampoline with full instruction relocation and a boundary map for instruction-pointer / return-address fix-ups
  • RAII Hook: enable / disable / restore on drop
  • Thread-safe multi-hook Registry + a process-wide global()
  • Thread-safe patching: stop-the-world freeze with IP and stack return-address relocation
  • Atomic batch apply: queue_enable / queue_disable (+ _all) and apply_queued
  • Hooking by symbol name via dlsym
  • Refuses to hook a function shorter than the jump (PrologueTooShort)
  • Panic-safe C / C++ ABI bridge (staticlib + include/smilehook.h)
  • Relocation proven over a real-libc corpus plus differential fuzzing

Roadmap

  • Hooking by /proc/self/maps module + offset lookup
  • 32-bit x86 support
  • AArch64 support (the engine is ISA-specific)

Platform notes

  • x86-64 only for now; building on another architecture fails with a clear compile error.
  • Under SELinux in Enforcing mode, the executable trampoline mmap and the .text mprotect need execmem / execmod, which an unconfined_t process has by default. A confined process would need a policy adjustment.

License

Released under the MIT License.

About

MinHook-style inline function hooking for Linux x86-64 — detours & trampolines in Rust, thread-safe, with a C/C++ API. Works in-process or via injection.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors