Inline function hooking for Linux on x86-64, in the spirit of MinHook.
SmileHook redirects a function by overwriting its prologue with a jump to a
detour, while keeping the original reachable through a trampoline. It is an
in-process library — it patches code running in the current process — and is
written in safe-by-default Rust with every unsafe block justified. Injecting
code into another process is a separate concern (see the roadmap).
- 14-byte absolute jump (
jmp [rip+0]+ address) — no ±2 GiB range limit. - Faithful trampoline: stolen instructions are relocated with
iced-x86, including rewriting relative branches that no longer reach their target. - Thread-safe patching: every enable/disable freezes the other threads and relocates any caught instruction pointer and stack return address, so a concurrent caller never observes a half-written prologue.
- Atomic batch apply: flip many hooks under a single freeze.
- Hooks by address or by symbol name (via
dlsym). - Refuses unsafe targets: a function whose prologue is shorter than the jump is rejected instead of corrupting the next function.
- C / C++ ABI: link the static library and hook from a native payload.
Status: experimental, x86-64 Linux only. The engine is well tested (see Robustness), but treat it as a 0.x library.
use std::sync::atomic::{AtomicPtr, Ordering};
// Holds the trampoline so the detour can call the original.
static ORIGINAL: AtomicPtr<u8> = AtomicPtr::new(std::ptr::null_mut());
extern "C" fn add(a: i32, b: i32) -> i32 { a + b }
extern "C" fn add_detour(a: i32, b: i32) -> i32 {
let original: extern "C" fn(i32, i32) -> i32 =
unsafe { std::mem::transmute(ORIGINAL.load(Ordering::SeqCst)) };
original(a, b) + 100
}
fn main() -> smilehook::Result<()> {
let registry = smilehook::global();
let target = add as *const () as *mut u8;
// SAFETY: `add_detour` is ABI-compatible with `add`.
let trampoline = unsafe { registry.create(target, add_detour as *const () as *const u8)? };
ORIGINAL.store(trampoline as *mut u8, Ordering::SeqCst);
unsafe { registry.enable(target)? }; // calls to `add` now route through the detour
assert_eq!(add(1, 2), 103);
unsafe { registry.disable(target)? }; // restore the original
assert_eq!(add(1, 2), 3);
Ok(())
}A detour must be ABI-compatible with its target. The original is always
reached through the trampoline returned by create, never by calling the target
again (that would re-enter the detour).
Hook— a single RAII hook. Disabled and freed automatically on drop. Use it for scoped, one-off hooks.Registry— a thread-safe collection of hooks keyed by target address. It rejects duplicates and supports bulk and batched operations. A process-wide instance is available throughsmilehook::global().
- Decode —
iced-x86decodes the target's prologue so that only whole instructions are stolen (at least 14 bytes). - Relocate — those instructions are re-encoded into an executable page
(
mmap), followed by a jump back totarget + N. Calling the trampoline is equivalent to calling the original. - Patch — the prologue is overwritten with a 14-byte absolute indirect
jump:
FF 25 00 00 00 00(jmp qword ptr [rip + 0]) followed by the 8-byte detour address. Unlike anE9 rel32jump, this form has no range limit. - Freeze — every enable, disable, and drop happens under a stop-the-world freeze (see Safety and concurrency), so concurrent callers never execute a partially patched prologue.
iced-x86's BlockEncoder handles the subtle part: if a stolen relative
call/jmp can no longer reach its target from the trampoline, it is rewritten
as an indirect jump with the absolute address stored out of line. The jump back
is handed to the encoder as a block instruction (not appended by hand) so the
layout stays correct. Because the encoder does not report a usable offset for an
instruction it rewrites this way, SmileHook re-derives each relocated offset by
decoding the encoded trampoline, and records a boundary map (original offset
→ trampoline offset) used to fix up any thread frozen mid-prologue.
symbol_address resolves an exported function through dlsym(RTLD_DEFAULT, …),
and Registry::create_by_symbol hooks it in one step — handy for intercepting
library functions such as glXSwapBuffers or vkQueuePresentKHR:
let trampoline = unsafe {
smilehook::global().create_by_symbol("strlen", my_strlen as *const () as *const u8)?
};enable_all / disable_all freeze the process once per hook. To flip many hooks
together and pay for one stop-the-world instead of N, queue the changes and
apply them in a single freeze (the analogue of MinHook's MH_ApplyQueued):
let reg = smilehook::global();
reg.queue_enable(target_a)?;
reg.queue_disable(target_b)?; // mixed enables and disables are fine
unsafe { reg.apply_queued()? }; // all take effect under one freezequeue_* only update bookkeeping (so they are safe); the live patch happens in
apply_queued. Hooks already in their requested state are skipped.
SmileHook also builds as a static library exposing a C ABI that mirrors
MinHook's, so a native payload (for example a C++ overlay LD_PRELOADed into a
target) can install hooks without writing any Rust. The crate's [lib] type
includes staticlib, so a release build emits target/release/libsmilehook.a;
the matching declarations are in include/smilehook.h,
and a runnable example is in examples/c_api.c.
#include "smilehook.h"
typedef int (*add_fn)(int, int);
static add_fn original;
int add_detour(int a, int b) { return original(a, b) + 100; }
void install(void *add) {
sh_create_hook(add, (const void *)add_detour, (void **)&original);
sh_enable_hook(add); /* calls to `add` now route through the detour */
}cargo build --release # -> target/release/libsmilehook.a
cc -O0 -Iinclude examples/c_api.c target/release/libsmilehook.a \
-lpthread -ldl -lm -o c_api && ./c_apiEvery sh_* call returns a status code (SH_OK, or a negative SH_ERR_*);
sh_strerror turns one into a static message. The entry points catch any
internal Rust panic and return SH_ERR_OTHER rather than unwinding across the
boundary, so a fault in the hooking engine cannot abort the host process.
src/
├── lib.rs crate root, docs, public re-exports
├── error.rs Error + Result
├── memory.rs ExecBuffer (executable mmap) and patch_code (mprotect)
├── arch/
│ ├── mod.rs ISA dispatch (x86-64 only; compile error otherwise)
│ └── x86_64.rs prologue decoding/relocation, jump generation, boundary map
├── freeze.rs stop-the-world thread suspension during a patch
├── symbol.rs symbol_address — resolve a function by name via dlsym
├── ffi.rs C ABI bridge (sh_* functions) for non-Rust payloads
├── hook.rs Hook — a single RAII hook
└── registry.rs Registry — thread-safe multi-hook registry + global()
include/
└── smilehook.h C/C++ header matching the ffi.rs bridge
ISA-specific logic (decode, relocate, jump emission) is isolated in
arch::x86_64; the rest of the crate is architecture-agnostic.
Installing a hook patches live, executable code, so the detour must be ABI-compatible with the target.
The 14-byte write is not atomic, and Linux has no in-process "suspend thread" primitive, so SmileHook makes patching safe against concurrent execution with a stop-the-world freeze. During every enable, disable, and drop the patching thread:
- enumerates the other threads via
/proc/self/taskand sends each a real-time signal (tgkillwithSIGRTMIN+4); - waits until all of them have parked inside an async-signal-safe handler (threads that exit before parking are dropped from the wait set);
- relocates any thread caught in the affected bytes — both its saved
instruction pointer and return addresses on its stack that point into the
stolen prologue (a
callin the prologue would otherwise return onto patched bytes) — using the relocation boundary map; - applies the patch and releases the threads.
A thread that blocks the freeze signal makes the operation fail with
Error::FreezeTimeout rather than hang. Registry operations are additionally
serialized by a mutex.
One obligation remains the caller's: do not remove or drop a hook while its detour may still be executing, since that frees the trampoline it calls into.
Two test suites exercise the relocation engine over many prologues, not just a handful:
- Corpus soundness runs the relocator over ~130 real glibc/pthread functions — vectorized string/memory ops, stdio, math, threading — and asserts every relocation is structurally and semantically sound (well-formed boundary map, no sentinel offsets, plain instructions copied byte-for-byte, a valid jump back) or is cleanly refused. It never patches the live functions, so it is safe to run over arbitrary code.
- Differential fuzzing hooks synthetic functions of varied prologue shapes
(leaf, early
call, large frame, branches, SSE, many locals) with thousands of random inputs each, checking that a transparent detour reproduces the original exactly and a transforming detour composes correctly.
Functions whose optimized prologue is shorter than the 14-byte jump (e.g. the
endbr64; mov; syscall; ret syscall wrappers) are refused with
Error::PrologueTooShort rather than allowed to overwrite the adjacent function.
No injector is required — the tests hook functions inside the test process.
cargo build --release # also emits target/release/libsmilehook.a
cargo test # hook, registry, symbol, ffi, thread-safety, fuzz
cargo test --release # the same, with optimized codegen
cargo run --example basic # multi-hook registry demo
cargo run --example diag # disassemble a target prologue and its trampolinetests/thread_safety.rs is a stress test: worker threads hammer a target in a
tight loop while the main thread toggles the hook hundreds of times (singly and
in atomic batches). Every observed return value must be either the original or
the hooked one — never a torn result.
- x86-64 inline hook (14-byte absolute jump)
- Trampoline with full instruction relocation and a boundary map for instruction-pointer / return-address fix-ups
- RAII
Hook: enable / disable / restore on drop - Thread-safe multi-hook
Registry+ a process-wideglobal() - Thread-safe patching: stop-the-world freeze with IP and stack return-address relocation
- Atomic batch apply:
queue_enable/queue_disable(+_all) andapply_queued - Hooking by symbol name via
dlsym - Refuses to hook a function shorter than the jump (
PrologueTooShort) - Panic-safe C / C++ ABI bridge (
staticlib+include/smilehook.h) - Relocation proven over a real-libc corpus plus differential fuzzing
- Hooking by
/proc/self/mapsmodule + offset lookup - 32-bit x86 support
- AArch64 support (the engine is ISA-specific)
- x86-64 only for now; building on another architecture fails with a clear compile error.
- Under SELinux in
Enforcingmode, the executable trampolinemmapand the.textmprotectneedexecmem/execmod, which anunconfined_tprocess has by default. A confined process would need a policy adjustment.
Released under the MIT License.