Skip to content

Commit 0acbbf7

Browse files
etrclaude
andcommitted
TASK-032: DR-008 thread-safety contract stress test
Adds test/integ/threadsafety_stress.cpp with two sub-tests: 1. concurrent_register_block_from_handlers_no_data_race — 16 curl clients hit an on_get lambda that re-enters the public webserver surface (register_path / unregister_path / block_ip / unblock_ip) for 60 s (default; HTTPSERVER_STRESS_SECONDS overrides). The TSan-clean rerun is the headline acceptance: the existing build-type=tsan matrix entry in verify-build.yml invokes make check, which auto-picks up the new check_PROGRAMS entry — no workflow edit required. Port 0 + get_bound_port() avoids collisions when the 60-s soak runs alongside other integ tests. 2. stop_from_handler_deadlocks_as_documented — opt-in (set HTTPSERVER_RUN_STOP_FROM_HANDLER=1) reproducer for the DR-008 negative case. Forks a child that calls stop() inside an on_get handler; on this libmicrohttpd, MHD detects pthread_join(self) returning EDEADLK and aborts with "Failed to join a thread." The fork contains the abort so the parent test binary stays healthy. Either a non-zero child exit or a 5-second timeout (child SIGKILLed by parent) counts as positive observation of the contract. Doxygen on webserver::stop() and ~webserver() now spells out the deadlock-or-abort consequence of calling stop() from a handler thread, pointing at DR-008 and §5.1. Test wall-clock: 60 s default, ~5 s minimum locally. Acceptance criteria all met: register_ok + unregister_ok > 0 and block_ok + unblock_ok > 0 in every observed run. Full make check -j1 passes 42/42. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 2f7b4ea commit 0acbbf7

4 files changed

Lines changed: 420 additions & 9 deletions

File tree

specs/tasks/M5-routing-lifecycle/TASK-032.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@
88
Verify the documented thread-safety contract: `webserver` public methods are reentrant from inside a handler, except `stop()` and `~webserver()` which deadlock by design.
99

1010
**Action Items:**
11-
- [ ] Write a stress test (`test/threadsafety_stress.cpp`) that runs N concurrent handlers, each randomly invoking `register_resource`, `block_ip`, `unblock_ip`, `unregister_resource` against the running `webserver`.
12-
- [ ] Run under ThreadSanitizer in CI; assert no data races.
13-
- [ ] Add a separate test that calls `stop()` from inside a handler thread and asserts deadlock-detection (or simply documents the timeout); skip the test by default in CI but make it runnable on demand to validate the contract.
14-
- [ ] Document the deadlock case in `webserver::stop()` Doxygen.
11+
- [x] Write a stress test (`test/integ/threadsafety_stress.cpp`) that runs 16 concurrent curl clients, each request randomly invoking `register_path` (the non-deprecated successor to `register_resource`), `unregister_path`, `block_ip`, `unblock_ip` against the running `webserver` from inside an on_get lambda handler. Default duration 60 s; override with `HTTPSERVER_STRESS_SECONDS=N`.
12+
- [x] Run under ThreadSanitizer in CI; the existing `build-type: tsan` matrix entry in `.github/workflows/verify-build.yml` invokes `make check`, which auto-picks up every new `check_PROGRAMS` entry — no workflow edit needed.
13+
- [x] Add a separate test (`stop_from_handler_deadlocks_as_documented`) that calls `stop()` from inside a handler thread; opt-in via `HTTPSERVER_RUN_STOP_FROM_HANDLER=1`. The test forks a child so the abort/deadlock does not crash the test binary; either a non-zero child exit (libmicrohttpd self-join abort) or a 5 s timeout counts as positive observation of the contract.
14+
- [x] Document the deadlock case in `webserver::stop()` and `~webserver()` Doxygen.
1515

1616
**Dependencies:**
1717
- Blocked by: TASK-027, TASK-031
@@ -26,4 +26,4 @@ Verify the documented thread-safety contract: `webserver` public methods are ree
2626
**Related Requirements:** PRD §2 NFR (concurrency)
2727
**Related Decisions:** DR-008, §5.1, §9 testing item 6, AR-006
2828

29-
**Status:** Not Started
29+
**Status:** Done

src/httpserver/webserver.hpp

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,15 @@ class webserver {
109109
// create_webserver. Callers must direct-init: webserver ws{cw};
110110
explicit webserver(const create_webserver& params);
111111
/**
112-
* Destructor of the class
112+
* Destructor.
113+
*
114+
* Calls stop() unconditionally, which joins libmicrohttpd's
115+
* worker threads. For the same reason as stop(), destroying a
116+
* webserver from inside a handler thread deadlocks (or, on some
117+
* libmicrohttpd versions, aborts with "Failed to join a thread.").
118+
* Destroy the webserver from the thread that constructed it.
119+
*
120+
* See specs/architecture/11-decisions/DR-008.md and §5.1.
113121
**/
114122
~webserver();
115123
// PIMPL-owned: copy/move would slice the backing impl object.
@@ -125,8 +133,22 @@ class webserver {
125133
**/
126134
bool start(bool blocking = false);
127135
/**
128-
* Method used to stop the webserver.
129-
* @return true if the webserver is stopped.
136+
* Stop the webserver.
137+
*
138+
* Joins libmicrohttpd's worker threads before returning. Safe to
139+
* call from any thread *except* a handler thread: stop() blocks
140+
* until every worker (including the calling one) drains, so a
141+
* call from inside a handler self-joins and deadlocks (or, on
142+
* some libmicrohttpd versions, aborts with "Failed to join a
143+
* thread."). This is the documented DR-008 contract — see
144+
* specs/architecture/11-decisions/DR-008.md and §5.1.
145+
*
146+
* For the same reason, ~webserver() (which calls stop())
147+
* deadlocks if it runs on a handler thread; destroy the
148+
* webserver from the thread that constructed it.
149+
*
150+
* @return true if the daemon was running and is now stopped;
151+
* false if it was already stopped.
130152
**/
131153
bool stop();
132154
/**

test/Makefile.am

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ LDADD += -lcurl
2626

2727
AM_CPPFLAGS = -I$(top_srcdir)/src -I$(top_srcdir)/src/httpserver/ -DHTTPSERVER_COMPILATION
2828
METASOURCES = AUTO
29-
check_PROGRAMS = basic file_upload http_utils threaded nodelay string_utilities http_endpoint ban_system ws_start_stop authentication deferred http_resource http_response create_webserver create_webserver_explicit new_response_types daemon_info uri_log feature_unavailable header_hygiene_iovec header_hygiene iovec_entry http_method constants body http_response_sbo http_response_factories webserver_pimpl http_request_pimpl create_test_request http_request_arena http_request_const_getters http_request_tls_accessors webserver_register_smartptr webserver_register_path_prefix webserver_on_methods webserver_route route_table lookup_pipeline route_table_concurrency routing_regression
29+
check_PROGRAMS = basic file_upload http_utils threaded nodelay string_utilities http_endpoint ban_system ws_start_stop authentication deferred http_resource http_response create_webserver create_webserver_explicit new_response_types daemon_info uri_log feature_unavailable header_hygiene_iovec header_hygiene iovec_entry http_method constants body http_response_sbo http_response_factories webserver_pimpl http_request_pimpl create_test_request http_request_arena http_request_const_getters http_request_tls_accessors webserver_register_smartptr webserver_register_path_prefix webserver_on_methods webserver_route route_table lookup_pipeline route_table_concurrency routing_regression threadsafety_stress
3030

3131
MOSTLYCLEANFILES = *.gcda *.gcno *.gcov
3232

@@ -233,6 +233,20 @@ lookup_pipeline_LDADD = $(LDADD) -lmicrohttpd
233233
route_table_concurrency_SOURCES = integ/route_table_concurrency.cpp
234234
route_table_concurrency_LDADD = $(LDADD) -lmicrohttpd
235235

236+
# threadsafety_stress: TASK-032. DR-008 multi-thread contract gate.
237+
# Spawns 16 curl clients hitting an on_get handler that re-enters the
238+
# public webserver surface (register_path / unregister_path / block_ip /
239+
# unblock_ip) for HTTPSERVER_STRESS_SECONDS (default 60). The TSan-clean
240+
# rerun is the headline acceptance; the CI matrix `build-type: tsan`
241+
# entry in verify-build.yml picks this binary up via `make check`
242+
# without further wiring. A second sub-test reproduces the documented
243+
# stop()-from-handler deadlock; it is skipped by default and runs only
244+
# when HTTPSERVER_RUN_STOP_FROM_HANDLER=1.
245+
# Default LDADD (libhttpserver + curl) is sufficient — this test
246+
# touches only public webserver APIs and does not need a direct
247+
# microhttpd link.
248+
threadsafety_stress_SOURCES = integ/threadsafety_stress.cpp
249+
236250
# routing_regression: TASK-028. v1 routing-test corpus regression gate.
237251
# Drives the public webserver registration surface (register_path /
238252
# register_prefix / on_get / on_post / unregister_path / unregister_prefix)

0 commit comments

Comments
 (0)