You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
stream_sender: sendto ENOMEM in a tight loop on ESP32-S3 (v0.8.1-esp32) — 0 UDP frames ever leave the node
Summary
A fresh ESP32-S3 (8 MB, no peripherals attached) flashed with the v0.8.1-esp32 release assets enters a permanent sendto ENOMEM loop on the second CSI callback. The aggregator's source state never advances past esp32:offline. While diagnosing this I also found what looks like a separate phantom LD2410 detection on a floating UART — I'm filing both here since they were observed in the same run; happy to split into two issues if preferred.
Wiped before each test (esptool erase-region 0x9000 0x6000) and re-provisioned via the bundled provision.py
What works
Wi-Fi associates, DHCP returns an IP, host↔ESP32 ICMP is healthy.
CSI capture itself is alive: callbacks fire at expected cadence with valid len and rssi fields.
The aggregator's UDP receiver works: injecting a hand-built UDP packet at 127.0.0.1:5005 from the host immediately promotes its source from simulated to esp32. So the failure is strictly on the node's egress path.
What fails
Annotated boot log (timestamps in ms since reset; nothing else is talking on this LAN):
I (4124) main: Got IP: 192.168.0.19
I (4124) stream_sender: UDP sender initialized: 192.168.0.25:5005
I (4144) csi_collector: WiFi modem sleep disabled (WIFI_PS_NONE) for CSI capture
I (4154) wifi:ic_enable_sniffer
I (4154) csi_collector: Promiscuous mode enabled (MGMT-only, RuView#396)
I (4164) csi_collector: self-ping started -> 192.168.0.1 @50Hz (CSI OFDM source, fix #521/#954)
I (4184) ESPNOW: espnow [version: 2.0] init
I (4194) edge_proc: Initializing edge processing (tier=2, top_k=8, vital_interval=1000ms, ...)
I (4294) mmwave: Probing UART1 (TX=17, RX=18) for mmWave sensor...
I (4304) mmwave: Probing at 115200 baud (MR60BHA2)...
I (4494) csi_collector: CSI cb #1: len=128 rssi=-25 ch=10 ← first send succeeds (no log = OK)
I (5364) mmwave: Probing at 256000 baud (LD2410)...
I (5544) csi_collector: CSI cb #2: len=128 rssi=-25 ch=10
W (5544) stream_sender: sendto ENOMEM — backing off for 100 ms ← first failure on CSI cb #2
W (5544) csi_collector: sendto failed (fail #1)
I (5564) mmwave: Detected LD2410 at 256000 baud (caps=0x000c) ← (see "bug #2" below)
I (5564) mmwave: mmWave UART task started (type=LD2410)
W (5564) stream_sender: sendto suppressed (ENOMEM backoff, 1 dropped)
... (steady-state — every send either ENOMEMs or is suppressed)
The aggregator stays at {"source":"esp32:offline"} indefinitely; zero CSI frames reach it over the network even though L2/L3 is healthy.
The first stream_sender_send (on CSI cb #1) appears to succeed (no failure log). The very next one fails with ENOMEM and never recovers — every subsequent attempt either ENOMEMs or is suppressed by the 100 ms backoff. The 100 ms backoff is shorter than what's needed for the underlying pbuf/mbox pressure to clear, so the node is stuck.
The 1050 ms gap between cb #1 and cb #2 is occupied by:
the 50 Hz self-ping to the gateway (csi_collector: self-ping started ... @50Hz) — that's ~52 UDP datagrams enqueued back-to-back into LWIP;
the MR60BHA2 UART probe at 115200 baud for ~1060 ms;
It looks like LWIP pbufs / WiFi dynamic TX buffers / UDP send mbox saturate during that 1 s and never drain. sdkconfig.defaults already mentions a sibling fix for an earlier ENOMEM (note above CONFIG_LWIP_UDP_RECVMBOX_SIZE=32 / CONFIG_LWIP_TCPIP_RECVMBOX_SIZE=64 / CONFIG_ESP_WIFI_DYNAMIC_TX_BUFFER_NUM=64), but on S3 those values don't appear sufficient — possibly because S3 + Sniffer + 50 Hz self-ping + ESPNOW competes harder for buffers than the C6 target the 0.6.7 build was verified against.
Possible fixes worth considering:
Drop the self-ping cadence (50 Hz → 10 Hz?) when the LD2410/mmWave or ESPNOW tasks are also TX-active.
Raise CONFIG_ESP_WIFI_DYNAMIC_TX_BUFFER_NUM / CONFIG_LWIP_TCPIP_RECVMBOX_SIZE further in the S3-specific sdkconfig overlays.
When stream_sender has been in ENOMEM backoff for >N consecutive cycles, exponentially extend the backoff (the current fixed 100 ms is too short) and emit a single warning instead of one per attempt.
Bug #2 — secondary: phantom LD2410 detection on a floating UART
With no mmWave sensor wired to UART1 (TX=17, RX=18), the firmware still concludes Detected LD2410 at 256000 baud (caps=0x000c) and spawns the LD2410 reader task. The v0.8.1-esp32 release notes specifically called out a fix for "false MR60BHA2 detection → ENOMEM by requiring validated sensor headers instead of accepting bare byte patterns" — the LD2410 path looks like it still accepts loose patterns and so trips on floating-pin noise at 256000 baud.
This isn't the trigger of Bug #1 (the timing rules it out — first ENOMEM at 5544 ms, LD2410 declared at 5564 ms), but the resulting mmWave UART task adds steady load to a system that's already in a fragile buffer state.
Suggested fix: gate mmwave: Detected LD2410 on a validated frame header (length + checksum + magic), matching what was done for MR60BHA2 in v0.8.1.
What I tried
release_bins/s3-adr110/ in-tree bins — same ENOMEM loop.
release_bins/s3-fair-adr110/ in-tree bins — same.
Fresh download of v0.8.1-esp32 release assets — same.
esptool erase-region 0x9000 0x6000 to wipe NVS, then provision.py --reset --edge-tier 2 --target-ip <host> --target-port 5005 — same.
Confirmed Wi-Fi credentials, IP, gateway, and aggregator IP/port are correct (ping host↔ESP32 OK).
Confirmed the aggregator's UDP receiver works by sending a synthetic CSI packet from the host — source promoted to esp32 immediately, then back to esp32:offline after the synthetic stream stops.
Repro
Flash a bare ESP32-S3 (8 MB, no mmWave sensor connected) with the v0.8.1-esp32 release assets at 0x0 / 0x8000 / 0xf000 / 0x20000.
Watch ESP32 serial: first stream_sender: sendto ENOMEM — backing off for 100 ms appears on CSI cb Embedded device like ESP32 and Rasbperry Pi #2 and never goes away. Phantom mmwave: Detected LD2410 ... appears in the same window.
Watch GET /api/v1/status on the aggregator — stays esp32:offline indefinitely.
stream_sender: sendto ENOMEMin a tight loop on ESP32-S3 (v0.8.1-esp32) — 0 UDP frames ever leave the nodeSummary
A fresh ESP32-S3 (8 MB, no peripherals attached) flashed with the v0.8.1-esp32 release assets enters a permanent
sendto ENOMEMloop on the second CSI callback. The aggregator's source state never advances pastesp32:offline. While diagnosing this I also found what looks like a separate phantom LD2410 detection on a floating UART — I'm filing both here since they were observed in the same run; happy to split into two issues if preferred.Environment
bootloader.bin,partition-table.bin,ota_data_initial.bin,esp32-csi-node-s3-8mb.bin) at0x0 / 0x8000 / 0xf000 / 0x20000ruvnet/wifi-densepose:latestin Docker, UDP0.0.0.0:5005mapped to hostesptool erase-region 0x9000 0x6000) and re-provisioned via the bundledprovision.pyWhat works
lenandrssifields.127.0.0.1:5005from the host immediately promotes its source fromsimulatedtoesp32. So the failure is strictly on the node's egress path.What fails
Annotated boot log (timestamps in ms since reset; nothing else is talking on this LAN):
The aggregator stays at
{"source":"esp32:offline"}indefinitely; zero CSI frames reach it over the network even though L2/L3 is healthy.Bug #1 — primary: permanent
sendto ENOMEMfrom CSI cb #2 onwardThe first
stream_sender_send(on CSI cb #1) appears to succeed (no failure log). The very next one fails withENOMEMand never recovers — every subsequent attempt either ENOMEMs or is suppressed by the 100 ms backoff. The 100 ms backoff is shorter than what's needed for the underlying pbuf/mbox pressure to clear, so the node is stuck.The 1050 ms gap between cb #1 and cb #2 is occupied by:
csi_collector: self-ping started ... @50Hz) — that's ~52 UDP datagrams enqueued back-to-back into LWIP;c6_espnowtx loop;It looks like LWIP pbufs / WiFi dynamic TX buffers / UDP send mbox saturate during that 1 s and never drain.
sdkconfig.defaultsalready mentions a sibling fix for an earlier ENOMEM (note aboveCONFIG_LWIP_UDP_RECVMBOX_SIZE=32/CONFIG_LWIP_TCPIP_RECVMBOX_SIZE=64/CONFIG_ESP_WIFI_DYNAMIC_TX_BUFFER_NUM=64), but on S3 those values don't appear sufficient — possibly because S3 + Sniffer + 50 Hz self-ping + ESPNOW competes harder for buffers than the C6 target the 0.6.7 build was verified against.Possible fixes worth considering:
CONFIG_ESP_WIFI_DYNAMIC_TX_BUFFER_NUM/CONFIG_LWIP_TCPIP_RECVMBOX_SIZEfurther in the S3-specific sdkconfig overlays.stream_senderhas been in ENOMEM backoff for >N consecutive cycles, exponentially extend the backoff (the current fixed 100 ms is too short) and emit a single warning instead of one per attempt.Bug #2 — secondary: phantom LD2410 detection on a floating UART
With no mmWave sensor wired to UART1 (TX=17, RX=18), the firmware still concludes
Detected LD2410 at 256000 baud (caps=0x000c)and spawns the LD2410 reader task. The v0.8.1-esp32 release notes specifically called out a fix for "false MR60BHA2 detection → ENOMEM by requiring validated sensor headers instead of accepting bare byte patterns" — the LD2410 path looks like it still accepts loose patterns and so trips on floating-pin noise at 256000 baud.This isn't the trigger of Bug #1 (the timing rules it out — first ENOMEM at 5544 ms, LD2410 declared at 5564 ms), but the resulting mmWave UART task adds steady load to a system that's already in a fragile buffer state.
Suggested fix: gate
mmwave: Detected LD2410on a validated frame header (length + checksum + magic), matching what was done for MR60BHA2 in v0.8.1.What I tried
release_bins/s3-adr110/in-tree bins — same ENOMEM loop.release_bins/s3-fair-adr110/in-tree bins — same.esptool erase-region 0x9000 0x6000to wipe NVS, thenprovision.py --reset --edge-tier 2 --target-ip <host> --target-port 5005— same.esp32immediately, then back toesp32:offlineafter the synthetic stream stops.Repro
0x0 / 0x8000 / 0xf000 / 0x20000.python3 provision.py --port <port> --chip esp32s3 --ssid <SSID> --password <pw> --target-ip <host> --target-port 5005 --edge-tier 2 --reset.<host>:5005.stream_sender: sendto ENOMEM — backing off for 100 msappears on CSI cb Embedded device like ESP32 and Rasbperry Pi #2 and never goes away. Phantommmwave: Detected LD2410 ...appears in the same window.GET /api/v1/statuson the aggregator — staysesp32:offlineindefinitely.Happy to test patches on this board if useful.