Skip to content

nrf52: fix LittleFS-in-loop() stack overflow via larger loop_task stack#41

Open
disq wants to merge 1 commit into
weebl2000:dev_plusfrom
disq:fix/nrf52-loop-stack-size
Open

nrf52: fix LittleFS-in-loop() stack overflow via larger loop_task stack#41
disq wants to merge 1 commit into
weebl2000:dev_plusfrom
disq:fix/nrf52-loop-stack-size

Conversation

@disq
Copy link
Copy Markdown

@disq disq commented May 31, 2026

Fix nRF52 stack overflow on LittleFS file opens from loop()

Problem

nRF52 repeaters crash / go silent after admin or session activity (reproducible on dev_plus with MESH_PACKET_LOGGING + MESH_DEBUG on a RAK3401). It's a stack overflow on the framework's Arduino loop_task:

  • The Adafruit_nRF52_Arduino framework runs loop() on loop_task with a 4KB stack (LOOP_STACK_SZ = 256*4).
  • the_mesh.loop()ClientACL::saveSessionKeysAdafruit_LittleFS::open() puts large lfs_dir/lfs_info structs on the stack; measured peak ~4.7KB overruns the 4KB chunk and corrupts the heap object directly below it (observed: the RadioLib Module*), causing a hardfault/lockup.

The LittleFS v1.6 → v1.7.2 upgrade is the trigger: it enlarged the file-open path's stack footprint enough to cross the 4KB limit. That's why this reproduces on dev_plus (LittleFS 1.7.2) but not on dev (still 1.6).

This root cause is general to every nRF52 build that touches the filesystem from loop() — repeater, room server, and BLE/USB companion alike — not repeater-specific.

Fix

Give loop_task a bigger stack via a build flag, covering all nRF52 examples at once with no per-example code:

Dependency / pin

Interim-pinned to disq/Adafruit_nRF52_Arduino@f019297 (= the current meshcore-patches 724e00a + the #ifndef guard). Once weebl2000/Adafruit_nRF52_Arduino#1 is merged and tagged, repin [nrf52_base] back to weebl2000/... at the new tag.

Validation

Built against the pinned guarded framework — RAK_3401_repeater (the exact MESH_PACKET_LOGGING+MESH_DEBUG crash config) and Xiao_nrf52_companion_radio_ble — both build cleanly with no -Wmacro-redefined, confirming the 8KB override is honored across nRF52 examples including BLE companion. With the default (no -D) the framework stack is unchanged at 4KB.

On-hardware: flash a RAK3401/RAK4631 repeater and drive the previously-crashing workload (admin sync, neighbour discover, repeated LoRa status requests) for several minutes — expect no LoRa-silence / lockup. Optional gdb: walk the 0xa5a5a5a5 canary from loop_task's pxStack — expect the high-water mark well under 8KB.

Supersedes

This replaces the per-example FreeRTOS worker-task approach (PR #40) with a single framework-level fix that covers all nRF52 builds.

Pin the framework to disq/Adafruit_nRF52_Arduino@f019297 (= weebl2000 meshcore-patches
724e00a plus an #ifndef guard around LOOP_STACK_SZ; PR weebl2000/Adafruit_nRF52_Arduino#1)
and set -D LOOP_STACK_SZ=2048 (words = 8KB) in [nrf52_base].

The framework default loop_task stack (256*4 = 1024 words = 4KB) is too small for
LittleFS file opens from loop() (e.g. ClientACL::saveSessionKeys after the LittleFS
v1.7.2 upgrade), which overflow it (~4.7KB peak) and corrupt the adjacent heap object,
causing a hardfault. The unconditional #define previously made -DLOOP_STACK_SZ
impossible to override; the guarded framework lets this flag take effect across every
nRF52 build, including BLE companion.

Repin to weebl2000/Adafruit_nRF52_Arduino once meshcore-dev#1 is merged and tagged.

Validated: RAK_3401_repeater and Xiao_nrf52_companion_radio_ble build clean with no
-Wmacro-redefined, confirming the 8KB override is honoured.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 31, 2026 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant