Add core dump support to Docker Compose services by RiteshKtiet · Pull Request #35 · osdldbt/dbt5

RiteshKtiet · 2026-03-14T17:40:39Z

Hey @markwkm, with reference to issue #34, I have implemented a minimal solution for core dump support:

Changes:

Add ulimits: core: -1 to broker, market, and driver services
Add ./cores:/cores volume mount to persist core dumps on host
Add ulimit -c unlimited in startup scripts
Update .gitignore and .dockerignore to exclude cores/

If you are using WSL, you'll need to configure the host:

sudo sysctl -w kernel.core_pattern="$(pwd)/cores/core.%e.%p.%t"

Another option is using privileged: true, but i guess that would be less secure.

Test:

docker compose exec broker bash -c 'cat > /tmp/crash.c << "EOF"
#include <stdio.h>
int main() { int *p = NULL; *p = 42; return 0; }
EOF
gcc -g /tmp/crash.c -o /tmp/crash && /tmp/crash'

ls -lh cores/  # Should show core dump file

I hope this solves the purpose. Should I add documentation if this approach is acceptable?

- Enable unlimited core dumps via ulimits in broker, market, and driver services - Add volume mount for ./cores directory to persist core dumps on host - Add ulimit -c unlimited in service startup scripts - Update .gitignore to exclude cores/ directory - Update .dockerignore to exclude cores/ directory from build context - Add ulimit configuration to base image bashrc This allows automatic core dump generation when services crash. Host configuration may be required on systems with systemd-coredump or other crash handlers. Will add it in the docs later on.

markwkm · 2026-03-14T18:42:07Z

I hope this solves the purpose. Should I add documentation if this approach is acceptable?

Two things items to work on. You should add documentation, don't need to ask if it's acceptable. I don't think this takes into account my comment about in #34 about the the host and guest OS matching.

Riteshk1314 · 2026-03-21T19:26:50Z

Hey @markwkm, firstly sorry for the delay. I have made a few changes which I feel will address all the issues.

When a binary crashes inside a container, core dumps are automatically captured and persisted to ./cores/ on the host. Both container-side and local (host-side) debugging are supported, even when the host and container run different distros.

Here are the changes that I have made in the docker file:-

Key Changes

Change	Why
`gdb` + `libc6-dbg` added to base image	Provides a debugger and glibc debug symbols inside the container.
`-DCMAKE_BUILD_TYPE=RelWithDebInfo` on all services	Embeds debug symbols so GDB can show function names, source files, and line numbers instead of raw addresses.
`ulimits: core: -1` on broker, market, load, driver	Docker defaults to `ulimit -c 0` (discard core dumps). This removes that limit.
`./cores:/cores` volume mount	Persists core dumps to the host so they survive container restarts.
`cd /cores && ulimit -c unlimited` before each `exec`	Sets the working directory so `core_pattern=core.%e.%p` writes dumps into the mounted volume, and opts the shell into generating dumps.
Binary + shared library copy into `./cores/sysroot/`	Enables local debugging by providing the exact binary and container libraries to GDB on the host.

The database service is intentionally excluded because PostgreSQL manages its own crash handling via docker-entrypoint.sh.

How the Host/Guest OS Mismatch Is Solved

A core dump is just a memory snapshot. To read it, GDB needs the exact binary and exact shared libraries that were loaded when the crash happened. If the host runs a diff os say, Ubuntu 24.04 and since the container runs Debian Bookworm, the library versions differ (libc, libssl, libpq, etc.). Loading wrong libraries produces garbage backtraces like:

#0  two_way_long_needle () at str-two-way.h:438
#1  init () at fmtmsg.c:266
Backtrace stopped: corrupt stack?

So we solve it by :-

Copying the binary to ./cores/ at service startup.

Copying every shared library (found via ldd) into ./cores/sysroot/, preserving the container's full directory layout:

./cores/sysroot/
├── lib/x86_64-linux-gnu/
│   ├── libc.so.6          (container's version)
│   ├── libstdc++.so.6
│   └── ...
├── usr/lib/x86_64-linux-gnu/
│   └── libpq.so.5
└── opt/egen/bin/
    └── BrokerageHouseMain

Using set sysroot when running GDB locally. This tells GDB to prefix all library paths with the sysroot directory, so it loads ./cores/sysroot/lib/x86_64-linux-gnu/libc.so.6 instead of the host's /lib/x86_64-linux-gnu/libc.so.6.

Thus, a clean backtrace regardless of host distro:

#0  ?? () from ./sysroot/lib/x86_64-linux-gnu/libc.so.6
#1  raise () from ./sysroot/lib/x86_64-linux-gnu/libc.so.6
#2  abort () from ./sysroot/lib/x86_64-linux-gnu/libc.so.6
#3  main (argc=9, argv=0x7ffc9618c628) at BrokerageHouseMain.cpp:122

Usage

1. Host prerequisite (one-time)

sudo sysctl -w kernel.core_pattern=core.%e.%p
# Ubuntu only — disable Apport if it intercepts core dumps:
sudo systemctl disable apport && sudo systemctl stop apport

2. Start services

docker compose up -d
docker compose run load           # first time only
docker compose run driver -d 120  # run workload

3. After a crash

ls ./cores/
# BrokerageHouseMain  core.BrokerageHouseM.483  sysroot/

4a. Debug inside the container (easier)

docker compose exec broker \
    gdb /opt/egen/bin/BrokerageHouseMain /cores/core.BrokerageHou.<pid>

4b. Debug locally on the host

sudo chmod 644 ./cores/core.BrokerageHou.<pid>
cd cores/
gdb -ex "set sysroot ./sysroot" \
    ./BrokerageHouseMain ./core.BrokerageHou.<pid>

5. Inside GDB

bt full                 # full backtrace with variable values
info threads            # list all threads
thread apply all bt     # backtrace for every thread

I hope this meets all your expectations (Although I have a feeling that this a little complex solution). Let me know if I should make any change, I'll do it asap!

markwkm · 2026-03-22T18:51:59Z

Yeah, the complexity bothers me a little, because it doesn't look natural. i.e. Docker really isn't meant to be used like that. Yet I say that as someone not particularly Docker savvy.

Although I still wonder if there is a way to fall through coredumpctl to output or save any cores. While I like the sound of that thought, I don't know if that's any more possible to expect to be doable.

Riteshk1314 · 2026-03-24T08:01:06Z

Thanks for the feedback! I have tried to simplify the approach a little.

On the coredumpctl question: tested it and here is how it works.
Since kernel.core_pattern is a kernel-level setting shared across all containers, crashes inside containers are captured automatically by the host's systemd-coredump. No special host setup needed.

Debugging workflow

When a container crashes, coredumpctl list on the host shows the crash with the correct executable name and the Docker control group confirming it came from the container.

coredumpctl info <PID> gives the signal and a partial backtrace, but only partial because the binary lives inside the container so the host cannot resolve debug symbols to get full function names and line numbers.

To get the full backtrace, extract the core with
sudo coredumpctl dump <PID> -o /tmp/core.file and debug inside the container where the binary and libs match the core exactly:

List crashes: coredumpctl list
Inspect signal and partial backtrace: coredumpctl info <PID>
Extract the core: sudo coredumpctl dump <PID> -o /tmp/core.file
Debug inside the container:

docker compose run --no-deps --rm \
  -v /tmp/core.file:/tmp/core.file \
  broker \
  gdb /opt/egen/bin/BrokerageHouseMain /tmp/core.file

basically: “spin up a fresh broker container, bring the core file into it, and launch GDB where everything matches.”

This still uses Docker in a slightly “creative” way. Let me know if this bothers you I’ll refactor the PR in that case.
Full workflow also documented as comments at the top of compose.yaml.

markwkm · 2026-03-24T17:50:15Z

It would be additionally helpful if you commented on your experiences about using the various methods you are proposing.

Riteshk1314 · 2026-03-28T14:35:22Z

Hey @markwkm, here's what I experienced testing the different approaches:

Sysroot copy (earlier iteration)

This worked when host and container distros didn't match, copied the binary + shared libs via ldd into ./cores/sysroot/, then used set sysroot in GDB on the host. Got clean backtraces, but honestly it felt hacky. You have to redo the copy every time you rebuild the image, and having a partial rootfs sitting next to your core files isn't great. Fine for a one-off debug session, not something I'd want baked into a default compose setup.

coredumpctl + docker compose run (current approach)

This was a better experience, On my Ubuntu 22.04 setup, crashes inside containers just showed up in coredumpctl list automatically since kernel.core_pattern is kernel-level. Did coredumpctl dump to extract the core, mounted it into the container, and GDB gave clean traces right away, so no symbol mismatch headaches since the container has the exact binary and libs that produced the core.

One thing to note, for Ubuntu desktop, Apport likes to hijack core_pattern, so I had to disable that first. Noted it in the compose.yaml comments.

On WSL

This was the messiest of the three. WSL2 doesn't run systemd by default so coredumpctl just isn't there. Had to manually set kernel.core_pattern to a writable path, and even then the behavior felt flaky compared to a native Linux host cores sometimes didn't land where expected until I got the path and permissions just right. It works once you get past the setup, but definitely the least smooth experience.

Overall method 2 was better, it piggybacks on what the host already does.
I'll be happy to work any changes you have to suggest.

Add documentation and copy binaries/core dumps to host

2a13590

Riteshk1314 added 2 commits March 24, 2026 13:00

Add core dump support for debugging container crashes

17456e1

Add core dump support for debugging container crashes with proper steps

afd1c24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add core dump support to Docker Compose services#35

Add core dump support to Docker Compose services#35
RiteshKtiet wants to merge 4 commits intoosdldbt:mainfrom
RiteshKtiet:add-core-dump-support

RiteshKtiet commented Mar 14, 2026

Uh oh!

markwkm commented Mar 14, 2026 •

edited

Loading

Uh oh!

Riteshk1314 commented Mar 21, 2026

Uh oh!

markwkm commented Mar 22, 2026

Uh oh!

Riteshk1314 commented Mar 24, 2026

Uh oh!

markwkm commented Mar 24, 2026

Uh oh!

Riteshk1314 commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RiteshKtiet commented Mar 14, 2026

Uh oh!

markwkm commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Riteshk1314 commented Mar 21, 2026

Key Changes

How the Host/Guest OS Mismatch Is Solved

Usage

1. Host prerequisite (one-time)

2. Start services

3. After a crash

4a. Debug inside the container (easier)

4b. Debug locally on the host

5. Inside GDB

Uh oh!

markwkm commented Mar 22, 2026

Uh oh!

Riteshk1314 commented Mar 24, 2026

Debugging workflow

Uh oh!

markwkm commented Mar 24, 2026

Uh oh!

Riteshk1314 commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

markwkm commented Mar 14, 2026 •

edited

Loading