Skip to content

fix: proper fd lifecycle management to eliminate fdsan crash#877

Open
tardyp wants to merge 1 commit into
COVESA:masterfrom
tardyp:android_fix
Open

fix: proper fd lifecycle management to eliminate fdsan crash#877
tardyp wants to merge 1 commit into
COVESA:masterfrom
tardyp:android_fix

Conversation

@tardyp
Copy link
Copy Markdown

@tardyp tardyp commented May 26, 2026

fdsan crash is found while integrating dlt-daemon into an android automotive system.
We are not deep experts of the dlt-daemon inner working, but I think that the IA answer is sound.
We could not reproduce the fdsan crash after patch applied.

Root cause: The DLT_CONNECTION_GATEWAY connections share a receiver pointer aliased into DltGatewayConnection.client.receiver. When dlt_connection_destroy() closed the fd, the gateway's client.sock retained the stale fd number. If the kernel reused that fd number (e.g., for a FILE*), subsequent gateway send() calls would trigger Android fdsan abort.

Changes:

  • dlt_connection_destroy: Do NOT close fd for GATEWAY connections (they don't own it). Only detach the receiver pointer.
  • dlt_gateway_close_connection: New function that properly closes client.sock AND invalidates client.receiver.fd at all disconnect points.
  • Deferred destruction: Connections are now marked PENDING_DESTROY instead of being freed immediately. A sweep phase after the event loop iteration safely destroys them, preventing use-after-free when callbacks trigger their own connection's removal.

Root cause: The DLT_CONNECTION_GATEWAY connections share a receiver pointer
aliased into DltGatewayConnection.client.receiver. When dlt_connection_destroy()
closed the fd, the gateway's client.sock retained the stale fd number. If the
kernel reused that fd number (e.g., for a FILE*), subsequent gateway send()
calls would trigger Android fdsan abort.

Changes:
- dlt_connection_destroy: Do NOT close fd for GATEWAY connections (they don't
  own it). Only detach the receiver pointer.
- dlt_gateway_close_connection: New function that properly closes client.sock
  AND invalidates client.receiver.fd at all disconnect points.
- Deferred destruction: Connections are now marked PENDING_DESTROY instead of
  being freed immediately. A sweep phase after the event loop iteration
  safely destroys them, preventing use-after-free when callbacks trigger
  their own connection's removal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Pierre Tardy <pierre.tardy@renault.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant