Skip to content

[rdma] occasional crash in butil::IOBuf::clear() #3252

@live4thee

Description

@live4thee

Describe the bug

Spotted from log, as shown below:

PC: @               Ox0(unknown)
*** SIGSEGV(@0x4) received by PID 7(TID Ox7f45f57bd6c0) from PID 4; stack trace: ***
    0x561a2f021526 google::Canonymousnamespace)::FailureSignaLHandler()
    0x7f45f8254050 (unknown)
    0x561a2ef11e9f butil::IOBuf::clear()
    0x561aZed92a2e brpc::rdma::RdmaEndpoint::HandleCompletion()
    0x561a2ed92f33 brpc::rdma::RdmaEndpoint::PollCq()
    0x561a2ed56290 brpc::Socket::ProcessEvent()
    0x561a2eecffcf bthread::TaskGroup::task_runner()
    0x561a2eeea901 bthread_make_fcontext

In theory, we have ref-counting in Socket::Address() that protects racing against _sq_sent between WaitAndReset() and RdmaEndpoint::PollCq(), but ... @yanglimingcn @chenBright Any clue?

To Reproduce

unknown

Expected behavior

No crash.

Versions
OS: Debian 12
Compiler: g++ (Debian 14.2.0-19) 14.2.0
brpc: 1.14
protobuf:

Additional context/screenshots

n/a

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions