Skip to content

[Bug] BE SIGSEGV and wrong result with nested lambda over array_agg result in array_map/array_count #63824

@LubyRuffy

Description

@LubyRuffy

Search before asking

  • I had searched in the issues and found no similar issues.

Version

from doris:be-4.0.0 to doris:be-4.1.1

test: apache/doris:fe-4.1.1 / apache/doris:be-4.1.1

Runtime version reported by FE/BE:

doris-4.1.1-rc01-b10073ad9ca

Official source tag/commit matched from Docker BE log:

4.1.1
b10073ad9ca17cd5685c4dd3b3ef650f256376d0

Test environment:

Host architecture: arm64
Docker client: 28.5.2
Docker server: 29.5.2
FE image: apache/doris:fe-4.1.1, linux/arm64, sha256:318ab41551d884ded601366193d6115ffdf6471e78e28c572c3dfcfa99d2255e
BE image: apache/doris:be-4.1.1, linux/arm64, sha256:4905607a194641fb47284b616836766aca283ae85575491c799351a41deec60d

What's Wrong?

A minimal query using nested lambda expressions crashes Doris BE with SIGSEGV.

The query does not require any physical table or inserted data. It only builds a one-element array with array_agg('a'), then evaluates array_map(x -> array_count(y -> y = x, ids), ids).

Minimal crashing query:

WITH base AS (
  SELECT array_agg('a') AS ids
)
SELECT array_map(
  x -> array_count(y -> y = x, ids),
  ids
) AS result
FROM base;

Expected result:

[1]

Actual client error:

ERROR 1105 (HY000) at line 1: RpcException, msg: send fragments failed. io.grpc.StatusRuntimeException: UNAVAILABLE: io exception, host: 172.30.82.3

The BE process exits. docker ps -a shows:

doris-minrepro-be    Exited (0) 9 seconds ago    apache/doris:be-4.1.1

BE log shows the failing query and SIGSEGV:

*** Query id: f944e19b5ee14d86-bfd49529a73f9a6a ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1779956843 (unix time) try "date -d @1779956843" if you are using GNU date ***
*** Current BE git commitID: b10073ad9ca ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 762 (TID 1215 OR 0xfffde7f865c0) from PID 0; stack trace: ***

Relevant stack frames:

doris::signal::(anonymous namespace)::FailureSignalHandler
doris::is_column_const
doris::PreparedFunctionImpl::default_implementation_for_constant_arguments
doris::VectorizedFnCall::_do_execute
doris::VLambdaFunctionExpr::execute_column
doris::ArrayMapFunction::execute
doris::VectorizedFnCall::_do_execute
doris::VLambdaFunctionExpr::execute_column
doris::ArrayMapFunction::execute
doris::PipelineTask::execute

This is not only a crash. Nested lambda binding also appears semantically wrong even when the query does not crash:

SELECT array_map(x -> array_count(y -> y = x, ['a']), ['b']) AS should_be_zero;
SELECT array_map(x -> array_count(y -> y = x, ['a']), ['a']) AS should_be_one;

Expected:

should_be_zero
[0]
should_be_one
[1]

Actual on 4.1.1:

should_be_zero
[1]
should_be_one
[1]

So the root issue seems to be nested lambda scoping/capture, and the array_agg version turns the same scoping bug into an invalid column access and BE crash.

What You Expected?

The minimal crashing query should return [1] and should never crash BE.

Nested lambdas should resolve variables according to lexical lambda scope. For example:

SELECT array_map(x -> array_count(y -> y = x, ['a']), ['b']);

should return [0], because the outer lambda variable x is 'b', while the inner array contains only 'a'.

BE should also fail safely with a query error if an expression binding is invalid. A SQL expression should not be able to terminate the BE process.

How to Reproduce?

Run a minimal FE/BE Docker cluster:

docker rm -f doris-minrepro-fe doris-minrepro-be >/dev/null 2>&1 || true
docker network rm doris-minrepro >/dev/null 2>&1 || true

docker network create --subnet 172.30.82.0/24 doris-minrepro

docker run -d \
  --name doris-minrepro-fe \
  --network doris-minrepro \
  --ip 172.30.82.2 \
  -p 39030:9030 \
  -p 38030:8030 \
  -e FE_SERVERS=fe1:172.30.82.2:9010 \
  -e FE_ID=1 \
  apache/doris:fe-4.1.1

for i in $(seq 1 90); do
  if docker exec doris-minrepro-fe mysql -uroot -h127.0.0.1 -P9030 -e 'SHOW FRONTENDS'; then
    break
  fi
  sleep 2
done

docker run -d \
  --name doris-minrepro-be \
  --network doris-minrepro \
  --ip 172.30.82.3 \
  -p 38040:8040 \
  -e FE_MASTER_IP=172.30.82.2 \
  -e BE_IP=172.30.82.3 \
  -e BE_PORT=9050 \
  apache/doris:be-4.1.1

for i in $(seq 1 120); do
  docker exec doris-minrepro-fe mysql -uroot -h127.0.0.1 -P9030 -e 'SHOW BACKENDS' > /tmp/doris-minrepro-backends.out 2>&1 || true
  if grep -q 'true' /tmp/doris-minrepro-backends.out; then
    cat /tmp/doris-minrepro-backends.out
    break
  fi
  sleep 2
done

Run normal control queries first:

docker exec -i doris-minrepro-fe mysql -uroot -h127.0.0.1 -P9030 <<'SQL'
WITH base AS (SELECT array_agg('a') AS ids)
SELECT ids FROM base;

WITH base AS (SELECT array_agg('a') AS ids)
SELECT array_count(y -> y = 'a', ids) AS result FROM base;

WITH base AS (SELECT array_agg('a') AS ids)
SELECT array_map(x -> x, ids) AS result FROM base;

SELECT array_map(x -> array_count(y -> y = x, ['a']), ['a']) AS expected_result;
SQL

Expected and observed output:

ids
["a"]
result
1
result
["a"]
expected_result
[1]

Run the crashing query:

docker exec -i doris-minrepro-fe mysql -uroot -h127.0.0.1 -P9030 <<'SQL'
WITH base AS (
  SELECT array_agg('a') AS ids
)
SELECT array_map(
  x -> array_count(y -> y = x, ids),
  ids
) AS result
FROM base;
SQL

Observed:

ERROR 1105 (HY000) at line 1: RpcException, msg: send fragments failed. io.grpc.StatusRuntimeException: UNAVAILABLE: io exception, host: 172.30.82.3

Collect evidence:

docker ps -a --filter name=doris-minrepro-be --format '{{.Names}}\t{{.Status}}\t{{.Image}}'

docker logs doris-minrepro-be 2>&1 | \
  grep -En 'Query id|SIGSEGV|FailureSignalHandler|is_column_const|VectorizedFnCall|VLambdaFunctionExpr|ArrayMapFunction|PipelineTask'

rm -rf /tmp/doris-minrepro-log
docker cp doris-minrepro-be:/opt/apache-doris/be/log /tmp/doris-minrepro-log

Optional physical-table reproduction with one row:

CREATE DATABASE IF NOT EXISTS repro;
USE repro;

DROP TABLE IF EXISTS t_array_lambda;

CREATE TABLE t_array_lambda (
  es_id VARCHAR(8),
  fid VARCHAR(8)
)
DISTRIBUTED BY HASH(es_id) BUCKETS 1
PROPERTIES (
  "replication_num" = "1"
);

INSERT INTO t_array_lambda VALUES ('g', 'a');

WITH base AS (
  SELECT es_id, array_agg(fid) AS ids
  FROM t_array_lambda
  GROUP BY es_id
)
SELECT array_map(
  x -> array_count(y -> y = x, ids),
  ids
) AS result
FROM base;

Anything Else?

Source-level investigation

The Docker BE log reports commit b10073ad9ca, which matches the official 4.1.1 tag target:

git ls-remote https://github.com/apache/doris.git | grep 'refs/tags/4.1.1'
# 73552015e7587a04f857cb0257fc6e178958d389 refs/tags/4.1.1
# b10073ad9ca17cd5685c4dd3b3ef650f256376d0 refs/tags/4.1.1^{}

From source inspection, the likely root cause is in ArrayMapFunction's handling of VColumnRef gaps for nested lambdas.

ArrayMapFunction::execute collects slot refs from the lambda body, computes gap, then recursively applies that gap to all VColumnRef nodes under the body:

LambdaArgs args_info;
// collect used slot ref in lambda function body
std::vector<int>& output_slot_ref_indexs = args_info.output_slot_ref_indexs;
_collect_slot_ref_column_id(children[0], output_slot_ref_indexs);
int gap = 0;
if (!output_slot_ref_indexs.empty()) {
auto max_id =
std::max_element(output_slot_ref_indexs.begin(), output_slot_ref_indexs.end());
gap = *max_id + 1;
_set_column_ref_column_id(children[0], gap);
}

_collect_slot_ref_column_id(children[0], output_slot_ref_indexs);

if (!output_slot_ref_indexs.empty()) {
    auto max_id = std::max_element(output_slot_ref_indexs.begin(), output_slot_ref_indexs.end());
    gap = *max_id + 1;
    _set_column_ref_column_id(children[0], gap);
}

The recursive setter does not appear to stop at nested lambda boundaries:

void _set_column_ref_column_id(VExprSPtr expr, int gap) const {
for (const auto& child : expr->children()) {
if (child->is_column_ref()) {
auto* ref = static_cast<VColumnRef*>(child.get());
ref->set_gap(gap);
} else {
_set_column_ref_column_id(child, gap);
}
}

void _set_column_ref_column_id(VExprSPtr expr, int gap) const {
    for (const auto& child : expr->children()) {
        if (child->is_column_ref()) {
            auto* ref = static_cast<VColumnRef*>(child.get());
            ref->set_gap(gap);
        } else {
            _set_column_ref_column_id(child, gap);
        }
    }
}

VColumnRef::set_gap only sets the gap once, so an inner lambda variable can inherit the outer lambda gap and cannot be corrected by the inner ArrayMapFunction execution:

void set_gap(int gap) {
if (_gap == 0) {
_gap = gap;
}
}

At execution time, VColumnRef reads the column by column_id + gap:

Status execute_column(VExprContext* context, const Block* block, Selector* selector,
size_t count, ColumnPtr& result_column) const override {
DCHECK(_open_finished || block == nullptr);
auto origin_column = block->get_by_position(_column_id + _gap).column;
result_column = filter_column_with_selector(origin_column, selector, count);
return Status::OK();
}
DataTypePtr execute_type(const Block* block) const override {
DCHECK(_open_finished || block == nullptr);
return block->get_by_position(_column_id + _gap).type;

The const overload of Block::get_by_position has no runtime boundary check:

ColumnWithTypeAndName& get_by_position(size_t position) {
DCHECK(data.size() > position)
<< ", data.size()=" << data.size() << ", position=" << position;
return data[position];
}
const ColumnWithTypeAndName& get_by_position(size_t position) const { return data[position]; }

The wrong/out-of-range ColumnPtr is then dereferenced in the constant-argument path:

  • PreparedFunctionImpl::default_implementation_for_constant_arguments:
    Status PreparedFunctionImpl::default_implementation_for_constant_arguments(
    FunctionContext* context, Block& block, const ColumnNumbers& args, uint32_t result,
    size_t input_rows_count, bool* executed) const {
    *executed = false;
    ColumnNumbers args_expect_const = get_arguments_that_are_always_constant();
    // Check that these arguments are really constant.
    for (auto arg_num : args_expect_const) {
    if (arg_num < args.size() &&
    !is_column_const(*block.get_by_position(args[arg_num]).column)) {
    return Status::InvalidArgument("Argument at index {} for function {} must be constant",
    arg_num, get_name());
    }
    }
    if (args.empty() || !use_default_implementation_for_constants() ||
    !VectorizedUtils::all_arguments_are_constant(block, args)) {
    return Status::OK();
    }
  • VectorizedUtils::all_arguments_are_constant:
    static bool all_arguments_are_constant(const Block& block, const ColumnNumbers& args) {
    for (const auto& arg : args) {
    if (!is_column_const(*block.get_by_position(arg).column)) {
    return false;
    }
    }
    return true;
    }
  • is_column_const:
    bool is_column_const(const IColumn& column) {
    return is_column<ColumnConst>(column);
    }

This matches the observed crash stack.

Suggested fix direction

I think the fix should be in lambda variable scoping, not just in the crash site.

  1. Make _set_column_ref_column_id and _collect_slot_ref_column_id scope-aware.

    • When traversing the current lambda body, do not recurse into nested LAMBDA_FUNCTION_EXPR / nested lambda bodies as if they belonged to the same lambda scope.
    • Inner lambda parameters should be assigned/resolved by the inner lambda execution context only.
  2. Avoid storing mutable execution-specific gap state directly on shared VColumnRef nodes when nested lambda expressions can reuse the same expression tree.

    • Passing the gap through an execution context, or cloning/rebinding the relevant lambda body per scope, would be safer than mutating VColumnRef::_gap globally.
  3. Add a runtime guard in VColumnRef::execute_column or use safe_get_by_position before dereferencing.

    • This would turn the current process crash into a query error.
    • However, this is only a safety net. It would not fix the wrong-result case shown above.
  4. Add regression tests for both cases:

-- Should not crash; should return [1]
WITH base AS (
  SELECT array_agg('a') AS ids
)
SELECT array_map(
  x -> array_count(y -> y = x, ids),
  ids
) AS result
FROM base;

-- Should return [0], not [1]
SELECT array_map(x -> array_count(y -> y = x, ['a']), ['b']) AS should_be_zero;

Production context

The production query that first exposed the bug was an account-behavior aggregation using this pattern:

WITH base AS (
  SELECT
    es_id,
    ARRAY_AGG(fid) AS ids
  FROM query_log
  WHERE created_at >= '2026-05-18 15:26:00'
    AND created_at <  '2026-05-18 15:36:00'
  GROUP BY es_id
)
SELECT
  COUNT(*) AS groups_count,
  SUM(ARRAY_SIZE(array_map(
    x -> array_count(y -> y = x, ids),
    array_distinct(ids)
  ))) AS bucket_count
FROM base;

The minimal reproduction above removes the production table, timestamp filter, grouping cardinality, and data volume from the equation. A single array_agg('a') is enough to reproduce the BE crash.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions