Skip to content

ErrorResponse > 30000 bytes is misparsed as a pre-protocol plain-text error #1324

Description

@RaduBerinde

Summary

Since v1.11.0 (PR #1249, commit 0ad30496), recvMessage treats any ErrorResponse whose payload length exceeds MaxErrlen (30000) as a pre-protocol plain-text error. A legitimate v3 ErrorResponse larger than 30 KB is therefore never parsed into its fields — the caller gets a garbled string instead of the real error.

Affected code

conn.go (v1.12.3, ~line 1088):

// libpq checks "if ErrorResponse && (msgLength < 8 || msgLength > MAX_ERRLEN)" ...
if t == proto.ErrorResponse && (n < 4 || n > proto.MaxErrlen) {
    msg, _ := cn.buf.ReadString('\x00')
    return 0, fmt.Errorf("pq: server error: %s%s", string(x[1:]), strings.TrimSuffix(msg, "\x00"))
}

MaxErrlen = 30_000 (internal/proto/proto.go).

Why it's wrong

The guard from PR #1249 is correct for its intended case (a backend that can't fork sends Ecould not fork… as plain text before the protocol handshake).

But libpq only treats msgLength > 30000 as a pre-protocol error during connection establishment (fe-connect.c, PQconnectPollCONNECTION_AWAITING_RESPONSE), where the comment notes it's coping with a pre-3.0-protocol server. In steady-state message parsing (fe-protocol3.c, pqParseInput3) there is a 30000 sanity check, but ErrorResponse is explicitly exempt via VALID_LONG_MESSAGE_TYPE (alongside DataRow, CopyData, NoticeResponse, etc.):

#define VALID_LONG_MESSAGE_TYPE(id) \
    ((id) == PqMsg_CopyData || \
     (id) == PqMsg_DataRow || \
     (id) == PqMsg_ErrorResponse || \
     (id) == PqMsg_FunctionCallResponse || \
     (id) == PqMsg_NoticeResponse || \
     (id) == PqMsg_NotificationResponse || \
     (id) == PqMsg_RowDescription)

...

if (msgLength > 30000 && \!VALID_LONG_MESSAGE_TYPE(id))
{
    handleSyncLoss(conn, id, msgLength);
    return;
}

So libpq deliberately allows large error messages once the protocol is established; lib/pq instead applies the startup-only cap to every ErrorResponse, mangling valid errors > 30 KB.

The returned string is also nonsense: string(x[1:]) is the 4 raw big-endian length bytes, so the error reads like pq: server error: <binary length><start of body> (e.g. pq: server error: �NSERROR, where �N are length bytes and SERROR is the start of the Severity field).

Reproduction

Against any PostgreSQL server (no special setup):

db, _ := sql.Open("postgres", connStr)
// Raise an error whose message field alone is ~40 KB.
_, err := db.Exec(`DO $$ BEGIN RAISE EXCEPTION '%', repeat('x', 40000); END $$;`)
fmt.Println(err)
// v1.10.9:  pq: xxxxxxxx... (the real message)
// v1.11.0+: pq: server error: <garbage>   (fields never parsed)

We hit this in CockroachDB, whose schema-changer errors embed graphviz URLs and routinely exceed ~50 KB.

Suggested fix

Scope the > MaxErrlen heuristic to the connection-startup path only (mirroring libpq), or drop the upper bound in steady-state recvMessage and rely on the length prefix. The < 8/< 4 lower-bound and OOM protection for the pre-handshake case can be preserved.

Affected versions: v1.11.0 – v1.12.3 (latest).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions