Skip to content

Performance: redundant LOWER() wrapping on encrypted attribute searches prevents index usage #14

@sl1depengwyn

Description

@sl1depengwyn

Problem

When PII encryption is enabled, Keycloak's JpaUserProvider applies SQL LOWER() to attribute value columns during searches. This is redundant for encrypted data — the plugin stores SHA-1 hashes that are already lowercase hex — but it prevents Postgres from using a standard B-tree index, causing full table scans.

Why LOWER() is redundant on encrypted values

LogicUtils.hash() produces the stored values:

public static String hash(String raw) {
    return raw != null ? DigestUtils.sha1Hex(raw.trim().toLowerCase()) : null;
}

DigestUtils.sha1Hex() always returns lowercase hex (40 chars, e.g. 6ea7bdc669b8926a75fe165989270ed025ac94dd). LOWER() on these values is a no-op.

Where LOWER() is applied (Keycloak core)

Keycloak's JpaUserProvider.predicates() (keycloak/keycloak@26.3.3) wraps searches with builder.lower():

  • firstName/lastName: builder.equal(builder.lower(root.get(key)), value.toLowerCase())
  • Custom attributes: builder.equal(builder.lower(attributesJoin.get("value")), value.toLowerCase())

How the plugin triggers it

EncryptedUserProvider.searchForUserStream() hashes the search values, then delegates to the parent:

attribute.setValue(LogicUtils.hash(attribute.getValue()));
// ...
return super.searchForUserStream(realm, attributes, firstResult, maxResults);

The parent applies LOWER() to both the column and the search value — both already lowercase — generating SQL like:

WHERE LOWER(ua.value) = LOWER('6ea7bdc669b8...')

Postgres cannot use a plain B-tree index on (name, value) for LOWER(value) expressions. It needs a functional index on (name, LOWER(value)).

Impact

On a table with ~212K users, attribute queries took ~2.6s without a functional index (full sequential scan). With a functional index on LOWER(value), they drop to ~0.2ms.

Suggested Fix

For the hash-based search path, implement a direct JPQL query with exact match (= :value) instead of delegating to super.searchForUserStream(). Since hashes are normalized to lowercase, LOWER() is unnecessary and the existing B-tree index on user_attribute(name, value) can serve the query directly.

Current Workaround

CREATE INDEX CONCURRENTLY idx_user_attribute_name_lower_value
ON user_attribute (name, lower(value));

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions