Skip to content

S3 Select passthrough #15

@ServerSideHannes

Description

@ServerSideHannes

Summary

Support S3 Select API requests, allowing clients to query CSV/JSON/Parquet objects without downloading the full object.

Problem

S3 Select (SelectObjectContent) lets clients run SQL-like queries on objects stored in S3. Currently, s3proxy does not handle this API — requests either fail or get forwarded as-is, which doesn't work for encrypted objects since the upstream data is ciphertext.

Proposal

  • Intercept SelectObjectContent requests
  • Decrypt the object (or stream-decrypt for multipart)
  • Run the S3 Select query against the plaintext
  • Return results in the expected S3 Select response format

Considerations

  • May require a local query engine for CSV/JSON parsing (e.g., DuckDB or pandas)
  • Parquet support adds complexity
  • For large objects, need to balance memory usage with query performance
  • Could initially support only CSV/JSON and add Parquet later
  • Alternative: proxy to a local MinIO instance with the decrypted data

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions