Skip to content

Add API so that Repository write functions only compute hash once #2516

@jonmeow

Description

Summary 💡

write_blob looks like it computes the hash twice. Could a variant of write_buf/write_stream be added that avoids the second hash computation?

To detail, on Repository, I believe write_blob will:

  1. Calls compute_hash to produce a hash.
  2. Call exists with the hash for a possible early exit.
  3. Call write_buf, which calls write_stream, which again computes a hash.

The hashes computed by compute_hash (in write_blob) and write_stream should be identical, implying write_stream could avoid a hash computation and instead rely on the earlier value.

Motivation 🔦

I was looking at this while examining the write_blob implementation as part of digging into performance for jj-vcs/jj#9304, and I'd probably use it in the related code. I think it'd be an incremental performance improvement. If you're supportive of this idea, I'd be happy to contribute a PR for it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions