Skip to content

_isdir() does not put entry in the dircache #702

Description

@aabmass

Thanks for the awesome project! I'm running into an issue where gcsfs is making many unexpected GET requests when I try to upload to a directory within a GCS bucket using put(). I believe the issue can be reproduced by just calling GcsFileSystem.put("somefile.txt", "gs://bucket/subdir/somefile.txt") with a subdirectory in a bucket. The network logs look like

2025-09-23 04:40:25,909 - gcsfs - DEBUG - _call -- GET: b/{}/o/{}, ('my-fake-bucket', 'v2/9bc1e04f-7b17-4da6-a32f-3d460e1e275c_outputs.json'), None
2025-09-23 04:40:26,067 - gcsfs - DEBUG - _call -- GET: b/{}/o, ('my-fake-bucket',), None
2025-09-23 04:40:26,213 - gcsfs - DEBUG - _call -- GET: b/{}/o/{}, ('my-fake-bucket', 'v2/9bc1e04f-7b17-4da6-a32f-3d460e1e275c_outputs.json'), None
2025-09-23 04:40:26,431 - gcsfs - DEBUG - _call -- POST: https://storage.googleapis.com/upload/storage/v1/b/my-fake-bucket/o, (), {'Content-Type': 'multipart/related; boundary="==0=="'}

_put() -> _isdir() -> _info() which calls GcsFileSystem._ls_from_cache() several times (1, 2) with cache misses. I did some debugging and dircache is remaining empty even though I'm doing this in a loop. My actual code is using simplecache and looks something like this:

for file in files:
  with fsspec.open(f"simplecache::gs://my-fake-bucket/v2/{file}", "w+") as file:
    json.dump(some_obj, file)

(which is calling GcsFileSystem.put() under the hood here)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions