restrict cache storage directory permissions to owner#2048
Conversation
|
gentle ping |
|
Hmm, I am not sure what the default should be here. The user, after all, has complete control over where the data is written, so if they want to share privileged data with other users, we shouldn't hinder that. I suggest that this can be made into an argument for the caching filesystems, but the default should remain what it is now (i.e., None, system default). |
Default of None keeps the system default (umask), preserving current behaviour; passing an octal mode such as 0o700 restricts the cache storage directory to the owner.
|
Fair point - if someone deliberately points the cache at a shared location we shouldn't override that. I've reworked it the way you suggested: there's now a It's threaded through the three places that create the storage dir ( |
|
(gcsfs tests fixed in #2061 ) |
While looking at how
filecache/blockcache/simplecachepersist downloaded data, I noticed an inconsistency in the permissions they leave behind. The cache metadata file ends up0o600because it goes throughatomic_write, which creates its temporary viamkstemp, but the cache directory and the cached data files inside it do not.CachingFileSystem.__init__,_mkcacheandSimpleCacheFileSystem.__init__each create the storage directory with a bareos.makedirs(..., exist_ok=True), so under a normal0o022umask the directory comes out0o755and the cached files0o644. Whencache_storagepoints at a persistent location (the documented way to keep a cache between runs), any other local user can traverse the directory and read another user's cached content, which for the http/s3/gcs backends is data that was fetched with that user's credentials.The change sets
mode=0o700on the threeos.makedirscalls that create the cache storage directory, so the directory fsspec owns becomes owner-only and the data files within inherit that protection whichever backend wrote them. Keeping it at the directory-creation point rather than chmod-ing each cached file individually puts the guard in one place and covers all three cache implementations and every write path through them; directories fsspec did not create are left as they are. I have added a parametrised regression test over filecache/simplecache/blockcache that fails on the current tree (the directory is0o755) and passes once the mode is applied.