Don't rebuild HTTP session with no connections#1020
Conversation
|
@Koncopd , latest aiobotocore doesn't like that; but I'm not sure why we are getting an unstarted session the second time around. |
|
Given that there isn't much cost to "rebuilding" the http_sessions without any actual live sessions, and that it only happens once, I think it's not worth trying to prob aiobotocore in the right direction here. What do you think, @Koncopd ? |
|
As I said, my problem is that rebuilding aiobotocore session (not http sessions) erases the credentials I pass. And if I provide a session, it should not happen that I just lose it on set_session. |
|
Here I mean Line 643 in 731e125 |
|
Btw the failing test (test_session_close) looks strange to me, it runs |
Yes, that is exactly the idea. If I remember, the variable is there to allow a breakpoint in the function, where we know which invocation we are on. |
|
Thank you, i will try to check the problem later. |
|
Hi — came at this from a perf regression we hit downstream (one of our test suites went from 49s to 173s on s3fs 2026.4.0). I put together a small repro that walks through the states the cached-session check can end up in, in case it's useful: https://github.com/lukedyer-uipath/s3fs-pr1002-repro I worked through the diagnosis and built the linked repro with help from Claude Code, so any of the claims below are worth a sanity check before acting on — flagging that up front. I might be misreading some of this, so please correct me, but it looks like there are a couple of states beyond the empty-dict case from #1019 that this PR's
Closed inner sessions with I don't have a confident view on the right shape of the fix — For what it's worth, we've sidestepped this downstream by replacing aiomoto with |
|
Iterating with the bot has produced this suggested code (rather verbose, of course), for discussion. The shape of it feels ok and, more importantly, the tests poke some of the conditions. This also folds in the anti-race asyncio lock. |
|
This fixes the performance regression issue I was having. Thanks for taking a look into it! A suggestion of an extra test to cover the niche performance issue I found 👇def test_set_session_closed_sessions_rebuilds_once(s3):
"""Regression test for the #1019 perf regression: a populated-but-all-closed
_sessions dict must rebuild the client exactly once, then reuse it. The
rebuilt client has an empty _sessions dict, which must not count as
"all closed" (vacuous all([]) == True) and force a refresh on every call.
"""
import aiobotocore.session as aio_session
from unittest import mock
s3.pipe(f"{test_bucket_name}/dir/afile", b"small")
create_client_calls = 0
original_create_client = aio_session.AioSession._create_client
async def counting_create_client(self, *args, **kwargs):
nonlocal create_client_calls
create_client_calls += 1
return await original_create_client(self, *args, **kwargs)
async def run():
fs = S3FileSystem(
anon=False,
asynchronous=True,
client_kwargs={"endpoint_url": endpoint_uri},
skip_instance_cache=True,
)
await fs._ls(f"{test_bucket_name}/dir") # populates _sessions
sessions = fs._s3._endpoint.http_session._sessions
assert sessions, "expected a populated _sessions dict after an op"
for sess in sessions.values():
await sess.close()
baseline = create_client_calls # ignore the warmup build above
iterations = 10
for _ in range(iterations):
await fs.set_session()
rebuilds = create_client_calls - baseline
assert rebuilds == 1, (
f"set_session rebuilt the client {rebuilds} times across {iterations} "
f"calls; expected 1. >1 means the empty-dict case forces a refresh on "
f"every call (#1019 regression)."
)
await fs.set_session(refresh=True) # clean up
with mock.patch.object(
aio_session.AioSession, "_create_client", counting_create_client
):
asyncio.run(run()) |
Fixes #1019 Fixes #1025
Closes #1028