Reduce Microsoft Graph API calls during large parallel applies using a shared list cache#101
Reduce Microsoft Graph API calls during large parallel applies using a shared list cache#101Hu6li wants to merge 2 commits intomicrosoft:mainfrom
Conversation
Reduces GET /collection calls during large parallel applies by sharing a single cached response across concurrent resource operations.
Replaces individual GET /resource/{id} calls with shared collection
fetches. Removes WaitForDeletion and WaitForUpdate after PATCH/PUT.
|
@microsoft-github-policy-service agree company="Die Schweizerische Post AG" |
|
Would this now call e.g. „GET /users“ and potentially load 1000s of users or more depending on the size of the directory, even if there‘s just a few users managed by terraform? Fetching entire collections without the possibility to set filters doesn’t seem like a good solution to me. |
|
Yes, I see your concerns and appreciate the thorough review. The fundamental problem is the hard limit of 1500 requests per hour imposed by the Graph API. The current main branch implementation makes a minimum of 5 API calls per resource per apply due to WaitForUpdate and WaitForDeletion both using ContinuousTargetOccurence: 3 in the consistency polling logic. 200 × POST /collection = 200 calls I can confirm that in practice I was unable to apply a set of 200 resources using the main branch version, especially if there are already several resources deployed. I see your point regarding filtering, this would be a worthwhile improvement as a next step to address the large-tenant case |
Problem
When running terraform apply with many resources (e.g. 200+ group members or detection rules), the provider issues one [GET /resource/{id}] request per resource during Create, Update, Read, and the post-create consistency polling loop. This quickly exhausts Microsoft Graph API rate limits and results in long apply times or throttling errors.
Solution
Instead of fetching each resource individually, the provider now calls GET /{collection} once and finds the item by ID in the response. A short-lived (10 s) in-memory cache ensures that concurrent resource operations share a single collection fetch per TTL window rather than each firing their own request.
To prevent a thundering-herd when many resources miss the cache simultaneously, an in-flight deduplication mechanism ensures only one goroutine issues the GET /{collection} request — all others wait on that result.
Changes
MSGraphClient gains a cachedList method (cache + in-flight dedup), ReadFromList (single fetch, find by ID), and ReadFromListWithWait (polls until item appears, used after Create).
All write operations (Create, Update, Delete, Action) invalidate the relevant cache entry immediately after success so the next read always sees fresh data.
Create uses ReadFromListWithWait to confirm visibility without per-resource polling loops.
Update uses a plain ReadFromList for the final state read — no polling needed since the item already exists.
Read uses ReadFromList instead of a direct [GET /resource/{id}]
Delete no longer calls WaitForDeletion — a successful 204 No Content response is sufficient.
Update no longer calls WaitForUpdate — the cache invalidation + single collection fetch is enough.
Testing
Validated with a real tenant apply of 700+ resources. API call volume dropped significantly. Deployments with that amount of resources haven't been possible before.
Remarks
This approach significantly reduces Graph API calls in large deployments. I'm happy to adapt the implementation if maintainers prefer a different caching strategy.