From ea0b25ba0e1f6f5710f19cb206124e0e9215c702 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20Gonz=C3=A1lez=20Di=20Antonio?= Date: Sat, 23 May 2026 17:43:26 +0200 Subject: [PATCH 1/2] feat(scim): replace brute-force member sync with members.value pagination MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes #520. The AWS IAM Identity Center SCIM ListGroups endpoint now supports the `members.value eq ""` filter together with cursor-based pagination, so a sync run no longer needs to issue one ListGroups call per (group, user) pair to reconstruct group membership. `internal/scim.GetGroupsMembers(ctx, groups, users)` now runs one paginated query per user (~O(users) calls) and inverts the result into the existing group → members map, replacing the previous O(groups × users) brute-force loop and its 10-150ms random sleep throttle. The concurrency cap is now exercised under Go 1.26's `testing/synctest` so the test asserts the true peak in-flight count using a virtual clock instead of waiting on a wall-clock sleep race. The legacy `GetGroupsMembers(ctx, gr)` and `GetGroupsMembersBruteForce` methods are removed without a compatibility shim. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 4 +- docs/Whats-New.md | 22 + internal/core/actions.go | 5 +- internal/core/scim.go | 10 +- internal/core/sync_test.go | 127 ++--- internal/scim/scim.go | 149 ++---- internal/scim/scim_test.go | 499 ++++++++---------- mocks/core/scim_mocks.go | 23 +- mocks/scim/scim_mocks.go | 15 + pkg/aws/scim.go | 75 ++- pkg/aws/scim_model.go | 11 +- pkg/aws/scim_test.go | 105 ++++ .../ListGroupsResponse_CursorPage1.json | 37 ++ .../ListGroupsResponse_CursorPage2.json | 22 + 14 files changed, 618 insertions(+), 486 deletions(-) create mode 100644 pkg/aws/testdata/ListGroupsResponse_CursorPage1.json create mode 100644 pkg/aws/testdata/ListGroupsResponse_CursorPage2.json diff --git a/README.md b/README.md index ac884f7..9943e4e 100644 --- a/README.md +++ b/README.md @@ -164,8 +164,8 @@ For config file examples, environment variable usage, CLI flags, SAM parameter u ## ⚠️ Limitations -* **Group Limit**: The AWS SSO SCIM API has a limit of 50 groups per request. Please support the feature request on the [AWS Support site](https://repost.aws/questions/QUqqnVkIo_SYyF_SlX5LcUjg/aws-sso-scim-api-pagination-for-methods) to help get this limit increased. -* **Throttling**: With a large number of users and groups, you may encounter a `ThrottlingException` from the AWS SSO SCIM API. This project uses the [httpx](https://github.com/slashdevops/httpx) library with automatic retry and jitter backoff to mitigate this, but it's still a possibility. +* **Group Page Size**: The AWS IAM Identity Center SCIM `ListGroups` endpoint returns at most 100 groups per page. Since v0.45.0 this project walks every page via cursor-based pagination, so a larger directory no longer requires manual configuration. +* **Throttling**: With a very large number of users and groups, you may still encounter a `ThrottlingException` from the AWS IAM Identity Center SCIM API. The new member-resolution algorithm (one `members.value` query per user, see [docs/Whats-New.md](docs/Whats-New.md)) is roughly two orders of magnitude lighter than the old brute-force path, but the underlying SCIM endpoint is still rate-limited. This project uses the [httpx](https://github.com/slashdevops/httpx) library with automatic retry and jitter backoff to mitigate this. * **User Status**: The Google Workspace API doesn't differentiate between normal and guest users except for their status. This project only syncs `ACTIVE` users. ## For `ssosync` Users diff --git a/docs/Whats-New.md b/docs/Whats-New.md index 45bf677..5c489e1 100644 --- a/docs/Whats-New.md +++ b/docs/Whats-New.md @@ -4,6 +4,28 @@ This document tracks notable changes, new features, and bug fixes across release ## Unreleased +### SCIM members sync no longer scales with `groups × users` (closes #520) + +Replaces the per-pair brute-force algorithm used to reconstruct group memberships on the AWS IAM Identity Center side with a single paginated query per user. + +**Before:** `internal/scim.GetGroupsMembersBruteForce` issued one `ListGroups` call for *every* (group, user) combination — `O(N_groups × N_users)` requests per sync, throttled with a hand-rolled 10–150ms random sleep and a concurrency cap of 5. For an org with 200 groups and 500 users that is 100,000 calls per sync run. + +**Now:** `internal/scim.GetGroupsMembers(ctx, groups, users)` issues one cursor-paginated `?cursor&filter=members.value eq ""` request per user (plus one extra request per additional page of memberships, when a single user belongs to more than 100 groups). The result is then inverted into the group → members map the rest of the pipeline expects. For the same 200-group / 500-user org this is ~500 calls per sync — **roughly two orders of magnitude fewer requests** — and the random sleep is gone since the call volume no longer needs artificial gapping. + +This is enabled by two AWS IAM Identity Center SCIM features documented at : + +* The `members.value eq ""` filter, which returns every group containing a given user. +* Cursor-based pagination (`?cursor` + `nextCursor`), which lifts the historical 50-result page cap to 100 results per page and supports walking the full result set deterministically. + +**API changes (internal-only — no user-facing CLI/config change):** + +* `pkg/aws.SCIMService` gained `ListGroupsWithCursor(ctx, filter, cursor) (*ListGroupsResponse, error)`. `ListGroups` is unchanged and remains the non-paginated single-page call. +* `pkg/aws.ListResponse` gained a `NextCursor string` field. +* `internal/core.SCIMService.GetGroupsMembers` now takes `(ctx, *model.GroupsResult, *model.UsersResult)`. The previous `GetGroupsMembers(ctx, gr)` and `GetGroupsMembersBruteForce(ctx, gr, ur)` methods, plus their AWS-side brute-force scaffolding, have been removed entirely — there is no compatibility shim. +* Memberships pointing at AWS-side groups that are *not* part of the in-scope `gr` (for example AWS-managed groups created outside the sync) are silently ignored, matching prior behavior. + +**Tests:** the concurrency cap is now exercised under `testing/synctest` (graduated to the standard library in Go 1.26 — see ) so the test asserts the true peak in-flight count using virtual time, instead of waiting on a wall-clock sleep race. + ### IAM least-privilege hardening for the state-file Lambda role Tightens the Lambda execution role in `template.yaml` so it can only touch the single state object via the single intended path. No behavior change for normal operation; the role is now strictly scoped. diff --git a/internal/core/actions.go b/internal/core/actions.go index 3ce2beb..02146c1 100644 --- a/internal/core/actions.go +++ b/internal/core/actions.go @@ -72,10 +72,7 @@ func scimSync( totalUsersResult = model.MergeUsersResult(usersCreated, usersUpdated, usersEqual) slog.Info("getting SCIM Groups Members") - // unfortunately, the SCIM service does not support the getGroupsMembers method in and efficient way - // see: "Nor Supported" section in: https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html - // scimGroupsMembersResult, err := scim.GetGroupsMembers(ctx, &totalGroupsResult) // not supported yet - scimGroupsMembersResult, err := scim.GetGroupsMembersBruteForce(ctx, totalGroupsResult, totalUsersResult) + scimGroupsMembersResult, err := scim.GetGroupsMembers(ctx, totalGroupsResult, totalUsersResult) if err != nil { return nil, nil, nil, fmt.Errorf("error getting groups members from the SCIM service: %w", err) } diff --git a/internal/core/scim.go b/internal/core/scim.go index 7b5525c..097bc4c 100644 --- a/internal/core/scim.go +++ b/internal/core/scim.go @@ -35,11 +35,11 @@ type SCIMService interface { // DeleteUsers deletes users in the SCIM Service given a list of users. DeleteUsers(ctx context.Context, ur *model.UsersResult) error - // GetGroupsMembers get the Groups and their Members from the SCIM service. - GetGroupsMembers(ctx context.Context, gr *model.GroupsResult) (*model.GroupsMembersResult, error) - - // GetGroupsMembersBruteForce get the Groups and their Members from the SCIM service using brute force. - GetGroupsMembersBruteForce(ctx context.Context, gr *model.GroupsResult, ur *model.UsersResult) (*model.GroupsMembersResult, error) + // GetGroupsMembers gets the in-scope Groups and their Members from the + // SCIM service. The implementation queries AWS with the members.value + // filter for each user in ur and assigns memberships back to groups in + // gr; memberships pointing at groups outside gr are ignored. + GetGroupsMembers(ctx context.Context, gr *model.GroupsResult, ur *model.UsersResult) (*model.GroupsMembersResult, error) // CreateGroupsMembers create groups members in the SCIM Service given a list of groups members. CreateGroupsMembers(ctx context.Context, gmr *model.GroupsMembersResult) (*model.GroupsMembersResult, error) diff --git a/internal/core/sync_test.go b/internal/core/sync_test.go index db1db1e..d864491 100644 --- a/internal/core/sync_test.go +++ b/internal/core/sync_test.go @@ -309,39 +309,15 @@ func TestSyncService_SyncGroupsAndTheirMembers(t *testing.T) { createUser2ResponseJSONBytes, err := json.Marshal(createUser2Response) assert.NoError(t, err) - listGroupsResponseGroup1User1 := &aws.ListGroupsResponse{ - ListResponse: aws.ListResponse{ - StartIndex: 1, - ItemsPerPage: 1, - TotalResults: 1, - Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:ListResponse"}, - }, - Resources: []*aws.Group{ - { - ID: "group-1", - Meta: aws.Meta{ - ResourceType: "Group", - Created: "2020-01-01T00:00:00Z", - LastModified: "2020-01-01T00:00:00Z", - }, - Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:Group"}, - DisplayName: "group 1", - Members: []*aws.Member{}, // AWS SSO SCIM API don't return members for list groups only TotalResults is returned - }, - }, - } - listGroupsResponseGroup1User1JSONBytes, err := json.Marshal(listGroupsResponseGroup1User1) - assert.NoError(t, err) - - listGroupsResponseGroup1User2 := &aws.ListGroupsResponse{ - ListResponse: aws.ListResponse{ - StartIndex: 1, - ItemsPerPage: 1, - TotalResults: 0, - Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:ListResponse"}, - }, - Resources: []*aws.Group{ - { + // listGroupsByMember returns the AWS ListGroups response for a + // `members.value eq ""` cursor query. The test scenario is: + // group-1 contains user-1; group-2 contains user-2. + listGroupsByMember := func(userSCIMID string) []byte { + schemas := []string{"urn:ietf:params:scim:api:messages:2.0:ListResponse"} + groups := []*aws.Group{} + switch userSCIMID { + case "user-1": + groups = append(groups, &aws.Group{ ID: "group-1", Meta: aws.Meta{ ResourceType: "Group", @@ -350,22 +326,10 @@ func TestSyncService_SyncGroupsAndTheirMembers(t *testing.T) { }, Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:Group"}, DisplayName: "group 1", - Members: []*aws.Member{}, // AWS SSO SCIM API don't return members for list groups only TotalResults is returned - }, - }, - } - listGroupsResponseGroup1User2JSONBytes, err := json.Marshal(listGroupsResponseGroup1User2) - assert.NoError(t, err) - - listGroupsResponseGroup2User1 := &aws.ListGroupsResponse{ - ListResponse: aws.ListResponse{ - StartIndex: 1, - ItemsPerPage: 1, - TotalResults: 0, - Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:ListResponse"}, - }, - Resources: []*aws.Group{ - { + Members: []*aws.Member{}, + }) + case "user-2": + groups = append(groups, &aws.Group{ ID: "group-2", Meta: aws.Meta{ ResourceType: "Group", @@ -374,36 +338,20 @@ func TestSyncService_SyncGroupsAndTheirMembers(t *testing.T) { }, Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:Group"}, DisplayName: "group 2", - Members: []*aws.Member{}, // AWS SSO SCIM API don't return members for list groups only TotalResults is returned - }, - }, - } - listGroupsResponseGroup2User1JSONBytes, err := json.Marshal(listGroupsResponseGroup2User1) - assert.NoError(t, err) - - listGroupsResponseGroup2User2 := &aws.ListGroupsResponse{ - ListResponse: aws.ListResponse{ - StartIndex: 1, - ItemsPerPage: 1, - TotalResults: 1, - Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:ListResponse"}, - }, - Resources: []*aws.Group{ - { - ID: "group-2", - Meta: aws.Meta{ - ResourceType: "Group", - Created: "2020-01-01T00:00:00Z", - LastModified: "2020-01-01T00:00:00Z", - }, - Schemas: []string{"urn:ietf:params:scim:schemas:core:2.0:Group"}, - DisplayName: "group 2", - Members: []*aws.Member{}, // AWS SSO SCIM API don't return members for list groups only TotalResults is returned + Members: []*aws.Member{}, + }) + } + resp := &aws.ListGroupsResponse{ + ListResponse: aws.ListResponse{ + ItemsPerPage: len(groups), + Schemas: schemas, }, - }, + Resources: groups, + } + body, err := json.Marshal(resp) + assert.NoError(t, err) + return body } - listGroupsResponseGroup2User2JSONBytes, err := json.Marshal(listGroupsResponseGroup2User2) - assert.NoError(t, err) // mock Google Workspace API calls svrIDP := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { @@ -456,21 +404,20 @@ func TestSyncService_SyncGroupsAndTheirMembers(t *testing.T) { switch r.URL.Path { case "/Groups": filter := r.URL.Query().Get("filter") - - switch filter { - case "": // first time getting groups + // AWS SCIM cursor-paginated calls include a bare "cursor" + // param; the new GetGroupsMembers algorithm queries one + // `members.value eq ""` page per user. + const prefix = `members.value eq "` + if strings.HasPrefix(filter, prefix) && strings.HasSuffix(filter, `"`) { + userSCIMID := filter[len(prefix) : len(filter)-1] + _, _ = w.Write(listGroupsByMember(userSCIMID)) + return + } + if filter == "" { _, _ = w.Write([]byte(`{}`)) - case "id eq \"group-1\" and members eq \"user-1\"": - _, _ = w.Write(listGroupsResponseGroup1User1JSONBytes) - case "id eq \"group-1\" and members eq \"user-2\"": - _, _ = w.Write(listGroupsResponseGroup1User2JSONBytes) // user 2 is not in group 1 - case "id eq \"group-2\" and members eq \"user-1\"": - _, _ = w.Write(listGroupsResponseGroup2User1JSONBytes) // user 1 is not in group 2 - case "id eq \"group-2\" and members eq \"user-2\"": - _, _ = w.Write(listGroupsResponseGroup2User2JSONBytes) - default: - w.WriteHeader(http.StatusBadRequest) + return } + w.WriteHeader(http.StatusBadRequest) case "/Users": _, _ = w.Write([]byte(`{}`)) } diff --git a/internal/scim/scim.go b/internal/scim/scim.go index e1d28c4..9bd6a98 100644 --- a/internal/scim/scim.go +++ b/internal/scim/scim.go @@ -4,9 +4,7 @@ import ( "context" "fmt" "log/slog" - "math/rand" "sync" - "time" "github.com/slashdevops/idp-scim-sync/internal/model" "github.com/slashdevops/idp-scim-sync/pkg/aws" @@ -40,6 +38,10 @@ type AWSSCIMProvider interface { // ListGroups lists groups in SCIM Provider ListGroups(ctx context.Context, filter string) (*aws.ListGroupsResponse, error) + // ListGroupsWithCursor lists a single page of groups using AWS SCIM + // cursor-based pagination. Pass an empty cursor for the first page. + ListGroupsWithCursor(ctx context.Context, filter, cursor string) (*aws.ListGroupsResponse, error) + // CreateOrGetGroup creates a group in SCIM Provider CreateOrGetGroup(ctx context.Context, g *aws.CreateGroupRequest) (*aws.CreateGroupResponse, error) @@ -411,53 +413,6 @@ func (s *Provider) DeleteGroupsMembers(ctx context.Context, gmr *model.GroupsMem return nil } -// GetGroupsMembers returns a list of groups and their members from the SCIM Provider -// NOTE: this method doesn't work because unfortunately the SCIM API doesn't support -// list the members of a group, or get a group and their members at the same time -// reference: https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html -func (s *Provider) GetGroupsMembers(ctx context.Context, gr *model.GroupsResult) (*model.GroupsMembersResult, error) { - groupMembers := make([]*model.GroupMembers, 0) - - for _, group := range gr.Resources { - // https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html - f := fmt.Sprintf("displayName eq %q", group.Name) - lgr, err := s.scim.ListGroups(ctx, f) - if err != nil { - return nil, fmt.Errorf("scim: error listing groups: %w", err) - } - - for _, gr := range lgr.Resources { - members := make([]*model.Member, len(gr.Members)) - - for j, member := range gr.Members { - u, err := s.scim.GetUser(ctx, member.Value) - if err != nil { - return nil, fmt.Errorf("scim: error getting user: %s, error %w", member.Value, err) - } - - m := model.MemberBuilder(). - WithSCIMID(member.Value). - WithEmail(u.Emails[0].Value). - Build() - - members[j] = m - } - - gms := model.GroupMembersBuilder(). - WithGroup(group). - WithResources(members). - Build() - - groupMembers = append(groupMembers, gms) - } - } - - slog.Debug("scim: GetGroupsMembers()", "groups_members", len(groupMembers)) - groupsMembersResult := model.GroupsMembersResultBuilder().WithResources(groupMembers).Build() - - return groupsMembersResult, nil -} - // patchGroupOperations assembles the operations for patch groups // bases in the limits of operations we can execute in a single request. func (s *Provider) patchGroupOperations(op, path string, pvs []patchValue, gms *model.GroupMembers) []*aws.PatchGroupRequest { @@ -502,54 +457,63 @@ func (s *Provider) patchGroupOperations(op, path string, pvs []patchValue, gms * return patchOperations } -// GetGroupsMembersBruteForce returns a list of groups and their members from the SCIM Provider -// NOTE: this is an bad alternative to the method GetGroupsMembers, because read the note in the method. -func (s *Provider) GetGroupsMembersBruteForce(ctx context.Context, gr *model.GroupsResult, ur *model.UsersResult) (*model.GroupsMembersResult, error) { - // a map of group SCIM IDs to a slice of members - membersByGroup := make(map[string][]*model.Member) +// getGroupsMembersConcurrency caps the number of in-flight ListGroupsWithCursor +// calls in GetGroupsMembers. Five was the historical limit used by the old +// O(groups * users) brute-force implementation; the new path issues at most +// one paginated query per user, so the same cap continues to be conservative. +const getGroupsMembersConcurrency = 5 + +// GetGroupsMembers returns the SCIM-side groups-to-members mapping for the +// groups in gr, inferred by querying which AWS groups contain each user in ur. +// +// AWS IAM Identity Center SCIM does not populate the members array on either +// ListGroups or GetGroup responses (see "Not supported" in +// https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html). +// The supported workaround is the members.value filter: for each user we ask +// AWS which groups contain them and invert the result. Pagination is handled +// via the cursor-based mode of ListGroups, so each user contributes one +// initial request plus one extra request per additional page of group +// memberships. +// +// Only groups present in gr are written into the result; memberships pointing +// at groups outside that set (e.g. AWS-managed groups not in scope) are +// discarded. The result preserves the order of gr.Resources. +func (s *Provider) GetGroupsMembers(ctx context.Context, gr *model.GroupsResult, ur *model.UsersResult) (*model.GroupsMembersResult, error) { + // Index in-scope groups by SCIM ID so the per-user inversion below only + // records memberships the caller actually asked about. + groupBySCIMID := make(map[string]*model.Group, len(gr.Resources)) + for _, group := range gr.Resources { + groupBySCIMID[group.SCIMID] = group + } - // pre-initialize all map entries before launching goroutines to avoid - // concurrent map writes between the outer loop and goroutine appends + membersByGroup := make(map[string][]*model.Member, len(gr.Resources)) for _, group := range gr.Resources { - membersByGroup[group.SCIMID] = make([]*model.Member, 0, 10) + membersByGroup[group.SCIMID] = make([]*model.Member, 0) } - // use an errgroup to manage the goroutines g, ctx := errgroup.WithContext(ctx) - - // set a limit to the number of concurrent goroutines - // to avoid hitting the API rate limit - g.SetLimit(5) - // protect concurrent access to membersByGroup + g.SetLimit(getGroupsMembersConcurrency) var mu sync.Mutex - // brute force implemented here thanks to the fxxckin' aws sso scim api - // iterate over each group and user and check if the user is a member of the group - for _, group := range gr.Resources { - for _, user := range ur.Resources { - - // create a new variable for the group and user to avoid data races - group := group - user := user - - g.Go(func() error { - // add random delay between 10-150 milliseconds to gap API calls - delay := time.Duration(rand.Intn(140)+10) * time.Millisecond - time.Sleep(delay) + for _, user := range ur.Resources { + user := user - // https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html - filter := fmt.Sprintf("id eq %q and members eq %q", group.SCIMID, user.SCIMID) + g.Go(func() error { + filter := fmt.Sprintf("members.value eq %q", user.SCIMID) - lgr, err := s.scim.ListGroups(ctx, filter) + cursor := "" + for { + lgr, err := s.scim.ListGroupsWithCursor(ctx, filter, cursor) if err != nil { - return fmt.Errorf("scim: error listing groups: %w", err) + return fmt.Errorf("scim: error listing groups for user %q: %w", user.SCIMID, err) } - // slog.Debug("checking group membership", "group_name", group.Name, "user_email", user.Emails[0].Value, "results", lgr.TotalResults) + for _, awsGroup := range lgr.Resources { + group, inScope := groupBySCIMID[awsGroup.ID] + if !inScope { + continue + } - // AWS SSO SCIM API, it doesn't return the member into the Resources array - if lgr.TotalResults > 0 { - // build member and append to group map m := model.MemberBuilder(). WithIPID(user.IPID). WithSCIMID(user.SCIMID). @@ -565,17 +529,18 @@ func (s *Provider) GetGroupsMembersBruteForce(ctx context.Context, gr *model.Gro mu.Unlock() } - return nil - }) - } + if lgr.NextCursor == "" { + return nil + } + cursor = lgr.NextCursor + } + }) } - // wait for all goroutines to finish if err := g.Wait(); err != nil { return nil, err } - // build the final result groupMembers := make([]*model.GroupMembers, len(gr.Resources)) for i, group := range gr.Resources { gms := model.GroupMembersBuilder(). @@ -585,8 +550,6 @@ func (s *Provider) GetGroupsMembersBruteForce(ctx context.Context, gr *model.Gro groupMembers[i] = gms } - slog.Debug("scim: GetGroupsMembersBruteForce()", "groups_members", len(groupMembers)) - groupsMembersResult := model.GroupsMembersResultBuilder().WithResources(groupMembers).Build() - - return groupsMembersResult, nil + slog.Debug("scim: GetGroupsMembers()", "groups_members", len(groupMembers)) + return model.GroupsMembersResultBuilder().WithResources(groupMembers).Build(), nil } diff --git a/internal/scim/scim_test.go b/internal/scim/scim_test.go index 34d5885..3e3604a 100644 --- a/internal/scim/scim_test.go +++ b/internal/scim/scim_test.go @@ -6,6 +6,7 @@ import ( "strconv" "sync" "testing" + "testing/synctest" "time" "github.com/google/go-cmp/cmp" @@ -28,14 +29,6 @@ func patchValueGenerator(from, numValues int) []patchValue { return values } -// newTestProvider creates a Provider with fast settings for testing -func newTestProvider(scim AWSSCIMProvider) *Provider { - return &Provider{ - scim: scim, - maxMembersPerRequest: 100, - } -} - func TestProvider_patchGroupOperations(t *testing.T) { type fields struct { scim AWSSCIMProvider @@ -1345,405 +1338,369 @@ func TestProvider_DeleteGroupsMembers(t *testing.T) { } } -func TestProvider_GetGroupsMembers(t *testing.T) { - ctrl := gomock.NewController(t) - defer ctrl.Finish() - - mockScimProvider := mock_scim.NewMockAWSSCIMProvider(ctrl) +// memberSortOpt provides a stable ordering for *model.Member slices in cmp diffs. +// The implementation builds membership lists from concurrent goroutines, so the +// observed order is non-deterministic. +var memberSortOpt = cmpopts.SortSlices(func(a, b *model.Member) bool { + return a.SCIMID < b.SCIMID +}) - type fields struct { - scim AWSSCIMProvider +// groupsMembersDiffOpts collects the cmp options used by GetGroupsMembers tests. +func groupsMembersDiffOpts() []cmp.Option { + return []cmp.Option{ + cmpopts.IgnoreFields(model.GroupsMembersResult{}, "HashCode", "Items"), + cmpopts.IgnoreFields(model.GroupMembers{}, "HashCode", "Items"), + cmpopts.IgnoreFields(model.Member{}, "HashCode"), + memberSortOpt, } +} + +func TestProvider_GetGroupsMembers(t *testing.T) { type args struct { - ctx context.Context - gr *model.GroupsResult + gr *model.GroupsResult + ur *model.UsersResult } tests := []struct { name string - fields fields args args prepare func(m *mock_scim.MockAWSSCIMProvider) want *model.GroupsMembersResult wantErr bool }{ { - name: "should get groups members", - fields: fields{ - scim: mockScimProvider, - }, + name: "single user belongs to a single in-scope group", args: args{ - ctx: context.Background(), gr: &model.GroupsResult{ Resources: []*model.Group{ + {SCIMID: "g1", Name: "group1"}, + }, + }, + ur: &model.UsersResult{ + Resources: []*model.User{ { - SCIMID: "1", - Name: "group1", + SCIMID: "u1", + Active: true, + Emails: []model.Email{{Value: "user1@email.com"}}, }, }, }, }, prepare: func(m *mock_scim.MockAWSSCIMProvider) { - m.EXPECT().ListGroups(gomock.Any(), gomock.Any()).Return(&aws.ListGroupsResponse{ - Resources: []*aws.Group{ - { - Members: []*aws.Member{ - { - Value: "1", - }, - }, - }, - }, - }, nil) - m.EXPECT().GetUser(gomock.Any(), gomock.Any()).Return(&aws.GetUserResponse{ - Emails: []aws.Email{ - { - Value: "user1@email.com", - }, - }, - }, nil) + m.EXPECT(). + ListGroupsWithCursor(gomock.Any(), `members.value eq "u1"`, ""). + Return(&aws.ListGroupsResponse{ + Resources: []*aws.Group{{ID: "g1", DisplayName: "group1"}}, + }, nil) }, want: &model.GroupsMembersResult{ Resources: []*model.GroupMembers{ { - Group: &model.Group{ - SCIMID: "1", - Name: "group1", - }, + Group: &model.Group{SCIMID: "g1", Name: "group1"}, Resources: []*model.Member{ - { - SCIMID: "1", - Email: "user1@email.com", - }, + {SCIMID: "u1", Email: "user1@email.com", Status: "ACTIVE"}, }, }, }, }, - wantErr: false, }, { - name: "should return an error when listing groups", - fields: fields{ - scim: mockScimProvider, - }, + name: "user with no group memberships yields empty members list", args: args{ - ctx: context.Background(), gr: &model.GroupsResult{ Resources: []*model.Group{ + {SCIMID: "g1", Name: "group1"}, + }, + }, + ur: &model.UsersResult{ + Resources: []*model.User{ { - SCIMID: "1", - Name: "group1", + SCIMID: "u1", + Active: true, + Emails: []model.Email{{Value: "user1@email.com"}}, }, }, }, }, prepare: func(m *mock_scim.MockAWSSCIMProvider) { - m.EXPECT().ListGroups(gomock.Any(), gomock.Any()).Return(nil, errors.New("error")) + m.EXPECT(). + ListGroupsWithCursor(gomock.Any(), `members.value eq "u1"`, ""). + Return(&aws.ListGroupsResponse{ + Resources: []*aws.Group{}, + }, nil) + }, + want: &model.GroupsMembersResult{ + Resources: []*model.GroupMembers{ + { + Group: &model.Group{SCIMID: "g1", Name: "group1"}, + Resources: []*model.Member{}, + }, + }, }, - wantErr: true, }, { - name: "should return an error when getting user", - fields: fields{ - scim: mockScimProvider, - }, + name: "memberships in out-of-scope groups are ignored", args: args{ - ctx: context.Background(), gr: &model.GroupsResult{ Resources: []*model.Group{ + {SCIMID: "g1", Name: "group1"}, + }, + }, + ur: &model.UsersResult{ + Resources: []*model.User{ { - SCIMID: "1", - Name: "group1", + SCIMID: "u1", + Active: true, + Emails: []model.Email{{Value: "user1@email.com"}}, }, }, }, }, prepare: func(m *mock_scim.MockAWSSCIMProvider) { - m.EXPECT().ListGroups(gomock.Any(), gomock.Any()).Return(&aws.ListGroupsResponse{ - Resources: []*aws.Group{ - { - Members: []*aws.Member{ - { - Value: "1", - }, - }, + m.EXPECT(). + ListGroupsWithCursor(gomock.Any(), `members.value eq "u1"`, ""). + Return(&aws.ListGroupsResponse{ + Resources: []*aws.Group{ + {ID: "g1", DisplayName: "group1"}, + {ID: "g-out-of-scope", DisplayName: "aws-managed-group"}, + }, + }, nil) + }, + want: &model.GroupsMembersResult{ + Resources: []*model.GroupMembers{ + { + Group: &model.Group{SCIMID: "g1", Name: "group1"}, + Resources: []*model.Member{ + {SCIMID: "u1", Email: "user1@email.com", Status: "ACTIVE"}, }, }, - }, nil) - m.EXPECT().GetUser(gomock.Any(), gomock.Any()).Return(nil, errors.New("error")) + }, }, - wantErr: true, }, - } - for _, tt := range tests { - t.Run(tt.name, func(t *testing.T) { - tt.prepare(mockScimProvider) - p := &Provider{ - scim: tt.fields.scim, - } - got, err := p.GetGroupsMembers(tt.args.ctx, tt.args.gr) - if (err != nil) != tt.wantErr { - t.Errorf("Provider.GetGroupsMembers() error = %v, wantErr %v", err, tt.wantErr) - return - } - if diff := cmp.Diff(tt.want, got, cmpopts.IgnoreFields(model.GroupsMembersResult{}, "HashCode", "Items"), cmpopts.IgnoreFields(model.GroupMembers{}, "HashCode", "Items"), cmpopts.IgnoreFields(model.Member{}, "HashCode")); diff != "" { - t.Errorf("Provider.GetGroupsMembers() (-want +got):\n%s", diff) - } - }) - } -} - -func TestProvider_GetGroupsMembersBruteForce(t *testing.T) { - ctrl := gomock.NewController(t) - defer ctrl.Finish() - - mockScimProvider := mock_scim.NewMockAWSSCIMProvider(ctrl) - - type fields struct { - scim AWSSCIMProvider - maxMembersPerRequest int - } - type args struct { - ctx context.Context - gr *model.GroupsResult - ur *model.UsersResult - } - tests := []struct { - name string - fields fields - args args - prepare func(m *mock_scim.MockAWSSCIMProvider) - want *model.GroupsMembersResult - wantErr bool - }{ { - name: "should get groups members", - fields: fields{ - scim: mockScimProvider, - }, + name: "inactive users do not get ACTIVE status", args: args{ - ctx: context.Background(), gr: &model.GroupsResult{ Resources: []*model.Group{ - { - SCIMID: "1", - Name: "group1", - }, + {SCIMID: "g1", Name: "group1"}, }, }, ur: &model.UsersResult{ Resources: []*model.User{ { - SCIMID: "1", - Active: true, - Emails: []model.Email{ - { - Value: "user1@email.com", - }, - }, + SCIMID: "u1", + Active: false, + Emails: []model.Email{{Value: "user1@email.com"}}, }, }, }, }, prepare: func(m *mock_scim.MockAWSSCIMProvider) { - m.EXPECT().ListGroups(gomock.Any(), gomock.Any()).Return(&aws.ListGroupsResponse{ - ListResponse: aws.ListResponse{ - TotalResults: 1, - }, - Resources: []*aws.Group{}, - }, nil) + m.EXPECT(). + ListGroupsWithCursor(gomock.Any(), `members.value eq "u1"`, ""). + Return(&aws.ListGroupsResponse{ + Resources: []*aws.Group{{ID: "g1", DisplayName: "group1"}}, + }, nil) }, want: &model.GroupsMembersResult{ Resources: []*model.GroupMembers{ { - Group: &model.Group{ - SCIMID: "1", - Name: "group1", - }, + Group: &model.Group{SCIMID: "g1", Name: "group1"}, Resources: []*model.Member{ - { - SCIMID: "1", - Email: "user1@email.com", - Status: "ACTIVE", - }, + {SCIMID: "u1", Email: "user1@email.com"}, }, }, }, }, }, { - name: "should get groups members with concurrency", - fields: fields{ - scim: mockScimProvider, - }, + name: "pagination walks every cursor and aggregates groups", args: args{ - ctx: context.Background(), gr: &model.GroupsResult{ Resources: []*model.Group{ - { - SCIMID: "1", - Name: "group1", - }, - { - SCIMID: "2", - Name: "group2", - }, + {SCIMID: "g1", Name: "group1"}, + {SCIMID: "g2", Name: "group2"}, }, }, ur: &model.UsersResult{ Resources: []*model.User{ { - SCIMID: "1", + SCIMID: "u1", Active: true, - Emails: []model.Email{ - { - Value: "user1@email.com", - }, - }, - }, - { - SCIMID: "2", - Active: true, - Emails: []model.Email{ - { - Value: "user2@email.com", - }, - }, + Emails: []model.Email{{Value: "user1@email.com"}}, }, }, }, }, prepare: func(m *mock_scim.MockAWSSCIMProvider) { - var wg sync.WaitGroup - var mu sync.Mutex - var currentConcurrent int - var maxConcurrent int - - calls := 4 - wg.Add(calls) - - for range calls { - m.EXPECT().ListGroups(gomock.Any(), gomock.Any()).DoAndReturn(func(_ context.Context, _ string) (*aws.ListGroupsResponse, error) { - mu.Lock() - currentConcurrent++ - if currentConcurrent > maxConcurrent { - maxConcurrent = currentConcurrent - } - mu.Unlock() - - time.Sleep(100 * time.Millisecond) - - mu.Lock() - currentConcurrent-- - mu.Unlock() - - wg.Done() - - return &aws.ListGroupsResponse{ - ListResponse: aws.ListResponse{ - TotalResults: 1, - }, - Resources: []*aws.Group{}, - }, nil - }) - } - - go func() { - wg.Wait() - if maxConcurrent > 20 { - t.Errorf("max concurrent calls should be less than 20, got %d", maxConcurrent) - } - }() + gomock.InOrder( + m.EXPECT(). + ListGroupsWithCursor(gomock.Any(), `members.value eq "u1"`, ""). + Return(&aws.ListGroupsResponse{ + ListResponse: aws.ListResponse{NextCursor: "cursor-2"}, + Resources: []*aws.Group{{ID: "g1", DisplayName: "group1"}}, + }, nil), + m.EXPECT(). + ListGroupsWithCursor(gomock.Any(), `members.value eq "u1"`, "cursor-2"). + Return(&aws.ListGroupsResponse{ + Resources: []*aws.Group{{ID: "g2", DisplayName: "group2"}}, + }, nil), + ) }, want: &model.GroupsMembersResult{ Resources: []*model.GroupMembers{ { - Group: &model.Group{ - SCIMID: "1", - Name: "group1", - }, + Group: &model.Group{SCIMID: "g1", Name: "group1"}, Resources: []*model.Member{ - { - SCIMID: "1", - Email: "user1@email.com", - Status: "ACTIVE", - }, - { - SCIMID: "2", - Email: "user2@email.com", - Status: "ACTIVE", - }, + {SCIMID: "u1", Email: "user1@email.com", Status: "ACTIVE"}, }, }, { - Group: &model.Group{ - SCIMID: "2", - Name: "group2", - }, + Group: &model.Group{SCIMID: "g2", Name: "group2"}, Resources: []*model.Member{ - { - SCIMID: "1", - Email: "user1@email.com", - Status: "ACTIVE", - }, - { - SCIMID: "2", - Email: "user2@email.com", - Status: "ACTIVE", - }, + {SCIMID: "u1", Email: "user1@email.com", Status: "ACTIVE"}, }, }, }, }, }, { - name: "should return an error when listing groups", - fields: fields{ - scim: mockScimProvider, - }, + name: "error from AWS is propagated", args: args{ - ctx: context.Background(), gr: &model.GroupsResult{ - Resources: []*model.Group{ - { - SCIMID: "1", - Name: "group1", - }, - }, + Resources: []*model.Group{{SCIMID: "g1", Name: "group1"}}, }, ur: &model.UsersResult{ Resources: []*model.User{ { - SCIMID: "1", + SCIMID: "u1", + Emails: []model.Email{{Value: "user1@email.com"}}, }, }, }, }, prepare: func(m *mock_scim.MockAWSSCIMProvider) { - m.EXPECT().ListGroups(gomock.Any(), gomock.Any()).Return(nil, errors.New("error")) + m.EXPECT(). + ListGroupsWithCursor(gomock.Any(), gomock.Any(), gomock.Any()). + Return(nil, errors.New("aws boom")) }, wantErr: true, }, } + for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + + mockScimProvider := mock_scim.NewMockAWSSCIMProvider(ctrl) tt.prepare(mockScimProvider) + p := &Provider{ - scim: tt.fields.scim, + scim: mockScimProvider, maxMembersPerRequest: 100, } - got, err := p.GetGroupsMembersBruteForce(tt.args.ctx, tt.args.gr, tt.args.ur) + + got, err := p.GetGroupsMembers(context.Background(), tt.args.gr, tt.args.ur) if (err != nil) != tt.wantErr { - t.Errorf("Provider.GetGroupsMembersBruteForce() error = %v, wantErr %v", err, tt.wantErr) + t.Fatalf("GetGroupsMembers() error = %v, wantErr %v", err, tt.wantErr) + } + if tt.wantErr { return } - // sort members by SCIMID to avoid order issues in the test - opt := cmpopts.SortSlices(func(a, b *model.Member) bool { - return a.SCIMID < b.SCIMID - }) - - if diff := cmp.Diff(tt.want, got, cmpopts.IgnoreFields(model.GroupsMembersResult{}, "HashCode", "Items"), cmpopts.IgnoreFields(model.GroupMembers{}, "HashCode", "Items"), cmpopts.IgnoreFields(model.Member{}, "HashCode"), opt); diff != "" { - t.Errorf("Provider.GetGroupsMembersBruteForce() (-want +got):\n%s", diff) + if diff := cmp.Diff(tt.want, got, groupsMembersDiffOpts()...); diff != "" { + t.Errorf("GetGroupsMembers() (-want +got):\n%s", diff) } }) } } + +// TestProvider_GetGroupsMembers_ConcurrencyLimit exercises the errgroup +// concurrency cap using testing/synctest from the Go 1.26 standard library. +// synctest.Test runs every goroutine in an isolated "bubble" with a fake +// clock, so time.Sleep below advances virtual time only after the runtime has +// proven no other goroutine in the bubble can make progress. That gives us +// deterministic observation of the maximum number of in-flight calls without +// relying on wall-clock races. +// +// Reference: https://go.dev/blog/testing-time +func TestProvider_GetGroupsMembers_ConcurrencyLimit(t *testing.T) { + synctest.Test(t, func(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + + mockScimProvider := mock_scim.NewMockAWSSCIMProvider(ctrl) + + const userCount = 20 + groups := make([]*model.Group, 0, userCount) + users := make([]*model.User, 0, userCount) + for i := range userCount { + id := strconv.Itoa(i) + groups = append(groups, &model.Group{SCIMID: "g-" + id, Name: "group-" + id}) + users = append(users, &model.User{ + SCIMID: "u-" + id, + Active: true, + Emails: []model.Email{{Value: "user-" + id + "@email.com"}}, + }) + } + + var ( + mu sync.Mutex + inFlight int + maxInFlight int + observedSerial int + ) + + mockScimProvider.EXPECT(). + ListGroupsWithCursor(gomock.Any(), gomock.Any(), gomock.Any()). + DoAndReturn(func(_ context.Context, filter, _ string) (*aws.ListGroupsResponse, error) { + mu.Lock() + inFlight++ + if inFlight > maxInFlight { + maxInFlight = inFlight + } + observedSerial++ + mu.Unlock() + + // Virtual sleep — synctest advances the fake clock only when + // every other goroutine in the bubble is blocked, which lets us + // observe the true concurrency cap rather than a flaky lower bound. + time.Sleep(50 * time.Millisecond) + + mu.Lock() + inFlight-- + mu.Unlock() + + // The filter encodes the user ID, so we can derive a deterministic + // "user belongs to its matching group" result without juggling extra + // state in the test fixture. + userSCIMID := filter[len(`members.value eq "`) : len(filter)-1] + groupSCIMID := "g-" + userSCIMID[len("u-"):] + return &aws.ListGroupsResponse{ + Resources: []*aws.Group{{ID: groupSCIMID}}, + }, nil + }).Times(userCount) + + p := &Provider{ + scim: mockScimProvider, + maxMembersPerRequest: 100, + } + + got, err := p.GetGroupsMembers(context.Background(), &model.GroupsResult{Resources: groups}, &model.UsersResult{Resources: users}) + if err != nil { + t.Fatalf("GetGroupsMembers() error = %v", err) + } + + if observedSerial != userCount { + t.Errorf("expected exactly one ListGroupsWithCursor per user (got %d, want %d)", observedSerial, userCount) + } + if maxInFlight > getGroupsMembersConcurrency { + t.Errorf("max in-flight calls = %d, exceeds cap %d", maxInFlight, getGroupsMembersConcurrency) + } + if maxInFlight < 2 { + t.Errorf("expected concurrent calls (>=2), only observed %d in flight at peak", maxInFlight) + } + if len(got.Resources) != userCount { + t.Errorf("expected %d group-members entries, got %d", userCount, len(got.Resources)) + } + }) +} diff --git a/mocks/core/scim_mocks.go b/mocks/core/scim_mocks.go index 079de51..e73ca4a 100644 --- a/mocks/core/scim_mocks.go +++ b/mocks/core/scim_mocks.go @@ -144,33 +144,18 @@ func (mr *MockSCIMServiceMockRecorder) GetGroups(ctx any) *gomock.Call { } // GetGroupsMembers mocks base method. -func (m *MockSCIMService) GetGroupsMembers(ctx context.Context, gr *model.GroupsResult) (*model.GroupsMembersResult, error) { +func (m *MockSCIMService) GetGroupsMembers(ctx context.Context, gr *model.GroupsResult, ur *model.UsersResult) (*model.GroupsMembersResult, error) { m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "GetGroupsMembers", ctx, gr) + ret := m.ctrl.Call(m, "GetGroupsMembers", ctx, gr, ur) ret0, _ := ret[0].(*model.GroupsMembersResult) ret1, _ := ret[1].(error) return ret0, ret1 } // GetGroupsMembers indicates an expected call of GetGroupsMembers. -func (mr *MockSCIMServiceMockRecorder) GetGroupsMembers(ctx, gr any) *gomock.Call { +func (mr *MockSCIMServiceMockRecorder) GetGroupsMembers(ctx, gr, ur any) *gomock.Call { mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetGroupsMembers", reflect.TypeOf((*MockSCIMService)(nil).GetGroupsMembers), ctx, gr) -} - -// GetGroupsMembersBruteForce mocks base method. -func (m *MockSCIMService) GetGroupsMembersBruteForce(ctx context.Context, gr *model.GroupsResult, ur *model.UsersResult) (*model.GroupsMembersResult, error) { - m.ctrl.T.Helper() - ret := m.ctrl.Call(m, "GetGroupsMembersBruteForce", ctx, gr, ur) - ret0, _ := ret[0].(*model.GroupsMembersResult) - ret1, _ := ret[1].(error) - return ret0, ret1 -} - -// GetGroupsMembersBruteForce indicates an expected call of GetGroupsMembersBruteForce. -func (mr *MockSCIMServiceMockRecorder) GetGroupsMembersBruteForce(ctx, gr, ur any) *gomock.Call { - mr.mock.ctrl.T.Helper() - return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetGroupsMembersBruteForce", reflect.TypeOf((*MockSCIMService)(nil).GetGroupsMembersBruteForce), ctx, gr, ur) + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetGroupsMembers", reflect.TypeOf((*MockSCIMService)(nil).GetGroupsMembers), ctx, gr, ur) } // GetUsers mocks base method. diff --git a/mocks/scim/scim_mocks.go b/mocks/scim/scim_mocks.go index f0f3c2e..ccddffa 100644 --- a/mocks/scim/scim_mocks.go +++ b/mocks/scim/scim_mocks.go @@ -144,6 +144,21 @@ func (mr *MockAWSSCIMProviderMockRecorder) ListGroups(ctx, filter any) *gomock.C return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ListGroups", reflect.TypeOf((*MockAWSSCIMProvider)(nil).ListGroups), ctx, filter) } +// ListGroupsWithCursor mocks base method. +func (m *MockAWSSCIMProvider) ListGroupsWithCursor(ctx context.Context, filter, cursor string) (*aws.ListGroupsResponse, error) { + m.ctrl.T.Helper() + ret := m.ctrl.Call(m, "ListGroupsWithCursor", ctx, filter, cursor) + ret0, _ := ret[0].(*aws.ListGroupsResponse) + ret1, _ := ret[1].(error) + return ret0, ret1 +} + +// ListGroupsWithCursor indicates an expected call of ListGroupsWithCursor. +func (mr *MockAWSSCIMProviderMockRecorder) ListGroupsWithCursor(ctx, filter, cursor any) *gomock.Call { + mr.mock.ctrl.T.Helper() + return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "ListGroupsWithCursor", reflect.TypeOf((*MockAWSSCIMProvider)(nil).ListGroupsWithCursor), ctx, filter, cursor) +} + // ListUsers mocks base method. func (m *MockAWSSCIMProvider) ListUsers(ctx context.Context, filter string) (*aws.ListUsersResponse, error) { m.ctrl.T.Helper() diff --git a/pkg/aws/scim.go b/pkg/aws/scim.go index 41d017a..46139ea 100644 --- a/pkg/aws/scim.go +++ b/pkg/aws/scim.go @@ -11,6 +11,7 @@ import ( "net/http" "net/url" "path" + "strings" ) // AWS SSO SCIM API @@ -652,7 +653,14 @@ func (s *SCIMService) GetGroupByDisplayName(ctx context.Context, displayName str return &GetGroupResponse{}, nil } -// ListGroups returns a list of groups from the AWS SSO Using the API +// ListGroups returns a list of groups from the AWS SSO using the API. +// +// This is a non-paginated call: the response contains TotalResults, ItemsPerPage +// and StartIndex but no NextCursor. The AWS SCIM endpoint caps the result at +// 100 groups per page when no cursor is supplied, so callers that need to walk +// every group should use ListGroupsWithCursor instead. +// +// Reference: https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html func (s *SCIMService) ListGroups(ctx context.Context, filter string) (*ListGroupsResponse, error) { reqURL, err := url.Parse(s.url.String()) if err != nil { @@ -690,6 +698,71 @@ func (s *SCIMService) ListGroups(ctx context.Context, filter string) (*ListGroup return &response, nil } +// ListGroupsPageSize is the maximum number of groups returned per page by the +// AWS IAM Identity Center SCIM ListGroups endpoint. Cursor-based pagination +// uses this limit implicitly when no count is supplied. +// +// Reference: https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html +const ListGroupsPageSize = 100 + +// ListGroupsWithCursor returns one page of groups using AWS IAM Identity +// Center SCIM cursor-based pagination. +// +// Pass an empty cursor on the first call: the bare "?cursor" parameter is +// required by AWS to switch the endpoint into cursor mode. For each subsequent +// page, pass the NextCursor returned by the previous response. The response is +// the final page when NextCursor is empty. +// +// The filter argument is forwarded verbatim and, when non-empty, must follow +// the AWS-supported filter combinations (for example: members.value eq ""). +// +// Reference: https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html +func (s *SCIMService) ListGroupsWithCursor(ctx context.Context, filter, cursor string) (*ListGroupsResponse, error) { + reqURL, err := url.Parse(s.url.String()) + if err != nil { + return nil, fmt.Errorf("aws ListGroupsWithCursor: error parsing url: %w", err) + } + + reqURL.Path = path.Join(reqURL.Path, "/Groups") + + // AWS requires the "cursor" parameter to be present (even when empty) to + // switch the endpoint into cursor-paginated mode. url.Values omits empty + // values, so build the query string manually to preserve the bare "cursor" + // form for the first page request. + var parts []string + if cursor == "" { + parts = append(parts, "cursor") + } else { + parts = append(parts, "cursor="+url.QueryEscape(cursor)) + } + if filter != "" { + parts = append(parts, "filter="+url.QueryEscape(filter)) + } + reqURL.RawQuery = strings.Join(parts, "&") + + req, err := s.newRequest(ctx, http.MethodGet, reqURL, nil) + if err != nil { + return nil, fmt.Errorf("aws ListGroupsWithCursor: error creating request, http method: %s, url: %v, error: %w", http.MethodGet, reqURL.String(), err) + } + + resp, err := s.do(ctx, req) + if err != nil { + return nil, fmt.Errorf("aws ListGroupsWithCursor: error sending request, http method: %s, url: %v, error: %w", http.MethodGet, reqURL.String(), err) + } + defer resp.Body.Close() + + if e := s.checkHTTPResponse(resp); e != nil { + return nil, e + } + + var response ListGroupsResponse + if err := json.NewDecoder(resp.Body).Decode(&response); err != nil { + return nil, fmt.Errorf("aws ListGroupsWithCursor: error decoding response body: %w", err) + } + + return &response, nil +} + // CreateGroup creates a new group in the AWS SSO Using the API // reference: // + https://docs.aws.amazon.com/singlesignon/latest/developerguide/creategroup.html diff --git a/pkg/aws/scim_model.go b/pkg/aws/scim_model.go index 2c1f85f..8fe087e 100644 --- a/pkg/aws/scim_model.go +++ b/pkg/aws/scim_model.go @@ -116,8 +116,17 @@ type Patch struct { Operations []*Operation `json:"Operations"` } -// ListResponse represent a general response entity +// ListResponse represent a general response entity. +// +// AWS IAM Identity Center SCIM supports two response formats for ListGroups: +// - Non-paginated (default): includes TotalResults, ItemsPerPage and StartIndex. +// - Cursor-paginated: includes ItemsPerPage and NextCursor; TotalResults and StartIndex are omitted. +// +// NextCursor is empty when the current page is the last page of results. +// +// Reference: https://docs.aws.amazon.com/singlesignon/latest/developerguide/listgroups.html type ListResponse struct { + NextCursor string `json:"nextCursor,omitempty"` Schemas []string `json:"schemas"` TotalResults int `json:"totalResults"` ItemsPerPage int `json:"itemsPerPage"` diff --git a/pkg/aws/scim_test.go b/pkg/aws/scim_test.go index 74623cf..78e3137 100644 --- a/pkg/aws/scim_test.go +++ b/pkg/aws/scim_test.go @@ -2066,6 +2066,111 @@ func TestListGroups(t *testing.T) { }) } +func TestListGroupsWithCursor(t *testing.T) { + t.Run("first page request sends bare cursor parameter and parses NextCursor", func(t *testing.T) { + page1 := ReadJSONFileAsString(t, "testdata/ListGroupsResponse_CursorPage1.json") + + var capturedRawQuery string + var capturedPath string + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + capturedRawQuery = r.URL.RawQuery + capturedPath = r.URL.Path + w.Header().Set("Content-Type", "application/json") + _, _ = fmt.Fprint(w, page1) + })) + defer server.Close() + + service, err := NewSCIMService(http.DefaultClient, server.URL, "MyToken") + assert.NoError(t, err) + + got, err := service.ListGroupsWithCursor(context.Background(), `members.value eq "u1"`, "") + assert.NoError(t, err) + assert.NotNil(t, got) + + assert.Equal(t, "/Groups", capturedPath) + // AWS requires the bare "cursor" parameter (no value) on the first + // paginated request to switch ListGroups into cursor mode. + assert.Equal(t, `cursor&filter=members.value+eq+%22u1%22`, capturedRawQuery) + assert.Equal(t, "Q3Vyc29yUGFnZTI=", got.NextCursor) + assert.Len(t, got.Resources, 2) + assert.Equal(t, "d468e4d8-c051-703d-dd84-28caf445f15e", got.Resources[0].ID) + }) + + t.Run("subsequent page sends cursor value and returns final page when NextCursor is empty", func(t *testing.T) { + page2 := ReadJSONFileAsString(t, "testdata/ListGroupsResponse_CursorPage2.json") + + var capturedRawQuery string + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + capturedRawQuery = r.URL.RawQuery + w.Header().Set("Content-Type", "application/json") + _, _ = fmt.Fprint(w, page2) + })) + defer server.Close() + + service, err := NewSCIMService(http.DefaultClient, server.URL, "MyToken") + assert.NoError(t, err) + + got, err := service.ListGroupsWithCursor(context.Background(), `members.value eq "u1"`, "Q3Vyc29yUGFnZTI=") + assert.NoError(t, err) + + assert.Equal(t, `cursor=Q3Vyc29yUGFnZTI%3D&filter=members.value+eq+%22u1%22`, capturedRawQuery) + assert.Empty(t, got.NextCursor, "final page must report no nextCursor") + assert.Len(t, got.Resources, 1) + }) + + t.Run("empty filter still sends bare cursor", func(t *testing.T) { + var capturedRawQuery string + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + capturedRawQuery = r.URL.RawQuery + w.Header().Set("Content-Type", "application/json") + _, _ = fmt.Fprint(w, `{"itemsPerPage":0,"schemas":["urn:ietf:params:scim:api:messages:2.0:ListResponse"],"Resources":[]}`) + })) + defer server.Close() + + service, err := NewSCIMService(http.DefaultClient, server.URL, "MyToken") + assert.NoError(t, err) + + _, err = service.ListGroupsWithCursor(context.Background(), "", "") + assert.NoError(t, err) + assert.Equal(t, "cursor", capturedRawQuery) + }) + + t.Run("propagates non-2xx responses as HTTPResponseError", func(t *testing.T) { + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + w.WriteHeader(http.StatusTooManyRequests) + _, _ = fmt.Fprint(w, `{"detail":"rate limited"}`) + })) + defer server.Close() + + service, err := NewSCIMService(http.DefaultClient, server.URL, "MyToken") + assert.NoError(t, err) + + _, err = service.ListGroupsWithCursor(context.Background(), `members.value eq "u1"`, "") + assert.Error(t, err) + + var httpErr *HTTPResponseError + assert.True(t, errors.As(err, &httpErr)) + assert.Equal(t, http.StatusTooManyRequests, httpErr.StatusCode) + }) + + t.Run("propagates context cancellation", func(t *testing.T) { + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + w.Header().Set("Content-Type", "application/json") + _, _ = fmt.Fprint(w, `{"itemsPerPage":0,"schemas":[],"Resources":[]}`) + })) + defer server.Close() + + service, err := NewSCIMService(http.DefaultClient, server.URL, "MyToken") + assert.NoError(t, err) + + ctx, cancel := context.WithCancel(context.Background()) + cancel() + + _, err = service.ListGroupsWithCursor(ctx, "", "") + assert.Error(t, err) + }) +} + // Enhanced test cases for improved coverage func TestSCIMService_ContextCancellation(t *testing.T) { diff --git a/pkg/aws/testdata/ListGroupsResponse_CursorPage1.json b/pkg/aws/testdata/ListGroupsResponse_CursorPage1.json new file mode 100644 index 0000000..528fe76 --- /dev/null +++ b/pkg/aws/testdata/ListGroupsResponse_CursorPage1.json @@ -0,0 +1,37 @@ +{ + "itemsPerPage": 2, + "nextCursor": "Q3Vyc29yUGFnZTI=", + "schemas": [ + "urn:ietf:params:scim:api:messages:2.0:ListResponse" + ], + "Resources": [ + { + "id": "d468e4d8-c051-703d-dd84-28caf445f15e", + "externalId": "eid_1", + "meta": { + "resourceType": "Group", + "created": "2025-08-08T17:28:12Z", + "lastModified": "2025-08-08T17:28:12Z" + }, + "schemas": [ + "urn:ietf:params:scim:schemas:core:2.0:Group" + ], + "displayName": "Group Page 1 A", + "members": [] + }, + { + "id": "d468e4d8-c051-703d-dd84-28caf445f15f", + "externalId": "eid_2", + "meta": { + "resourceType": "Group", + "created": "2025-08-08T17:28:12Z", + "lastModified": "2025-08-08T17:28:12Z" + }, + "schemas": [ + "urn:ietf:params:scim:schemas:core:2.0:Group" + ], + "displayName": "Group Page 1 B", + "members": [] + } + ] +} diff --git a/pkg/aws/testdata/ListGroupsResponse_CursorPage2.json b/pkg/aws/testdata/ListGroupsResponse_CursorPage2.json new file mode 100644 index 0000000..b1db767 --- /dev/null +++ b/pkg/aws/testdata/ListGroupsResponse_CursorPage2.json @@ -0,0 +1,22 @@ +{ + "itemsPerPage": 1, + "schemas": [ + "urn:ietf:params:scim:api:messages:2.0:ListResponse" + ], + "Resources": [ + { + "id": "d468e4d8-c051-703d-dd84-28caf445f160", + "externalId": "eid_3", + "meta": { + "resourceType": "Group", + "created": "2025-08-08T17:28:12Z", + "lastModified": "2025-08-08T17:28:12Z" + }, + "schemas": [ + "urn:ietf:params:scim:schemas:core:2.0:Group" + ], + "displayName": "Group Page 2", + "members": [] + } + ] +} From 91db0fa24c679a444a0beabe300a8118b856513e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20Gonz=C3=A1lez=20Di=20Antonio?= Date: Sun, 24 May 2026 10:40:04 +0200 Subject: [PATCH 2/2] docs(whats-new): highlight security & performance impact for #520 Restructures the Unreleased entry for the SCIM members sync change to lead with an IMPORTANT callout and dedicated Security / Performance subsections, making the impact of closing #520 explicit for readers. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/Whats-New.md | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/Whats-New.md b/docs/Whats-New.md index 5c489e1..1a45f50 100644 --- a/docs/Whats-New.md +++ b/docs/Whats-New.md @@ -4,13 +4,23 @@ This document tracks notable changes, new features, and bug fixes across release ## Unreleased -### SCIM members sync no longer scales with `groups × users` (closes #520) +### SCIM members sync — major security & performance improvement (closes [#520](https://github.com/slashdevops/idp-scim-sync/issues/520)) -Replaces the per-pair brute-force algorithm used to reconstruct group memberships on the AWS IAM Identity Center side with a single paginated query per user. +> [!IMPORTANT] +> This release replaces the brute-force algorithm that reconstructed group memberships on the AWS IAM Identity Center side. The change is **internal-only** (no CLI/config change) but materially improves both the **security posture** and the **runtime cost** of every sync. -**Before:** `internal/scim.GetGroupsMembersBruteForce` issued one `ListGroups` call for *every* (group, user) combination — `O(N_groups × N_users)` requests per sync, throttled with a hand-rolled 10–150ms random sleep and a concurrency cap of 5. For an org with 200 groups and 500 users that is 100,000 calls per sync run. +#### Security improvements -**Now:** `internal/scim.GetGroupsMembers(ctx, groups, users)` issues one cursor-paginated `?cursor&filter=members.value eq ""` request per user (plus one extra request per additional page of memberships, when a single user belongs to more than 100 groups). The result is then inverted into the group → members map the rest of the pipeline expects. For the same 200-group / 500-user org this is ~500 calls per sync — **roughly two orders of magnitude fewer requests** — and the random sleep is gone since the call volume no longer needs artificial gapping. +* **Drastically smaller attack & failure surface per sync.** The number of authenticated SCIM requests issued per sync drops by ~2 orders of magnitude (see *Performance* below). Each request is a credential-bearing call to AWS IAM Identity Center — fewer requests means fewer opportunities for a credential leak, log capture, in-flight tampering, or partial-failure half-state to be observed. +* **Shorter sync window = smaller inconsistency window.** Previously, a sync of a few hundred users could run for many minutes while ~100k requests trickled out behind a hand-rolled 10–150 ms random sleep. During that window, group membership in AWS could be partially reconciled — an externally-observable inconsistent state. The new path completes in a fraction of the time, shrinking that window proportionally. +* **No more time-based throttling band-aid.** The previous `time.Sleep(rand.Intn(...))` jitter existed solely to avoid tripping AWS SCIM throttles under the brute-force call volume. It has been removed: the new call profile is light enough that artificial gapping is no longer required. This eliminates a source of non-determinism in the sync path and removes timing-dependent behavior from a security-sensitive code path. +* **Deterministic pagination.** Cursor-based pagination (`?cursor` + `nextCursor`) walks the full result set deterministically, so memberships can no longer be silently truncated by hitting an undocumented page cap mid-sync. + +#### Performance improvements + +* **Before:** `internal/scim.GetGroupsMembersBruteForce` issued one `ListGroups` call for *every* (group, user) combination — `O(N_groups × N_users)` requests per sync, throttled by a 10–150 ms random sleep and capped at concurrency 5. For an org with 200 groups and 500 users this is **~100,000 calls per sync run**. +* **Now:** `internal/scim.GetGroupsMembers(ctx, groups, users)` issues one cursor-paginated `?cursor&filter=members.value eq ""` request per user (plus one extra request per additional page of memberships, when a single user belongs to more than 100 groups). The result is then inverted into the group → members map the rest of the pipeline expects. For the same 200-group / 500-user org this is **~500 calls per sync — roughly two orders of magnitude fewer requests**. +* **Lower Lambda execution time and cost.** Fewer requests + no random sleeps directly reduces billable Lambda duration on every scheduled invocation. This is enabled by two AWS IAM Identity Center SCIM features documented at :