Skip to content

Proposal: Custom Project Roles#276

Open
maxgraustenzel-create wants to merge 3 commits into
goharbor:mainfrom
maxgraustenzel-create:patch-1
Open

Proposal: Custom Project Roles#276
maxgraustenzel-create wants to merge 3 commits into
goharbor:mainfrom
maxgraustenzel-create:patch-1

Conversation

@maxgraustenzel-create

Copy link
Copy Markdown

Summary

Proposal to add custom project roles to Harbor, enabling system administrators to create roles with flexible permission combinations.

Related Issues

  • #18124 (main issue)
  • #18143, #21306, #12062, #8632, #1486 (related)

Implementation

Discussion

This proposal is ready for community review and discussion at the next community meeting.

Community Meeting

If wished, I could present this proposal and a demonstration of the role functionality at the next community meeting for feedback and discussion.

/kind proposal

Signed-off-by: Max <max.grau.stenzel@gmail.com>
correction

Signed-off-by: Max <max.grau.stenzel@gmail.com>
Comment thread proposals/new/Custom-Project-Roles.md
Comment thread proposals/new/Custom-Project-Roles.md
Comment thread proposals/new/Custom-Project-Roles.md
Comment thread proposals/new/Custom-Project-Roles.md
Comment thread proposals/new/Custom-Project-Roles.md
- **Minimal schema changes:** Only extend `role` table with metadata columns (`is_builtin`, `description`, `modified`, `created_by`, `modified_by`, timestamps)
- **Discriminator pattern:** `role_permission.role_type` distinguishes 'project-role' (users/groups) from 'robotaccount' (direct permissions)
- **System admin only:** Only system administrators can create/modify custom roles (project admins assign roles, existing workflow unchanged)
- **Built-in role protection:** Built-in roles can be modified but not deleted; modifications are tracked and reversible

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the build-in role should be "modifiable", allowing modification of the standard Project Admin, Maintainer, Developer, Guest, and Limited Guest roles introduces huge security risks. If a system administrator makes a change to projectAdmin or developer, it immediately alters the baseline security assumptions of all existing projects.

We should keep all the existing define as is, but introduce the custom role only.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as discussed in the community meeting I fully accept your point. I will change this.

- **Minimal schema changes:** Only extend `role` table with metadata columns (`is_builtin`, `description`, `modified`, `created_by`, `modified_by`, timestamps)
- **Discriminator pattern:** `role_permission.role_type` distinguishes 'project-role' (users/groups) from 'robotaccount' (direct permissions)
- **System admin only:** Only system administrators can create/modify custom roles (project admins assign roles, existing workflow unchanged)
- **Built-in role protection:** Built-in roles can be modified but not deleted; modifications are tracked and reversible

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built-in roles must remain completely immutable to serve as a secure baseline.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for me


✅ **Migration strategy:**
- Zero-downtime migration
- Built-in role permissions migrated from `rbac_role.go` to `role_permission` table

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrating rbac_role.go's static policies into role_permission is cleaner, but we must protect against:

  • High DB query volume during login/CLI handshakes.
  • Missing indexes on role_permission combined with role_type and associated keys.

2. **API:** Add role CRUD endpoints for system administrators
3. **UI:** Add role management interface in System Administration section
4. **Security:** Implement privilege escalation prevention and audit logging
5. **Caching:** Load permissions at login (session-scoped cache)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the back and forth.

The proposal suggests session-scoped caching at login, with updates applying on the next login. While this is simple, it poses a severe security risk in enterprise, high-availability, or multi-replica Harbor setups:

  • If a custom role's permission is revoked (e.g., a "Security Auditor" role has access to artifacts revoked), the user can continue executing actions because their session-scoped memory cache on their specific pod is stale.
  • Logouts or token expirations are not immediate enough for compliance-critical revocations.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding the same caching mechanism is used for the role assignement itself. The use case that somebody gets a critical role revoked while logged in is probably more likely and is managed in the same way.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I get where you're coming from about role assignments having some cache lag today, but there’s a big difference here.

Right now, standard roles are hardcoded as compile-time constants in rbac_role.go. Since they’re locked in Go memory, they can never change while the server is running. But by moving these to the database for custom roles, we're introducing a fully dynamic state. We can’t really apply the same static caching assumptions to dynamic DB records, especially when a single role change can instantly impact thousands of users across multiple projects.

Since we want to ship this safely in v2.16.0, we should make sure changes to custom roles propagate across all Harbor instances without a massive lag.Instead of building a super complex sync or tracking lock system, maybe we can go with a lightweight compromise:

  • Whenever an admin edits or deletes a custom role, we write a simple version tag or timestamp in Redis (like custom_roles:last_update).
  • When evaluating permissions during a session, the evaluator does a quick check against that Redis timestamp. If the local cache is out of date, it lazily reloads the role definitions.

How does that sound to you?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not use Redis here. This add another complexity dimension. A KISS approach would be if a custom role changes, we can invalidate all sessions, forcing users to log in again. Custom role change is something that will happen once or twice during the entire lifecycle of a custom_role. Once a role is defined, it is unlikely to ever change again. People (admins) who are capable of changing roles will not do it since they are unable to predict consequences for the users since they are disconnected from the user base.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not invalidate all sessions, but the sessions that associate with the updated role.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion earlier — let me restate my point more clearly.

In current Harbor, when you assign, remove, or change a logged-in user's (or group's) project role, the change is not reflected immediately. It's evaluated at the next login and stored in the session scope — exactly the same mechanism the role feature uses.

I'd argue this existing behavior is actually more security-critical than custom role changes, because role assignment is a routine, day-to-day operation for Admins and ProjectAdmins. For example: if a user who is no longer trusted is removed from a project, that removal is also not reflected until their session refreshes. So the stale-cache window already exists today for the most common revocation case.

Given that, applying the same session mechanism to custom role modifications should be sufficient and consistent with current behavior — it introduces no new class of risk beyond what Harbor already accepts for role assignment.
If a mechanism to invalidate session information (enabling live propagation of administrative changes) is introduced in the future, the role feature can easily be updated to support it. Ideally that mechanism would cover both role assignment and role modification together, since they share the same underlying caching.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxgraustenzel-create Hi, could you please clarify the Harbor scenario you mentioned earlier? I just tested a specific use case with two users: admin and test.
First, admin granted test permissions for Project A and Project B. Once test logged in, both projects were visible. Then, admin revoked these permissions and removed test from both projects. As a result, test immediately lost access to Projects A and B without needing to re-log in.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxgraustenzel-create any updates?

**Owner:** Max Graustenzel
**Timeline:** Completed

### Phase 4: Security Validation (🔄 In Progress - 30%)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we consider?

  • If a custom role includes ResourceRobot:ActionCreate, how do we prevent a user with this role from creating a robot account with permissions that exceed the user's custom role scope?
  • A user with role-management permissions within a project must not be allowed to assign custom roles that contain permissions higher than their own assigned role.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented mechanisms that prevent privilege escalation (on frontend and backend).

  • Users can only create robots with less or equal permissions than they have. Where necessary i implemented a mapping between robot and role permissions.
  • Users can only assign roles with less or equal permissions than they have.


**Performance Impact:**

- **Login:** +50-100ms (one-time permission load)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to run a proper stress test with a sustained load of 500 requests, not just a single benchmark. We need accurate cost metrics, and our commercial deployments handle massive loads. Also, can you grab the CPU/Memory trends for the DB pod? I'm worried the current resource limits won't be enough after we bump Harbor to v2.16.

This data is essential for my evaluation of the solution.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please provide additional information about the required stresstest ? Is there already a sort of scenario to evaluate login performance that I shall reuse?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the https://github.com/goharbor/perf to do the stress test. cc @chlins can you help?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxgraustenzel-create Yes, you can use https://github.com/goharbor/perf to benchmark some harbor API, please refer to the README for usage, and feel free to reach out me if you have any question.

@bupd bupd self-requested a review June 4, 2026 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants