core: implement new authorize SPI in PolarisAuthorizerImpl and OpaPolarisAuthorizer#3999
core: implement new authorize SPI in PolarisAuthorizerImpl and OpaPolarisAuthorizer#3999sungwy wants to merge 5 commits intoapache:mainfrom
PolarisAuthorizerImpl and OpaPolarisAuthorizer#3999Conversation
| import org.apache.polaris.core.persistence.ResolvedPolarisEntity; | ||
|
|
||
| /** Lexical fully qualified resource path split into leaf and ordered parent segments. */ | ||
| public record FullyQualifiedPath(PathSegment leaf, List<PathSegment> parents) { |
There was a problem hiding this comment.
Note to reviewer: this class was needed to absorb the logic required in converting a PolarisSecurable to a fully qualified path representation that included the correct parents and their respective PolarisEntityType that could be constructed and inserted into the OPA request payload.
I'm on the fence as to whether we'd want this to be the representation in AuthorizationRequest instead of PolarisSecurable. Leaning towards keeping them separate for now.
There was a problem hiding this comment.
But this class is in polaris-core, which tends to indicate that it's not only for the OPA authorizer. Having both models in polaris-core seems a bit overkill.
This model is imho richer than PolarisSecurable. I wonder if you couldn't merge both models together, e.g.:
@PolarisImmutable
public interface PolarisSecurable {
/** Returns the path segments that identify the securable, in hierarchical order. */
@Nonnull
List<PathSegment> getPathSegments();
default PathSegment getLeafPathSegment() {
return getPathSegments().getLast();
}
@Value.Check
default void validate() {
Preconditions.checkState(!getPathSegments().isEmpty(), "path segments must not be empty");
}
}There was a problem hiding this comment.
Having both models in polaris-core seems a bit overkill.
I thought about this as well, I'm currently leaning towards keeping it in polaris-core because new PolarisAuthorizer implementations like the Ranger Authorizer may also need this.
This model is imho richer than PolarisSecurable. I wonder if you couldn't merge both models together, e.g.:
I'm open to this idea. In order to correctly construct the parent though, we'd need to pass the referenceCatalogName to an arg referenceCatalogName arg here like
default PathSegment getParents(@Nullable String referenceCatalogName) {
...
}
Would that make sense?
There was a problem hiding this comment.
I tend to think PolarisSecurable must know its catalog name anyway. Permission checks in Polaris are always performed within the context of a catalog 🤔 WDYT?
There was a problem hiding this comment.
Yeah that's a good point.
I'm starting to think that the shape of FullyQualifiedPath is more in line with what we'd want to have in place of PolarisSecurable in the AuthorizationRequest, instead of having it represented as a PolarisEntityType and a List of String as its name.
Having clear PathSegment with ordered parents and reference catalog is a better representation of the resource... What do you folks think about just replacing PolarisSecurable with FullyQualifiedPath's shape instead?
There was a problem hiding this comment.
What do you folks think about just replacing PolarisSecurable with FullyQualifiedPath's shape instead?
Works for me 👍
| // Omit ROOT from fully qualified paths. It is an internal RBAC-only ancestor | ||
| if (resolvedParent.getEntity().getType() == PolarisEntityType.ROOT) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Note to reviewer: important behavior change to remove ROOT semantics from OPA request payload, since this is an RBAC-only semantic
There was a problem hiding this comment.
Ack, but we are introducing a specific behavior in polaris-core. Shouldn't we instead keep the ROOT entity here, and let specific authorizers filter it out? I feel like the ROOT case deserves some more thinking. It's also troubling that the current code sometimes prepends it, and sometimes not.
There was a problem hiding this comment.
I fully agree.
public static FullyQualifiedPath of(PolarisResolvedPathWrapper wrapper)
This is only a temporary constructor introduced for compatibility, and will be removed from FullyQualifiedPath in the future. Which means ROOT semantics will no longer be in this module.
There was a problem hiding this comment.
Note to reviewer: large refactoring here to move PolarisPrivilege semantics out of PolarisAuthorizableOperation, and into a completely separate module specific to RBAC instead. See RbacOperationSemantics.java
There was a problem hiding this comment.
+1 to this sub-section of the current PR. Does it depend on other changes a lot? I think it might be preferable to do this part in a dedicated PR. This way, OPA changes will hopefully be isolated from the internal RBAC changes in that commit (or at least less intertwined) WDYT?
There was a problem hiding this comment.
Honestly, I had that thought as well. I'm happy to do it in a dedicated PR...
There was a problem hiding this comment.
sweet, let's merge that first :)
| TargetType targetType) { | ||
| boolean prependRootContainer = | ||
| switch (targetType) { | ||
| case TARGET -> semantics.targetScope() == PathEvaluationScope.ROOT; |
There was a problem hiding this comment.
I'm a bit confused here. Every operation uses PathEvaluationScope.ROOT except the policy attach/detach operations which use CATALOG.
Is this correct for catalog-scoped operations like LIST_NAMESPACES, LIST_TABLES, ...
These operations semantically happen within a catalog context. If the existing authorizeOrThrow callers already prepend the root container themselves, then ROOT make sense for backward compatibility. Maybe worth to add a comment here to clarify.
There was a problem hiding this comment.
Hi @jbonofre - thanks for the early feedback on the PR :)
I was confused initially as well, so your confusion isn't unfounded. the ROOT scope here refers to whether ROOT container should be prepended to the resolved path, and hence whether the ROOT container resource should be included for privilege evaluation.
For example, PolarisAuthorizableOperation.LIST_NAMESPACES prepends root container:
PolarisAuthorizableOperation.GET_APPLICABLE_POLICIES_ON_CATALOG sets prependRootContainer to false.
I think the name PathEvaluationScope may be misleading. I'll think of a better name for this enum
| import org.apache.polaris.core.persistence.ResolvedPolarisEntity; | ||
|
|
||
| /** Lexical fully qualified resource path split into leaf and ordered parent segments. */ | ||
| public record FullyQualifiedPath(PathSegment leaf, List<PathSegment> parents) { |
There was a problem hiding this comment.
But this class is in polaris-core, which tends to indicate that it's not only for the OPA authorizer. Having both models in polaris-core seems a bit overkill.
This model is imho richer than PolarisSecurable. I wonder if you couldn't merge both models together, e.g.:
@PolarisImmutable
public interface PolarisSecurable {
/** Returns the path segments that identify the securable, in hierarchical order. */
@Nonnull
List<PathSegment> getPathSegments();
default PathSegment getLeafPathSegment() {
return getPathSegments().getLast();
}
@Value.Check
default void validate() {
Preconditions.checkState(!getPathSegments().isEmpty(), "path segments must not be empty");
}
}| // Omit ROOT from fully qualified paths. It is an internal RBAC-only ancestor | ||
| if (resolvedParent.getEntity().getType() == PolarisEntityType.ROOT) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Ack, but we are introducing a specific behavior in polaris-core. Shouldn't we instead keep the ROOT entity here, and let specific authorizers filter it out? I feel like the ROOT case deserves some more thinking. It's also troubling that the current code sometimes prepends it, and sometimes not.
| @Nullable | ||
| @Value.Default | ||
| default String getReferenceCatalogName() { | ||
| return null; | ||
| } |
There was a problem hiding this comment.
This should be enough:
| @Nullable | |
| @Value.Default | |
| default String getReferenceCatalogName() { | |
| return null; | |
| } | |
| @Nullable | |
| String getReferenceCatalogName(); |
But how about using an Optional here?
| ResolvedPathRooting targetScope, | ||
| ResolvedPathRooting secondaryScope) { | ||
|
|
||
| private RbacOperationSemantics( |
There was a problem hiding this comment.
Not sure we need this constructor, why don't you update RBAC_SEMANTICS_BY_OPERATION using EnumSet as below:
RBAC_SEMANTICS_BY_OPERATION.put(
PolarisAuthorizableOperation.LIST_NAMESPACES,
new RbacOperationSemantics(EnumSet.of(NAMESPACE_LIST), null, ResolvedPathRooting.ROOT, null));There was a problem hiding this comment.
It was more for convenience / beautification. Do you think using EnumSet directly is better? I'm okay either way
| EnumSet<PolarisPrivilege> targetPrivileges, | ||
| EnumSet<PolarisPrivilege> secondaryPrivileges, | ||
| ResolvedPathRooting targetScope, | ||
| ResolvedPathRooting secondaryScope) { |
There was a problem hiding this comment.
secondaryScope is either null (90% of cases), or identical to targetScope. Can we use one single ResolvedPathRooting maybe?
There was a problem hiding this comment.
Yeah that's a great insight, but I also didn't want to make assumptions so I went with a safer approach of keeping them separate (feels more logical to do so)
On second thought, maybe we keep them as the same for now, since this is an internally used SPI, maybe it's okay to change it to decouple them in the future if we need to
| import org.apache.polaris.core.persistence.ResolvedPolarisEntity; | ||
|
|
||
| /** Lexical fully qualified resource path split into leaf and ordered parent segments. */ | ||
| public record FullyQualifiedPath(PathSegment leaf, List<PathSegment> parents) { |
There was a problem hiding this comment.
nit: swap arguments maybe? In JRE's Path parents go first 🤔
There was a problem hiding this comment.
Good point, but parents could be an empty list though. Feels counterintuitive to have an empty list as the first argument as well
There was a problem hiding this comment.
How about we only have one List<PathSegment> in this record and add convenience methods for creating child paths based on an existing parent path?
There was a problem hiding this comment.
That sounds great actually
There was a problem hiding this comment.
Would a linked-list-like approach be more appropriate? E.g.
public record PathSegment(PolarisEntityType entityType, String name, @Nullable PathSegment parent) {
public PathSegment(PolarisEntityType entityType, String name) {
this(entityType, name, null);
}
public String fullyQualifiedName() {
return parent == null ? name : parent.fullyQualifiedName() + "/" + name;
}
}Then we might not need FullyQualifiedPath at all (but path traversals may be less efficient).
There was a problem hiding this comment.
We can probably change recursion to iteration if it becomes critical.
There was a problem hiding this comment.
I'm fine with either approach.
Minor comment: if we go with linked lists, I think it might be preferable to name the object something like EntityPath rather than PathSegment because it contains full path data down to the root.
|
Hi @sungwy , is this PR ready for review? |
|
Hi @flyrain - not yet, I will need to update the branch with some recent changes. I've been at KubeCon this week, I'll get back to it soon :) |
# Conflicts: # polaris-core/src/main/java/org/apache/polaris/core/auth/PathSegment.java # polaris-core/src/main/java/org/apache/polaris/core/auth/PolarisAuthorizableOperation.java # polaris-core/src/main/java/org/apache/polaris/core/auth/PolarisAuthorizerImpl.java # polaris-core/src/main/java/org/apache/polaris/core/auth/PolarisSecurable.java # polaris-core/src/main/java/org/apache/polaris/core/auth/RbacOperationSemantics.java # polaris-core/src/test/java/org/apache/polaris/core/auth/AuthorizationRequestTest.java # polaris-core/src/test/java/org/apache/polaris/core/auth/RbacOperationSemanticsTest.java
This PR implements phases 2 and 3 of the authorization refactor tracking issue. (#3779)
It introduces request-based authorization with
AuthorizationRequest, in bothPolarisAuthorizerImplandOpaPolarisAuthorizer. The new SPI methodauthorizeis still not called by Polaris endpoint handlers. That will be done in phase 4 and 5 of the authorization refactoring effort. Hence, no behavior changes are expected in existing endpoint handling, except forROOThandling inOpaPolarisAuthorizernoted below.In order to decouple RBAC semantics from Polaris's core, we move RBAC operation semantics like Privileges and newly introduced Scope semantic into
RbacOperationSemantics. The PR also addsFullyQualifiedPathas a lexical path with parent representation thatOpaPolarisAuthorizercan use to construct the OPA request input payload.Breaking change
OPA input payloads no longer include
ROOTin resource parent chains. This is intentional, sinceROOTis an internal Polaris RBAC semantic and should not be part of the external OPA request contractChecklist
CHANGELOG.md(if needed)site/content/in-dev/unreleased(if needed)