Skip to content

Comments

Added ScanServer property for allowed tables#6146

Open
dlmarion wants to merge 4 commits intoapache:2.1from
dlmarion:6123-allowed-sserver-tables
Open

Added ScanServer property for allowed tables#6146
dlmarion wants to merge 4 commits intoapache:2.1from
dlmarion:6123-allowed-sserver-tables

Conversation

@dlmarion
Copy link
Contributor

Added property that allows user to configure which tables are allowed to be scanned by clients at the ScanServer group level. Property defaults to allowing all non-accumulo namespace tables.

Closes #6123

Added property that allows user to configure which tables
are allowed to be scanned by clients at the ScanServer
group level. Property defaults to allowing all non-accumulo
namespace tables.

Closes apache#6123
@dlmarion dlmarion added this to the 2.1.5 milestone Feb 23, 2026
@dlmarion dlmarion self-assigned this Feb 23, 2026
@dlmarion
Copy link
Contributor Author

I only perform the allowed table check in startScan/startMultiScan and did not add the checks into the continue methods. Doing so would cause an in-progress scan to fail on the property change.

@Experimental
SSERV_SCAN_ALLOWED_TABLES("sserver.scan.allowed.tables.group.", null, PropertyType.PREFIX,
"A regular expression that determines which tables are allowed to be scanned for"
+ " servers in the specified group. The property name should end with the scan server"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 4.0, with the changes in #5749 would not need to group name in the property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct. There are probably other changes when merging in 4.0 as well.


// Visible for testing
protected boolean isAllowed(TableId tid) {
boolean result = allowedTables.containsKey(tid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to differentiate between not existing in the map and a table id is known to not be allowed. Seems like if a table is not allowed it will constantly regenrate the entire cache for all tables. Could make the value in the map a boolean, then can know all three states of allowed, not allowed, unknown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I implemented this in 881b810. I had to change the IT to wait for the server to notice the property change.

@dlmarion dlmarion linked an issue Feb 23, 2026 that may be closed by this pull request
// is a race condition in table creation and
// scan
updateAllowedTables(true);
result = allowedTables.get(tid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its possible the table id still may not exist in the map after the update call because that only works on known tables. A client could pass a table id to the scan server that no longer exists. Could do the following to protect against this and avoid a NPE when trying to convert null to a boolean primitave.

Suggested change
result = allowedTables.get(tid);
result = allowedTables.getOrDefault(tid, false);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approached this differently in f3ca130, want to get your thoughts on it.

throw new TException(
"Only the system user can perform eventual consistency scans on the root and metadata tables");
if (!isAllowed(extent.tableId())) {
throw new TException("Scan of table " + extent.tableId() + " disallowed by property: "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preexisting problem. Throwing TException will probably result in a TappException on the client side. Need to throw a thrift exception defined on the RPC to get anything meaningful on the client side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 02b55d1

LOG.error(
"Property {} contains an invalid regular expression. Property value: {}. Using default pattern: {}",
propName, allowedTableRegex, DEFAULT_SCAN_ALLOWED_PATTERN);
allowedTablePattern = Pattern.compile(DEFAULT_SCAN_ALLOWED_PATTERN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could fail all scans here (put false for all tables in the map) instead of falling back to the default. Falling back to the default may hide that the system is not working as configured/intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 02b55d1

TableId tid = e.getValue();
Matcher m = p.matcher(tname);
if (m.matches()) {
LOG.debug("Table {} can now be scanned via this ScanServer", tname);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the log every second for every table? If so should probably be trace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan is to adjust the logging before I merge. Leaving it as debug for now in case of testing issues.

scanner.setRange(new Range());
scanner.setConsistencyLevel(ConsistencyLevel.EVENTUAL);
scanner.setExecutionHints(Map.of("scan_type", "use_group1"));
Wait.waitFor(() -> Iterables.size(scanner) == 100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to check for failure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 02b55d1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add scan server property for allowed tables to scan

2 participants