Skip to content

issue-2639: [Filestore] Read and write compressed ShardId and ShardNodeName to NodeRefs#5545

Open
ruslan-gar wants to merge 3 commits intomainfrom
users/ruslan-gar/shard_id_compression
Open

issue-2639: [Filestore] Read and write compressed ShardId and ShardNodeName to NodeRefs#5545
ruslan-gar wants to merge 3 commits intomainfrom
users/ruslan-gar/shard_id_compression

Conversation

@ruslan-gar
Copy link
Copy Markdown
Collaborator

@ruslan-gar ruslan-gar commented Mar 20, 2026

Notes

ShardId and ShardNodeName can be read and written to NodeRefs table in a compressed format.
SjhardId now takes 3 bytes: 1 - compression type, 2 bytes for shardNo.
SharNodeName takes 16 bytes: it's just a GUID in the form of four 4-byte integers.
Both fields are written to the same columns as before.
In order to get the same ShardId as before, a filesystem id is passed the functions that read NodeRefs

How ShardId/ShardNodeName are written and read is controlled by:

enum EShardIdCompressionMode
{
// Only uncompressed ShartdId/ShardNodeName are read and written.
SICM_NO_COMPRESSION = 0;
// Both uncompressed and compressed forms are read.
SICM_READ = 1;
// SICM_READ + new ShartdId/ShardNodeName are compressed.
SICM_READ_WRITE = 2;
// SICM_READ_WRITE_COMPRESSED + conversion from uncompress to compressed
// form is started when TIndexTabletActor tablet starts.
SICM_READ_WRITE_CONVERT = 3;
}

SICM_READ_WRITE_CONVERT is not implemented in this PR.

Issue

#2639

@ruslan-gar ruslan-gar added filestore Add this label to run only cloud/filestore build and tests on PR large-tests Launch large tests for PR labels Mar 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo target: cloud/filestore/ (test time: 7250s): all tests PASSED for commit 1dd8e56.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
3551 3551 0 0 0 0 0

@ruslan-gar ruslan-gar requested a review from debnatkh March 20, 2026 19:03
1);
1,
Config->GetShardIdCompressionMode(),
GetFileSystemId());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that this is the same as the Main filesystem? In cases of sharded directories, a shard with the name filesystem_s12 could hold an external reference to filesystem_s34

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep you need GetMainFileSystemId

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also pls add a test with directory sharding

#define lengthof(arr) sizeof(arr) / sizeof(arr[0])

template<typename TNumType>
void numberFromByteStream(TNumType& num, const char*& byteStream)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower case naming convention

#define lengthof(arr) sizeof(arr) / sizeof(arr[0])

template<typename TNumType>
void numberFromByteStream(TNumType& num, const char*& byteStream)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider const char*& -> const char*

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, let us return num in the return value?

}
}

inline bool IsEncoded(const TIndexTabletDatabase::TNodeRef& nodeRef)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be a TNodeRef Method?

inline bool IsEncoded(const TIndexTabletDatabase::TNodeRef& nodeRef)
{
return !nodeRef.ShardId.empty() &&
nodeRef.ShardId[0] == ShardIdAsBinaryStream;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us ensure human-readability here with isalnum or something like that?

And add a compile-time assert that ShardIdAsBinaryStream is not readable (in case we are going to add new formats

public:

TMaxTracker(size_t slots = 15)
TMaxTracker(size_t slots = 15)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?


config.SetMultiTabletForwardingEnabled(true);
config.SetStrictFileSystemSizeEnforcementEnabled(true);
config.SetShardBalancerPolicy(NProto::SBP_ROUND_ROBIN);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is ROUND_ROBIN by default

config.SetAutomaticShardCreationEnabled(true);
config.SetShardAllocationUnit(shardAllocationUnit);

const TString fsId = "computefilesystem-e0tdfty89a5fjdpg6k";
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let us use something more neutral


for (int writeModeNum =
NProto::EShardIdCompressionMode::SICM_NO_COMPRESSION;
writeModeNum <=
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth to introduce ::MAX value to be able to properly iterate?

NProto::TStorageConfig newStorageConfig;
auto setMode = [&](const NProto::EShardIdCompressionMode mode) {
newStorageConfig.SetShardIdCompressionMode(mode);
ChangeStorageConfig(fsId, newStorageConfig, service, true);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are going to move this test to the one for directory sharding, do not forget to apply it to all shards

But probably it is easier to change the runtime storage config and restart the tablet

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure it will be applied to the restarted tablets, though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

filestore Add this label to run only cloud/filestore build and tests on PR large-tests Launch large tests for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants