Skip to content

Wire NativeObjectStore through Store to DataFormatPlugin, IndexingEngine and ReaderManagers#7

Open
rayshrey wants to merge 1 commit intonishchay21:warm-composite-supportfrom
rayshrey:shard-level-object-store
Open

Wire NativeObjectStore through Store to DataFormatPlugin, IndexingEngine and ReaderManagers#7
rayshrey wants to merge 1 commit intonishchay21:warm-composite-supportfrom
rayshrey:shard-level-object-store

Conversation

@rayshrey
Copy link
Copy Markdown

DO NOT MERGE

Created PR only to get reviews on the NativeStore wiring.

PR Description

Summary

Wire a shard-scoped NativeStoreHandle (Rust ObjectStore pointer) through the storage and data format layers so that both the write path (IndexingEngine) and read path (ReaderManagers) can access it.
Phase 1 - plumbing only, no actual ObjectStore usage yet.
Phase 2 - actual usage of ObjectStore on native side for read and writes

Motivation

The parquet data format plugin currently uses direct file I/O for reads and writes. To support warm indices (where files may not be present locally), we need an ObjectStore abstraction
that can route I/O to local or remote storage. This PR creates the wiring so that a future Phase 2 can plug in the actual ObjectStore implementation via FFM/Rust.

Flow

                                                                                                                                                                                              
                           ┌─────────────────────────────┐                                                                                                                                    
                           │   RepositoriesService        │                                                                                                                                   
                           │   (has Repository instances)  │                                                                                                                                  
                           └──────────┬──────────────────┘                                                                                                                                    
                                      │ repo.getNativeStore()                                                                                                                                 
                                      ▼                                                                                                                                                       
  ┌──────────────────────────────────────────────────────────────┐                                                                                                                            
  │  IndexService.createShard()                                  │                                                                                                                            
  │                                                              │                                                                                                                            
  │  NativeStoreRepository repoStore = resolveNativeStoreRepo()  │                                                                                                                            
  │  store = storeFactory.newStore(..., repoStore)               │                                                                                                                            
  │                        │                                     │                                                                                                                            
  └────────────────────────┼─────────────────────────────────────┘                                                                                                                            
                           │  Store carries NativeStoreRepository                                                                                                                             
                           ▼                                                                                                                                                                  
  ┌──────────────────────────────────────────────────────────────┐                                                                                                                            
  │  DataFormatAwareEngine (per shard)                           │                                                                                                                            
  │                                                              │                                                                                                                            
  │  ┌─ registry.createNativeStore(format, store) ──────────┐   │                                                                                                                             
  │  │  └─► DataFormatPlugin.createNativeStore(store)        │   │                                                                                                                            
  │  │      └─► returns NativeStoreHandle (shard-scoped)     │   │                                                                                                                            
  │  └───────────────────────────────────────────────────────┘   │                                                                                                                            
  │                         │                                    │                                                                                                                            
  │            ┌────────────┴────────────┐                       │                                                                                                                            
  │            ▼                         ▼                       │                                                                                                                            
  │  ┌─────────────────┐     ┌────────────────────┐             │                                                                                                                             
  │  │ IndexingEngine   │     │ ReaderManagers      │             │                                                                                                                           
  │  │ (write path)     │     │ (read path)         │             │                                                                                                                           
  │  │                  │     │                     │             │                                                                                                                           
  │  │ IndexingEngine   │     │ ReaderManagerConfig  │             │                                                                                                                          
  │  │ Config has       │     │ has                  │             │                                                                                                                          
  │  │ NativeStoreHandle│     │ NativeStoreHandle    │             │                                                                                                                          
  │  └─────────────────┘     └────────────────────┘             │                                                                                                                             
  │                                                              │                                                                                                                            
  │  closeNoLock(): shardNativeStore.close()                     │                                                                                                                            
  └──────────────────────────────────────────────────────────────┘   

DataFormatPlugin.createNativeStore(Store) — what it will do

Called once per shard at engine creation time. Creates a shard-scoped ObjectStore (Rust side) and returns a NativeStoreHandle pointer to it. Both the write path and read path share this handle.

  DataFormatPlugin.createNativeStore(Store store)                                                                                                                                             
  │                                                                                                                                                                                           
  ├── store.getNativeStoreRepository()                                                                                                                                                        
  │   │                                                                                                                                                                                       
  │   ├── isLive() = true (remote-backed index)                                                                                                                                               
  │   │   │                                                                                                                                                                                   
  │   │   │  The NativeStoreRepository holds a repo-level ObjectStore                                                                                                                         
  │   │   │  (e.g., Arc<AmazonS3> for the entire S3 bucket).                                                                                                                                  
  │   │   │                                                                                                                                                                                   
  │   │   │  We need to scope it to this shard's prefix so the shard                                                                                                                          
  │   │   │  can only access its own files:                                                                                                                                                   
  │   │   │                                                                                                                                                                                   
  │   │   ├── Build shard prefix: "indices/{indexUUID}/{shardId}/parquet/"                                                                                                                    
  │   │   │                                                                                                                                                                                   
  │   │   ├── FFM call: RustBridge.createScopedStore(repoPtr, shardPrefix)                                                                                                                    
  │   │   │   └── Rust: PrefixStore::new(Arc::clone(repo_store), prefix)                                                                                                                      
  │   │   │       └── Returns Arc<dyn ObjectStore> scoped to shard path                                                                                                                       
  │   │   │                                                                                                                                                                                   
  │   │   └── return new NativeStoreHandle(scopedPtr, RustBridge::destroyStore)                                                                                                               
  │   │                                                                                                                                                                                       
  │   └── isLive() = false (local-only index, no remote repo)                                                                                                                                 
  │       │                                                                                                                                                                                   
  │       ├── FFM call: RustBridge.createLocalStore(shardPath.getDataPath())                                                                                                                  
  │       │   └── Rust: LocalFileSystem::new_with_prefix(shard_data_path)                                                                                                                     
  │       │       └── Returns Arc<dyn ObjectStore> rooted at shard path                                                                                                                       
  │       │                                                                                                                                                                                   
  │       └── return new NativeStoreHandle(localPtr, RustBridge::destroyStore)                                                                                                                
  │                                                                                                                                                                                           
  │  The returned NativeStoreHandle is:                                                                                                                                                       
  │  - Passed to IndexingEngine via IndexingEngineConfig (writes)                                                                                                                             
  │  - Passed to ReaderManagers via ReaderManagerConfig (reads)                                                                                                                               
  │  - Closed by DataFormatAwareEngine.closeNoLock() on shard close                                                                                                                           
  │    └── Rust: drops the Arc<dyn ObjectStore> (PrefixStore or LocalFileSystem)                                                                                                              

Description

[Describe what this change achieves]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…ine and ReaderManagers

Phase 1: Plumbing only — creates the wiring for a shard-scoped NativeStoreHandle
to flow from the repository layer through to the data format write and read paths.
No actual ObjectStore usage yet (createNativeStore returns EMPTY).

Changes:
- Store: add NativeStoreRepository field (repo-level, borrowed from Repository)
- StoreFactory: 6/7-arg newStore become defaults, 8-arg with NativeStoreRepository is abstract
- IndexService.createShard: resolve NativeStoreRepository from RepositoriesService
- DataFormatPlugin: add createNativeStore(Store) default method
- DataFormatRegistry: add createNativeStore(format, store), update getReaderManagers
- DataFormatAwareEngine: orchestrate creation, pass handle to engine and readers, close on shutdown
- IndexingEngineConfig/ReaderManagerConfig: carry NativeStoreHandle (shard-scoped)
- ParquetDataFormatPlugin: override createNativeStore with Phase 2 TODOs
Comment on lines +217 to +219
// Plugin creates shard-scoped native store from the Store's repository reference
this.shardNativeStore = registry.createNativeStore(format, config().getStore());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of NativeStoreHandle I believe the createNativeStore should return NativeStoreRepository, the handle should be a lower level primitive that interacts with the FFM layer

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, Consider passing the shard-scoped ObjectStore handle directly instead of the repository-level NativeStoreRepository. The plugin shouldn't need to know about repositories — it just needs an ObjectStore to read/write parquet files.

We can think this of how lucene works with FilterDirectory instead of RemoteDirectory. @Bukhtawar your thoughts ?

Comment on lines +217 to +219
// Plugin creates shard-scoped native store from the Store's repository reference
this.shardNativeStore = registry.createNativeStore(format, config().getStore());

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, Consider passing the shard-scoped ObjectStore handle directly instead of the repository-level NativeStoreRepository. The plugin shouldn't need to know about repositories — it just needs an ObjectStore to read/write parquet files.

We can think this of how lucene works with FilterDirectory instead of RemoteDirectory. @Bukhtawar your thoughts ?

OnClose onClose,
ShardPath shardPath,
IndexStorePlugin.DirectoryFactory directoryFactory,
NativeStoreRepository nativeStoreRepository
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store is a shard-level storage abstraction — it holds Directory for Lucene. For native formats, it should hold a NativeDirectory (shard-scoped object store) rather than NativeStoreRepository (repo-level handle). The repository is a cluster concept that shouldn't leak into the shard store directly

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested pattern — mirror how Directory comes from DirectoryFactory:

NativeDirectoryFactory resolves the native pointer, creates the native object store returns NativeDirectory
Store holds NativeDirectory — doesn't know about repositories
Tomorrow the underlying pointer can be swapped (different backend) without changing Store

@nishchay21 nishchay21 force-pushed the warm-composite-support branch 8 times, most recently from 9f3e9a5 to f78e480 Compare April 20, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants