Conversation
|
I think this is about done. This was a massive task but the outcome is great, the database is now small and faster both reading and writing (practically, some operations on some tables may be slower). This is probably going to take some time to review, the old docs have all been removed as I didn't want to spend the time updating it all while everything is still changing. Its the same with the lower level tests, the higher level tests are still present. The format of the code is still roughly the same. |
Otherwise, this will error with debug builds
| #[inline] | ||
| pub fn remove_tx(tx_hash: &TxHash, tables: &mut impl TablesMut) -> DbResult<(TxId, Transaction)> { | ||
| //------------------------------------------------------ Transaction data | ||
| let tx_id = tables.tx_ids_mut().take(tx_hash)?; |
There was a problem hiding this comment.
main used to have .take() which deletes the tx_hash, i dont see the remove_tx_from_dynamic_tables deleting the tx_hash from tx_ids - only key images and outputs
edit: looks like it impacts add_alt_transaction_blob in src/ops/alt_block/tx.rs as well - sees tx hash exists, skips adding the blob
There was a problem hiding this comment.
Oops forgot to add the explicit removal when I move to a write batch instead of an atomic tx, should be fixed now
| monero_oxide::io::VarInt::write(&block_txs.len(), &mut block) | ||
| .expect("The number of txs per block will not exceed u64::MAX"); | ||
| let cumulative_rct_outs = tapes | ||
| .read_entry(&db.block_infos, block_height as u64 - 1)? |
There was a problem hiding this comment.
nit, to match L215
| .read_entry(&db.block_infos, block_height as u64 - 1)? | |
| .read_entry(&db.block_infos, block_height.saturating_sub(1) as u64)? |
| let top_block_height = tapes | ||
| .fixed_sized_tape_len(&db.block_infos) | ||
| .expect("Required tape not open") | ||
| - 1; |
There was a problem hiding this comment.
style nit: could port the same defensive check from chain_height
unnecessary as genesis block always gets added but good for consistency
| let top_block_height = tapes | |
| .fixed_sized_tape_len(&db.block_infos) | |
| .expect("Required tape not open") | |
| - 1; | |
| let chain_height = tapes | |
| .fixed_sized_tape_len(&db.block_infos) | |
| .expect("Required tape not open"); | |
| if chain_height == 0 { | |
| return Err(BlockchainError::NotFound); | |
| } | |
| let top_block_height = chain_height - 1; |
| block_height(db, &tx_ro, &first_known_block_hash)?.ok_or(BlockchainError::NotFound)?; | ||
|
|
||
| let chain_height = crate::ops::blockchain::chain_height(table_block_heights)?; | ||
| let chain_height = crate::ops::blockchain::chain_height(db, &tapes)?; |
There was a problem hiding this comment.
same here
| let chain_height = crate::ops::blockchain::chain_height(db, &tapes)?; | |
| let chain_height = crate::ops::blockchain::chain_height(db, &tapes)?; | |
| if chain_height == 0 { | |
| return Err(BlockchainError::NotFound): | |
| } | |
| let height = u64_to_usize( | ||
| tapes | ||
| .fixed_sized_tape_len(&db.block_infos) | ||
| .expect("Require tape not found") |
| table_block_heights.len().map(|height| height as usize) | ||
| Ok(tapes | ||
| .fixed_sized_tape_len(&db.block_infos) | ||
| .expect("Required tape must exists") as usize) |
There was a problem hiding this comment.
here too
| .expect("Required tape must exists") as usize) | |
| .expect("Required tape must exist") as usize) |
| tapes.commit(Persistence::Buffer)?; | ||
|
|
||
| let mut pre_rct_numb_outputs_cache = db.pre_rct_numb_outputs_cache.lock().unwrap(); | ||
|
|
||
| let mut tx_rw = db.fjall.batch().durability(Some(PersistMode::Buffer)); | ||
|
|
||
| for block in blocks { | ||
| crate::ops::block::add_block_to_dynamic_tables( | ||
| db, | ||
| &block.block, | ||
| &block.block_hash, | ||
| block.txs.iter().map(|tx| Cow::Borrowed(&tx.tx)), | ||
| &mut numb_transactions, | ||
| &mut tx_rw, | ||
| &mut pre_rct_numb_outputs_cache, | ||
| )?; | ||
| } | ||
|
|
||
| tx_rw.commit()?; |
There was a problem hiding this comment.
note: with graceful shutdown all errors after tapes commit, but before fjall successfully commits should be distinguished due to us being out of sync after that. BlockchainError::DatabaseOutOfSync?
handler.rs does panic on error now so not a issue rn, but graceful shutdown changes this
idea: recovery function that replays only the out of sync blocks from tapes into fjall when we get that kind of error instead instead of a full chain nuke on desync
There was a problem hiding this comment.
Yeah I was going to do that originally, however full fjall nuke is much easier to handle + is not too slow. You can test it by just deleting the fjall DB but keeping the tapes.
| /// - The blockchain has 0 blocks => this returns `Err(BlockchainError::KeyNotFound)` | ||
| /// - The blockchain has 1 block (height 0) => this returns `Ok(0)` | ||
| /// - The blockchain has 2 blocks (height 1) => this returns `Ok(1)` | ||
| /// | ||
| /// Note that in cases where no blocks have been written to the | ||
| /// database yet, an error is returned: `Err(RuntimeError::KeyNotFound)`. | ||
| /// database yet, an error is returned: `Err(BlockchainError::KeyNotFound)`. |
There was a problem hiding this comment.
nit: I don't think you care to change this but in the storage/ diffs there are many u32/u64/usize as ops that could use cuprate_helper::cast already in scope.
| // Split the blocks at the point the pruning stripe changes. | ||
| let start_height = blocks[0].height; | ||
| let first_block_pruning_seed = cuprate_pruning::DecompressedPruningSeed::new( | ||
| cuprate_pruning::get_block_pruning_stripe(start_height, usize::MAX, 3).unwrap(), |
There was a problem hiding this comment.
nit:
rg "usize::MAX, 3"
storage/blockchain/src/ops/tx.rs
259: cuprate_pruning::get_block_pruning_stripe(tx_info.height, usize::MAX, 3).unwrap();
storage/blockchain/src/ops/block.rs
111: cuprate_pruning::get_block_pruning_stripe(start_height, usize::MAX, 3).unwrap(),
143: cuprate_pruning::get_block_pruning_stripe(blocks[0].height, usize::MAX, 3).unwrap();
405: let stripe = cuprate_pruning::get_block_pruning_stripe(block_height, usize::MAX, 3).unwrap();
// cuprate_pruning
pub fn get_default_block_pruning_stripe(start_height) -> u32 {
get_block_pruning_stripe(
block_height,
usize::MAX,
CRYPTONOTE_PRUNING_LOG_STRIPES
).unwrap()
}|
|
||
| /// Adds a [`VerifiedTransactionInformation`] from an alt-block | ||
| /// if it is not already in the DB. | ||
| /// Adds a [`VerifiedTransactionInformation`] from an alt-block if it is not already in the DB. |
There was a problem hiding this comment.
nit: there's probably a lot of inaccurate docs at this point; this function replaces it unconditionally.
| let cumulative_rct_outs = tapes | ||
| .read_entry(&db.block_infos, block_height as u64 - 1)? |
There was a problem hiding this comment.
Call stack:
cuprated::blockchain::manager::pop_blocksBlockchainWriteRequest::PopBlockscuprate_blockchain::service::write::pop_blockscuprate_blockchain::ops::block::pop_blocks
AFAICT input that can pop the genesis is allowed? I think this will underflow.
| /// **THIS IS NOT ATOMIC** | ||
| /// | ||
| pub fn update_alt_chain_info( | ||
| db: &BlockchainDatabase, |
There was a problem hiding this comment.
Shouldn't this be given a tx_rw: &mut fjall::OwnedWriteBatch? It is only called from add_alt_block which has it open already.
| fn pop_blocks(db: &BlockchainDatabase, numb_blocks: usize) -> ResponseResult { | ||
| let mut tapes = db.linear_tapes.truncate(); | ||
| let mut tx_rw = db.fjall.batch(); | ||
|
|
||
| // flush all the current alt blocks as they may reference blocks to be popped. | ||
| crate::ops::alt_block::flush_alt_blocks(db)?; | ||
|
|
||
| // generate a `ChainId` for the popped blocks. | ||
| let old_main_chain_id = ChainId(rand::random()); | ||
|
|
||
| // pop the blocks | ||
| for _ in 0..numb_blocks { | ||
| crate::ops::block::pop_block(db, Some(old_main_chain_id), &mut tx_rw, &mut tapes)?; | ||
| } | ||
|
|
||
| tx_rw.commit()?; | ||
| tapes.commit(Persistence::SyncAll)?; | ||
| Ok(BlockchainResponse::PopBlocks(old_main_chain_id)) |
There was a problem hiding this comment.
Shouldn't tx_rw be passed to flush_alt_blocks?
| pub(crate) linear_tapes: Tapes, | ||
| pub(crate) fjall: fjall::Database, | ||
|
|
||
| pub(crate) block_heights: fjall::Keyspace, |
There was a problem hiding this comment.
Optional but perhaps (K,V) and their encoding should be documented since the types/tables are gone, even monerod's DB is easier to understand at a glance now 😭
| pub(crate) block_heights: fjall::Keyspace, | |
| /// K = `[u8; 32]` (block hash), V = [`usize`], encoding = little endian. | |
| pub(crate) block_heights: fjall::Keyspace, |
| .build() | ||
| cuprate_blockchain::config::Config { | ||
| blob_dir: path_with_network(&self.fs.fast_data_directory, self.network), | ||
| index_dir: path_with_network(&self.fs.slow_data_directory.clone(), self.network), |
There was a problem hiding this comment.
nit:
| index_dir: path_with_network(&self.fs.slow_data_directory.clone(), self.network), | |
| index_dir: path_with_network(&self.fs.slow_data_directory, self.network), |
cache_sizes below can be Copy but meh
| /// | macOS | "/Users/Alice/Library/Application Support/Cuprate/" | | ||
| /// | Linux | "/home/alice/.local/share/cuprate/" | | ||
| pub data_directory: PathBuf, | ||
| pub fast_data_directory: PathBuf, |
There was a problem hiding this comment.
Is there a reason to split this into two? They're both the same path.
| db.tx_infos.insert(tx_hash, bytemuck::bytes_of(&tx_info))?; | ||
|
|
||
| if !db.tx_blobs.contains_key(tx_hash)? { | ||
| db.tx_infos.remove(tx_hash)?; |
There was a problem hiding this comment.
Shouldn't this be through a OwnedWriteBatch?
|
I still don't have a good mental model of the atomic-ness between fjall/tapes, there are separate transactions everywhere, sometimes directly into a keyspace, sometimes in different commit orders. It doesn't seem like we can have a deterministic safe sync mode (as much as |
|
The DB is now 2 systems that are for ACID purposes independent. The tapes DB will do 1 update atomically, so will fjall, the 2 updates combined are not atomic. This does lead to issues with the fact they could get out of sync. On start-up we solve this by using the tapes DB as the source of truth, if fjall is out of sync, we rebuild it from the tapes (doesn't take that long for me). On some requests where we need an fjall and tapes TX to be in sync we should check if they are then just open another set of txs until they are in sync (this is not done yet). The only exception is alt blocks, for which fjall is not atomic. This is ok as we clear alt blocks on startup anyway and request should be built to handle only some alt block data for a given block being present. This all means Cuprated should be able to have a safe sync mode. |

No description provided.