Fully revamp Firehose docs structure#38
Conversation
- Restructure SUMMARY.md to clearly separate chain-agnostic (90%) from chain-specific (10%) content - Create new Core Firehose section with architecture, CLI reference, and deployment guides - Add Chain-Specific Implementations section with standardized structure for each blockchain - Create comprehensive CLI reference documentation for firecore binary - Add detailed deployment guide targeting network operators - Create supported chains overview with binary usage patterns and requirements - Add quick start guide for getting Firehose running in under 30 minutes - Create integration template for adding new blockchain support - Add system requirements documentation for production deployments - Reorganize existing content to fit new structure while preserving valuable information - Focus on CLI flags over configuration files as requested - Target network operators and deployment-focused users This addresses the major revamp requested in BLO-537 to better organize Firehose documentation.
maoueh
left a comment
There was a problem hiding this comment.
This is a gitbook documentation project, so we need to follow Gitbook structure, re-work you PR to follow gitbook documentation structure.
Ensure the SUMMARY.md file is properly updated so I can preview the documentation update properly.
- Update SUMMARY.md to follow proper GitBook format and navigation - Restructure sections with correct GitBook hierarchy: - Getting Started (with Quick Start Guide) - Core Firehose (Chain-Agnostic) - Chain-Specific Implementations - Community Integrations - Integrate New Chains - Create architecture/README.md for proper GitBook navigation - Maintain existing file references while improving structure - Ensure SUMMARY.md enables proper GitBook preview functionality This addresses the review feedback to follow GitBook documentation structure.
| ### Distributed Deployment | ||
| Components spread across multiple machines for production scale: | ||
|
|
||
| ``` | ||
| ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ | ||
| │ Blockchain │ │ Firehose │ │ Storage & │ | ||
| │ Nodes │ │ Processing │ │ Serving │ | ||
| ├─────────────────┤ ├─────────────────┤ ├─────────────────┤ | ||
| │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ | ||
| │ │ Node 1 │ │ │ │ Reader 1 │ │ │ │ Storage │ │ | ||
| │ │ Node 2 │─┼────┼─│ Reader 2 │─┼────┼─│ (Cloud) │ │ | ||
| │ │ Node 3 │ │ │ │ Merger │ │ │ │ │ │ | ||
| │ └─────────────┘ │ │ │ Relayer │ │ │ └─────────────┘ │ | ||
| │ │ │ └─────────────┘ │ │ ┌─────────────┐ │ | ||
| │ │ │ │ │ │ gRPC Server │ │ | ||
| │ │ │ │ │ │ (Load Bal) │ │ | ||
| │ │ │ │ │ └─────────────┘ │ | ||
| └─────────────────┘ └─────────────────┘ └─────────────────┘ | ||
| ``` |
There was a problem hiding this comment.
The blockchain nodes are run as a subprocess of Reader node, can you make that more apparent somehow, either in the diagram or as text information.
Also, replace the gRPC Server By Firehose & Substreams and Load Bal by via gRPC.
architecture/README.md
Outdated
| ### Streaming API | ||
| - gRPC-based streaming interface | ||
| - Real-time and historical data access | ||
| - Filtering and transformation capabilities |
There was a problem hiding this comment.
Fork aware and cursoring are also important element.
chains/supported-chains.md
Outdated
| Firehose supports a wide range of blockchain networks through a combination of universal components and chain-specific reader implementations. This page provides an overview of all supported chains and their specific characteristics. | ||
|
|
There was a problem hiding this comment.
Instead of using wide range, let's talk more about any blockchain for which a Firehose enabled node's client exists.
core/cli-reference.md
Outdated
| - `--config-file, -c` (string): Configuration file to use (default: `./firehose.yaml`) | ||
|
|
||
| ### Logging | ||
| - `--log-format` (string): Format for logging to stdout (`text` or `stackdriver`, default: `text`) |
There was a problem hiding this comment.
Document also that if Docker or Kubernetes execution environment, the default value switches to stackdriver (JSON format).
core/cli-reference.md
Outdated
| - **`firehose`** - Serves gRPC API for block streaming | ||
| - **`substreams-tier1`** - Substreams execution tier 1 | ||
| - **`substreams-tier2`** - Substreams execution tier 2 | ||
| - **`index-builder`** - Builds block indexes (if supported by chain) |
chains/ethereum/README.md
Outdated
| ## Storage Requirements | ||
|
|
||
| ### Mainnet | ||
| - **One-block files**: ~2GB/day | ||
| - **Merged blocks**: ~50GB/month | ||
| - **Full archive**: ~2TB/year | ||
|
|
||
| ### Testnets | ||
| - **Goerli**: ~10GB/month | ||
| - **Sepolia**: ~5GB/month | ||
|
|
||
| ## Performance Characteristics | ||
|
|
||
| ### Block Processing | ||
| - **Average block time**: 12 seconds | ||
| - **Processing latency**: <1 second | ||
| - **Throughput**: ~7,000 transactions/block | ||
|
|
||
| ### Resource Usage | ||
| - **CPU**: 2-4 cores recommended | ||
| - **Memory**: 8GB minimum, 16GB recommended | ||
| - **Storage**: SSD required for optimal performance | ||
| - **Network**: 100Mbps+ for real-time sync |
There was a problem hiding this comment.
Remove requirements, simply tell operators to refer to the chain's official documentation they target, Firehose is a dummy reader on top of node's client, so operators should always refer to node's official documentation for how to properly operate the node's client software.
chains/ethereum/README.md
Outdated
| ## Troubleshooting | ||
|
|
||
| ### Common Issues | ||
|
|
||
| #### Node Sync Problems | ||
| ```bash | ||
| # Check node sync status | ||
| fireeth tools check-node-sync --node-url=http://localhost:8545 | ||
| ``` | ||
|
|
||
| #### Block Processing Delays | ||
| ```bash | ||
| # Monitor processing pipeline | ||
| fireeth tools monitor-pipeline --data-dir=/var/firehose-data | ||
| ``` | ||
|
|
||
| #### Storage Issues | ||
| ```bash | ||
| # Verify block file integrity | ||
| fireeth tools verify-blocks --start-block=1000000 --stop-block=1001000 | ||
| ``` | ||
|
|
||
| ## Migration from Other Systems | ||
|
|
||
| ### From Graph Node | ||
| - Export existing subgraph mappings | ||
| - Convert to Substreams modules | ||
| - Test with historical data | ||
| - Deploy to production | ||
|
|
||
| ### From Custom Indexers | ||
| - Identify data extraction patterns | ||
| - Map to Firehose block structure | ||
| - Implement using Substreams | ||
| - Validate data consistency |
chains/supported-chains.md
Outdated
| #### Cosmos Ecosystem | ||
| - **[Injective](injective/README.md)** - Decentralized exchange protocol | ||
| - **Osmosis** - AMM protocol in Cosmos | ||
| - **Juno** - Smart contract platform | ||
|
|
||
| ### Community Supported | ||
|
|
||
| These chains are maintained by the community with StreamingFast guidance: | ||
|
|
||
| - **[Starknet](../community-integrations/starknet/README.md)** - Layer 2 scaling solution | ||
| - **Aptos** - Move-based blockchain | ||
| - **Sui** - Move-based blockchain | ||
|
|
There was a problem hiding this comment.
Remove all this, outdated and not true anymore.
SUMMARY.md
Outdated
| * [Injective](firehose-setup/injective/README.md) | ||
| * [Single-Machine Deployment](firehose-setup/injective/single-machine-deployment.md) | ||
|
|
||
| ## Community Integrations |
There was a problem hiding this comment.
Remove this section as well as all documentastion under community-integration
SUMMARY.md
Outdated
| * [CLI Reference](core/cli-reference.md) | ||
| * [Deployment Guide](core/deployment-guide.md) |
There was a problem hiding this comment.
Put Deployment guide before CLI reference.
- Remove Getting Started section completely - Remove Community Integrations section and all community-integration docs - Remove Integration Template file - Put Deployment Guide before CLI Reference in SUMMARY.md - Update architecture diagrams to show nodes as subprocess of Reader - Replace 'gRPC Server' with 'Firehose & Substreams' in diagrams - Add GitBook hint about node subprocess relationship - Add fork-aware and cursor-based streaming features - Remove CLI Reference temporarily (will recreate with correct info) This addresses the major structural feedback from the review.
- Clean up Ethereum documentation: only support Geth and Geth forks - Add proper GitBook hints throughout documentation - Create new CLI reference with correct environment variable patterns - Fix deployment guide to use flags instead of config files - Add firecore vs fireeth explanation with info hints - Update system requirements with correct default ports - Remove log-to-file recommendations - Add proper port information for Firehose & Substreams gRPC endpoint This addresses the specific technical feedback from the review.
- Rewrite deployment guide root page with chain-agnostic focus - Add dummy-blockchain as example implementation - Create comprehensive Single Machine Deployment guide: * All components in single process with shared local storage * Step-by-step verification with inspection commands * Proper backlinks to architecture documentation - Create detailed Distributed Deployment guide: * Each component as separate process * Shared object storage configuration * Production considerations and scaling guidance * Health checks and monitoring examples - Remove System Requirements from Core Firehose section - Update SUMMARY.md with new deployment structure Both guides use dummy-blockchain as chain-agnostic example that can be applied to any Firehose-enabled blockchain.
Single Machine Deployment fixes: - Add --config-file="" flag to prevent config file conflicts - Remove incorrect --grpc-listen-addr flag - Update all port references to correct defaults: * Firehose: :10015 (was :9000) * Substreams Tier1: :10016 (was :9000) * Substreams Tier2: :10017 (was :9000) - Fix print command syntax: * one-blocks → one-block (singular) * Remove trailing slashes from paths * Use block number 1 instead of 0000000001 - Add comprehensive prerequisites section with port conflict troubleshooting - Add default ports reference table - Add protobuf output explanation Distributed Deployment fixes: - Add --config-file="" flag to all components - Fix all print commands (one-blocks → one-block, remove trailing slashes) - Update all port references to correct defaults: * Firehose: :10015 (was :9000) * Substreams Tier1: :10016 (was :9001) * Substreams Tier2: :10017 (was :9002) - Fix health check commands with correct ports - Fix nginx configuration example with correct ports - Update all gRPC test commands to use correct ports Both guides now work out-of-the-box with accurate technical details.
There was a problem hiding this comment.
@codegen-sh Here another pass of review
I'll review your comments and suggestions now! 📝
💻 View my work • 🛑 Stop
| # Create working directory | ||
| mkdir firehose-deployment | ||
| cd firehose-deployment | ||
|
|
||
| # Create data directory | ||
| mkdir -p firehose-data |
There was a problem hiding this comment.
No need to create firehose-data, it's automatically created by firecore, find another name for firehose-deployment also as inside there will be firehose-data folder, so it's a bit redundant.
| --advertise-chain-name="acme-dummy-blockchain" \ | ||
| --reader-node-path="dummy-blockchain" \ | ||
| --reader-node-data-dir="./firehose-data/reader-node" \ | ||
| --reader-node-arguments="start --tracer=firehose --store-dir=./firehose-data/reader-node --block-rate=120 --genesis-height=0 --genesis-block-burst=100" |
There was a problem hiding this comment.
| --reader-node-arguments="start --tracer=firehose --store-dir=./firehose-data/reader-node --block-rate=120 --genesis-height=0 --genesis-block-burst=100" | |
| --reader-node-arguments="start --tracer=firehose --store-dir={data-dir}/reader --block-rate=120" |
| **Default Ports Used:** | ||
| - **Firehose**: `:10015` (main gRPC API) | ||
| - **Reader**: `:10010` | ||
| - **Relayer**: `:10014` | ||
| - **Merger**: `:10012` | ||
| - **Substreams Tier1**: `:10016` | ||
| - **Substreams Tier2**: `:10017` |
There was a problem hiding this comment.
Also describe quickly the protocol for each of those port and ideally link to Protobuf service definition.
| - **Substreams Tier1**: `:10016` | ||
| - **Substreams Tier2**: `:10017` | ||
|
|
||
| The `--config-file=""` flag disables automatic config file loading to prevent conflicts. |
There was a problem hiding this comment.
| The `--config-file=""` flag disables automatic config file loading to prevent conflicts. | |
| The `--config-file=""` flag disables automatic config file loading switching into a flags only mode. |
| {% endhint %} | ||
|
|
||
| {% hint style="info" %} | ||
| The `dummy-blockchain` runs as a subprocess of the Reader component. The Reader manages its lifecycle and extracts block data from it. See [Reader Component](../architecture/components/reader.md) for more details. |
There was a problem hiding this comment.
Describe quickly that extracted data is exchanged through stdout pipe to the Reader component and contains chain's specific Protobuf block and metadata.
| ```bash | ||
| # List Substreams tier1 services | ||
| grpcurl -plaintext localhost:10016 list | ||
|
|
||
| # List Substreams tier2 services | ||
| grpcurl -plaintext localhost:10017 list | ||
|
|
||
| # Test a simple Substreams request (if you have a .spkg file) | ||
| # substreams run -e localhost:10016 your-substream.spkg map_blocks -s 1 -t 10 | ||
| ``` |
There was a problem hiding this comment.
Replace with working substreams run -e localhost:10016 -p common@v0.1.0 -s 1 -t +5
| {% hint style="info" %} | ||
| Substreams runs on separate ports from Firehose: | ||
| - **Substreams Tier1**: `:10016` (processing tier) | ||
| - **Substreams Tier2**: `:10017` (caching tier) | ||
| - **Firehose**: `:10015` (block streaming) | ||
| {% endhint %} |
There was a problem hiding this comment.
Seems mostly useless, remove.
| By default, all data is stored under `./firehose-data/storage/`: | ||
|
|
||
| - **One-blocks**: `./firehose-data/storage/one-blocks/` | ||
| - **Merged blocks**: `./firehose-data/storage/merged-blocks/` | ||
| - **Indexes**: `./firehose-data/storage/indexes/` |
There was a problem hiding this comment.
Remove trailing slashes.
Also document which flag controls which paths and how they are common and shared among app. You can use docker run --rm -it ghcr.io/streamingfast/firehose-core:v1.10.1 start --help to learn about firecore flags.
| ### Performance Tuning | ||
|
|
||
| For better performance, consider: | ||
|
|
||
| ```bash | ||
| # Increase block rate for faster testing | ||
| --reader-node-arguments="start --tracer=firehose --store-dir=./firehose-data/reader-node --block-rate=300 --genesis-height=0 --genesis-block-burst=100" | ||
|
|
||
| # Use different data directory on faster storage | ||
| --data-dir="/fast-ssd/firehose-data" | ||
| ``` |
| ## Monitoring | ||
|
|
||
| Monitor your deployment by watching the logs and checking component health: | ||
|
|
||
| ```bash | ||
| # Watch for errors in logs | ||
| tail -f firecore.log | grep ERROR | ||
|
|
||
| # Check disk usage | ||
| du -sh ./firehose-data/ | ||
|
|
||
| # Monitor block processing rate | ||
| watch 'ls ./firehose-data/storage/one-blocks/ | wc -l' | ||
| ``` |
- Change working directory from firehose-deployment to firehose-workspace
- Remove manual data directory creation (automatically handled by firecore)
- Update reader-node-arguments to use {data-dir} placeholder
- Fix config-file flag explanation (flags only mode)
- Add stdout pipe and Protobuf data exchange explanation
- Fix relayer stream description (stops after 3 blocks)
- Replace gRPC commands with firecore tools:
* firecore tools firehose-client
* firecore tools firehose-single-block-client
* Add JSON output options (-o protojson, -o json)
- Update Substreams test to use working command: common@v0.1.0
- Remove redundant Substreams port info section
- Update storage paths documentation:
* Remove trailing slashes
* Add flag documentation for path control
* Document shared storage among components
- Remove Performance Tuning section
- Remove Monitoring section
- Add protocol documentation for all ports with Protobuf service links
| ### Performance Tuning | ||
| - **One-blocks**: `./firehose-data/storage/one-blocks` (controlled by `--common-one-block-store-url`) | ||
| - **Merged blocks**: `./firehose-data/storage/merged-blocks` (controlled by `--common-merged-blocks-store-url`) | ||
| - **Indexes**: `./firehose-data/storage/indexes` (controlled by `--common-index-store-url`) |
There was a problem hiding this comment.
Only relevant to fireeth and firenear, let's not document this here.
Got it! I'll review your documentation comments... 📝
💻 View my work • 🛑 Stop
| - **Relayer**: `:10014` (gRPC - live block streaming) | ||
| - **Merger**: `:10012` (gRPC - internal merger protocol) | ||
| - **Substreams Tier1**: `:10016` (gRPC - [sf.substreams.rpc.v2.Stream](https://buf.build/streamingfast/substreams/docs/main:sf.substreams.rpc.v2)) | ||
| - **Substreams Tier2**: `:10017` (gRPC - [sf.substreams.rpc.v2.Stream](https://buf.build/streamingfast/substreams/docs/main:sf.substreams.rpc.v2)) |
There was a problem hiding this comment.
It's gRPC but it's an internal tier1 <=> tier2 protocol
Got it! I'll review your documentation comments... 📝
💻 View my work • 🛑 Stop
…ption - Remove indexes storage documentation (only relevant to fireeth and firenear) - Fix Substreams Tier2 description: internal tier1 <=> tier2 protocol (not public API)
# Conflicts: # firehose-setup/single-machine-deployment.md
Fully Revamp Firehose Documentation Structure
This PR addresses BLO-537 by completely restructuring the Firehose documentation to clearly separate chain-agnostic content (90%) from chain-specific implementations (10%).
🎯 Key Changes
📋 New Documentation Structure
🔧 CLI-First Approach
firecoreCLI reference with all flags and commands🚀 Operator-Focused Content
📚 Improved Organization
📁 New File Structure
🎯 Target Audience Alignment
This restructure specifically targets:
🔍 Content Highlights
CLI Reference (
core/cli-reference.md)firecorecommand documentationDeployment Guide (
core/deployment-guide.md)Supported Chains (
chains/supported-chains.md)firecorevsfireeth)Quick Start (
getting-started/quick-start.md)🔄 Migration Strategy
🚀 Next Steps
This PR establishes the new structure with:
Follow-up work needed:
📋 Addresses BLO-537 Requirements
This provides a solid foundation for the fully revamped Firehose documentation that better serves network operators and maintains the clear architectural separation requested.
💻 View my work • 👤 Initiated by
Matthieu Vachon• About Codegen