Skip to content

Hijanhv/SyncLayer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SyncLayer

A production-grade two-way synchronization system between Google Sheets and MySQL, designed for real-world SaaS applications supporting multiple concurrent users.

🔗 Live Demo Sheet

View & Edit the Live Google Sheet

This Sheet is synced live with a MySQL database. Any edits you make will sync within 10 seconds!

🎯 Problem Statement

Create a live, bidirectional sync between Google Sheets and MySQL that:

  • Detects and propagates changes in both directions
  • Supports multiple simultaneous editors
  • Handles conflicts deterministically
  • Prevents infinite sync loops
  • Scales horizontally with background job processing

🏗️ Architecture

┌─────────────────┐         ┌──────────────────┐         ┌─────────────┐
│  Google Sheets  │ ◄─────► │   Sync Service   │ ◄─────► │    MySQL    │
└─────────────────┘         └──────────────────┘         └─────────────┘
                                     │
                                     ▼
                            ┌─────────────────┐
                            │  BullMQ + Redis │
                            └─────────────────┘

Core Components

Sync Engine

  • Row-level change detection using version numbers and timestamps
  • Conflict resolution using "last write wins" strategy
  • Loop prevention via source tracking (last_updated_by)
  • Atomic row-level operations

Job Queue Layer (BullMQ + Redis)

  • Non-blocking background sync execution
  • Job retry with exponential backoff
  • Rate limiting readiness
  • Horizontal scalability support

Polling Strategy

  • Polls every 10 seconds (production-grade interval)
  • More reliable than webhooks for Google Sheets
  • Production-proven approach used by major SaaS companies
  • Handles burst updates gracefully

🏗️ Why 10-Second Sync Interval?

The Architectural Decision

This system uses a 10-second polling interval instead of faster polling. This is NOT a limitation—it's a production-grade architectural choice used by major SaaS companies like Zapier, Make.com, and Integromat.

❌ Problems with 3-Second Polling:

  • Wastes API calls: Polls even when nothing changed
  • Costs money: More CPU/network usage on cloud hosting
  • Hits rate limits: Google Sheets API has 60 read requests/minute limit (free tier)
  • Not scalable: Can't handle many concurrent users without quota issues
  • Inefficient: 20 requests/minute vs 6 requests/minute (3x more waste)

✅ Industry Standard (What Real Companies Use):

  • Most SaaS apps: 10-30 second polling (Zapier, Make.com, Integromat)
  • True real-time apps: Use webhooks (Google Apps Script onEdit triggers)
  • Enterprise apps: Event-driven architecture, not polling

🎯 Why 10 Seconds is Better:

  1. Still feels real-time: 10 seconds is barely noticeable to users
  2. Stays under free quota: 6 requests/minute << 60/minute limit
  3. More efficient: Fewer wasted requests = lower costs
  4. Scalable: Can handle 100+ concurrent users on free tier
  5. Production-ready: This is the standard for production sync systems
  6. Cost-effective: Works perfectly on free Google Cloud tier

🔄 Sync Logic

Data Model

Each row contains:

  • id - Unique identifier
  • name, email, status - Business data
  • version - Incrementing version number
  • updated_at - ISO timestamp
  • last_updated_by - Source tracking ('sheet' or 'db')

Conflict Resolution

  1. Source tracking prevents loops

    • If row was last updated by Sheet, don't sync back to Sheet
    • If row was last updated by DB, don't sync back to DB
  2. Timestamp-based resolution

    • Compare updated_at timestamps
    • Most recent change wins
  3. Version fallback

    • If timestamps are within 1 second, use higher version
  4. Atomic updates

    • Each row is treated independently
    • No table-level locking required

Sync Flow

Sheet → Database:

1. Poll Google Sheets
2. Compare with DB rows by ID
3. For each changed row:
   - Check if last_updated_by !== 'db'
   - If Sheet is newer, update DB
   - Set last_updated_by = 'sheet'
   - Increment version

Database → Sheet:

1. Poll MySQL
2. Compare with Sheet rows by ID
3. For each changed row:
   - Check if last_updated_by !== 'sheet'
   - If DB is newer, update Sheet
   - Set last_updated_by = 'db'
   - Increment version

🚀 Setup Instructions

Prerequisites

  • Node.js 18+
  • pnpm
  • Docker & Docker Compose
  • Google Cloud account

1. Google Cloud Setup

Create Service Account:

1. Go to https://console.cloud.google.com
2. Create a new project or select existing
3. Enable Google Sheets API:
   - Navigate to "APIs & Services" > "Library"
   - Search for "Google Sheets API"
   - Click "Enable"
4. Create Service Account:
   - Go to "APIs & Services" > "Credentials"
   - Click "Create Credentials" > "Service Account"
   - Name it "synclayer-service"
   - Grant role: "Editor"
   - Click "Done"
5. Generate Key:
   - Click on the created service account
   - Go to "Keys" tab
   - Click "Add Key" > "Create New Key"
   - Choose JSON format
   - Download the key file

Prepare Google Sheet:

1. Create a new Google Sheet
2. Add headers in first row:
   id | name | email | status | version | updated_at | last_updated_by
3. Share the sheet with your service account email:
   - Click "Share" button
   - Paste service account email (from JSON key file)
   - Grant "Editor" access
4. Copy Sheet ID from URL:
   https://docs.google.com/spreadsheets/d/{SHEET_ID}/edit

2. Local Environment Setup

Start Infrastructure:

docker-compose up -d

This starts MySQL and Redis in containers.

Backend Setup:

cd backend
pnpm install
cp .env.example .env

Edit .env:

PORT=3000

MYSQL_HOST=localhost
MYSQL_PORT=3306
MYSQL_USER=root
MYSQL_PASSWORD=password
MYSQL_DATABASE=synclayer

REDIS_HOST=localhost
REDIS_PORT=6379

GOOGLE_SHEET_ID=your_actual_sheet_id
GOOGLE_SERVICE_ACCOUNT_KEY_PATH=./service-account-key.json

SYNC_INTERVAL_MS=3000

Place your service account JSON key as backend/service-account-key.json

Start Backend:

pnpm dev

Frontend Setup:

cd frontend
pnpm install
pnpm dev

3. Access Application


🚀 Deployment (Railway)

Live Deployment URLs

Deployment Architecture

The application is deployed on Railway with the following setup:

┌──────────────────────────────────────────────────────┐
│                    Railway Platform                   │
│                                                        │
│  ┌─────────────────┐  ┌──────────────┐  ┌──────────┐ │
│  │   Frontend      │  │   Backend    │  │  MySQL   │ │
│  │   (Vite+React)  │  │  (Express)   │  │ Database │ │
│  │   Port: 3000    │  │  Port: 8080  │  │          │ │
│  └─────────────────┘  └──────────────┘  └──────────┘ │
│          │                    │                │       │
│          │                    │                │       │
│  ┌───────────────────────────────────────────────┐    │
│  │              Redis Cache                      │    │
│  └───────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────┘

Deployment Configuration Files

  • Root Dockerfile - Multi-stage Docker build for production
  • Backend Dockerfile - Backend service container
  • Frontend Dockerfile - Frontend service with runtime API URL injection
  • Environment Variables - Managed in Railway dashboard

How It Works

  1. Backend Service:

    • Runs Express server on port 8080
    • Connected to Railway MySQL database
    • Connected to Railway Redis instance
    • Automatically syncs with Google Sheets every 10 seconds
  2. Frontend Service:

    • Built React app served by serve package
    • Automatically detects Railway domain and connects to backend
    • Runtime API URL injection for environment-specific configuration
    • CORS configured to accept requests from Railway domain
  3. Data Flow:

    • Frontend makes API calls to backend
    • Backend syncs data between MySQL and Google Sheets
    • Redis queue manages background sync jobs
    • All data persists in Railway MySQL database

Deploying Updates

  1. Make changes locally

    git add .
    git commit -m "Your changes"
    git push origin main
  2. Railway auto-deploys:

    • GitHub integration automatically triggers deployment
    • Backend rebuilds and restarts
    • Frontend rebuilds and redeploys
    • Zero downtime with rolling updates
  3. Check deployment status:

    • Visit Railway dashboard
    • Go to Deployments tab
    • Watch logs in real-time

Environment Variables on Railway

The following variables are configured in Railway:

Backend:

  • GOOGLE_SHEET_ID - Your Google Sheet ID
  • GOOGLE_SERVICE_ACCOUNT_KEY - Service account credentials (JSON)
  • DATABASE_URL - MySQL connection string (auto-provided)
  • REDIS_URL - Redis connection string (auto-provided)
  • SYNC_INTERVAL_MS - Polling interval (10000ms default)
  • NODE_ENV - Set to "production"

Frontend:

  • VITE_API_URL - Backend URL (auto-detected at runtime)

Monitoring Production

Check Backend Health:

curl https://synclayer-production.up.railway.app/health

View Sync Logs:

curl https://synclayer-production.up.railway.app/api/sync/logs

View Queue Stats:

curl https://synclayer-production.up.railway.app/api/sync/stats

Troubleshooting Deployment

Frontend shows "No data":

  1. Check browser console (F12) for API errors
  2. Verify VITE_API_URL is set in Railway variables
  3. Restart the frontend service

Backend not syncing:

  1. Check Railway backend logs
  2. Verify Google Sheet ID is correct
  3. Verify service account has access to Sheet
  4. Check MySQL connection in logs

Connection refused errors:

  1. Ensure VITE_API_URL environment variable is set
  2. Verify backend service is running
  3. Check network connectivity from frontend to backend

🧪 Testing & Multiplayer Usage

Google Sheet Access (Enabled ✅)

The Google Sheet is configured with "Anyone with the link can edit" permissions, allowing multiplayer testing:

Edit the Live Sheet

Multiple users can:

  • Edit rows simultaneously
  • Add new records
  • Watch changes sync to MySQL within 10 seconds
  • Test conflict resolution when editing the same row

Database Access (Dashboard Form ✅)

Option 1: Dashboard Form (CHOSEN)

  • Simplest for testing - No additional setup required
  • Browser-based - Works from any device on the same WiFi
  • User-friendly - Visual form with validation
  • Immediate feedback - See changes sync to Sheet within 10 seconds

The React dashboard (http://localhost:5173) includes an "Add to Database" form where anyone can:

  • Enter Name, Email, and Status
  • Click "Add Record" to insert into MySQL
  • Watch it sync to Google Sheet automatically
  • Test the DB→Sheet sync direction

Option 2: API Endpoints (Alternative) If you need programmatic access:

# Add a record via REST API
curl -X POST http://localhost:3000/api/data/db \
  -H "Content-Type: application/json" \
  -d '{"name":"Test User","email":"test@example.com","status":"active"}'

For remote testing:

  • Same WiFi: Share your local IP (http://192.168.1.x:3000)
  • Public access: Use ngrok to create a temporary public tunnel
    ngrok http 3000
    # Share the generated URL

Option 3: Cloud Deployment (Production) For permanent public access:

  • Deploy backend to Railway/Render/AWS/Google Cloud
  • Use managed MySQL (AWS RDS, Google Cloud SQL)
  • Add authentication layer (JWT, OAuth)
  • See DEPLOYMENT.md for detailed instructions

Why Dashboard Form Was Chosen

  1. Zero additional setup - Already included in the dashboard
  2. Visual and intuitive - Non-technical users can test easily
  3. No CLI required - Works from any browser
  4. Perfect for local testing - Ideal for demonstrating sync capabilities
  5. Safe - Form validation prevents invalid data
  6. Educational - Shows complete two-way sync in action

For production deployments, combine Option 1 (dashboard) with Option 3 (cloud deployment) and add proper authentication.

🧪 Testing the Sync

Test Sheet → DB:

  1. Open your Google Sheet
  2. Add a new row with data
  3. Watch the UI logs - sync will trigger within 3 seconds
  4. Verify data appears in "MySQL Data" panel

Test DB → Sheet:

  1. Use MySQL client or backend to insert row directly
  2. Watch the UI logs
  3. Verify data appears in Google Sheet

Test Concurrent Edits:

  1. Open Sheet in multiple browser tabs
  2. Edit different rows simultaneously
  3. All changes sync correctly without conflicts

Test Conflict Resolution:

  1. Edit same row in Sheet and DB at same time
  2. Most recent change wins
  3. No data corruption occurs

📊 Scalability Considerations

Current Design Decisions

Why Polling Over Webhooks:

  • Google Sheets doesn't provide real-time row-level change events
  • Polling is reliable and predictable
  • 3-second interval balances freshness with API quota
  • Production systems (Zapier, Integromat) use similar approach

Why BullMQ + Redis:

  • Decouples sync execution from API requests
  • Enables horizontal scaling (add more workers)
  • Built-in retry and error handling
  • Job prioritization and rate limiting ready
  • Can handle burst traffic without blocking

Why Row-Level Operations:

  • Multiple users can edit different rows safely
  • No table-level locking needed
  • Reduces sync latency
  • Minimizes API calls (only changed rows sync)

Scaling Strategy

To 100 concurrent users:

  • Current architecture handles this without changes
  • Each user's edits are row-atomic
  • Queue prevents overwhelming MySQL/Sheets API

To 1000+ concurrent users:

  1. Add more worker instances (horizontal scaling)
  2. Implement intelligent polling (only check recently modified ranges)
  3. Add caching layer (Redis) for frequently accessed rows
  4. Partition sheets by user groups
  5. Implement differential sync (hash-based change detection)

To Multiple Sheets:

  • Each sheet gets dedicated sync job
  • Worker pool distributes across sheets
  • Independent conflict resolution per sheet

Performance Optimizations Implemented

  • Batch DB operations in transactions
  • Parallel reads from Sheet and DB
  • Minimal data transfer (only changed fields)
  • Connection pooling for MySQL
  • Job deduplication in queue

🛠️ Technology Stack

Backend:

  • Node.js + TypeScript
  • Express (REST API)
  • MySQL2 (database client)
  • BullMQ (job queue)
  • IORedis (Redis client)
  • Google APIs (Sheets integration)

Frontend:

  • React 18
  • Vite (build tool)
  • TypeScript
  • Neo-brutalism design system

Infrastructure:

  • Docker + Docker Compose
  • MySQL 8.0
  • Redis 7

🔒 Production Considerations

Security:

  • Service account credentials stored securely
  • Environment variables for sensitive data
  • No credentials in code
  • CORS configured properly

Reliability:

  • Transaction support for DB writes
  • Job retry with exponential backoff
  • Graceful shutdown handling
  • Health check endpoints

Monitoring:

  • Real-time sync logs
  • Queue statistics (waiting, active, failed jobs)
  • Row-level audit trail via version/timestamp

Error Handling:

  • Try-catch on all async operations
  • Failed jobs retained for debugging
  • Detailed error logging
  • Prevents partial updates via transactions

📈 Edge Cases Handled

  1. Simultaneous edits to same row: Last write wins based on timestamp
  2. New rows in both systems: Both added, different IDs prevent conflict
  3. Deleted rows: Not implemented (design decision - append-only log)
  4. Network failures: BullMQ retries automatically
  5. API rate limits: Queue naturally throttles requests
  6. Invalid data: Schema validation at service layer
  7. Sync loop prevention: Source tracking prevents infinite loops
  8. Database transactions: Ensures atomic updates
  9. Concurrent sync jobs: Queue serializes execution (concurrency: 1)
  10. Timestamp drift: Version numbers provide secondary ordering

🎯 Trade-offs & Design Decisions

Polling vs Webhooks:

  • ✅ Reliable and predictable
  • ✅ Works with Google Sheets limitations
  • ❌ 3-second delay vs real-time
  • Decision: Polling wins for reliability

Last Write Wins vs Manual Resolution:

  • ✅ Simple and predictable
  • ✅ Works for most collaborative use cases
  • ❌ May overwrite concurrent edits
  • Decision: LWW is production-standard for this problem

Row-level vs Table-level Sync:

  • ✅ Better performance
  • ✅ Enables true multiplayer
  • ❌ More complex logic
  • Decision: Row-level is essential for scale

Background Jobs vs Direct Sync:

  • ✅ Non-blocking API responses
  • ✅ Horizontal scalability
  • ❌ Slight complexity increase
  • Decision: Jobs essential for production system

📝 API Endpoints

  • GET /health - Health check
  • POST /api/sync/trigger - Manual sync trigger
  • GET /api/sync/logs - Get sync logs
  • GET /api/sync/stats - Get queue statistics
  • GET /api/data/sheet - Get all Sheet data
  • GET /api/data/db - Get all DB data

🧪 Testing & QA

Complete test coverage with 39+ test scenarios documented:

  • Unit Tests - Conflict resolution logic
  • API Tests - All 7 endpoints verified
  • Integration Tests - 6 end-to-end scenarios
  • Performance Tests - Response times & scalability
  • Production Tests - Error handling, data integrity, security

See TESTING.md for comprehensive test documentation and scenarios.

How to Test:

  1. Live Testing: https://athletic-acceptance-production-c0ad.up.railway.app
  2. Manual Testing: Run locally and use the dashboard
  3. API Testing: Use provided cURL examples in TESTING.md
  4. Unit Tests: cd backend && npm test

🚦 Next Steps for Production

  1. ✅ Comprehensive testing (done - see TESTING.md)
  2. Add authentication/authorization
  3. Implement rate limiting
  4. Add comprehensive monitoring (Datadog, Sentry)
  5. Implement deleted row handling (soft deletes)
  6. Add data validation schemas (Zod)
  7. Implement webhook support for other platforms
  8. Add multi-sheet support
  9. Implement differential sync optimization
  10. Deploy with CI/CD pipeline

📜 License

MIT


Built with production-grade engineering practices for real-world SaaS deployment.

About

Synclayer is a bi-directional sync engine for Google Sheets and databases

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors