Powering South Africa's energy procurement intelligence! β‘ This AWS Lambda service is the electrical heart of our tender scraping fleet - one of five specialized crawlers that harvest opportunities from South Africa's largest utility company. From massive power station projects to infrastructure upgrades, we capture every kilowatt of opportunity! π
- π― Overview
- β‘ Lambda Function (lambda_handler.py)
- π Data Model (models.py)
- π·οΈ AI Tagging Initialization
- π Example Tender Data
- π Getting Started
- π¦ Deployment
- π§° Troubleshooting
Welcome to the powerhouse of procurement data! π This service is your direct pipeline into Eskom's massive tender ecosystem, capturing multi-billion rand infrastructure projects, power generation contracts, and critical maintenance opportunities that keep South Africa's lights on! π‘
What makes it electrifying? β‘
- π Energy Sector Focus: Specialized in power generation, transmission, and distribution tenders
- ποΈ Mega Project Capture: From power station retrofits to grid infrastructure upgrades
- π‘οΈ Industrial-Grade Reliability: Built to handle Eskom's complex tender structures and massive data volumes
- π€ AI-Ready Pipeline: Every tender pre-configured for intelligent categorization and enrichment
The electrical brain of our operation! π§ The lambda_handler orchestrates the entire data harvesting process with industrial precision:
-
π Fetch Data: Connects to the Eskom Tender Bulletin API - the official source for all Eskom procurement opportunities across the country.
-
π‘οΈ Bulletproof Error Handling: Built like a power station! Handles network storms, API blackouts, and response anomalies with enterprise-grade resilience. No downtime, no data loss! πͺ
-
βοΈ Data Processing: Each tender goes through our industrial-strength parsing engine. We clean dates, validate structures, and ensure every field meets our exacting standards.
-
β Quality Assurance: Our
EskomTendermodel runs rigorous validation checks. Bad data gets flagged, logged, and filtered out - only premium-grade tenders make it through! π -
π¦ Smart Batching: Valid tenders are intelligently grouped into batches of 10 messages - optimized for maximum SQS throughput and cost efficiency.
-
π Queue Dispatch: Each batch powers up to the central
AIQueue.fifoSQS queue with the uniqueMessageGroupIdofEskomTenderScrape. This keeps our power sector tenders organized and maintains perfect processing order.
Our data architecture is engineered for power and precision! ποΈ
The robust foundation that powers all our tender models! This abstract class defines the core electrical grid that connects all tenders:
π§ Core Attributes:
title: The tender's power rating - what's being procured?description: Technical specifications and project requirementssource: Always "Eskom" for this industrial-grade scraperpublished_date: When this opportunity went live on the gridclosing_date: Submission deadline - when the power window closes! β°supporting_docs: Critical technical documents and specificationstags: Keywords for AI intelligence (starts empty, gets energized by our AI service)
This powerhouse inherits all the foundational strength from TenderBase and adds Eskom's unique high-voltage features:
π Eskom-Specific Attributes:
tender_number: Official Eskom reference code (e.g., "MWP2577PS")audience: Who can bid? (e.g., "All Suppliers", "Pre-qualified Contractors")office_location: Physical location for tender collection and briefingsemail: Direct line to Eskom's procurement powerhouseaddress: Full address for site visits and document collectionprovince: Which province needs the power boost
We're all about intelligent power distribution! π€ Every tender that flows through our system is perfectly prepared for downstream AI enhancement:
# From models.py - Preparing for AI electrification! β‘
return cls(
# ... other fields
tags=[], # Initialize tags as an empty list, ready for the AI service.
# ... other fields
)This ensures seamless integration with our AI pipeline - every tender object arrives with a clean, empty tags field just waiting to be charged with intelligent categorizations! π§ β‘
Here's what a real Eskom mega-project looks like after our scraper works its magic! π©β¨
{
"title": "The Medupi Power Station Flue Gas Desulphurization (Fgd) Retrofit Engineer, Procure, Construct (Epc) Project For An Estimated Contract Period Of Eight (8) Years.",
"description": "The Medupi Power Station Flue Gas Desulphurization (Fgd) Retrofit Engineer, Procure, Construct (Epc) Project For An Estimated Contract Period Of Eight (8) Years.",
"source": "Eskom",
"publishedDate": "2024-09-09T12:40:55.587000",
"closingDate": "2026-02-02T10:00:00",
"supporting_docs": [
{
"name": "Eskom Tender Bulletin",
"url": "https://tenderbulletin.eskom.co.za/webapi/api/Lookup/GetTender?TENDER_ID=90032"
}
],
"tags": [],
"tenderNumber": "MWP2577PS",
"audience": "All Suppliers",
"officeLocation": "Eskom Megawatt Park, 1 Maxwell Drive Sunninghill.",
"email": "cyril.ntshonga@eskom.co.za",
"address": "Eskom Megawatt Park Tender Office Northside (Retail Centre) 1 Maxwell Drive Sunninghill Sandton",
"province": "National"
}π₯ What this shows:
- π° Mega Project: Multi-billion rand power station retrofit over 8 years
- π Critical Infrastructure: Flue Gas Desulphurization at Medupi Power Station
- π Environmental Impact: Emissions reduction technology for cleaner power
- π Complete Documentation: Full tender bulletin with technical specifications
- β° Long-term Commitment: Extended timeline from 2024 to 2026
- π― National Scope: Infrastructure project with national significance
Ready to tap into Eskom's power grid of opportunities? Let's energize your setup! β‘
- AWS CLI configured with appropriate credentials π
- Python 3.9+ with pip π
- Access to AWS Lambda and SQS services βοΈ
- Understanding of power sector terminology π
- π Clone the repository
- π¦ Install dependencies:
pip install -r requirements.txt - π§ͺ Run tests:
python -m pytest - π Test locally: Use AWS SAM for local Lambda simulation
This Lambda function supports multiple deployment methods to power up your infrastructure! Choose the approach that best fits your workflow. β‘
The simplest way to deploy! Our GitHub Actions workflow automatically handles the deployment when you create a release branch.
Steps:
-
Create a release branch from main:
git checkout main git pull origin main git checkout -b release/v1.0.0 git push origin release/v1.0.0
-
Automatic deployment triggers via GitHub Actions workflow
-
Verify deployment in AWS Lambda console
Benefits: π―
- β Zero manual configuration
- β Consistent deployment process
- β Automatic rollback on failure
- β Integrated with CI/CD pipeline
Deploy using AWS SAM with the included template.yml for complete infrastructure management.
Prerequisites:
# Install AWS SAM CLI
pip install aws-sam-cli
# Verify installation
sam --versionDeployment Steps:
# 1. Build the SAM application
sam build --template-file template.yml
# 2. Deploy with guided setup (first time)
sam deploy --guided --template-file template.yml
# 3. Subsequent deployments
sam deploy --template-file template.ymlAdvanced SAM Options:
# Deploy to specific environment
sam deploy --template-file template.yml --parameter-overrides Environment=production
# Deploy with custom stack name
sam deploy --template-file template.yml --stack-name eskom-function-prod
# Validate template before deployment
sam validate --template-file template.ymlBenefits: ποΈ
- β Infrastructure as Code
- β Complete stack management
- β Environment-specific configurations
- β Local testing capabilities
Direct deployment using AWS CLI and toolkit for maximum control.
Prerequisites:
# Install AWS CLI
pip install awscli
# Configure AWS credentials
aws configure
# Install AWS Lambda deployment tools
pip install boto3Deployment Steps:
-
Prepare deployment package:
# Create deployment directory mkdir deployment-package # Install dependencies pip install -r requirements.txt -t deployment-package/ # Copy source code cp *.py deployment-package/ # Create deployment zip cd deployment-package zip -r ../eskom-function.zip . cd ..
-
Deploy new Lambda function:
# Create new function aws lambda create-function \ --function-name eskom-tender-processor \ --runtime python3.9 \ --role arn:aws:iam::YOUR-ACCOUNT:role/lambda-execution-role \ --handler lambda_handler.lambda_handler \ --zip-file fileb://eskom-function.zip \ --timeout 300 \ --memory-size 512 -
Update existing Lambda function:
# Update function code aws lambda update-function-code \ --function-name eskom-tender-processor \ --zip-file fileb://eskom-function.zip # Update function configuration aws lambda update-function-configuration \ --function-name eskom-tender-processor \ --timeout 300 \ --memory-size 512
-
Configure environment variables:
aws lambda update-function-configuration \ --function-name eskom-tender-processor \ --environment Variables='{ "SQS_QUEUE_URL":"https://sqs.region.amazonaws.com/account/queue-name", "API_TIMEOUT":"30", "BATCH_SIZE":"10" }'
-
Set up triggers (if needed):
# Add CloudWatch Events trigger for scheduled execution aws events put-rule \ --name eskom-scraper-schedule \ --schedule-expression "rate(1 hour)" aws lambda add-permission \ --function-name eskom-tender-processor \ --statement-id scheduled-execution \ --action lambda:InvokeFunction \ --principal events.amazonaws.com \ --source-arn arn:aws:events:region:account:rule/eskom-scraper-schedule aws events put-targets \ --rule eskom-scraper-schedule \ --targets Id=1,Arn=arn:aws:lambda:region:account:function:eskom-tender-processor
Benefits: βοΈ
- β Maximum deployment control
- β Custom configuration options
- β Direct AWS service integration
- β Scriptable for automation
Configure these environment variables for optimal performance:
| Variable | Description | Default | Required |
|---|---|---|---|
SQS_QUEUE_URL |
Target SQS queue for processed tenders | - | β |
API_TIMEOUT |
Eskom API request timeout (seconds) | 30 | β |
BATCH_SIZE |
Tenders per SQS batch | 10 | β |
LOG_LEVEL |
Logging verbosity (INFO, DEBUG, ERROR) | INFO | β |
Verify your deployment with these validation steps:
# Test Lambda function locally (SAM)
sam local invoke -e events/test-event.json --template-file template.yml
# Test deployed function
aws lambda invoke \
--function-name eskom-tender-processor \
--payload '{}' \
response.json
# Check CloudWatch logs
aws logs describe-log-groups --log-group-name-prefix /aws/lambda/eskom-tender-processorCommon Deployment Issues
Issue: Permission denied during deployment Solution: Ensure your AWS credentials have Lambda and IAM permissions
Issue: Package too large for Lambda Solution: Use Lambda layers for large dependencies or optimize package size
Issue: Environment variables not updating Solution: Redeploy with explicit environment variable configuration
Choose the deployment method that best fits your development workflow. The release branch method is recommended for production environments, while AWS CLI deployment offers maximum flexibility for custom setups! β‘
API Connection Failures
Issue: Cannot connect to Eskom Tender Bulletin API.
Solution: Eskom's API can be temperamental during peak hours. Implement retry logic with exponential backoff. The power grid needs patience! β‘
Large Tender Processing
Issue: Lambda timeouts on massive infrastructure projects.
Solution: Eskom deals in mega-projects! Increase Lambda timeout and memory allocation. Some power station retrofits have extensive documentation! π
Data Validation on Technical Specs
Issue: Complex engineering tenders failing validation.
Solution: Eskom tenders often contain technical jargon and specifications. Update validation rules to handle power sector terminology and measurements! βοΈ
SQS Quota Overruns
Issue: Too many large tenders hitting SQS limits.
Solution: Eskom runs massive procurement cycles. Implement intelligent batching based on tender size and complexity! π¦
Built with love, bread, and code by Bread Corporation π¦β€οΈπ»