A serverless AWS Lambda function that downloads Instagram videos and stores them in S3. The project uses Supabase for metadata management and AWS SQS for message queuing.
- Downloads Instagram videos at specified resolutions
- Stores videos in AWS S3
- Uses Supabase for metadata management
- Implements AWS SQS for message queuing
- Includes CloudWatch metrics for monitoring
- Handles retries and error cases
- Supports batch processing of videos
- AWS Account with appropriate permissions
- Supabase account and project
- Python 3.10 or higher
- AWS CLI configured with appropriate credentials
Create a .env file with the following variables:
SUPABASE_URL=your_supabase_url
SUPABASE_KEY=your_supabase_key
S3_BUCKET=your_s3_bucket_name
MAX_RETRIES=3
DOWNLOAD_TIMEOUT=300
CHUNK_SIZE=1048576- Clone the repository:
git clone <repository-url>
cd InstagramVideoDownloader-lambda- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtlambda_function.py: Main Lambda function codetrigger_sqs_batches.py: Script to trigger SQS message batchesscripts/pack_lambda.sh: Script to package Lambda functionrequirements.txt: Python dependencies.env: Environment variables (not committed to git)
- Package the Lambda function:
./scripts/pack_lambda.sh- Upload the
lambda_function.zipto AWS Lambda
Run the batch processor to send videos to the queue:
python trigger_sqs_batches.pyThe following metrics are available in CloudWatch under the namespace 'InstagramCollector':
ProcessingStarted: Number of videos processing startedProcessingSuccess: Number of successfully processed videosProcessingFailed: Number of failed video processing attemptsDownloadTime: Time taken to download videosUploadTime: Time taken to upload videos to S3TotalProcessingTime: Total processing time per videoSupabaseVideoFound: Number of videos found in SupabaseSupabaseVideoNotFound: Number of videos not found in SupabaseSupabaseError: Number of Supabase errorsDownloadTimeout: Number of download timeoutsDownloadError: Number of download errorsS3UploadError: Number of S3 upload errorsInvalidInput: Number of invalid input messages
The system implements comprehensive error handling:
- Retries for failed downloads
- Error logging to CloudWatch
- Metrics tracking for various failure scenarios
- Graceful cleanup of temporary files
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
For issues and feature requests, please create an issue in the repository.
Maintained by Oriane XYZ