# PostgreSQL Backup to S3-Compatible Storage Container This Docker container provides automated PostgreSQL database backups to S3-compatible storage services (MinIO, DigitalOcean Spaces, Backblaze B2, etc.), designed to run as a Kubernetes CronJob. ## Features - Backs up one or more PostgreSQL databases - Optimized for third-party S3-compatible storage services - Uses lightweight `rclone` instead of heavy AWS CLI - Automatic cleanup of old backups based on retention policy - Compression support (gzip) - Non-root container for security - Notification support via webhooks - Comprehensive logging ## Building the Container ```bash docker build -t your-registry/postgres-backup:latest . docker push your-registry/postgres-backup:latest ``` ## Environment Variables ### Required Variables - `POSTGRES_HOST`: PostgreSQL server hostname - `POSTGRES_USER`: PostgreSQL username - `POSTGRES_PASSWORD`: PostgreSQL password - `S3_BUCKET`: S3 bucket name for backups - `S3_ENDPOINT`: S3 endpoint URL (e.g., https://nyc3.digitaloceanspaces.com) - `S3_ACCESS_KEY_ID`: S3 access key ID - `S3_SECRET_ACCESS_KEY`: S3 secret access key ### Optional Variables - `POSTGRES_PORT`: PostgreSQL port (default: 5432) - `POSTGRES_DB`: Default database for connection (default: postgres) - `POSTGRES_DATABASES`: Comma-separated list of databases to backup (default: all databases) - `S3_PREFIX`: S3 key prefix for backups (default: postgres-backups) - `S3_REGION`: S3 region (default: us-east-1) - `BACKUP_RETENTION_DAYS`: Number of days to keep backups (default: 7) - `HEALTHCHECKS_URL`: Healthchecks.io ping URL for monitoring (optional) ## Running Locally ```bash docker run --rm \ -e POSTGRES_HOST=your-postgres-host \ -e POSTGRES_USER=postgres \ -e POSTGRES_PASSWORD=your-password \ -e S3_BUCKET=your-backup-bucket \ -e S3_ENDPOINT=https://nyc3.digitaloceanspaces.com \ -e S3_ACCESS_KEY_ID=your-access-key \ -e S3_SECRET_ACCESS_KEY=your-secret-key \ -e HEALTHCHECKS_URL=https://hc-ping.com/your-uuid \ your-registry/postgres-backup:latest ``` ## Kubernetes Deployment 1. **Create the secret with your credentials:** ```bash # Edit k8s-secret.yaml with your actual credentials (uses stringData for simplicity) kubectl apply -f k8s-secret.yaml ``` 2. **Deploy the CronJob:** ```bash # Edit k8s-cronjob.yaml with your settings kubectl apply -f k8s-cronjob.yaml ``` 3. **Monitor the CronJob:** ```bash # Check CronJob status kubectl get cronjobs # Check recent jobs kubectl get jobs # Check logs kubectl logs -l job-name=postgres-backup- ``` ## Monitoring with Healthchecks.io The container has built-in support for [Healthchecks.io](https://healthchecks.io) monitoring: ### Setup: 1. Create a check on healthchecks.io 2. Copy the ping URL (e.g., `https://hc-ping.com/your-uuid-here`) 3. Add it to your Kubernetes secret as `healthchecks-url` ### Webhook Behavior: - **Start**: Pings `/start` when backup begins - **Success**: Pings the main URL when all backups complete successfully - **Failure**: Pings `/fail` with error details when any backup fails ### Example healthchecks.io URL: ``` https://hc-ping.com/12345678-1234-1234-1234-123456789012 ``` This will automatically track: - Job start times - Success/failure status - Failure reasons in the check log - Missing backup alerts if job doesn't run ## Backup Structure Backups are stored in S3 with a simple flat structure: ``` s3://your-bucket/ └── postgres-backups/ ├── database1_20240130_020000.sql.gz ├── database1_20240131_020000.sql.gz ├── database2_20240130_020000.sql.gz └── database2_20240131_020000.sql.gz ``` All backups are created as gzipped SQL dumps with: - `--clean` and `--if-exists` flags for safer restores - `gzip --rsyncable` for efficient incremental transfers - Human-readable SQL format after decompression ## Security Considerations - Container runs as non-root user (UID 1001) - Uses read-only root filesystem - Drops all capabilities - Secrets are stored in Kubernetes secrets, not environment variables - Network policies can be applied to restrict access ## Backup Restoration To restore a backup: 1. Download the backup file from S3: ```bash rclone copy s3remote:your-bucket/postgres-backups/ ./ --include "database1_20240130_020000.sql.gz" ``` 2. Decompress and restore: ```bash gunzip database1_20240130_020000.sql.gz psql -h your-postgres-host -U postgres -d database1 < database1_20240130_020000.sql ``` ## Troubleshooting ### Common Issues 1. **Connection refused**: Check PostgreSQL host and port 2. **Authentication failed**: Verify username and password 3. **S3 upload failed**: Check AWS credentials and bucket permissions 4. **Out of space**: Ensure sufficient disk space in /backups volume ### Logs Check container logs for detailed information: ```bash kubectl logs -l job-name=postgres-backup- -f ``` ## Customization You can modify the `backup.sh` script to: - Add custom backup validation - Implement different notification methods - Add encryption before upload - Modify backup naming conventions - Add database-specific backup options ## License This project is provided as-is for educational and production use.