All checks were successful
Docker Build and Push / build (push) Successful in 35s
182 lines
5.1 KiB
Markdown
182 lines
5.1 KiB
Markdown
# PostgreSQL Backup to S3-Compatible Storage Container
|
|
|
|
This Docker container provides automated PostgreSQL database backups to S3-compatible storage services (MinIO, DigitalOcean Spaces, Backblaze B2, etc.), designed to run as a Kubernetes CronJob.
|
|
|
|
## Features
|
|
|
|
- Backs up one or more PostgreSQL databases
|
|
- Optimized for third-party S3-compatible storage services
|
|
- Uses lightweight `rclone` instead of heavy AWS CLI
|
|
- Automatic cleanup of old backups based on retention policy
|
|
- Compression support (gzip)
|
|
- Non-root container for security
|
|
- Notification support via webhooks
|
|
- Comprehensive logging
|
|
|
|
## Building the Container
|
|
|
|
```bash
|
|
docker build -t your-registry/postgres-backup:latest .
|
|
docker push your-registry/postgres-backup:latest
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
### Required Variables
|
|
|
|
- `POSTGRES_HOST`: PostgreSQL server hostname
|
|
- `POSTGRES_USER`: PostgreSQL username
|
|
- `POSTGRES_PASSWORD`: PostgreSQL password
|
|
- `S3_BUCKET`: S3 bucket name for backups
|
|
- `S3_ENDPOINT`: S3 endpoint URL (e.g., https://nyc3.digitaloceanspaces.com)
|
|
- `S3_ACCESS_KEY_ID`: S3 access key ID
|
|
- `S3_SECRET_ACCESS_KEY`: S3 secret access key
|
|
|
|
### Optional Variables
|
|
|
|
- `POSTGRES_PORT`: PostgreSQL port (default: 5432)
|
|
- `POSTGRES_DB`: Default database for connection (default: postgres)
|
|
- `POSTGRES_DATABASES`: Comma-separated list of databases to backup (default: all databases)
|
|
- `S3_PREFIX`: S3 key prefix for backups (default: postgres-backups)
|
|
- `S3_REGION`: S3 region (default: us-east-1)
|
|
- `BACKUP_RETENTION_DAYS`: Number of days to keep backups (default: 7)
|
|
- `HEALTHCHECKS_URL`: Healthchecks.io ping URL for monitoring (optional)
|
|
|
|
## Running Locally
|
|
|
|
```bash
|
|
docker run --rm \
|
|
-e POSTGRES_HOST=your-postgres-host \
|
|
-e POSTGRES_USER=postgres \
|
|
-e POSTGRES_PASSWORD=your-password \
|
|
-e S3_BUCKET=your-backup-bucket \
|
|
-e S3_ENDPOINT=https://nyc3.digitaloceanspaces.com \
|
|
-e S3_ACCESS_KEY_ID=your-access-key \
|
|
-e S3_SECRET_ACCESS_KEY=your-secret-key \
|
|
-e HEALTHCHECKS_URL=https://hc-ping.com/your-uuid \
|
|
your-registry/postgres-backup:latest
|
|
```
|
|
|
|
## Kubernetes Deployment
|
|
|
|
1. **Create the secret with your credentials:**
|
|
|
|
```bash
|
|
# Edit k8s-secret.yaml with your actual credentials (uses stringData for simplicity)
|
|
kubectl apply -f k8s-secret.yaml
|
|
```
|
|
|
|
2. **Deploy the CronJob:**
|
|
|
|
```bash
|
|
# Edit k8s-cronjob.yaml with your settings
|
|
kubectl apply -f k8s-cronjob.yaml
|
|
```
|
|
|
|
3. **Monitor the CronJob:**
|
|
|
|
```bash
|
|
# Check CronJob status
|
|
kubectl get cronjobs
|
|
|
|
# Check recent jobs
|
|
kubectl get jobs
|
|
|
|
# Check logs
|
|
kubectl logs -l job-name=postgres-backup-<timestamp>
|
|
```
|
|
|
|
## Monitoring with Healthchecks.io
|
|
|
|
The container has built-in support for [Healthchecks.io](https://healthchecks.io) monitoring:
|
|
|
|
### Setup:
|
|
1. Create a check on healthchecks.io
|
|
2. Copy the ping URL (e.g., `https://hc-ping.com/your-uuid-here`)
|
|
3. Add it to your Kubernetes secret as `healthchecks-url`
|
|
|
|
### Webhook Behavior:
|
|
- **Start**: Pings `/start` when backup begins
|
|
- **Success**: Pings the main URL when all backups complete successfully
|
|
- **Failure**: Pings `/fail` with error details when any backup fails
|
|
|
|
### Example healthchecks.io URL:
|
|
```
|
|
https://hc-ping.com/12345678-1234-1234-1234-123456789012
|
|
```
|
|
|
|
This will automatically track:
|
|
- Job start times
|
|
- Success/failure status
|
|
- Failure reasons in the check log
|
|
- Missing backup alerts if job doesn't run
|
|
|
|
## Backup Structure
|
|
|
|
Backups are stored in S3 with a simple flat structure:
|
|
```
|
|
s3://your-bucket/
|
|
└── postgres-backups/
|
|
├── database1_20240130_020000.sql.gz
|
|
├── database1_20240131_020000.sql.gz
|
|
├── database2_20240130_020000.sql.gz
|
|
└── database2_20240131_020000.sql.gz
|
|
```
|
|
|
|
All backups are created as gzipped SQL dumps with:
|
|
- `--clean` and `--if-exists` flags for safer restores
|
|
- `gzip --rsyncable` for efficient incremental transfers
|
|
- Human-readable SQL format after decompression
|
|
|
|
## Security Considerations
|
|
|
|
- Container runs as non-root user (UID 1001)
|
|
- Uses read-only root filesystem
|
|
- Drops all capabilities
|
|
- Secrets are stored in Kubernetes secrets, not environment variables
|
|
- Network policies can be applied to restrict access
|
|
|
|
## Backup Restoration
|
|
|
|
To restore a backup:
|
|
|
|
1. Download the backup file from S3:
|
|
```bash
|
|
rclone copy s3remote:your-bucket/postgres-backups/ ./ --include "database1_20240130_020000.sql.gz"
|
|
```
|
|
|
|
2. Decompress and restore:
|
|
```bash
|
|
gunzip database1_20240130_020000.sql.gz
|
|
psql -h your-postgres-host -U postgres -d database1 < database1_20240130_020000.sql
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **Connection refused**: Check PostgreSQL host and port
|
|
2. **Authentication failed**: Verify username and password
|
|
3. **S3 upload failed**: Check AWS credentials and bucket permissions
|
|
4. **Out of space**: Ensure sufficient disk space in /backups volume
|
|
|
|
### Logs
|
|
|
|
Check container logs for detailed information:
|
|
```bash
|
|
kubectl logs -l job-name=postgres-backup-<timestamp> -f
|
|
```
|
|
|
|
## Customization
|
|
|
|
You can modify the `backup.sh` script to:
|
|
- Add custom backup validation
|
|
- Implement different notification methods
|
|
- Add encryption before upload
|
|
- Modify backup naming conventions
|
|
- Add database-specific backup options
|
|
|
|
## License
|
|
|
|
This project is provided as-is for educational and production use.
|