Backup monitoring is currently available for AWS accounts only. Support for GCP and Azure backups is planned.
Supported Backup Types
| Type | Description | What Upmetr Checks |
|---|---|---|
| RDS Snapshot | Automated and manual RDS database snapshots | Latest snapshot age per RDS instance |
| EBS Snapshot | Point-in-time snapshots of EBS volumes (including DLM-managed) | Latest snapshot age per volume |
| S3 Dump | Database or application dumps stored in S3 buckets | Most recent object matching a configured prefix |
Backup Statuses
Each resource receives one of four statuses after every scan:| Status | Meaning |
|---|---|
| OK | A recent backup exists within the configured threshold |
| Stale | A backup exists, but it is older than the threshold |
| Missing | No backup was found for the resource |
| Error | The backup check failed (e.g., permission issue or API error) |
How It Works
- A Celery Beat task (
check_all_backups) runs once daily at 2:00 AM UTC - It scans every active AWS account that has backup monitoring enabled
- For each account, it queries:
- RDS snapshots (automated + manual) via the AWS RDS API
- EBS snapshots via the EC2 API
- S3 dump objects (only if S3 buckets are configured on the account)
- Each resource’s latest backup age is compared against its staleness threshold
- Results are upserted into the backup checks table — new resources are added automatically, existing ones are updated
- Status changes trigger or auto-resolve incidents as needed
Enabling Backup Monitoring
Backup monitoring is enabled per cloud account:- Go to Cloud Accounts and select an AWS account
- Enable Backup Monitoring
- Optionally adjust the staleness threshold (default: 25 hours for RDS, 168 hours for EBS)
- For S3 dumps, configure the target bucket and object prefix
Viewing Backups
Navigate to the Backups page from the sidebar. The page shows:Summary Cards
Four cards at the top provide an at-a-glance overview:- Total — number of resources being monitored
- OK — resources with a recent backup
- Stale — resources whose backup exceeds the threshold
- Missing — resources with no backup found
Filters
Use the filter bar to narrow the list by:- Account — filter by a specific AWS account
- Type — RDS Snapshot, EBS Snapshot, or S3 Dump
- Status — OK, Stale, Missing, or Error
Backup Table
The main table displays each monitored resource with:- Resource — name and AWS resource ID
- Account — which cloud account it belongs to
- Type — the backup type
- Last Backup — how long ago the latest backup was taken
- Status — current health status
Staleness Thresholds
The staleness threshold determines how old a backup can be before Upmetr flags it as Stale. Default values:| Backup Type | Default Threshold | Rationale |
|---|---|---|
| RDS Snapshot | 25 hours | AWS automated snapshots run daily; 25h allows a 1-hour buffer |
| EBS Snapshot | 168 hours (7 days) | DLM policies typically run weekly |
| S3 Dump | 25 hours | Assumes daily database dump jobs |
expected_frequency_hours per bucket.
Alert Integration
Backup monitoring is fully integrated with the incident and notification system:- Stale backups create a warning-severity incident
- Missing backups create a critical-severity incident
- Incidents include the resource name, account, backup type, age, and threshold
- When a previously stale or missing backup recovers to OK, Upmetr auto-resolves the incident and sends a resolution notification
- All configured notification channels (email, Slack, SMS, webhook) receive backup alerts based on your notification rules
BACKUP_STALE and can be acknowledged or resolved manually like any other incident.
Troubleshooting
No backups showing on the page
No backups showing on the page
- Verify that backup monitoring is enabled on at least one AWS account
- Confirm the account is active (not disabled)
- Check that your plan includes the
cloud_resourcesmodule - Wait for the next daily scan (2:00 AM UTC) or check logs for errors
All backups showing as Stale
All backups showing as Stale
- Your threshold may be too low for your backup schedule. If backups run weekly, set the threshold to at least 168 hours (7 days), not 25
- Verify that AWS automated backups are actually enabled on the resource (RDS > Instance > Backup settings)
- Check if a recent AWS maintenance window or outage delayed snapshots
Missing backups for specific resources
Missing backups for specific resources
- RDS: Confirm automated backups are enabled (backup retention > 0 days) in the AWS Console
- EBS: Ensure a DLM lifecycle policy or manual snapshot schedule covers the volume
- S3 Dumps: Verify the configured bucket and prefix match where your dump job writes files. Check that the IAM role used by Upmetr has
s3:ListBucketands3:GetObjectpermissions on the target bucket
Backup check shows Error status
Backup check shows Error status
- This usually indicates an AWS API permission issue. Ensure the IAM policy attached to Upmetr includes
rds:DescribeDBSnapshots,ec2:DescribeSnapshots, ands3:ListBucket - Check the backend logs (
docker-compose logs -f celery-worker) for detailed error messages - If the error is transient (e.g., API throttling), the task will retry automatically up to 3 times

