Skip to main content
Upmetr automatically checks the backup status of your cloud resources on a daily schedule. Instead of manually verifying that snapshots exist and are recent, Upmetr scans your AWS accounts and flags any resource whose latest backup is missing or older than the configured threshold.
Backup monitoring is currently available for AWS accounts only. Support for GCP and Azure backups is planned.

Supported Backup Types

TypeDescriptionWhat Upmetr Checks
RDS SnapshotAutomated and manual RDS database snapshotsLatest snapshot age per RDS instance
EBS SnapshotPoint-in-time snapshots of EBS volumes (including DLM-managed)Latest snapshot age per volume
S3 DumpDatabase or application dumps stored in S3 bucketsMost recent object matching a configured prefix

Backup Statuses

Each resource receives one of four statuses after every scan:
StatusMeaning
OKA recent backup exists within the configured threshold
StaleA backup exists, but it is older than the threshold
MissingNo backup was found for the resource
ErrorThe backup check failed (e.g., permission issue or API error)
A Missing status is treated as critical and will immediately create an incident. Stale backups generate a warning-level incident.

How It Works

  1. A Celery Beat task (check_all_backups) runs once daily at 2:00 AM UTC
  2. It scans every active AWS account that has backup monitoring enabled
  3. For each account, it queries:
    • RDS snapshots (automated + manual) via the AWS RDS API
    • EBS snapshots via the EC2 API
    • S3 dump objects (only if S3 buckets are configured on the account)
  4. Each resource’s latest backup age is compared against its staleness threshold
  5. Results are upserted into the backup checks table — new resources are added automatically, existing ones are updated
  6. Status changes trigger or auto-resolve incidents as needed
Backup monitoring requires the cloud_resources module to be included in your plan.

Enabling Backup Monitoring

Backup monitoring is enabled per cloud account:
  1. Go to Cloud Accounts and select an AWS account
  2. Enable Backup Monitoring
  3. Optionally adjust the staleness threshold (default: 25 hours for RDS, 168 hours for EBS)
  4. For S3 dumps, configure the target bucket and object prefix
Once enabled, Upmetr starts checking that account on the next daily scan.

Viewing Backups

Navigate to the Backups page from the sidebar. The page shows:

Summary Cards

Four cards at the top provide an at-a-glance overview:
  • Total — number of resources being monitored
  • OK — resources with a recent backup
  • Stale — resources whose backup exceeds the threshold
  • Missing — resources with no backup found

Filters

Use the filter bar to narrow the list by:
  • Account — filter by a specific AWS account
  • Type — RDS Snapshot, EBS Snapshot, or S3 Dump
  • Status — OK, Stale, Missing, or Error

Backup Table

The main table displays each monitored resource with:
  • Resource — name and AWS resource ID
  • Account — which cloud account it belongs to
  • Type — the backup type
  • Last Backup — how long ago the latest backup was taken
  • Status — current health status
Results are sorted with the most critical issues first (missing/stale before ok).

Staleness Thresholds

The staleness threshold determines how old a backup can be before Upmetr flags it as Stale. Default values:
Backup TypeDefault ThresholdRationale
RDS Snapshot25 hoursAWS automated snapshots run daily; 25h allows a 1-hour buffer
EBS Snapshot168 hours (7 days)DLM policies typically run weekly
S3 Dump25 hoursAssumes daily database dump jobs
You can override the default threshold per account by editing the Backup Alert Threshold setting on the cloud account. S3 dump configurations can also specify a custom expected_frequency_hours per bucket.
Set your threshold slightly above your actual backup frequency. For example, if RDS automated backups run daily, a 25-hour threshold gives a 1-hour grace period before alerting.

Alert Integration

Backup monitoring is fully integrated with the incident and notification system:
  • Stale backups create a warning-severity incident
  • Missing backups create a critical-severity incident
  • Incidents include the resource name, account, backup type, age, and threshold
  • When a previously stale or missing backup recovers to OK, Upmetr auto-resolves the incident and sends a resolution notification
  • All configured notification channels (email, Slack, SMS, webhook) receive backup alerts based on your notification rules
Incidents appear on the Incidents page with the trigger type BACKUP_STALE and can be acknowledged or resolved manually like any other incident.

Troubleshooting

  • Verify that backup monitoring is enabled on at least one AWS account
  • Confirm the account is active (not disabled)
  • Check that your plan includes the cloud_resources module
  • Wait for the next daily scan (2:00 AM UTC) or check logs for errors
  • Your threshold may be too low for your backup schedule. If backups run weekly, set the threshold to at least 168 hours (7 days), not 25
  • Verify that AWS automated backups are actually enabled on the resource (RDS > Instance > Backup settings)
  • Check if a recent AWS maintenance window or outage delayed snapshots
  • RDS: Confirm automated backups are enabled (backup retention > 0 days) in the AWS Console
  • EBS: Ensure a DLM lifecycle policy or manual snapshot schedule covers the volume
  • S3 Dumps: Verify the configured bucket and prefix match where your dump job writes files. Check that the IAM role used by Upmetr has s3:ListBucket and s3:GetObject permissions on the target bucket
  • This usually indicates an AWS API permission issue. Ensure the IAM policy attached to Upmetr includes rds:DescribeDBSnapshots, ec2:DescribeSnapshots, and s3:ListBucket
  • Check the backend logs (docker-compose logs -f celery-worker) for detailed error messages
  • If the error is transient (e.g., API throttling), the task will retry automatically up to 3 times