Skip to main content
Upmetr uses the OpenTelemetry Collector to monitor host-level metrics (CPU, RAM, disk, network) and container metrics (Docker, Kubernetes, ECS). Agents are lightweight, consume minimal resources, and push metrics securely to Upmetr via OTLP/HTTP.

How It Works

  1. You create an agent in Upmetr and get a unique token
  2. Deploy the OTel Collector on your server with that token
  3. The collector pushes metrics every 60 seconds
  4. Upmetr stores metrics in a TimescaleDB hypertable with 30-day retention
Agents are stateless — they push metrics and don’t store data locally. If the connection drops, metrics resume when connectivity is restored.

Creating an Agent

  1. Go to Settings > Infra Agents
  2. Click Add Agent
  3. Enter a name (e.g., “prod-web-01”)
  4. Copy the generated agent token — you’ll need it for deployment

Deployment Options

The quickest way to deploy. One command:
docker run -d \
  --name upmetr-agent \
  --restart unless-stopped \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -v /proc:/hostfs/proc:ro \
  -v /sys:/hostfs/sys:ro \
  -v /:/hostfs:ro \
  -e OTEL_BACKEND_URL=https://app.upmetr.com \
  -e OTEL_AGENT_TOKEN=your-agent-token \
  -e OTEL_AGENT_ID=your-agent-name \
  --pid=host \
  --memory=128m \
  --cpus=0.25 \
  otel/opentelemetry-collector-contrib:0.145.0 \
  --config=/etc/otelcol-contrib/config.yaml
The Docker socket mount (/var/run/docker.sock) is required for container metrics. If you don’t need container monitoring, remove it.

Docker Compose

Create an otel-collector-config.yaml:
receivers:
  hostmetrics:
    root_path: /hostfs
    collection_interval: 60s
    scrapers:
      cpu:
      memory:
      disk:
      network:
      load:
  docker_stats:
    endpoint: unix:///var/run/docker.sock
    collection_interval: 60s

processors:
  batch:
    timeout: 10s
    send_batch_size: 1024
  resource:
    attributes:
      - key: agent_id
        value: ${env:OTEL_AGENT_ID}
        action: upsert

exporters:
  otlphttp:
    endpoint: ${env:OTEL_BACKEND_URL}/api/v1/otel
    headers:
      Authorization: "Bearer ${env:OTEL_AGENT_TOKEN}"
    compression: gzip
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s

service:
  pipelines:
    metrics:
      receivers: [hostmetrics, docker_stats]
      processors: [resource, batch]
      exporters: [otlphttp]
Then add to your docker-compose.yml:
otel-collector:
  image: otel/opentelemetry-collector-contrib:0.145.0
  restart: unless-stopped
  volumes:
    - ./otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - /proc:/hostfs/proc:ro
    - /sys:/hostfs/sys:ro
    - /:/hostfs:ro
  environment:
    - OTEL_BACKEND_URL=https://app.upmetr.com
    - OTEL_AGENT_TOKEN=your-agent-token
    - OTEL_AGENT_ID=your-agent-name
  pid: host
  deploy:
    resources:
      limits:
        cpus: "0.25"
        memory: 128M

Collected Metrics

Host Metrics

MetricDescription
CPUUsage per core, idle, iowait, system, user
MemoryUsed, available, cached, swap
DiskRead/write bytes, IOPS, usage percentage
NetworkBytes sent/received, packets, errors
Load1m, 5m, 15m load averages

Container Metrics

MetricDescription
CPUPer-container CPU usage
MemoryPer-container memory usage and limit
NetworkPer-container network I/O
Block I/OPer-container disk reads/writes

CloudWatch Integration

For AWS managed services (RDS, ALB, etc.) that don’t run agents, Upmetr polls CloudWatch metrics every 5 minutes. This creates virtual agent entries in the metrics pipeline — no deployment needed. CloudWatch metrics are enabled automatically when you add an AWS cloud account with CloudWatch permissions.

Agent Health

Upmetr monitors agent health via heartbeats:
  • Agents are expected to report every 60 seconds
  • If no data is received for 5 minutes, the agent is marked as offline
  • An incident is created if the agent remains offline

Resource Limits

The OTel Collector is designed to be lightweight:
ResourceLimit
CPU0.25 cores
Memory128-256 MB
Network~1-5 KB/min (compressed)
DiskNone (stateless)

Viewing Agent Metrics

To inspect metrics for a specific agent, navigate to Infrastructure and click on an agent card. This opens the agent detail page with real-time charts and gauges.

Time Range Selector

Use the time range selector in the top-right corner to adjust the chart window:
OptionWindow
10 MinLast 10 minutes
30 MinLast 30 minutes
1 HourLast hour (default)
6 HoursLast 6 hours
24 HoursLast 24 hours
7 DaysLast 7 days

Host Metrics Charts (OTel Agents)

For standard OTel agents, the detail page shows:
  • CPU Utilization — Total CPU usage over time (computed as 1 minus idle)
  • Memory Utilization — Used memory percentage
  • Disk Utilization — Filesystem usage for the root mount
  • Network I/O — Bytes sent and received on the primary interface
  • Swap Usage — Paging utilization
Each metric includes a real-time gauge at the top of the page showing the current value, plus a time-series chart below.

Container Metrics

If the agent reports Docker or Kubernetes container data, a dedicated Containers section appears below the host charts. This shows per-container CPU and memory usage, making it easy to identify resource-hungry containers. For Kubernetes agents, additional Node and Pod sections display cluster-level metrics like node CPU/memory utilization and pod resource consumption.
All metrics are stored in a TimescaleDB hypertable with 30-day retention. Data older than 30 days is automatically pruned.

CloudWatch Virtual Agents

When you connect an AWS cloud account with CloudWatch permissions, Upmetr automatically creates a virtual agent for that account. Virtual agents appear in the Infrastructure list alongside real OTel agents — no deployment required.

How It Works

  1. Upmetr polls CloudWatch metrics every 5 minutes via Celery background tasks
  2. Metrics are stored in the same TimescaleDB hypertable as OTel agent data
  3. A virtual agent entry is created so you can browse cloud service metrics the same way you browse host metrics

CloudWatch Metric Sections

Each virtual agent organizes metrics by AWS resource type, with a dedicated color palette per section:
SectionMetrics
RDS DatabasesCPU utilization, connections, read/write latency, freeable memory, free storage, IOPS, swap, queue depth
Application Load BalancersRequest count, response time, 5xx errors, active/rejected connections, healthy/unhealthy hosts
CloudFront DistributionsRequests, bytes downloaded, 4xx/5xx error rates
WAF Web ACLsAllowed, blocked, and counted requests
For detailed information about WAF metrics and security monitoring, see the WAF Monitoring guide.

GCP and Azure Virtual Agents

The same virtual agent pattern applies to other cloud providers:
  • GCP Cloud Monitoring — Metrics for Compute Engine, Cloud SQL, and GKE clusters
  • Azure Monitor — Metrics for Azure VMs, Azure SQL, and AKS clusters
Each provider has its own metric sections with provider-specific charts (e.g., DTU consumption for Azure SQL, node/pod utilization for GKE/AKS).

Agent Types

The Infrastructure page shows agents of different types depending on your connected accounts and deployments:
TypeLabelDescription
otelHost AgentStandard OpenTelemetry Collector deployed on your server. Collects CPU, memory, disk, network, and container metrics.
cloudwatchCloudWatchVirtual agent created automatically for AWS accounts. Polls RDS, ALB, CloudFront, and WAF metrics.
gcp_monitoringGCP MonitoringVirtual agent for GCP accounts. Polls Compute Engine, Cloud SQL, and GKE metrics.
azure_monitorAzure MonitorVirtual agent for Azure accounts. Polls VM, Azure SQL, and AKS metrics.
kubernetesKubernetesOTel Collector deployed as a DaemonSet in a Kubernetes cluster. Reports node, pod, and container metrics.
ecsECSOTel Collector deployed as an ECS daemon service. Reports task and container metrics.

Filtering Agents

The Infrastructure list page provides two filter dropdowns to help you find agents quickly:
  • Cloud Account — Filter agents by their associated cloud account. Useful when you have multiple AWS, GCP, or Azure accounts connected and want to focus on one.
  • Agent Type — Filter by type (Host, CloudWatch, GCP Monitoring, Azure Monitor, Kubernetes, ECS). For example, select “CloudWatch” to see only virtual agents polling AWS managed services.
Both filters can be combined. For instance, you can select a specific AWS account and the CloudWatch type to see only the CloudWatch virtual agent for that account.

Troubleshooting

IssueSolution
Agent shows “Offline”Check the collector is running: docker ps. Verify the token is correct.
No metrics appearingCheck the backend URL is reachable. Look at collector logs: docker logs upmetr-agent.
High memory usageReduce send_batch_size in the processor config.
Docker metrics missingEnsure the Docker socket is mounted: -v /var/run/docker.sock:/var/run/docker.sock:ro.