Skip to main content

Infrastructure Agents

Upmetr uses the OpenTelemetry Collector to monitor host-level metrics (CPU, RAM, disk, network) and container metrics (Docker, Kubernetes, ECS). Agents are lightweight, consume minimal resources, and push metrics securely to Upmetr via OTLP/HTTP.

How It Works

  1. You create an agent in Upmetr and get a unique token
  2. Deploy the OTel Collector on your server with that token
  3. The collector pushes metrics every 60 seconds
  4. Upmetr stores metrics in a TimescaleDB hypertable with 30-day retention
Agents are stateless — they push metrics and don’t store data locally. If the connection drops, metrics resume when connectivity is restored.

Creating an Agent

  1. Go to Settings > Infra Agents
  2. Click Add Agent
  3. Enter a name (e.g., “prod-web-01”)
  4. Copy the generated agent token — you’ll need it for deployment

Deployment Options

The quickest way to deploy. One command:
docker run -d \
  --name upmetr-agent \
  --restart unless-stopped \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -v /proc:/hostfs/proc:ro \
  -v /sys:/hostfs/sys:ro \
  -v /:/hostfs:ro \
  -e OTEL_BACKEND_URL=https://app.upmetr.com \
  -e OTEL_AGENT_TOKEN=your-agent-token \
  -e OTEL_AGENT_ID=your-agent-name \
  --pid=host \
  --memory=128m \
  --cpus=0.25 \
  otel/opentelemetry-collector-contrib:0.145.0 \
  --config=/etc/otelcol-contrib/config.yaml
The Docker socket mount (/var/run/docker.sock) is required for container metrics. If you don’t need container monitoring, remove it.

Docker Compose

Create an otel-collector-config.yaml:
receivers:
  hostmetrics:
    root_path: /hostfs
    collection_interval: 60s
    scrapers:
      cpu:
      memory:
      disk:
      network:
      load:
  docker_stats:
    endpoint: unix:///var/run/docker.sock
    collection_interval: 60s

processors:
  batch:
    timeout: 10s
    send_batch_size: 1024
  resource:
    attributes:
      - key: agent_id
        value: ${env:OTEL_AGENT_ID}
        action: upsert

exporters:
  otlphttp:
    endpoint: ${env:OTEL_BACKEND_URL}/api/v1/otel
    headers:
      Authorization: "Bearer ${env:OTEL_AGENT_TOKEN}"
    compression: gzip
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s

service:
  pipelines:
    metrics:
      receivers: [hostmetrics, docker_stats]
      processors: [resource, batch]
      exporters: [otlphttp]
Then add to your docker-compose.yml:
otel-collector:
  image: otel/opentelemetry-collector-contrib:0.145.0
  restart: unless-stopped
  volumes:
    - ./otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - /proc:/hostfs/proc:ro
    - /sys:/hostfs/sys:ro
    - /:/hostfs:ro
  environment:
    - OTEL_BACKEND_URL=https://app.upmetr.com
    - OTEL_AGENT_TOKEN=your-agent-token
    - OTEL_AGENT_ID=your-agent-name
  pid: host
  deploy:
    resources:
      limits:
        cpus: "0.25"
        memory: 128M

Collected Metrics

Host Metrics

MetricDescription
CPUUsage per core, idle, iowait, system, user
MemoryUsed, available, cached, swap
DiskRead/write bytes, IOPS, usage percentage
NetworkBytes sent/received, packets, errors
Load1m, 5m, 15m load averages

Container Metrics

MetricDescription
CPUPer-container CPU usage
MemoryPer-container memory usage and limit
NetworkPer-container network I/O
Block I/OPer-container disk reads/writes

CloudWatch Integration

For AWS managed services (RDS, ALB, etc.) that don’t run agents, Upmetr polls CloudWatch metrics every 5 minutes. This creates virtual agent entries in the metrics pipeline — no deployment needed. CloudWatch metrics are enabled automatically when you add an AWS cloud account with CloudWatch permissions.

Agent Health

Upmetr monitors agent health via heartbeats:
  • Agents are expected to report every 60 seconds
  • If no data is received for 5 minutes, the agent is marked as offline
  • An incident is created if the agent remains offline

Resource Limits

The OTel Collector is designed to be lightweight:
ResourceLimit
CPU0.25 cores
Memory128-256 MB
Network~1-5 KB/min (compressed)
DiskNone (stateless)

Troubleshooting

IssueSolution
Agent shows “Offline”Check the collector is running: docker ps. Verify the token is correct.
No metrics appearingCheck the backend URL is reachable. Look at collector logs: docker logs upmetr-agent.
High memory usageReduce send_batch_size in the processor config.
Docker metrics missingEnsure the Docker socket is mounted: -v /var/run/docker.sock:/var/run/docker.sock:ro.