Infrastructure Agents
Upmetr uses the OpenTelemetry Collector to monitor host-level metrics (CPU, RAM, disk, network) and container metrics (Docker, Kubernetes, ECS). Agents are lightweight, consume minimal resources, and push metrics securely to Upmetr via OTLP/HTTP.
How It Works
- You create an agent in Upmetr and get a unique token
- Deploy the OTel Collector on your server with that token
- The collector pushes metrics every 60 seconds
- Upmetr stores metrics in a TimescaleDB hypertable with 30-day retention
Agents are stateless — they push metrics and don’t store data locally. If the connection drops, metrics resume when connectivity is restored.
Creating an Agent
- Go to Settings > Infra Agents
- Click Add Agent
- Enter a name (e.g., “prod-web-01”)
- Copy the generated agent token — you’ll need it for deployment
Deployment Options
Docker (Linux/macOS)
Kubernetes
Amazon ECS
Windows
The quickest way to deploy. One command:docker run -d \
--name upmetr-agent \
--restart unless-stopped \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc:/hostfs/proc:ro \
-v /sys:/hostfs/sys:ro \
-v /:/hostfs:ro \
-e OTEL_BACKEND_URL=https://app.upmetr.com \
-e OTEL_AGENT_TOKEN=your-agent-token \
-e OTEL_AGENT_ID=your-agent-name \
--pid=host \
--memory=128m \
--cpus=0.25 \
otel/opentelemetry-collector-contrib:0.145.0 \
--config=/etc/otelcol-contrib/config.yaml
The Docker socket mount (/var/run/docker.sock) is required for container metrics. If you don’t need container monitoring, remove it.
Docker Compose
Create an otel-collector-config.yaml:receivers:
hostmetrics:
root_path: /hostfs
collection_interval: 60s
scrapers:
cpu:
memory:
disk:
network:
load:
docker_stats:
endpoint: unix:///var/run/docker.sock
collection_interval: 60s
processors:
batch:
timeout: 10s
send_batch_size: 1024
resource:
attributes:
- key: agent_id
value: ${env:OTEL_AGENT_ID}
action: upsert
exporters:
otlphttp:
endpoint: ${env:OTEL_BACKEND_URL}/api/v1/otel
headers:
Authorization: "Bearer ${env:OTEL_AGENT_TOKEN}"
compression: gzip
retry_on_failure:
enabled: true
initial_interval: 5s
max_interval: 30s
max_elapsed_time: 300s
service:
pipelines:
metrics:
receivers: [hostmetrics, docker_stats]
processors: [resource, batch]
exporters: [otlphttp]
Then add to your docker-compose.yml:otel-collector:
image: otel/opentelemetry-collector-contrib:0.145.0
restart: unless-stopped
volumes:
- ./otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml
- /var/run/docker.sock:/var/run/docker.sock:ro
- /proc:/hostfs/proc:ro
- /sys:/hostfs/sys:ro
- /:/hostfs:ro
environment:
- OTEL_BACKEND_URL=https://app.upmetr.com
- OTEL_AGENT_TOKEN=your-agent-token
- OTEL_AGENT_ID=your-agent-name
pid: host
deploy:
resources:
limits:
cpus: "0.25"
memory: 128M
Deploy as a DaemonSet so every node in your cluster is monitored.Step 1: Create namespace
kubectl create namespace upmetr-monitoring
Step 2: Create secret
kubectl create secret generic upmetr-agent \
--namespace upmetr-monitoring \
--from-literal=token=your-agent-token \
--from-literal=backend-url=https://app.upmetr.com
Step 3: Apply RBAC
The collector needs permissions to read node and pod metrics:apiVersion: v1
kind: ServiceAccount
metadata:
name: upmetr-otel-collector
namespace: upmetr-monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: upmetr-otel-collector
rules:
- apiGroups: [""]
resources: ["nodes/stats", "nodes/proxy", "pods"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments", "daemonsets", "statefulsets"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: upmetr-otel-collector
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: upmetr-otel-collector
subjects:
- kind: ServiceAccount
name: upmetr-otel-collector
namespace: upmetr-monitoring
Step 4: Deploy DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: upmetr-otel-collector
namespace: upmetr-monitoring
spec:
selector:
matchLabels:
app: upmetr-otel-collector
template:
metadata:
labels:
app: upmetr-otel-collector
spec:
serviceAccountName: upmetr-otel-collector
tolerations:
- operator: Exists
containers:
- name: otel-collector
image: otel/opentelemetry-collector-contrib:0.145.0
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 250m
memory: 256Mi
volumeMounts:
- name: hostfs-proc
mountPath: /hostfs/proc
readOnly: true
- name: hostfs-sys
mountPath: /hostfs/sys
readOnly: true
volumes:
- name: hostfs-proc
hostPath:
path: /proc
- name: hostfs-sys
hostPath:
path: /sys
Monitored Metrics
The Kubernetes collector gathers:
- Node metrics — CPU, memory, disk, network per node
- Pod metrics — CPU/memory usage per pod via kubelet stats
- Cluster metrics — Deployment replicas, DaemonSet status
Deploy as a daemon service (EC2 launch type) or sidecar (Fargate).Step 1: Store token in SSM
aws ssm put-parameter \
--name "/upmetr/agent-token" \
--value "your-agent-token" \
--type SecureString
Step 2: Register task definition
{
"family": "upmetr-otel-collector",
"networkMode": "host",
"containerDefinitions": [
{
"name": "otel-collector",
"image": "otel/opentelemetry-collector-contrib:0.145.0",
"essential": true,
"cpu": 256,
"memory": 512,
"secrets": [
{
"name": "OTEL_AGENT_TOKEN",
"valueFrom": "/upmetr/agent-token"
}
],
"environment": [
{
"name": "OTEL_BACKEND_URL",
"value": "https://app.upmetr.com"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/upmetr-otel-collector",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "otel"
}
}
}
]
}
Step 3: Create daemon service
aws ecs create-service \
--cluster your-cluster \
--service-name upmetr-otel-collector \
--task-definition upmetr-otel-collector \
--scheduling-strategy DAEMON
ECS Metrics
The ECS collector uses the awsecscontainermetrics receiver:
- Task metrics — CPU, memory per task
- Container metrics — Per-container resource usage
Supports Windows Server 2016+ and Windows 10/11.PowerShell Install
# Download the collector
$version = "0.145.0"
$url = "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v$version/otelcol-contrib_${version}_windows_amd64.tar.gz"
Invoke-WebRequest -Uri $url -OutFile otelcol-contrib.tar.gz
tar -xzf otelcol-contrib.tar.gz
# Install as Windows service
New-Service -Name "UpmetrOtelCollector" `
-BinaryPathName "C:\upmetr\otelcol-contrib.exe --config C:\upmetr\config.yaml" `
-StartupType Automatic
Start-Service UpmetrOtelCollector
Windows uses the same config as Linux, but without root_path in the hostmetrics receiver (Windows doesn’t use /hostfs).
Collected Metrics
Host Metrics
| Metric | Description |
|---|
| CPU | Usage per core, idle, iowait, system, user |
| Memory | Used, available, cached, swap |
| Disk | Read/write bytes, IOPS, usage percentage |
| Network | Bytes sent/received, packets, errors |
| Load | 1m, 5m, 15m load averages |
Container Metrics
| Metric | Description |
|---|
| CPU | Per-container CPU usage |
| Memory | Per-container memory usage and limit |
| Network | Per-container network I/O |
| Block I/O | Per-container disk reads/writes |
CloudWatch Integration
For AWS managed services (RDS, ALB, etc.) that don’t run agents, Upmetr polls CloudWatch metrics every 5 minutes. This creates virtual agent entries in the metrics pipeline — no deployment needed.
CloudWatch metrics are enabled automatically when you add an AWS cloud account with CloudWatch permissions.
Agent Health
Upmetr monitors agent health via heartbeats:
- Agents are expected to report every 60 seconds
- If no data is received for 5 minutes, the agent is marked as offline
- An incident is created if the agent remains offline
Resource Limits
The OTel Collector is designed to be lightweight:
| Resource | Limit |
|---|
| CPU | 0.25 cores |
| Memory | 128-256 MB |
| Network | ~1-5 KB/min (compressed) |
| Disk | None (stateless) |
Troubleshooting
| Issue | Solution |
|---|
| Agent shows “Offline” | Check the collector is running: docker ps. Verify the token is correct. |
| No metrics appearing | Check the backend URL is reachable. Look at collector logs: docker logs upmetr-agent. |
| High memory usage | Reduce send_batch_size in the processor config. |
| Docker metrics missing | Ensure the Docker socket is mounted: -v /var/run/docker.sock:/var/run/docker.sock:ro. |