Prometheus Monitor

Scrape a Prometheus /metrics endpoint and validate that a specific metric meets expected criteria.

Use Cases

Configuration

monitors:
  - id: prometheus-request-rate
    name: Prometheus Request Rate
    type: prometheus
    interval: 2m
    timeout: 10s
    config:
      host: prometheus.internal.example
      port: 9090
      path: /metrics
      metric_name: http_requests_total
      expected_value: "1"
      comparison: gt
      skip_tls_verify: false

Configuration Fields

Field Required Default Description
host Prometheus server hostname or IP address
port 9090 Prometheus server port
path /metrics HTTP path to metrics endpoint
metric_name Prometheus metric name to extract (e.g., http_requests_total, up)
expected_value Expected value for comparison (numeric or string)
comparison eq Comparison operator: eq (equal), gt (greater than), lt (less than), gte (≥), lte (≤)
skip_tls_verify false Skip TLS certificate verification (not recommended for production)

Metric Format

The monitor extracts metrics from Prometheus text format (OpenMetrics):

# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",status="200"} 1234
http_requests_total{method="POST",status="201"} 567

The monitor extracts the first matching metric name and uses its value for comparison.

Comparison Operators

Example: Monitor Application Requests

monitors:
  - id: app-request-rate
    name: App Request Rate
    type: prometheus
    groups: [Application]
    interval: 1m
    timeout: 5s
    config:
      host: app.internal.example
      port: 8080
      path: /metrics
      metric_name: app_http_requests_total
      expected_value: "100"
      comparison: gt

This checks that the app has processed more than 100 HTTP requests.

Example: Monitor System Health

monitors:
  - id: prometheus-up
    name: Prometheus Up
    type: prometheus
    groups: [Observability]
    interval: 5m
    timeout: 10s
    config:
      host: prometheus.internal.example
      port: 9090
      metric_name: up
      expected_value: "1"
      comparison: eq

This checks that the up metric (standard Prometheus metric) is 1 (healthy).

Metadata

The monitor captures the following metadata in check results:

{
  "metric_name": "http_requests_total",
  "value": "1234",
  "expected": "1000",
  "comparison": "gt"
}

Common Issues

Metric not found: Ensure the metric is exposed by the Prometheus target and the metric name matches exactly (metric names are case-sensitive).

TLS certificate errors: Use skip_tls_verify: true only in development. For production, ensure the certificate is valid or use a proxy with valid certificates.

Timeout errors: Increase the timeout if the Prometheus instance is slow to respond or the metrics endpoint is large.

See Also