Skip to content

Monitoring dashboards

Panel Query / source
heapUsed over time nodejs_heap_size_used_bytes
rss over time process_resident_memory_bytes
Request rate http_requests_total
Deploy markers annotations

Overlay RPS with memory — distinguish leak from traffic growth.

Leak suspicion:

rate(nodejs_heap_size_used_bytes[1h]) > 5MB per hour
AND
rate(http_requests_total[1h]) < 10% change

Imminent OOM:

nodejs_heap_size_used_bytes / nodejs_heap_size_total_bytes > 0.85 for 15m
xychart-beta
  title "Production heap with deploy marker"
  x-axis [00h, 06h, 12h, 18h, 24h]
  y-axis "RSS GB" 0 --> 4
  line [1.2, 1.3, 1.8, 2.4, 3.1]

Step at 12h without traffic change → investigate deploy diff.

Tool Role
Prometheus + Grafana OSS metrics
Datadog / New Relic APM + process metrics
clinic.js Deep Node profiling (staging)

See Tool comparison and the dedicated Grafana & Kubernetes lesson for cAdvisor metrics, OOMKill loops, and multi-pod debugging.