Monitoring dashboards
Minimum viable dashboard
Section titled “Minimum viable dashboard”| Panel | Query / source |
|---|---|
heapUsed over time |
nodejs_heap_size_used_bytes |
rss over time |
process_resident_memory_bytes |
| Request rate | http_requests_total |
| Deploy markers | annotations |
Overlay RPS with memory — distinguish leak from traffic growth.
Alert rules (examples)
Section titled “Alert rules (examples)”Leak suspicion:
rate(nodejs_heap_size_used_bytes[1h]) > 5MB per hourANDrate(http_requests_total[1h]) < 10% changeImminent OOM:
nodejs_heap_size_used_bytes / nodejs_heap_size_total_bytes > 0.85 for 15mGrafana-style mental model
Section titled “Grafana-style mental model”xychart-beta title "Production heap with deploy marker" x-axis [00h, 06h, 12h, 18h, 24h] y-axis "RSS GB" 0 --> 4 line [1.2, 1.3, 1.8, 2.4, 3.1]
Step at 12h without traffic change → investigate deploy diff.
| Tool | Role |
|---|---|
| Prometheus + Grafana | OSS metrics |
| Datadog / New Relic | APM + process metrics |
| clinic.js | Deep Node profiling (staging) |
See Tool comparison and the dedicated Grafana & Kubernetes lesson for cAdvisor metrics, OOMKill loops, and multi-pod debugging.