The Prometheus Query Cheat Sheet is your ultimate reference for mastering PromQL — the powerful query language used to extract, analyze, and visualize time-series metrics in Prometheus. Whether you’re a DevOps engineer, SRE, or developer, this guide simplifies the most important queries, operators, and aggregation functions you’ll need for effective monitoring.
Prometheus Query Language (PromQL) lets you perform complex operations on metrics, including filtering, mathematical computations, and data aggregation — all in real time. By the end of this guide, you’ll be able to query metrics confidently, visualize performance trends, and create meaningful dashboards.
What Is Prometheus and PromQL?
Prometheus is an open-source monitoring and alerting system developed by SoundCloud. It collects metrics from configured targets at intervals, stores them as time-series data, and enables users to query that data using PromQL (Prometheus Query Language).
PromQL is the backbone of Prometheus — it allows you to slice, dice, and analyze your metrics in real-time. You can retrieve metrics for CPU usage, memory consumption, network activity, HTTP request latency, and much more.
Example:
up
This simple query returns the status of monitored targets (1 for up, 0 for down).
Why Use This Prometheus Query Cheat Sheet?
Prometheus has a vast query language, and memorizing every function or operator isn’t practical. This Prometheus Query Cheat Sheet gives you:
- A quick reference for all key PromQL commands.
- Ready-to-use examples for system and application monitoring.
- Aggregation, mathematical, and rate function usage.
- Simplified explanations for faster debugging and dashboard building.
PromQL Query Types Explained
Prometheus supports different query types for various use cases:
1. Instant Queries
Instant queries return the current value of a metric.
node_memory_MemFree_bytes
2. Range Queries
Range queries show metric changes over a period.
rate(http_requests_total[5m])
3. Vector Queries
These return multiple time-series results for the same metric name.
Understanding these types helps you choose the right query format for alerts, graphs, or analysis.
Basic Prometheus Query Examples
Here are a few commonly used PromQL queries for everyday monitoring.
| Purpose | Query Example |
|---|---|
| Show all active targets | up |
| Show CPU usage per core | rate(node_cpu_seconds_total[1m]) |
| Show memory usage | node_memory_MemTotal_bytes - node_memory_MemFree_bytes |
| Disk usage percentage | (node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100 |
| HTTP request rate | rate(http_requests_total[5m]) |
These queries are the foundation of system-level monitoring.
Using Labels in Prometheus Queries
Labels allow fine-grained filtering of metrics. Each time series in Prometheus is uniquely identified by its metric name and label set.
Filter by Label
http_requests_total{method="GET"}
Exclude a Label
http_requests_total{method!="POST"}
Multiple Label Filters
http_requests_total{method="GET", handler="/api"}
Labels help you zoom in on specific applications, endpoints, or nodes for precise analysis.
Working with Rate and Increase Functions
The rate() and increase() functions are essential for analyzing time-series changes.
1. rate()
Shows per-second average rate of increase.
rate(http_requests_total[5m])
2. increase()
Displays the total increase over a time range.
increase(http_requests_total[1h])
3. irate()
Instantaneous rate — good for real-time dashboards.
irate(node_cpu_seconds_total[1m])
Use rate() for smoother graphs and irate() for more responsive charts.
Aggregation Operators in PromQL
Aggregations summarize data across multiple dimensions.
| Operator | Description |
|---|---|
sum() | Sum of all values |
avg() | Average of all values |
max() | Maximum value |
min() | Minimum value |
count() | Count of series |
stddev() | Standard deviation |
topk() | Top K series by value |
Examples:
sum(rate(http_requests_total[5m])) by (method)
avg(node_cpu_seconds_total) by (mode)
topk(5, rate(http_requests_total[5m]))
Aggregation is vital for service-level metrics like average latency or total traffic.
Mathematical Operations in Prometheus Queries
PromQL allows math operations between metrics and constants.
Examples:
node_memory_MemFree_bytes / node_memory_MemTotal_bytes * 100
rate(http_requests_total[5m]) * 60
You can even combine metrics:
(rate(http_requests_total[1m]) / rate(http_requests_errors_total[1m])) * 100
This calculates error percentages from total request counts.
Comparison and Logical Operators
Use these operators to compare or combine metrics.
| Operator | Meaning |
|---|---|
== | Equal to |
!= | Not equal to |
> | Greater than |
< | Less than |
and, or, unless | Logical operations |
Example:
up == 0
Lists all targets that are down.
node_cpu_seconds_total > 0.9
Shows nodes with CPU usage above 90%.
Rate vs. Increase vs. Irate
Let’s summarize these commonly confused PromQL functions:
| Function | Purpose | Use Case |
|---|---|---|
| rate() | Per-second average over range | Smooth trends |
| increase() | Total increase over range | Counters |
| irate() | Instantaneous rate | Live dashboards |
Each has unique advantages depending on how frequently your metrics are scraped.
Working with Histogram Metrics
Histogram metrics measure distribution, such as request durations.
Example:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
This returns the 95th percentile latency for HTTP requests.
Common Histogram Buckets:
le= less than or equal to bucket boundary.sum by (le)aggregates all samples by bucket.
Histograms are crucial for performance and SLA/SLO analysis.
Vector Matching in Prometheus Queries
When performing operations between two metrics, Prometheus uses vector matching to align series by labels.
Example:
http_requests_total{job="api"} / http_requests_total{job="frontend"}
Use:
- on() to match specific labels.
- ignoring() to exclude labels from matching.
Example:
rate(requests_total[5m]) / ignoring(instance) rate(errors_total[5m])
This prevents mismatched label dimensions.
Prometheus Query Cheat Sheet for Node Exporter Metrics
Here are the most used PromQL commands for host-level metrics:
| Metric | Description | Example |
|---|---|---|
| CPU Usage | CPU time used per mode | rate(node_cpu_seconds_total[5m]) |
| Memory Usage | Free vs total memory | 1 - (node_memory_MemFree_bytes / node_memory_MemTotal_bytes) |
| Disk Usage | Used space percentage | (node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes * 100 |
| Load Average | System load | node_load1 |
| Network In/Out | Bytes transferred | rate(node_network_receive_bytes_total[5m]) |
These form the backbone of system observability in Prometheus.
PromQL Cheat Sheet for Kubernetes Metrics
Prometheus integrates deeply with Kubernetes, allowing cluster-level insights.
| Use Case | Query Example |
|---|---|
| Pod CPU Usage | sum(rate(container_cpu_usage_seconds_total[5m])) by (pod) |
| Pod Memory Usage | sum(container_memory_usage_bytes) by (pod) |
| Node Disk Usage | node_filesystem_avail_bytes / node_filesystem_size_bytes * 100 |
| Pod Restart Count | sum(increase(kube_pod_container_status_restarts_total[1h])) by (pod) |
| Running Pods | count(kube_pod_info) |
These queries are essential for monitoring Kubernetes cluster health.
Alerting Rules with Prometheus Queries
Prometheus queries are used to trigger alerts in Alertmanager.
Example Rule:
- alert: HighCPULoad
expr: sum(rate(node_cpu_seconds_total[1m])) by (instance) > 0.9
for: 2m
labels:
severity: critical
annotations:
summary: "High CPU usage detected on {{ $labels.instance }}"
By mastering PromQL, you can design intelligent alerts to catch anomalies early.
Advanced PromQL Functions
| Function | Purpose |
|---|---|
| avg_over_time() | Average over a time range |
| sum_over_time() | Sum over time |
| max_over_time() | Maximum in range |
| quantile_over_time() | Percentile values |
| predict_linear() | Forecast trends |
| deriv() | Derivative of time series |
Example:
predict_linear(http_requests_total[1h], 3600)
Forecasts future requests based on past trends.
PromQL Query Optimization Tips
- Use shorter time ranges for faster queries.
- Filter labels precisely to reduce data scans.
- Use recording rules for repetitive queries.
- Avoid using heavy joins in dashboards.
- Test queries incrementally.
These optimizations make dashboards load faster and alerts more responsive.
Best Practices for Prometheus Querying
- Always use rate() for counters (e.g., request counts).
- Use gauge metrics for instantaneous values (e.g., temperature, memory).
- Normalize metrics for consistent visualization.
- Use recording rules to precompute metrics for dashboards.
- Regularly test queries in Prometheus UI or Grafana Explore.
A disciplined PromQL strategy improves observability and scalability.
Conclusion
This Prometheus Query Cheat Sheet gives you everything you need to query, aggregate, and analyze metrics efficiently. From system monitoring and Kubernetes metrics to alerting and forecasting, PromQL’s flexibility makes it a must-have skill for any DevOps engineer.
Mastering these commands and best practices will help you troubleshoot faster, optimize performance, and create meaningful dashboards — all while unlocking the full power of Prometheus.
FAQs About Prometheus Query Cheat Sheet
1. What is PromQL in Prometheus?
PromQL (Prometheus Query Language) is used to query and analyze time-series data collected by Prometheus. It supports filtering, aggregation, and mathematical operations.
2. How do I query data in Prometheus?
You can query data using the Prometheus UI or API. For example:rate(http_requests_total[5m])
3. What is the difference between rate() and increase()?
rate() shows the per-second rate of increase, while increase() shows the total increase over the selected range.
4. Can Prometheus queries be used in Grafana?
Yes, Grafana supports Prometheus as a data source. You can use PromQL queries directly in Grafana dashboards.
5. How do I create custom alerts in Prometheus?
Use alerting rules in YAML format with PromQL expressions. These rules are processed by Alertmanager to send notifications.