Prometheus Metric Types Reference

Counter, Gauge, Histogram, Summary types with exposition format and PromQL notes.

Reference for Prometheus metric types with data model, naming conventions and PromQL function applicability. Understand counters, gauges, histograms and summaries and when to use each.

What is the difference between a counter and a gauge?

A counter only ever increases (or resets to zero on restart) and is used for cumulative totals like requests served. A gauge can go up or down and represents an instantaneous value like memory in use or queue depth.

Prometheus metric types reference

Prometheus has four core metric types — Counter, Gauge, Histogram, and Summary — each with its own data model, exposition output, and the set of PromQL functions that make sense for it. Choosing the right type is the single most important modeling decision when instrumenting a service. This reference summarizes each type, what it exposes, and how to query it.

Select a metric type below to see its behavior, the series it produces, and the PromQL functions that apply.

How it works

A Counter is a monotonically increasing value; it only resets to zero when the process restarts. You never read its raw value directly — you apply rate() or increase() to get throughput, both of which detect and correct for resets. A Gauge is a snapshot that can rise and fall; functions like avg_over_time, delta, and deriv apply.

A Histogram counts observations into configurable buckets and exposes three series: _bucket{le=...} (cumulative counts per upper bound), _sum, and _count. histogram_quantile() interpolates percentiles from the buckets, and because buckets are additive you can aggregate across instances before computing the quantile. A Summary computes client-side quantiles directly (exposed with a quantile label) plus _sum and _count; its quantiles cannot be meaningfully aggregated across instances.

Tips and examples

Compute request throughput and error ratio from counters:

rate(http_requests_total[5m])
sum(rate(http_requests_total{code=~"5.."}[5m]))
  / sum(rate(http_requests_total[5m]))

Estimate the 95th percentile latency from a histogram:

histogram_quantile(0.95,
  sum by (le) (rate(http_request_duration_seconds_bucket[5m])))

Prefer histograms over summaries when you operate more than one replica, because you can aggregate first then take the quantile. Keep bucket boundaries aligned to your SLOs, and name metrics with base units (_seconds, _bytes) and a _total suffix on counters.