Metrics

As part of normal operation, CockroachDB continuously records metrics that track performance, latency, usage, and many other runtime indicators. These metrics are often useful in diagnosing problems, troubleshooting performance, or planning cluster infrastructure modifications. This page documents locations where metrics are exposed for analysis.

Available metrics

CockroachDB Metric Name Description Type UnitSupported Deployments
addsstable.applications
Number of SSTable ingestions applied (i.e. applied by Replicas) COUNTER COUNTAdvanced/self-hosted
addsstable.copies
number of SSTable ingestions that required copying files during application COUNTER COUNTAdvanced/self-hosted
addsstable.proposals
Number of SSTable ingestions proposed (i.e. sent to Raft by lease holders) COUNTER COUNTAdvanced/self-hosted
admission.io.overload
1-normalized float indicating whether IO admission control considers the store as overloaded with respect to compaction out of L0 (considers sub-level and file counts). GAUGE PERCENTself-hosted
auth.cert.conn.latency
Latency to establish and authenticate a SQL connection using certificate HISTOGRAM NANOSECONDSself-hosted
auth.gss.conn.latency
Latency to establish and authenticate a SQL connection using GSS HISTOGRAM NANOSECONDSself-hosted
auth.jwt.conn.latency
Latency to establish and authenticate a SQL connection using JWT Token HISTOGRAM NANOSECONDSself-hosted
auth.ldap.conn.latency
Latency to establish and authenticate a SQL connection using LDAP HISTOGRAM NANOSECONDSself-hosted
auth.password.conn.latency
Latency to establish and authenticate a SQL connection using password HISTOGRAM NANOSECONDSself-hosted
auth.scram.conn.latency
Latency to establish and authenticate a SQL connection using SCRAM HISTOGRAM NANOSECONDSself-hosted
build.timestamp
Build information GAUGE TIMESTAMP_SECself-hosted
capacity
Total storage capacity GAUGE BYTESAdvanced/self-hosted
capacity.available
Available storage capacity GAUGE BYTESAdvanced/self-hosted
capacity.reserved
Capacity reserved for snapshots GAUGE BYTESAdvanced/self-hosted
capacity.used
Used storage capacity GAUGE BYTESAdvanced/self-hosted
changefeed.aggregator_progress
The earliest timestamp up to which any aggregator is guaranteed to have emitted all values for GAUGE TIMESTAMP_NSself-hosted
changefeed.backfill_count
Number of changefeeds currently executing backfill GAUGE COUNTStandard/Advanced/self-hosted
changefeed.backfill_pending_ranges
Number of ranges in an ongoing backfill that are yet to be fully emitted GAUGE COUNTStandard/Advanced/self-hosted
changefeed.checkpoint_progress
The earliest timestamp of any changefeed's persisted checkpoint (values prior to this timestamp will never need to be re-emitted) GAUGE TIMESTAMP_NSself-hosted
changefeed.commit_latency
Event commit latency: a difference between event MVCC timestamp and the time it was acknowledged by the downstream sink. If the sink batches events, then the difference between the oldest event in the batch and acknowledgement is recorded; Excludes latency during backfill HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
changefeed.emitted_bytes
Bytes emitted by all feeds COUNTER BYTESself-hosted
changefeed.emitted_messages
Messages emitted by all feeds COUNTER COUNTStandard/Advanced/self-hosted
changefeed.error_retries
Total retryable errors encountered by all changefeeds COUNTER COUNTStandard/Advanced/self-hosted
changefeed.failures
Total number of changefeed jobs which have failed COUNTER COUNTStandard/Advanced/self-hosted
changefeed.lagging_ranges
The number of ranges considered to be lagging behind GAUGE COUNTself-hosted
changefeed.max_behind_nanos
The most any changefeed's persisted checkpoint is behind the present GAUGE NANOSECONDSStandard/Advanced/self-hosted
changefeed.message_size_hist
Message size histogram HISTOGRAM BYTESStandard/Advanced/self-hosted
changefeed.running
Number of currently running changefeeds, including sinkless GAUGE COUNTStandard/Advanced/self-hosted
clock-offset.meannanos
Mean clock offset with other nodes GAUGE NANOSECONDSStandard/Advanced/self-hosted
clock-offset.stddevnanos
Stddev clock offset with other nodes GAUGE NANOSECONDSStandard/Advanced/self-hosted
cluster.preserve-downgrade-option.last-updated
Unix timestamp of last updated time for cluster.preserve_downgrade_option GAUGE TIMESTAMP_SECself-hosted
distsender.batches
Number of batches processed COUNTER COUNTStandard/Advanced/self-hosted
distsender.batches.partial
Number of partial batches processed after being divided on range boundaries COUNTER COUNTStandard/Advanced/self-hosted
distsender.errors.notleaseholder
Number of NotLeaseHolderErrors encountered from replica-addressed RPCs COUNTER COUNTStandard/Advanced/self-hosted
distsender.rpc.sent
Number of replica-addressed RPCs sent COUNTER COUNTStandard/Advanced/self-hosted
distsender.rpc.sent.local
Number of replica-addressed RPCs sent through the local-server optimization COUNTER COUNTStandard/Advanced/self-hosted
distsender.rpc.sent.nextreplicaerror
Number of replica-addressed RPCs sent due to per-replica errors COUNTER COUNTStandard/Advanced/self-hosted
exec.error
Number of batch KV requests that failed to execute on this node.

This count excludes transaction restart/abort errors. However, it will include other errors expected during normal operation, such as ConditionFailedError. This metric is thus not an indicator of KV health.

COUNTER COUNTAdvanced/self-hosted
exec.latency
Latency of batch KV requests (including errors) executed on this node.

This measures requests already addressed to a single replica, from the moment at which they arrive at the internal gRPC endpoint to the moment at which the response (or an error) is returned.

This latency includes in particular commit waits, conflict resolution and replication, and end-users can easily produce high measurements via long-running transactions that conflict with foreground traffic. This metric thus does not provide a good signal for understanding the health of the KV layer.

HISTOGRAM NANOSECONDSAdvanced/self-hosted
exec.success
Number of batch KV requests executed successfully on this node.

A request is considered to have executed 'successfully' if it either returns a result or a transaction restart/abort error.

COUNTER COUNTAdvanced/self-hosted
gcbytesage
Cumulative age of non-live data GAUGE SECONDSAdvanced/self-hosted
gossip.bytes.received
Number of received gossip bytes COUNTER BYTESAdvanced/self-hosted
gossip.bytes.sent
Number of sent gossip bytes COUNTER BYTESAdvanced/self-hosted
gossip.connections.incoming
Number of active incoming gossip connections GAUGE COUNTAdvanced/self-hosted
gossip.connections.outgoing
Number of active outgoing gossip connections GAUGE COUNTAdvanced/self-hosted
gossip.connections.refused
Number of refused incoming gossip connections COUNTER COUNTAdvanced/self-hosted
gossip.infos.received
Number of received gossip Info objects COUNTER COUNTAdvanced/self-hosted
gossip.infos.sent
Number of sent gossip Info objects COUNTER COUNTAdvanced/self-hosted
intentage
Cumulative age of locks GAUGE SECONDSAdvanced/self-hosted
intentbytes
Number of bytes in intent KV pairs GAUGE BYTESAdvanced/self-hosted
intentcount
Count of intent keys GAUGE COUNTAdvanced/self-hosted
jobs.auto_config_env_runner.currently_paused
Number of auto_config_env_runner jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_config_env_runner.protected_age_sec
The age of the oldest PTS record protected by auto_config_env_runner jobs GAUGE SECONDSself-hosted
jobs.auto_config_env_runner.protected_record_count
Number of protected timestamp records held by auto_config_env_runner jobs GAUGE COUNTself-hosted
jobs.auto_config_runner.currently_paused
Number of auto_config_runner jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_config_runner.protected_age_sec
The age of the oldest PTS record protected by auto_config_runner jobs GAUGE SECONDSself-hosted
jobs.auto_config_runner.protected_record_count
Number of protected timestamp records held by auto_config_runner jobs GAUGE COUNTself-hosted
jobs.auto_config_task.currently_paused
Number of auto_config_task jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_config_task.protected_age_sec
The age of the oldest PTS record protected by auto_config_task jobs GAUGE SECONDSself-hosted
jobs.auto_config_task.protected_record_count
Number of protected timestamp records held by auto_config_task jobs GAUGE COUNTself-hosted
jobs.auto_create_partial_stats.currently_paused
Number of auto_create_partial_stats jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_create_partial_stats.protected_age_sec
The age of the oldest PTS record protected by auto_create_partial_stats jobs GAUGE SECONDSself-hosted
jobs.auto_create_partial_stats.protected_record_count
Number of protected timestamp records held by auto_create_partial_stats jobs GAUGE COUNTself-hosted
jobs.auto_create_stats.currently_paused
Number of auto_create_stats jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_create_stats.currently_running
Number of auto_create_stats jobs currently running in Resume or OnFailOrCancel state GAUGE COUNTself-hosted
jobs.auto_create_stats.protected_age_sec
The age of the oldest PTS record protected by auto_create_stats jobs GAUGE SECONDSself-hosted
jobs.auto_create_stats.protected_record_count
Number of protected timestamp records held by auto_create_stats jobs GAUGE COUNTself-hosted
jobs.auto_create_stats.resume_failed
Number of auto_create_stats jobs which failed with a non-retriable error COUNTER COUNTself-hosted
jobs.auto_schema_telemetry.currently_paused
Number of auto_schema_telemetry jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_schema_telemetry.protected_age_sec
The age of the oldest PTS record protected by auto_schema_telemetry jobs GAUGE SECONDSself-hosted
jobs.auto_schema_telemetry.protected_record_count
Number of protected timestamp records held by auto_schema_telemetry jobs GAUGE COUNTself-hosted
jobs.auto_span_config_reconciliation.currently_paused
Number of auto_span_config_reconciliation jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_span_config_reconciliation.protected_age_sec
The age of the oldest PTS record protected by auto_span_config_reconciliation jobs GAUGE SECONDSself-hosted
jobs.auto_span_config_reconciliation.protected_record_count
Number of protected timestamp records held by auto_span_config_reconciliation jobs GAUGE COUNTself-hosted
jobs.auto_sql_stats_compaction.currently_paused
Number of auto_sql_stats_compaction jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_sql_stats_compaction.protected_age_sec
The age of the oldest PTS record protected by auto_sql_stats_compaction jobs GAUGE SECONDSself-hosted
jobs.auto_sql_stats_compaction.protected_record_count
Number of protected timestamp records held by auto_sql_stats_compaction jobs GAUGE COUNTself-hosted
jobs.auto_update_sql_activity.currently_paused
Number of auto_update_sql_activity jobs currently considered Paused GAUGE COUNTself-hosted
jobs.auto_update_sql_activity.protected_age_sec
The age of the oldest PTS record protected by auto_update_sql_activity jobs GAUGE SECONDSself-hosted
jobs.auto_update_sql_activity.protected_record_count
Number of protected timestamp records held by auto_update_sql_activity jobs GAUGE COUNTself-hosted
jobs.backup.currently_paused
Number of backup jobs currently considered Paused GAUGE COUNTself-hosted
jobs.backup.currently_running
Number of backup jobs currently running in Resume or OnFailOrCancel state GAUGE COUNTself-hosted
jobs.backup.protected_age_sec
The age of the oldest PTS record protected by backup jobs GAUGE SECONDSself-hosted
jobs.backup.protected_record_count
Number of protected timestamp records held by backup jobs GAUGE COUNTself-hosted
jobs.changefeed.currently_paused
Number of changefeed jobs currently considered Paused GAUGE COUNTself-hosted
jobs.changefeed.expired_pts_records
Number of expired protected timestamp records owned by changefeed jobs COUNTER COUNTself-hosted
jobs.changefeed.protected_age_sec
The age of the oldest PTS record protected by changefeed jobs GAUGE SECONDSself-hosted
jobs.changefeed.protected_record_count
Number of protected timestamp records held by changefeed jobs GAUGE COUNTself-hosted
jobs.changefeed.resume_retry_error
Number of changefeed jobs which failed with a retriable error COUNTER COUNTStandard/Advanced/self-hosted
jobs.create_stats.currently_paused
Number of create_stats jobs currently considered Paused GAUGE COUNTself-hosted
jobs.create_stats.currently_running
Number of create_stats jobs currently running in Resume or OnFailOrCancel state GAUGE COUNTself-hosted
jobs.create_stats.protected_age_sec
The age of the oldest PTS record protected by create_stats jobs GAUGE SECONDSself-hosted
jobs.create_stats.protected_record_count
Number of protected timestamp records held by create_stats jobs GAUGE COUNTself-hosted
jobs.history_retention.currently_paused
Number of history_retention jobs currently considered Paused GAUGE COUNTself-hosted
jobs.history_retention.protected_age_sec
The age of the oldest PTS record protected by history_retention jobs GAUGE SECONDSself-hosted
jobs.history_retention.protected_record_count
Number of protected timestamp records held by history_retention jobs GAUGE COUNTself-hosted
jobs.import.currently_paused
Number of import jobs currently considered Paused GAUGE COUNTself-hosted
jobs.import.protected_age_sec
The age of the oldest PTS record protected by import jobs GAUGE SECONDSself-hosted
jobs.import.protected_record_count
Number of protected timestamp records held by import jobs GAUGE COUNTself-hosted
jobs.import_rollback.currently_paused
Number of import_rollback jobs currently considered Paused GAUGE COUNTself-hosted
jobs.import_rollback.protected_age_sec
The age of the oldest PTS record protected by import_rollback jobs GAUGE SECONDSself-hosted
jobs.import_rollback.protected_record_count
Number of protected timestamp records held by import_rollback jobs GAUGE COUNTself-hosted
jobs.key_visualizer.currently_paused
Number of key_visualizer jobs currently considered Paused GAUGE COUNTself-hosted
jobs.key_visualizer.protected_age_sec
The age of the oldest PTS record protected by key_visualizer jobs GAUGE SECONDSself-hosted
jobs.key_visualizer.protected_record_count
Number of protected timestamp records held by key_visualizer jobs GAUGE COUNTself-hosted
jobs.logical_replication.currently_paused
Number of logical_replication jobs currently considered Paused GAUGE COUNTself-hosted
jobs.logical_replication.protected_age_sec
The age of the oldest PTS record protected by logical_replication jobs GAUGE SECONDSself-hosted
jobs.logical_replication.protected_record_count
Number of protected timestamp records held by logical_replication jobs GAUGE COUNTself-hosted
jobs.migration.currently_paused
Number of migration jobs currently considered Paused GAUGE COUNTself-hosted
jobs.migration.protected_age_sec
The age of the oldest PTS record protected by migration jobs GAUGE SECONDSself-hosted
jobs.migration.protected_record_count
Number of protected timestamp records held by migration jobs GAUGE COUNTself-hosted
jobs.mvcc_statistics_update.currently_paused
Number of mvcc_statistics_update jobs currently considered Paused GAUGE COUNTself-hosted
jobs.mvcc_statistics_update.protected_age_sec
The age of the oldest PTS record protected by mvcc_statistics_update jobs GAUGE SECONDSself-hosted
jobs.mvcc_statistics_update.protected_record_count
Number of protected timestamp records held by mvcc_statistics_update jobs GAUGE COUNTself-hosted
jobs.new_schema_change.currently_paused
Number of new_schema_change jobs currently considered Paused GAUGE COUNTself-hosted
jobs.new_schema_change.protected_age_sec
The age of the oldest PTS record protected by new_schema_change jobs GAUGE SECONDSself-hosted
jobs.new_schema_change.protected_record_count
Number of protected timestamp records held by new_schema_change jobs GAUGE COUNTself-hosted
jobs.poll_jobs_stats.currently_paused
Number of poll_jobs_stats jobs currently considered Paused GAUGE COUNTself-hosted
jobs.poll_jobs_stats.protected_age_sec
The age of the oldest PTS record protected by poll_jobs_stats jobs GAUGE SECONDSself-hosted
jobs.poll_jobs_stats.protected_record_count
Number of protected timestamp records held by poll_jobs_stats jobs GAUGE COUNTself-hosted
jobs.replication_stream_ingestion.currently_paused
Number of replication_stream_ingestion jobs currently considered Paused GAUGE COUNTself-hosted
jobs.replication_stream_ingestion.protected_age_sec
The age of the oldest PTS record protected by replication_stream_ingestion jobs GAUGE SECONDSself-hosted
jobs.replication_stream_ingestion.protected_record_count
Number of protected timestamp records held by replication_stream_ingestion jobs GAUGE COUNTself-hosted
jobs.replication_stream_producer.currently_paused
Number of replication_stream_producer jobs currently considered Paused GAUGE COUNTself-hosted
jobs.replication_stream_producer.protected_age_sec
The age of the oldest PTS record protected by replication_stream_producer jobs GAUGE SECONDSself-hosted
jobs.replication_stream_producer.protected_record_count
Number of protected timestamp records held by replication_stream_producer jobs GAUGE COUNTself-hosted
jobs.restore.currently_paused
Number of restore jobs currently considered Paused GAUGE COUNTself-hosted
jobs.restore.protected_age_sec
The age of the oldest PTS record protected by restore jobs GAUGE SECONDSself-hosted
jobs.restore.protected_record_count
Number of protected timestamp records held by restore jobs GAUGE COUNTself-hosted
jobs.row_level_ttl.currently_paused
Number of row_level_ttl jobs currently considered Paused GAUGE COUNTAdvanced/self-hosted
jobs.row_level_ttl.currently_running
Number of row_level_ttl jobs currently running in Resume or OnFailOrCancel state GAUGE COUNTAdvanced/self-hosted
jobs.row_level_ttl.delete_duration
Duration for delete requests during row level TTL. HISTOGRAM NANOSECONDSAdvanced/self-hosted
jobs.row_level_ttl.num_active_spans
Number of active spans the TTL job is deleting from. GAUGE COUNTAdvanced/self-hosted
jobs.row_level_ttl.protected_age_sec
The age of the oldest PTS record protected by row_level_ttl jobs GAUGE SECONDSself-hosted
jobs.row_level_ttl.protected_record_count
Number of protected timestamp records held by row_level_ttl jobs GAUGE COUNTself-hosted
jobs.row_level_ttl.resume_completed
Number of row_level_ttl jobs which successfully resumed to completion COUNTER COUNTAdvanced/self-hosted
jobs.row_level_ttl.resume_failed
Number of row_level_ttl jobs which failed with a non-retriable error COUNTER COUNTAdvanced/self-hosted
jobs.row_level_ttl.rows_deleted
Number of rows deleted by the row level TTL job. COUNTER COUNTAdvanced/self-hosted
jobs.row_level_ttl.rows_selected
Number of rows selected for deletion by the row level TTL job. COUNTER COUNTAdvanced/self-hosted
jobs.row_level_ttl.select_duration
Duration for select requests during row level TTL. HISTOGRAM NANOSECONDSAdvanced/self-hosted
jobs.row_level_ttl.span_total_duration
Duration for processing a span during row level TTL. HISTOGRAM NANOSECONDSAdvanced/self-hosted
jobs.row_level_ttl.total_expired_rows
Approximate number of rows that have expired the TTL on the TTL table. GAUGE COUNTAdvanced/self-hosted
jobs.row_level_ttl.total_rows
Approximate number of rows on the TTL table. GAUGE COUNTAdvanced/self-hosted
jobs.schema_change.currently_paused
Number of schema_change jobs currently considered Paused GAUGE COUNTself-hosted
jobs.schema_change.protected_age_sec
The age of the oldest PTS record protected by schema_change jobs GAUGE SECONDSself-hosted
jobs.schema_change.protected_record_count
Number of protected timestamp records held by schema_change jobs GAUGE COUNTself-hosted
jobs.schema_change_gc.currently_paused
Number of schema_change_gc jobs currently considered Paused GAUGE COUNTself-hosted
jobs.schema_change_gc.protected_age_sec
The age of the oldest PTS record protected by schema_change_gc jobs GAUGE SECONDSself-hosted
jobs.schema_change_gc.protected_record_count
Number of protected timestamp records held by schema_change_gc jobs GAUGE COUNTself-hosted
jobs.standby_read_ts_poller.currently_paused
Number of standby_read_ts_poller jobs currently considered Paused GAUGE COUNTself-hosted
jobs.standby_read_ts_poller.protected_age_sec
The age of the oldest PTS record protected by standby_read_ts_poller jobs GAUGE SECONDSself-hosted
jobs.standby_read_ts_poller.protected_record_count
Number of protected timestamp records held by standby_read_ts_poller jobs GAUGE COUNTself-hosted
jobs.typedesc_schema_change.currently_paused
Number of typedesc_schema_change jobs currently considered Paused GAUGE COUNTself-hosted
jobs.typedesc_schema_change.protected_age_sec
The age of the oldest PTS record protected by typedesc_schema_change jobs GAUGE SECONDSself-hosted
jobs.typedesc_schema_change.protected_record_count
Number of protected timestamp records held by typedesc_schema_change jobs GAUGE COUNTself-hosted
jobs.update_table_metadata_cache.currently_paused
Number of update_table_metadata_cache jobs currently considered Paused GAUGE COUNTself-hosted
jobs.update_table_metadata_cache.protected_age_sec
The age of the oldest PTS record protected by update_table_metadata_cache jobs GAUGE SECONDSself-hosted
jobs.update_table_metadata_cache.protected_record_count
Number of protected timestamp records held by update_table_metadata_cache jobs GAUGE COUNTself-hosted
keybytes
Number of bytes taken up by keys GAUGE BYTESAdvanced/self-hosted
keycount
Count of all keys GAUGE COUNTAdvanced/self-hosted
leases.epoch
Number of replica leaseholders using epoch-based leases GAUGE COUNTAdvanced/self-hosted
leases.error
Number of failed lease requests COUNTER COUNTAdvanced/self-hosted
leases.expiration
Number of replica leaseholders using expiration-based leases GAUGE COUNTAdvanced/self-hosted
leases.success
Number of successful lease requests COUNTER COUNTAdvanced/self-hosted
leases.transfers.error
Number of failed lease transfers COUNTER COUNTAdvanced/self-hosted
leases.transfers.success
Number of successful lease transfers COUNTER COUNTAdvanced/self-hosted
livebytes
Number of bytes of live data (keys plus values) GAUGE BYTESAdvanced/self-hosted
livecount
Count of live keys GAUGE COUNTAdvanced/self-hosted
liveness.epochincrements
Number of times this node has incremented its liveness epoch COUNTER COUNTAdvanced/self-hosted
liveness.heartbeatfailures
Number of failed node liveness heartbeats from this node COUNTER COUNTAdvanced/self-hosted
liveness.heartbeatlatency
Node liveness heartbeat latency HISTOGRAM NANOSECONDSAdvanced/self-hosted
liveness.heartbeatsuccesses
Number of successful node liveness heartbeats from this node COUNTER COUNTAdvanced/self-hosted
liveness.livenodes
Number of live nodes in the cluster (will be 0 if this node is not itself live) GAUGE COUNTAdvanced/self-hosted
node-id
node ID with labels for advertised RPC and HTTP addresses GAUGE CONSTself-hosted
physical_replication.logical_bytes
Logical bytes (sum of keys + values) ingested by all replication jobs COUNTER BYTESAdvanced/self-hosted
physical_replication.replicated_time_seconds
The replicated time of the physical replication stream in seconds since the unix epoch. GAUGE SECONDSAdvanced/self-hosted
queue.consistency.pending
Number of pending replicas in the consistency checker queue GAUGE COUNTAdvanced/self-hosted
queue.consistency.process.failure
Number of replicas which failed processing in the consistency checker queue COUNTER COUNTAdvanced/self-hosted
queue.consistency.process.success
Number of replicas successfully processed by the consistency checker queue COUNTER COUNTAdvanced/self-hosted
queue.consistency.processingnanos
Nanoseconds spent processing replicas in the consistency checker queue COUNTER NANOSECONDSAdvanced/self-hosted
queue.gc.info.abortspanconsidered
Number of AbortSpan entries old enough to be considered for removal COUNTER COUNTAdvanced/self-hosted
queue.gc.info.abortspangcnum
Number of AbortSpan entries fit for removal COUNTER COUNTAdvanced/self-hosted
queue.gc.info.abortspanscanned
Number of transactions present in the AbortSpan scanned from the engine COUNTER COUNTAdvanced/self-hosted
queue.gc.info.clearrangefailed
Number of failed ClearRange operations during GC COUNTER COUNTself-hosted
queue.gc.info.clearrangesuccess
Number of successful ClearRange operations during GC COUNTER COUNTself-hosted
queue.gc.info.intentsconsidered
Number of 'old' intents COUNTER COUNTAdvanced/self-hosted
queue.gc.info.intenttxns
Number of associated distinct transactions COUNTER COUNTAdvanced/self-hosted
queue.gc.info.numkeysaffected
Number of keys with GC'able data COUNTER COUNTAdvanced/self-hosted
queue.gc.info.pushtxn
Number of attempted pushes COUNTER COUNTAdvanced/self-hosted
queue.gc.info.resolvesuccess
Number of successful intent resolutions COUNTER COUNTAdvanced/self-hosted
queue.gc.info.resolvetotal
Number of attempted intent resolutions COUNTER COUNTAdvanced/self-hosted
queue.gc.info.transactionspangcaborted
Number of GC'able entries corresponding to aborted txns COUNTER COUNTAdvanced/self-hosted
queue.gc.info.transactionspangccommitted
Number of GC'able entries corresponding to committed txns COUNTER COUNTAdvanced/self-hosted
queue.gc.info.transactionspangcpending
Number of GC'able entries corresponding to pending txns COUNTER COUNTAdvanced/self-hosted
queue.gc.info.transactionspanscanned
Number of entries in transaction spans scanned from the engine COUNTER COUNTAdvanced/self-hosted
queue.gc.pending
Number of pending replicas in the MVCC GC queue GAUGE COUNTAdvanced/self-hosted
queue.gc.process.failure
Number of replicas which failed processing in the MVCC GC queue COUNTER COUNTAdvanced/self-hosted
queue.gc.process.success
Number of replicas successfully processed by the MVCC GC queue COUNTER COUNTAdvanced/self-hosted
queue.gc.processingnanos
Nanoseconds spent processing replicas in the MVCC GC queue COUNTER NANOSECONDSAdvanced/self-hosted
queue.raftlog.pending
Number of pending replicas in the Raft log queue GAUGE COUNTAdvanced/self-hosted
queue.raftlog.process.failure
Number of replicas which failed processing in the Raft log queue COUNTER COUNTAdvanced/self-hosted
queue.raftlog.process.success
Number of replicas successfully processed by the Raft log queue COUNTER COUNTAdvanced/self-hosted
queue.raftlog.processingnanos
Nanoseconds spent processing replicas in the Raft log queue COUNTER NANOSECONDSAdvanced/self-hosted
queue.raftsnapshot.pending
Number of pending replicas in the Raft repair queue GAUGE COUNTAdvanced/self-hosted
queue.raftsnapshot.process.failure
Number of replicas which failed processing in the Raft repair queue COUNTER COUNTAdvanced/self-hosted
queue.raftsnapshot.process.success
Number of replicas successfully processed by the Raft repair queue COUNTER COUNTAdvanced/self-hosted
queue.raftsnapshot.processingnanos
Nanoseconds spent processing replicas in the Raft repair queue COUNTER NANOSECONDSAdvanced/self-hosted
queue.replicagc.pending
Number of pending replicas in the replica GC queue GAUGE COUNTAdvanced/self-hosted
queue.replicagc.process.failure
Number of replicas which failed processing in the replica GC queue COUNTER COUNTAdvanced/self-hosted
queue.replicagc.process.success
Number of replicas successfully processed by the replica GC queue COUNTER COUNTAdvanced/self-hosted
queue.replicagc.processingnanos
Nanoseconds spent processing replicas in the replica GC queue COUNTER NANOSECONDSAdvanced/self-hosted
queue.replicagc.removereplica
Number of replica removals attempted by the replica GC queue COUNTER COUNTAdvanced/self-hosted
queue.replicate.addreplica
Number of replica additions attempted by the replicate queue COUNTER COUNTAdvanced/self-hosted
queue.replicate.addreplica.error
Number of failed replica additions processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.addreplica.success
Number of successful replica additions processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.pending
Number of pending replicas in the replicate queue GAUGE COUNTAdvanced/self-hosted
queue.replicate.process.failure
Number of replicas which failed processing in the replicate queue COUNTER COUNTAdvanced/self-hosted
queue.replicate.process.success
Number of replicas successfully processed by the replicate queue COUNTER COUNTAdvanced/self-hosted
queue.replicate.processingnanos
Nanoseconds spent processing replicas in the replicate queue COUNTER NANOSECONDSAdvanced/self-hosted
queue.replicate.purgatory
Number of replicas in the replicate queue's purgatory, awaiting allocation options GAUGE COUNTAdvanced/self-hosted
queue.replicate.rebalancereplica
Number of replica rebalancer-initiated additions attempted by the replicate queue COUNTER COUNTAdvanced/self-hosted
queue.replicate.removedeadreplica
Number of dead replica removals attempted by the replicate queue (typically in response to a node outage) COUNTER COUNTAdvanced/self-hosted
queue.replicate.removedeadreplica.error
Number of failed dead replica removals processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.removedeadreplica.success
Number of successful dead replica removals processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.removedecommissioningreplica.error
Number of failed decommissioning replica removals processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.removedecommissioningreplica.success
Number of successful decommissioning replica removals processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.removereplica
Number of replica removals attempted by the replicate queue (typically in response to a rebalancer-initiated addition) COUNTER COUNTAdvanced/self-hosted
queue.replicate.removereplica.error
Number of failed replica removals processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.removereplica.success
Number of successful replica removals processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.replacedeadreplica.error
Number of failed dead replica replacements processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.replacedeadreplica.success
Number of successful dead replica replacements processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.replacedecommissioningreplica.error
Number of failed decommissioning replica replacements processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.replacedecommissioningreplica.success
Number of successful decommissioning replica replacements processed by the replicate queue COUNTER COUNTself-hosted
queue.replicate.transferlease
Number of range lease transfers attempted by the replicate queue COUNTER COUNTAdvanced/self-hosted
queue.split.pending
Number of pending replicas in the split queue GAUGE COUNTAdvanced/self-hosted
queue.split.process.failure
Number of replicas which failed processing in the split queue COUNTER COUNTAdvanced/self-hosted
queue.split.process.success
Number of replicas successfully processed by the split queue COUNTER COUNTAdvanced/self-hosted
queue.split.processingnanos
Nanoseconds spent processing replicas in the split queue COUNTER NANOSECONDSAdvanced/self-hosted
queue.tsmaintenance.pending
Number of pending replicas in the time series maintenance queue GAUGE COUNTAdvanced/self-hosted
queue.tsmaintenance.process.failure
Number of replicas which failed processing in the time series maintenance queue COUNTER COUNTAdvanced/self-hosted
queue.tsmaintenance.process.success
Number of replicas successfully processed by the time series maintenance queue COUNTER COUNTAdvanced/self-hosted
queue.tsmaintenance.processingnanos
Nanoseconds spent processing replicas in the time series maintenance queue COUNTER NANOSECONDSAdvanced/self-hosted
raft.commandsapplied
Number of Raft commands applied.

This measurement is taken on the Raft apply loops of all Replicas (leaders and followers alike), meaning that it does not measure the number of Raft commands proposed (in the hypothetical extreme case, all Replicas may apply all commands through snapshots, thus not increasing this metric at all). Instead, it is a proxy for how much work is being done advancing the Replica state machines on this node.

COUNTER COUNTAdvanced/self-hosted
raft.heartbeats.pending
Number of pending heartbeats and responses waiting to be coalesced GAUGE COUNTAdvanced/self-hosted
raft.process.commandcommit.latency
Latency histogram for applying a batch of Raft commands to the state machine.

This metric is misnamed: it measures the latency for applying a batch of committed Raft commands to a Replica state machine. This requires only non-durable I/O (except for replication configuration changes).

Note that a "batch" in this context is really a sub-batch of the batch received for application during raft ready handling. The 'raft.process.applycommitted.latency' histogram is likely more suitable in most cases, as it measures the total latency across all sub-batches (i.e. the sum of commandcommit.latency for a complete batch).

HISTOGRAM NANOSECONDSAdvanced/self-hosted
raft.process.logcommit.latency
Latency histogram for committing Raft log entries to stable storage

This measures the latency of durably committing a group of newly received Raft entries as well as the HardState entry to disk. This excludes any data processing, i.e. we measure purely the commit latency of the resulting Engine write. Homogeneous bands of p50-p99 latencies (in the presence of regular Raft traffic), make it likely that the storage layer is healthy. Spikes in the latency bands can either hint at the presence of large sets of Raft entries being received, or at performance issues at the storage layer.

HISTOGRAM NANOSECONDSAdvanced/self-hosted
raft.process.tickingnanos
Nanoseconds spent in store.processRaft() processing replica.Tick() COUNTER NANOSECONDSAdvanced/self-hosted
raft.process.workingnanos
Nanoseconds spent in store.processRaft() working.

This is the sum of the measurements passed to the raft.process.handleready.latency histogram.

COUNTER NANOSECONDSAdvanced/self-hosted
raft.rcvd.app
Number of MsgApp messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.appresp
Number of MsgAppResp messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.dropped
Number of incoming Raft messages dropped (due to queue length or size) COUNTER COUNTAdvanced/self-hosted
raft.rcvd.heartbeat
Number of (coalesced, if enabled) MsgHeartbeat messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.heartbeatresp
Number of (coalesced, if enabled) MsgHeartbeatResp messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.prevote
Number of MsgPreVote messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.prevoteresp
Number of MsgPreVoteResp messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.prop
Number of MsgProp messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.snap
Number of MsgSnap messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.timeoutnow
Number of MsgTimeoutNow messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.transferleader
Number of MsgTransferLeader messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.vote
Number of MsgVote messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.rcvd.voteresp
Number of MsgVoteResp messages received by this store COUNTER COUNTAdvanced/self-hosted
raft.ticks
Number of Raft ticks queued COUNTER COUNTAdvanced/self-hosted
raftlog.behind
Number of Raft log entries followers on other stores are behind.

This gauge provides a view of the aggregate number of log entries the Raft leaders on this node think the followers are behind. Since a raft leader may not always have a good estimate for this information for all of its followers, and since followers are expected to be behind (when they are not required as part of a quorum) and the aggregate thus scales like the count of such followers, it is difficult to meaningfully interpret this metric.

GAUGE COUNTAdvanced/self-hosted
raftlog.truncated
Number of Raft log entries truncated COUNTER COUNTAdvanced/self-hosted
range.adds
Number of range additions COUNTER COUNTAdvanced/self-hosted
range.merges
Number of range merges COUNTER COUNTself-hosted
range.raftleadertransfers
Number of raft leader transfers COUNTER COUNTAdvanced/self-hosted
range.removes
Number of range removals COUNTER COUNTAdvanced/self-hosted
range.snapshots.generated
Number of generated snapshots COUNTER COUNTAdvanced/self-hosted
range.snapshots.rcvd-bytes
Number of snapshot bytes received COUNTER BYTESself-hosted
range.snapshots.rebalancing.rcvd-bytes
Number of rebalancing snapshot bytes received COUNTER BYTESself-hosted
range.snapshots.rebalancing.sent-bytes
Number of rebalancing snapshot bytes sent COUNTER BYTESself-hosted
range.snapshots.recovery.rcvd-bytes
Number of raft recovery snapshot bytes received COUNTER BYTESself-hosted
range.snapshots.recovery.sent-bytes
Number of raft recovery snapshot bytes sent COUNTER BYTESself-hosted
range.snapshots.recv-in-progress
Number of non-empty snapshots being received GAUGE COUNTself-hosted
range.snapshots.recv-queue
Number of snapshots queued to receive GAUGE COUNTself-hosted
range.snapshots.recv-total-in-progress
Number of total snapshots being received GAUGE COUNTself-hosted
range.snapshots.send-in-progress
Number of non-empty snapshots being sent GAUGE COUNTself-hosted
range.snapshots.send-queue
Number of snapshots queued to send GAUGE COUNTself-hosted
range.snapshots.send-total-in-progress
Number of total snapshots being sent GAUGE COUNTself-hosted
range.snapshots.sent-bytes
Number of snapshot bytes sent COUNTER BYTESself-hosted
range.snapshots.unknown.rcvd-bytes
Number of unknown snapshot bytes received COUNTER BYTESself-hosted
range.snapshots.unknown.sent-bytes
Number of unknown snapshot bytes sent COUNTER BYTESself-hosted
range.splits
Number of range splits COUNTER COUNTAdvanced/self-hosted
rangekeybytes
Number of bytes taken up by range keys (e.g. MVCC range tombstones) GAUGE BYTESAdvanced/self-hosted
rangekeycount
Count of all range keys (e.g. MVCC range tombstones) GAUGE COUNTAdvanced/self-hosted
ranges
Number of ranges GAUGE COUNTAdvanced/self-hosted
ranges.overreplicated
Number of ranges with more live replicas than the replication target GAUGE COUNTAdvanced/self-hosted
ranges.unavailable
Number of ranges with fewer live replicas than needed for quorum GAUGE COUNTAdvanced/self-hosted
ranges.underreplicated
Number of ranges with fewer live replicas than the replication target GAUGE COUNTAdvanced/self-hosted
rangevalbytes
Number of bytes taken up by range key values (e.g. MVCC range tombstones) GAUGE BYTESAdvanced/self-hosted
rangevalcount
Count of all range key values (e.g. MVCC range tombstones) GAUGE COUNTAdvanced/self-hosted
rebalancing.cpunanospersecond
Average CPU nanoseconds spent on processing replica operations in the last 30 minutes. GAUGE NANOSECONDSAdvanced/self-hosted
rebalancing.lease.transfers
Number of lease transfers motivated by store-level load imbalances COUNTER COUNTAdvanced/self-hosted
rebalancing.queriespersecond
Number of kv-level requests received per second by the store, considering the last 30 minutes, as used in rebalancing decisions. GAUGE COUNTAdvanced/self-hosted
rebalancing.range.rebalances
Number of range rebalance operations motivated by store-level load imbalances COUNTER COUNTAdvanced/self-hosted
rebalancing.readbytespersecond
Number of bytes read recently per second, considering the last 30 minutes. GAUGE BYTESAdvanced/self-hosted
rebalancing.readspersecond
Number of keys read recently per second, considering the last 30 minutes. GAUGE COUNTAdvanced/self-hosted
rebalancing.replicas.cpunanospersecond
Histogram of average CPU nanoseconds spent on processing replica operations in the last 30 minutes. HISTOGRAM NANOSECONDSAdvanced/self-hosted
rebalancing.replicas.queriespersecond
Histogram of average kv-level requests received per second by replicas on the store in the last 30 minutes. HISTOGRAM COUNTAdvanced/self-hosted
rebalancing.requestspersecond
Number of requests received recently per second, considering the last 30 minutes. GAUGE COUNTAdvanced/self-hosted
rebalancing.state.imbalanced_overfull_options_exhausted
Number of occurrences where this store was overfull but failed to shed load after exhausting available rebalance options COUNTER COUNTAdvanced/self-hosted
rebalancing.writebytespersecond
Number of bytes written recently per second, considering the last 30 minutes. GAUGE BYTESAdvanced/self-hosted
rebalancing.writespersecond
Number of keys written (i.e. applied by raft) per second to the store, considering the last 30 minutes. GAUGE COUNTAdvanced/self-hosted
replicas
Number of replicas GAUGE COUNTAdvanced/self-hosted
replicas.leaders
Number of raft leaders GAUGE COUNTAdvanced/self-hosted
replicas.leaders_invalid_lease
Number of replicas that are Raft leaders whose lease is invalid GAUGE COUNTself-hosted
replicas.leaders_not_leaseholders
Number of replicas that are Raft leaders whose range lease is held by another store GAUGE COUNTAdvanced/self-hosted
replicas.leaseholders
Number of lease holders GAUGE COUNTAdvanced/self-hosted
replicas.quiescent
Number of quiesced replicas GAUGE COUNTAdvanced/self-hosted
replicas.reserved
Number of replicas reserved for snapshots GAUGE COUNTAdvanced/self-hosted
requests.backpressure.split
Number of backpressured writes waiting on a Range split.

A Range will backpressure (roughly) non-system traffic when the range is above the configured size until the range splits. When the rate of this metric is nonzero over extended periods of time, it should be investigated why splits are not occurring.

GAUGE COUNTAdvanced/self-hosted
requests.slow.distsender
Number of range-bound RPCs currently stuck or retrying for a long time.

Note that this is not a good signal for KV health. The remote side of the RPCs tracked here may experience contention, so an end user can easily cause values for this metric to be emitted by leaving a transaction open for a long time and contending with it using a second transaction.

GAUGE COUNTStandard/Advanced/self-hosted
requests.slow.lease
Number of requests that have been stuck for a long time acquiring a lease.

This gauge registering a nonzero value usually indicates range or replica unavailability, and should be investigated. In the common case, we also expect to see 'requests.slow.raft' to register a nonzero value, indicating that the lease requests are not getting a timely response from the replication layer.

GAUGE COUNTAdvanced/self-hosted
requests.slow.raft
Number of requests that have been stuck for a long time in the replication layer.

An (evaluated) request has to pass through the replication layer, notably the quota pool and raft. If it fails to do so within a highly permissive duration, the gauge is incremented (and decremented again once the request is either applied or returns an error).

A nonzero value indicates range or replica unavailability, and should be investigated.

GAUGE COUNTAdvanced/self-hosted
rocksdb.block.cache.hits
Count of block cache hits COUNTER COUNTAdvanced/self-hosted
rocksdb.block.cache.misses
Count of block cache misses COUNTER COUNTAdvanced/self-hosted
rocksdb.block.cache.usage
Bytes used by the block cache GAUGE BYTESAdvanced/self-hosted
rocksdb.bloom.filter.prefix.checked
Number of times the bloom filter was checked COUNTER COUNTAdvanced/self-hosted
rocksdb.bloom.filter.prefix.useful
Number of times the bloom filter helped avoid iterator creation COUNTER COUNTAdvanced/self-hosted
rocksdb.compactions
Number of table compactions COUNTER COUNTAdvanced/self-hosted
rocksdb.flushes
Number of table flushes COUNTER COUNTAdvanced/self-hosted
rocksdb.memtable.total-size
Current size of memtable in bytes GAUGE BYTESAdvanced/self-hosted
rocksdb.num-sstables
Number of storage engine SSTables GAUGE COUNTAdvanced/self-hosted
rocksdb.read-amplification
Number of disk reads per query GAUGE COUNTAdvanced/self-hosted
rocksdb.table-readers-mem-estimate
Memory used by index and filter blocks GAUGE BYTESAdvanced/self-hosted
round-trip-latency
Distribution of round-trip latencies with other nodes.

This only reflects successful heartbeats and measures gRPC overhead as well as possible head-of-line blocking. Elevated values in this metric may hint at network issues and/or saturation, but they are no proof of them. CPU overload can similarly elevate this metric. The operator should look towards OS-level metrics such as packet loss, retransmits, etc, to conclusively diagnose network issues. Heartbeats are not very frequent (~seconds), so they may not capture rare or short-lived degradations.

HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
rpc.connection.avg_round_trip_latency
Sum of exponentially weighted moving average of round-trip latencies, as measured through a gRPC RPC.

Dividing this Gauge by rpc.connection.healthy gives an approximation of average latency, but the top-level round-trip-latency histogram is more useful. Instead, users should consult the label families of this metric if they are available (which requires prometheus and the cluster setting 'server.child_metrics.enabled'); these provide per-peer moving averages.

This metric does not track failed connection. A failed connection's contribution is reset to zero.

GAUGE NANOSECONDSself-hosted
rpc.connection.failures
Counter of failed connections.

This includes both the event in which a healthy connection terminates as well as unsuccessful reconnection attempts.

Connections that are terminated as part of local node shutdown are excluded. Decommissioned peers are excluded.

COUNTER COUNTself-hosted
rpc.connection.healthy
Gauge of current connections in a healthy state (i.e. bidirectionally connected and heartbeating) GAUGE COUNTself-hosted
rpc.connection.healthy_nanos
Gauge of nanoseconds of healthy connection time

On the prometheus endpoint scraped with the cluster setting 'server.child_metrics.enabled' set, the constituent parts of this metric are available on a per-peer basis and one can read off for how long a given peer has been connected

GAUGE NANOSECONDSself-hosted
rpc.connection.heartbeats
Counter of successful heartbeats. COUNTER COUNTself-hosted
rpc.connection.unhealthy
Gauge of current connections in an unhealthy state (not bidirectionally connected or heartbeating) GAUGE COUNTself-hosted
rpc.connection.unhealthy_nanos
Gauge of nanoseconds of unhealthy connection time.

On the prometheus endpoint scraped with the cluster setting 'server.child_metrics.enabled' set, the constituent parts of this metric are available on a per-peer basis and one can read off for how long a given peer has been unreachable

GAUGE NANOSECONDSself-hosted
schedules.BACKUP.failed
Number of BACKUP jobs failed COUNTER COUNTStandard/Advanced/self-hosted
schedules.BACKUP.last-completed-time
The unix timestamp of the most recently completed backup by a schedule specified as maintaining this metric GAUGE TIMESTAMP_SECStandard/Advanced/self-hosted
schedules.BACKUP.protected_age_sec
The age of the oldest PTS record protected by BACKUP schedules GAUGE SECONDSself-hosted
schedules.BACKUP.protected_record_count
Number of PTS records held by BACKUP schedules GAUGE COUNTself-hosted
schedules.BACKUP.started
Number of BACKUP jobs started COUNTER COUNTStandard/Advanced/self-hosted
schedules.BACKUP.succeeded
Number of BACKUP jobs succeeded COUNTER COUNTStandard/Advanced/self-hosted
schedules.scheduled-row-level-ttl-executor.failed
Number of scheduled-row-level-ttl-executor jobs failed COUNTER COUNTAdvanced/self-hosted
seconds.until.enterprise.license.expiry
Seconds until enterprise license expiry (0 if no license present or running without enterprise features) GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.ca
Expiration for the CA certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECAdvanced/self-hosted
security.certificate.expiration.ca-client-tenant
Expiration for the Tenant Client CA certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.client
Minimum expiration for client certificates, labeled by SQL user. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.client-ca
Expiration for the client CA certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.client-tenant
Expiration for the Tenant Client certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.node
Expiration for the node certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.node-client
Expiration for the node's client certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.ui
Expiration for the UI certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.expiration.ui-ca
Expiration for the UI CA certificate. 0 means no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.ca
Seconds till expiration for the CA certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.ca-client-tenant
Seconds till expiration for the Tenant Client CA certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.client
Seconds till expiration for the client certificates, labeled by SQL user. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.client-ca
Seconds till expiration for the client CA certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.client-tenant
Seconds till expiration for the Tenant Client certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.node
Seconds till expiration for the node certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.node-client
Seconds till expiration for the node's client certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.ui
Seconds till expiration for the UI certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
security.certificate.ttl.ui-ca
Seconds till expiration for the UI CA certificate. 0 means expired, no certificate or error. GAUGE TIMESTAMP_SECself-hosted
sql.bytesin
Number of SQL bytes received COUNTER BYTESStandard/Advanced/self-hosted
sql.bytesout
Number of SQL bytes sent COUNTER BYTESStandard/Advanced/self-hosted
sql.conn.latency
Latency to establish and authenticate a SQL connection HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
sql.conns
Number of open SQL connections GAUGE COUNTStandard/Advanced/self-hosted
sql.crud_query.count
Number of SQL SELECT, INSERT, UPDATE, DELETE statements successfully executed COUNTER COUNTself-hosted
sql.crud_query.started.count
Number of SQL SELECT, INSERT, UPDATE, DELETE statements started COUNTER COUNTself-hosted
sql.ddl.count
Number of SQL DDL statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.delete.count
Number of SQL DELETE statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.distsql.contended_queries.count
Number of SQL queries that experienced contention COUNTER COUNTStandard/Advanced/self-hosted
sql.distsql.exec.latency
Latency of DistSQL statement execution HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
sql.distsql.flows.active
Number of distributed SQL flows currently active GAUGE COUNTStandard/Advanced/self-hosted
sql.distsql.flows.total
Number of distributed SQL flows executed COUNTER COUNTStandard/Advanced/self-hosted
sql.distsql.queries.active
Number of invocations of the execution engine currently active (multiple of which may occur for a single SQL statement) GAUGE COUNTStandard/Advanced/self-hosted
sql.distsql.queries.total
Number of invocations of the execution engine executed (multiple of which may occur for a single SQL statement) COUNTER COUNTStandard/Advanced/self-hosted
sql.distsql.select.count
Number of SELECT statements planned to be distributed COUNTER COUNTStandard/Advanced/self-hosted
sql.distsql.service.latency
Latency of DistSQL request execution HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
sql.exec.latency
Latency of SQL statement execution HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
sql.exec.latency.detail
Latency of SQL statement execution, by statement fingerprint HISTOGRAM NANOSECONDSself-hosted
sql.failure.count
Number of statements resulting in a planning or runtime error COUNTER COUNTStandard/Advanced/self-hosted
sql.full.scan.count
Number of full table or index scans COUNTER COUNTStandard/Advanced/self-hosted
sql.guardrails.max_row_size_err.count
Number of rows observed violating sql.guardrails.max_row_size_err COUNTER COUNTself-hosted
sql.guardrails.max_row_size_log.count
Number of rows observed violating sql.guardrails.max_row_size_log COUNTER COUNTself-hosted
sql.insert.count
Number of SQL INSERT statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.mem.distsql.current
Current sql statement memory usage for distsql GAUGE BYTESStandard/Advanced/self-hosted
sql.mem.distsql.max
Memory usage per sql statement for distsql HISTOGRAM BYTESStandard/Advanced/self-hosted
sql.mem.internal.session.current
Current sql session memory usage for internal GAUGE BYTESStandard/Advanced/self-hosted
sql.mem.internal.session.max
Memory usage per sql session for internal HISTOGRAM BYTESStandard/Advanced/self-hosted
sql.mem.internal.txn.current
Current sql transaction memory usage for internal GAUGE BYTESStandard/Advanced/self-hosted
sql.mem.internal.txn.max
Memory usage per sql transaction for internal HISTOGRAM BYTESStandard/Advanced/self-hosted
sql.mem.root.current
Current sql statement memory usage for root GAUGE BYTESStandard/Advanced/self-hosted
sql.mem.root.max
Memory usage per sql statement for root HISTOGRAM BYTESself-hosted
sql.misc.count
Number of other SQL statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.new_conns
Number of SQL connections created COUNTER COUNTStandard/Advanced/self-hosted
sql.pgwire_cancel.ignored
Number of pgwire query cancel requests that were ignored due to rate limiting COUNTER COUNTself-hosted
sql.pgwire_cancel.successful
Number of pgwire query cancel requests that were successful COUNTER COUNTself-hosted
sql.pgwire_cancel.total
Number of pgwire query cancel requests COUNTER COUNTself-hosted
sql.query.count
Number of SQL operations started including queries, and transaction control statements COUNTER COUNTStandard/Advanced/self-hosted
sql.query.unique.count
Cardinality estimate of the set of statement fingerprints COUNTER COUNTself-hosted
sql.select.count
Number of SQL SELECT statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.service.latency
Latency of SQL request execution HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
sql.statements.active
Number of currently active user SQL statements GAUGE COUNTStandard/Advanced/self-hosted
sql.txn.abort.count
Number of SQL transaction abort errors COUNTER COUNTStandard/Advanced/self-hosted
sql.txn.begin.count
Number of SQL transaction BEGIN statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.txn.commit.count
Number of SQL transaction COMMIT statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.txn.contended.count
Number of SQL transactions experienced contention COUNTER COUNTself-hosted
sql.txn.latency
Latency of SQL transactions HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
sql.txn.rollback.count
Number of SQL transaction ROLLBACK statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
sql.txns.open
Number of currently open user SQL transactions GAUGE COUNTStandard/Advanced/self-hosted
sql.update.count
Number of SQL UPDATE statements successfully executed COUNTER COUNTStandard/Advanced/self-hosted
storage.keys.range-key-set.count
Approximate count of RangeKeySet internal keys across the storage engine. GAUGE COUNTself-hosted
storage.l0-level-score
Compaction score of level 0 GAUGE COUNTself-hosted
storage.l0-level-size
Size of the SSTables in level 0 GAUGE BYTESself-hosted
storage.l0-num-files
Number of SSTables in Level 0 GAUGE COUNTAdvanced/self-hosted
storage.l0-sublevels
Number of Level 0 sublevels GAUGE COUNTAdvanced/self-hosted
storage.l1-level-score
Compaction score of level 1 GAUGE COUNTself-hosted
storage.l1-level-size
Size of the SSTables in level 1 GAUGE BYTESself-hosted
storage.l2-level-score
Compaction score of level 2 GAUGE COUNTself-hosted
storage.l2-level-size
Size of the SSTables in level 2 GAUGE BYTESself-hosted
storage.l3-level-score
Compaction score of level 3 GAUGE COUNTself-hosted
storage.l3-level-size
Size of the SSTables in level 3 GAUGE BYTESself-hosted
storage.l4-level-score
Compaction score of level 4 GAUGE COUNTself-hosted
storage.l4-level-size
Size of the SSTables in level 4 GAUGE BYTESself-hosted
storage.l5-level-score
Compaction score of level 5 GAUGE COUNTself-hosted
storage.l5-level-size
Size of the SSTables in level 5 GAUGE BYTESself-hosted
storage.l6-level-score
Compaction score of level 6 GAUGE COUNTself-hosted
storage.l6-level-size
Size of the SSTables in level 6 GAUGE BYTESself-hosted
storage.marked-for-compaction-files
Count of SSTables marked for compaction GAUGE COUNTself-hosted
storage.write-stalls
Number of instances of intentional write stalls to backpressure incoming writes COUNTER COUNTself-hosted
sys.cgo.allocbytes
Current bytes of memory allocated by cgo GAUGE BYTESAdvanced/self-hosted
sys.cgo.totalbytes
Total bytes of memory allocated by cgo, but not released GAUGE BYTESAdvanced/self-hosted
sys.cgocalls
Total number of cgo calls COUNTER COUNTAdvanced/self-hosted
sys.cpu.combined.percent-normalized
Current user+system cpu percentage consumed by the CRDB process, normalized 0-1 by number of cores GAUGE PERCENTAdvanced/self-hosted
sys.cpu.host.combined.percent-normalized
Current user+system cpu percentage across the whole machine, normalized 0-1 by number of cores GAUGE PERCENTself-hosted
sys.cpu.sys.ns
Total system cpu time consumed by the CRDB process COUNTER NANOSECONDSAdvanced/self-hosted
sys.cpu.sys.percent
Current system cpu percentage consumed by the CRDB process GAUGE PERCENTAdvanced/self-hosted
sys.cpu.user.ns
Total user cpu time consumed by the CRDB process COUNTER NANOSECONDSAdvanced/self-hosted
sys.cpu.user.percent
Current user cpu percentage consumed by the CRDB process GAUGE PERCENTAdvanced/self-hosted
sys.fd.open
Process open file descriptors GAUGE COUNTAdvanced/self-hosted
sys.fd.softlimit
Process open FD soft limit GAUGE COUNTAdvanced/self-hosted
sys.gc.count
Total number of GC runs COUNTER COUNTAdvanced/self-hosted
sys.gc.pause.ns
Total GC pause COUNTER NANOSECONDSAdvanced/self-hosted
sys.gc.pause.percent
Current GC pause percentage GAUGE PERCENTAdvanced/self-hosted
sys.go.allocbytes
Current bytes of memory allocated by go GAUGE BYTESAdvanced/self-hosted
sys.go.totalbytes
Total bytes of memory allocated by go, but not released GAUGE BYTESAdvanced/self-hosted
sys.goroutines
Current number of goroutines GAUGE COUNTAdvanced/self-hosted
sys.host.disk.iopsinprogress
IO operations currently in progress on this host (as reported by the OS) GAUGE COUNTAdvanced/self-hosted
sys.host.disk.read.bytes
Bytes read from all disks since this process started (as reported by the OS) COUNTER BYTESAdvanced/self-hosted
sys.host.disk.read.count
Disk read operations across all disks since this process started (as reported by the OS) COUNTER COUNTAdvanced/self-hosted
sys.host.disk.write.bytes
Bytes written to all disks since this process started (as reported by the OS) COUNTER BYTESAdvanced/self-hosted
sys.host.disk.write.count
Disk write operations across all disks since this process started (as reported by the OS) COUNTER COUNTAdvanced/self-hosted
sys.host.net.recv.bytes
Bytes received on all network interfaces since this process started (as reported by the OS) COUNTER BYTESAdvanced/self-hosted
sys.host.net.send.bytes
Bytes sent on all network interfaces since this process started (as reported by the OS) COUNTER BYTESAdvanced/self-hosted
sys.rss
Current process RSS GAUGE BYTESAdvanced/self-hosted
sys.runnable.goroutines.per.cpu
Average number of goroutines that are waiting to run, normalized by number of cores GAUGE COUNTAdvanced/self-hosted
sys.totalmem
Total memory (both free and used) GAUGE BYTESAdvanced/self-hosted
sys.uptime
Process uptime COUNTER SECONDSStandard/Advanced/self-hosted
sysbytes
Number of bytes in system KV pairs GAUGE BYTESAdvanced/self-hosted
syscount
Count of system KV pairs GAUGE COUNTAdvanced/self-hosted
tenant.consumption.cross_region_network_ru
Total number of RUs charged for cross-region network traffic COUNTER COUNTself-hosted
tenant.consumption.external_io_egress_bytes
Total number of bytes written to external services such as cloud storage providers COUNTER COUNTself-hosted
tenant.consumption.pgwire_egress_bytes
Total number of bytes transferred from a SQL pod to the client COUNTER COUNTself-hosted
tenant.consumption.read_batches
Total number of KV read batches COUNTER COUNTself-hosted
tenant.consumption.read_bytes
Total number of bytes read from KV COUNTER COUNTself-hosted
tenant.consumption.read_requests
Total number of KV read requests COUNTER COUNTself-hosted
tenant.consumption.request_units
Total RU consumption COUNTER COUNTself-hosted
tenant.consumption.sql_pods_cpu_seconds
Total amount of CPU used by SQL pods COUNTER SECONDSself-hosted
tenant.consumption.write_batches
Total number of KV write batches COUNTER COUNTself-hosted
tenant.consumption.write_bytes
Total number of bytes written to KV COUNTER COUNTself-hosted
tenant.consumption.write_requests
Total number of KV write requests COUNTER COUNTself-hosted
tenant.sql_usage.cross_region_network_ru
Total number of RUs charged for cross-region network traffic COUNTER COUNTStandard/self-hosted
tenant.sql_usage.estimated_cpu_seconds
Estimated amount of CPU consumed by a virtual cluster COUNTER SECONDSStandard/self-hosted
tenant.sql_usage.external_io_egress_bytes
Total number of bytes written to external services such as cloud storage providers COUNTER COUNTStandard/self-hosted
tenant.sql_usage.external_io_ingress_bytes
Total number of bytes read from external services such as cloud storage providers COUNTER COUNTStandard/self-hosted
tenant.sql_usage.kv_request_units
RU consumption attributable to KV COUNTER COUNTStandard/self-hosted
tenant.sql_usage.pgwire_egress_bytes
Total number of bytes transferred from a SQL pod to the client COUNTER COUNTStandard/self-hosted
tenant.sql_usage.provisioned_vcpus
Number of vcpus available to the virtual cluster GAUGE COUNTStandard/self-hosted
tenant.sql_usage.read_batches
Total number of KV read batches COUNTER COUNTStandard/self-hosted
tenant.sql_usage.read_bytes
Total number of bytes read from KV COUNTER COUNTStandard/self-hosted
tenant.sql_usage.read_requests
Total number of KV read requests COUNTER COUNTStandard/self-hosted
tenant.sql_usage.request_units
RU consumption COUNTER COUNTStandard/self-hosted
tenant.sql_usage.sql_pods_cpu_seconds
Total amount of CPU used by SQL pods COUNTER SECONDSStandard/self-hosted
tenant.sql_usage.write_batches
Total number of KV write batches COUNTER COUNTStandard/self-hosted
tenant.sql_usage.write_bytes
Total number of bytes written to KV COUNTER COUNTStandard/self-hosted
tenant.sql_usage.write_requests
Total number of KV write requests COUNTER COUNTStandard/self-hosted
timeseries.write.bytes
Total size in bytes of metric samples written to disk COUNTER BYTESAdvanced/self-hosted
timeseries.write.errors
Total errors encountered while attempting to write metrics to disk COUNTER COUNTAdvanced/self-hosted
timeseries.write.samples
Total number of metric samples written to disk COUNTER COUNTAdvanced/self-hosted
totalbytes
Total number of bytes taken up by keys and values including non-live data GAUGE BYTESAdvanced/self-hosted
txn.aborts
Number of aborted KV transactions COUNTER COUNTStandard/Advanced/self-hosted
txn.commits
Number of committed KV transactions (including 1PC) COUNTER COUNTStandard/Advanced/self-hosted
txn.commits1PC
Number of KV transaction one-phase commits COUNTER COUNTStandard/Advanced/self-hosted
txn.durations
KV transaction durations HISTOGRAM NANOSECONDSStandard/Advanced/self-hosted
txn.restarts
Number of restarted KV transactions HISTOGRAM COUNTStandard/Advanced/self-hosted
txn.restarts.asyncwritefailure
Number of restarts due to async consensus writes that failed to leave intents COUNTER COUNTself-hosted
txn.restarts.readwithinuncertainty
Number of restarts due to reading a new value within the uncertainty interval COUNTER COUNTself-hosted
txn.restarts.serializable
Number of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE COUNTER COUNTStandard/Advanced/self-hosted
txn.restarts.txnaborted
Number of restarts due to an abort by a concurrent transaction (usually due to deadlock) COUNTER COUNTself-hosted
txn.restarts.txnpush
Number of restarts due to a transaction push failure COUNTER COUNTself-hosted
txn.restarts.unknown
Number of restarts due to a unknown reasons COUNTER COUNTself-hosted
txn.restarts.writetooold
Number of restarts due to a concurrent writer committing first COUNTER COUNTStandard/Advanced/self-hosted
txnwaitqueue.deadlocks_total
Number of deadlocks detected by the txn wait queue COUNTER COUNTself-hosted
valbytes
Number of bytes taken up by values GAUGE BYTESAdvanced/self-hosted
valcount
Count of all values GAUGE COUNTAdvanced/self-hosted

See also

×