Query by fqdn_label instead of instance for host metrics

Currently we are using `instance` label to query about host metrics to
prometheus. This label is assigned to the url of each endpoint being
scrapped.

While this work fine in one-exporter-per-compute cases as the driver is
mapping the fqdn_label value to the `instance` label value, it fails
when there are more that one target with the same value for the fqdn
label. This is a valid case, to be able to query by fqdn and do not
care about what exporter in the host is providing the metric.

This patch is changing the queries we use for hosts to be based on the
fqdn_label instead of the instance one. To implement it, we are also
simplifying the way we check the metric exist for the host by converting
prometheus_fqdn_instance_map into a prometheus_fqdn_labels set
which stores the list of fqdn found in  prometheus.

Closes-Bug: #2103451
Change-Id: I3bcc317441b73da5c876e53edd4622370c6d575e
This commit is contained in:
Alfredo Moralejo
2025-03-17 19:06:28 +01:00
parent 52bba70fec
commit a65e7e9b59
4 changed files with 173 additions and 90 deletions

View File

@@ -0,0 +1,9 @@
---
fixes:
- |
When using prometheus datasource and more that one target has the same value
for the `fqdn_label`, the driver used the wrong instance label to query for host
metrics. The `instance` label is no longer used in the queries but the `fqdn_label`
which identifies all the metrics for a specific compute node.
.. _Bug 2103451: https://bugs.launchpad.net/watcher/+bug/2103451