For compute nodes, nova works fine if a destination node is not
specified, so this change makes sure we're not passing None when the
user does not set one to avoid an error.
Partial-Bug: 2108988
Change-Id: Ida1f18b97697c041819e29f935aa5e232848226a
Currently host maintenance strategy also migrate instances from maintenance
node to watcher_disabled compute nodes.
watcher_disabled compute nodes might be disabled for some other purpose
by different strategy. If host maintenace use those compute nodes for
migration, It might affect customer workloads.
Host maintenance strategy should never touch disabled hosts unless the user
specify a disable host as backup node.
This cr drops the logic for using disabled compute node for maintenance.
Host maintaince is already using nova schedular for migrating the
instance, will use the same. If there is no available node, strategy
will fail.
Closes-Bug: #2109945
Change-Id: If9795fd06f684eb67d553405cebd8a30887c3997
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
Set the default interface for keystone_client to public in the watcher
conf instead of admin.
Closes-Bug: 2109494
Change-Id: I9e0289249981ca965190df6dbdc37e09fd0951d7
pip 23.1 removed the "setup.py install" fallback for projects that do
not have pyproject.toml and now uses a pyproject.toml which is vendored
in pip [1][2]. pip 24.2 has now deprecated a similar fallback to
"setup.py develop" and plans to fully remove this in pip 25.0 [3][4][5].
pbr supports editable installs since 6.0.0
pip 25.1 has now been released and the removal is complete.
by adding our own minimal pyproject.toml to ensure we are using the
correct build system.
This change also requires that we adapt how we generate our wsgi
entry point. when pyproject.toml is used the wsgi console script is
not generated in an editbale install such as is used in devstck
To adress this we need to refactor our usage of our wsgi applciation
to use a module path instead. This change does not remove
the declaration of our wsgi_scrtip entry point but it shoudl
be considered deprecated and it will be removed in the future.
To unblock the gate the devstack plugin is modifed to to deploy
using the wsgi module instead of the console script.
Finally supprot for the mod_wsgi wsgi mode is removed.
that was deprecated in devstack a few cycle ago and
support was removed in I8823e98809ed6b66c27dbcf21a00eea68ef403e8
[1] https://pip.pypa.io/en/stable/news/#v23-1
[2] https://github.com/pypa/pip/issues/8368
[3] https://pip.pypa.io/en/stable/news/#v24-2
[4] https://github.com/pypa/pip/issues/11457
[5] https://ichard26.github.io/blog/2024/08/whats-new-in-pip-24.2/
Closes-Bug: #2109608
Depends-on: https://review.opendev.org/c/openstack/watcher/+/948502
Change-Id: Iad77939ab0403c5720c549f96edfc77d2b7d90ee
Currently we are using `instance` label to query about host metrics to
prometheus. This label is assigned to the url of each endpoint being
scrapped.
While this work fine in one-exporter-per-compute cases as the driver is
mapping the fqdn_label value to the `instance` label value, it fails
when there are more that one target with the same value for the fqdn
label. This is a valid case, to be able to query by fqdn and do not
care about what exporter in the host is providing the metric.
This patch is changing the queries we use for hosts to be based on the
fqdn_label instead of the instance one. To implement it, we are also
simplifying the way we check the metric exist for the host by converting
prometheus_fqdn_instance_map into a prometheus_fqdn_labels set
which stores the list of fqdn found in prometheus.
Closes-Bug: #2103451
Change-Id: I3bcc317441b73da5c876e53edd4622370c6d575e
The Monasca project was marked inactive during 2023.1. Although we have
seen multiple people showing interest to keep the project, we haven't
seen any real progress.
Because the project is likely retired soon, let's deprecate the feature
dependent on Monasca so that we can remove it in a future release.
Change-Id: Ifd64f5ba59bbac238ff62302ec36a3e36954d6d0
In order to support vm_workload_consolidation, workload_balance and
workload_stabilization strategis some instance metrics are required.
This patch is adding support for them.
Implementation is based on a prometheus store populated using sg-core
from ceilometer metrics with Pollster source.
- instance_ram_usage: rely on ceilometer_memory_usage metrics created from
ceilometer memory.usage meter.
- instance_ram_allocated: rely on the memory value provided by the
inventory created from nova and placement APIs.
- instance_cpu_usage: rely on ceilometer_cpu metric created from
ceilometer cpu meter. A max value of 100 is set in the query.
- instance_root_disk_size: rely on the `disk` value provided by the
inventory created from nova and placement APIs.
A new parameterer `instance_uuid_label` has been added to the prometheus
datasource configuration to identify the label used to store the value of the
OpenStack instance uuid for eache instance metric in prometheus. Default
value is `resource`.
Change-Id: I2f2b56aa002014e511a5e48398ef1da43fc4f5e2
This adds a new data source for the Watcher decision engine that
implements the watcher.decision_engine.datasources.DataSourceBase.
related spec was merged at [1].
Implements: blueprint prometheus-datasource
[1] https://review.opendev.org/c/openstack/watcher-specs/+/933300
Change-Id: I6a70c4acc70a864c418cf347f5f6951cb92ec906
This datasource requires Ceilometer API which was already removed some
years ago. The implementation should have been removed when dependency
on ceilometerclient was removed by [1].
Also remove some job definitions which are not actually used.
[1] 01d74d0a87
Change-Id: I29c3865dc1207f1bbbb266e4217cf8888afebfb6
This chanage enabled codespell in precommit and
fixes the existing typos.
A followup commit will enable this in tox and ci.
Change-Id: I0a11bcd5a88247a48d3437525fc8a3cb3cdd4e58
This change adds configuration for the pre-commit tool,
follow-up changes will address the remaining issues in a phased
approach to make the reviews simpler.
This is based on the pre-commit config used in nova
with some additional hooks.
Follow-up changes will address the FIXME comments
related to sphinx-lint and codespell, as well as update tox
to enforce these checks in ci.
Change-Id: I87681a19f7fa88366c2b0d310c8b3153aa6a137b
As per the community goal of migrating the policy file
the format from JSON to YAML[1], we need to do two things:
1. Change the default value of '[oslo_policy] policy_file''
config option from 'policy.json' to 'policy.yaml' with
upgrade checks.
2. Deprecate the JSON formatted policy file on the project side
via warning in doc and releasenotes.
Also replace policy.json to policy.yaml ref from doc and tests.
[1]https://governance.openstack.org/tc/goals/selected/wallaby/migrate-policy-format-from-json-to-yaml.html
Change-Id: I207c02ba71fe60635fd3406c9c9364c11f259bae
Add the releasenote for the general purpose decision engine threadpool.
Including config parameters and how contributors can find relevant
documentation.
Implements: blueprint general-purpose-decision-engine-threadpool
Change-Id: I3560069b4e34f13305950559a0f05f7921f7867e
This strategy is used to centralize VMs to as few nodes as possible
by VM migration. User can set a input parameter to decide how to
select the destination node.
Implements: blueprint node-resource-consolidation
Closes-Bug: #1843016
Change-Id: I104c864d532c2092f5dc6f0c8f756ebeae12f09e
Add call_retry method for ModelBuilder classes along with configuration
options. This allows ModelBuilder classes to reattempt any failed calls
to external services such as Nova or Ironic.
Change-Id: Ided697adebed957e5ff13b4c6b5b06c816f81c4a
This is the releasenote for the new grafana datasource it refers to
the documentation on configuring grafana.
Depends-on: Ib12b6a7882703e84a27c301e821c1a034b192508
Change-Id: Icb3939d772f06ad2d66eeba9a59fa8b60822ece0
This patch implements uWSGI support for Watcher API service.
Because mod_wsgi is deprecated, using uwsgi to replace of mod_wsgi.
Most of Openstack projects have finished it.
Closes-Bug: #1834392
Change-Id: I3fad8d30a15aba493fb91da9337c2515ddea5167
Moves the query_retry method into the baseclass and makes the query
retry and timeout options part of the watcher_datasources config group.
This makes the query_retry behavior uniform across all datasources.
A new baseclass method named query_retry_reset is added so datasources
can define operations to perform when recovering from a query error.
Test cases are added to verify the behavior of query_retry.
The query_max_retries and query_timeout config parameters are
deprecated in the gnocchi_client group and will be removed in a future
release.
Change-Id: I33e9dc2d1f5ba8f83fcf1488ff583ca5be5529cc
This patch added Placement to Watcher
We plan to improve the data model and strategies in
the future specs.
Change-Id: I7141459eef66557cd5d525b5887bd2a381cdac3f
Implements: blueprint support-placement-api
The [nova_client]/api_version defaults to 2.56 since
change Idd6ebc94f81ad5d65256c80885f2addc1aaeaae1. There
is compatibility code for that change but if 2.56 is
not available watcher_non_live_migrate_instance will
still fail if a destination host is used.
Since 2.56 has been available since the Queens version of
nova it should be reasonable to require at least that
version of nova is running for using Watcher.
This adds code which enforces the minimum version along
with a release note and "watcher-status upgrade check"
check method.
Note that it's kind of weird for watcher to have a config
option like nova_client.api_version since compute API
microversions are per API request even though novaclient
is constructed with the single configured version. It should
really be something the client (watcher in this case) determines
using version discovery and gracefully enables features if
the required nova API version is available, but that's a bigger
change.
Change-Id: Id34938c7bb8a5ca934d997e52cac3b365414c006
Override the metric map of each datasource as soon as it is created by
the manager. This override comes from a file whose path is provided by
a setting in config file.
Loading at creation time allows the correct datasource be used when
get_backend is called, this allows loading a datasource whose metric
names get updated outside the watcher's codebase.
The function 'load_metric_map' returns empty-dict in any error case.
Also in case the file is empty where safe_load is unable finds any
yaml documents, it will return None. [1]
Some minor refactoring in the test_manager file for readability and
added tests for file load and metric override.
1 - https://pyyaml.org/wiki/PyYAMLDocumentation
Change-Id: I1df16245f4c7dfd34066f3ab0553cd67154faa58
Implements: blueprint file-based-metric-map
Some users may want to create keystoneclient by specifying the
type of endpoint and region name, so we need to supply the option
for user to choose.
Implements: blueprint support-keystoneclient-option
Change-Id: I49b33a69ec99d2a91568ce27ef89dc80b75e7091
Allows to define a global preference for metric datasources with the
ability for strategy specific overrides. In addition, strategies which
do not require datasources have the config options removed this is
done to prevent confusion.
Some documentation that details the inner workings of selecting
datasources is updated.
Imports for some files in watcher/common have been changed to resolve
circular dependencies and now match the overall method to import
configuration.
Addtional datasources will be retrieved by the manager if the
datasource throws an error.
Implements: blueprint global-datasource-preference
Change-Id: I6fc455b288e338c20d2c4cfec5a0c95350bebc36