Commit Graph

105 Commits

Author SHA1 Message Date
Alfredo Moralejo
90009aac84 Check result of retype action based on type and status
Currently, when there is a volume_migrate action and migration_type is
`retype`, watcher assumes that the retype always triggers a migration
and checks the result of the retype based on the fields related to
the migration action (actually, it uses the same function to check the
result when `migration_type` is `retype` or `migrate`. This creates
problem in different scenarios:

- Actions keep in ONGOING status forever for volumes which have never
  being migrated as the migration fields of the volume are empty.
- Actions which were migrated anytime before, still have the old values
  so it may report the status of te retype actions wrongly.

This patch is implementing an entirely new function to check the result
of a retype action based on the final type and the status field of the
volume. This should be valid for any kind of retype action, with or
without migration. The criteria for successfull retype is that the type
for the volume is the destination one in the action and the status is
available or in-use.

Closes-Bug: #2112100

Change-Id: I76e91ed99e7a814a43a6dd906b6bcc150d471624
Signed-off-by: jgilaber <jgilaber@redhat.com>
2025-09-01 16:59:38 +02:00
Zuul
58b25101e6 Merge "Return HTTP code 400 when creating an audit with wrong parameters" 2025-05-27 19:23:25 +00:00
Zuul
20f231054a Merge "Set actionplan state to FAILED if any action has failed" 2025-05-26 14:44:37 +00:00
Alfredo Moralejo
88d81c104e Set actionplan state to FAILED if any action has failed
Currently, an actionplan state is set to SUCCEEDED once the execution
has finished, but that does not imply that all the actions finished
successfully.

This patch is checking the actual state of all the actions in the plan
after the execution has finished. If any action has status FAILED, it
will set the state of the action plan as FAILED and will apply the
appropiate notification parameters. This is the expected behavior according
to Watcher documentation.

The patch is also fixing the unit test for this to set the expected
action plan state to FAILED and notification parameters.

Closes-Bug: #2106407
Change-Id: I7bfc6759b51cd97c26ec13b3918bd8d3b7ac9d4e
2025-05-26 14:58:03 +02:00
Zuul
26e36e1620 Merge "Handle missing dst_node parameter in zone_migration" 2025-05-20 17:14:29 +00:00
Zuul
3585e0cc3e Merge "Drop code from Host maintenance strategy migrating instance to disabled hosts" 2025-05-16 18:18:26 +00:00
jgilaber
c6302edeca Handle missing dst_node parameter in zone_migration
For compute nodes, nova works fine if a destination node is not
specified, so this change makes sure we're not passing None when the
user does not set one to avoid an error.

Partial-Bug: 2108988

Change-Id: Ida1f18b97697c041819e29f935aa5e232848226a
2025-05-16 13:51:47 +02:00
Alfredo Moralejo
4629402f38 Return HTTP code 400 when creating an audit with wrong parameters
Currently, when trying to create an audit which misses a mandatory
parameter watcher returns error 500 instead of 400 which is the
documented error in the API [1] and the appropiate error code for
malformed requests.

This patch catch parameters validation errors according to the json
schema for each strategy and returns error 400. It also fixes the
unit test to validate the expected behavior.

[1] https://docs.openstack.org/api-ref/resource-optimization/#audits

Closes-Bug: #2110538
Change-Id: I23232b3b54421839bb01d54386d4e7b244f4e2a0
2025-05-16 09:35:50 +02:00
Zuul
86a260a2c7 Merge "Set keystone_client default interface to public" 2025-05-15 12:45:52 +00:00
Chandan Kumar (raukadah)
9dea55bd64 Drop code from Host maintenance strategy migrating instance to disabled hosts
Currently host maintenance strategy also migrate instances from maintenance
node to watcher_disabled compute nodes.

watcher_disabled compute nodes might be disabled for some other purpose
by different strategy. If host maintenace use those compute nodes for
migration, It might affect customer workloads.

Host maintenance strategy should never touch disabled hosts unless the user
specify a disable host as backup node.

This cr drops the logic for using disabled compute node for maintenance.
Host maintaince is already using nova schedular for migrating the
instance, will use the same. If there is no available node, strategy
will fail.

Closes-Bug: #2109945

Change-Id: If9795fd06f684eb67d553405cebd8a30887c3997
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-05-14 09:24:25 +05:30
Douglas Viroel
17d1cf535a Deprecated Noisy Neighbor strategy
Noisy neighbor strategy is a proof of concept strategy that was
built based on LLC metric, which is not available in Nova since
Victoria release[1].
This patch marks this strategy as deprecated, to be removed in
future releases.

[1] https://docs.openstack.org/releasenotes/nova/victoria.html#relnotes-22-0-0-unmaintained-victoria-upgrade-notes

Change-Id: I940b88555007312c76a86706bd44a38fbcf7701e
2025-05-12 15:44:39 -03:00
jgilaber
ae48f65f20 Set keystone_client default interface to public
Set the default interface for keystone_client to public in the watcher
conf instead of admin.

Closes-Bug: 2109494

Change-Id: I9e0289249981ca965190df6dbdc37e09fd0951d7
2025-05-09 08:16:51 +02:00
Sean Mooney
57b248f9fe Add support for pyproject.toml and wsgi module paths
pip 23.1 removed the "setup.py install" fallback for projects that do
not have pyproject.toml and now uses a pyproject.toml which is vendored
in pip [1][2]. pip 24.2 has now deprecated a similar fallback to
"setup.py develop" and plans to fully remove this in pip 25.0 [3][4][5].
pbr supports editable installs since 6.0.0

pip 25.1 has now been released and the removal is complete.
by adding our own minimal pyproject.toml to ensure we are using the
correct build system.

This change also requires that we adapt how we generate our wsgi
entry point. when pyproject.toml is used the wsgi console script is
not generated in an editbale install such as is used in devstck

To adress this we need to refactor our usage of our wsgi applciation
to use a module path instead. This change does not remove
the declaration of our wsgi_scrtip entry point but it shoudl
be considered deprecated and it will be removed in the future.

To unblock the gate the devstack plugin is modifed to to deploy
using the wsgi module instead of the console script.

Finally supprot for the mod_wsgi wsgi mode is removed.
that was deprecated in devstack a few cycle ago and
support was removed in I8823e98809ed6b66c27dbcf21a00eea68ef403e8

[1] https://pip.pypa.io/en/stable/news/#v23-1
[2] https://github.com/pypa/pip/issues/8368
[3] https://pip.pypa.io/en/stable/news/#v24-2
[4] https://github.com/pypa/pip/issues/11457
[5] https://ichard26.github.io/blog/2024/08/whats-new-in-pip-24.2/
Closes-Bug: #2109608

Depends-on: https://review.opendev.org/c/openstack/watcher/+/948502
Change-Id: Iad77939ab0403c5720c549f96edfc77d2b7d90ee
2025-05-01 00:19:59 +00:00
Alfredo Moralejo
a65e7e9b59 Query by fqdn_label instead of instance for host metrics
Currently we are using `instance` label to query about host metrics to
prometheus. This label is assigned to the url of each endpoint being
scrapped.

While this work fine in one-exporter-per-compute cases as the driver is
mapping the fqdn_label value to the `instance` label value, it fails
when there are more that one target with the same value for the fqdn
label. This is a valid case, to be able to query by fqdn and do not
care about what exporter in the host is providing the metric.

This patch is changing the queries we use for hosts to be based on the
fqdn_label instead of the instance one. To implement it, we are also
simplifying the way we check the metric exist for the host by converting
prometheus_fqdn_instance_map into a prometheus_fqdn_labels set
which stores the list of fqdn found in  prometheus.

Closes-Bug: #2103451
Change-Id: I3bcc317441b73da5c876e53edd4622370c6d575e
2025-03-19 15:25:24 +01:00
Sean Mooney
bbf5c41cab Add epoxy prelude
This change added the prelude for the 2025.1 Expoxy release cycle.

Change-Id: I8223842a57491a91c565e47bd1819db4d142e628
2025-03-05 17:57:55 +00:00
Takashi Kajinami
977f014cba Deprecate Monasca data source
The Monasca project was marked inactive during 2023.1. Although we have
seen multiple people showing interest to keep the project, we haven't
seen any real progress.

Because the project is likely retired soon, let's deprecate the feature
dependent on Monasca so that we can remove it in a future release.

Change-Id: Ifd64f5ba59bbac238ff62302ec36a3e36954d6d0
2025-02-16 18:45:31 +09:00
Zuul
4527f89d8d Merge "Add support for instance metrics to prometheus datasource" 2025-02-03 13:22:28 +00:00
Zuul
e535177bc0 Merge "Remove ceilometer datasource" 2025-01-29 13:22:46 +00:00
Alfredo Moralejo
136e5d927c Add support for instance metrics to prometheus datasource
In order to support vm_workload_consolidation, workload_balance and
workload_stabilization strategis some instance metrics are required.
This patch is adding support for them.

Implementation is based on a prometheus store populated using sg-core
from ceilometer metrics with Pollster source.

- instance_ram_usage: rely on ceilometer_memory_usage metrics created from
  ceilometer memory.usage meter.
- instance_ram_allocated: rely on the memory value provided by the
  inventory created from nova and placement APIs.
- instance_cpu_usage: rely on ceilometer_cpu metric created from
  ceilometer cpu meter. A max value of 100 is set in the query.
- instance_root_disk_size: rely on the `disk` value provided by the
  inventory created from nova and placement APIs.

A new parameterer `instance_uuid_label` has been added to the prometheus
datasource configuration to identify the label used to store the value of the
OpenStack instance uuid for eache instance metric in prometheus. Default
value is `resource`.

Change-Id: I2f2b56aa002014e511a5e48398ef1da43fc4f5e2
2025-01-23 13:23:04 +01:00
m
3f26dc47f2 Add prometheus data source for watcher decision engine
This adds a new data source for the Watcher decision engine that
implements the watcher.decision_engine.datasources.DataSourceBase.

related spec was merged at [1].

Implements: blueprint prometheus-datasource

[1] https://review.opendev.org/c/openstack/watcher-specs/+/933300

Change-Id: I6a70c4acc70a864c418cf347f5f6951cb92ec906
2025-01-10 15:20:37 +02:00
Zuul
70ba13ca6d Merge "Update python versions, drop py3.8" 2024-12-21 01:58:27 +00:00
Takashi Kajinami
da23fdc621 Remove ceilometer datasource
This datasource requires Ceilometer API which was already removed some
years ago. The implementation should have been removed when dependency
on ceilometerclient was removed by [1].

Also remove some job definitions which are not actually used.

[1] 01d74d0a87

Change-Id: I29c3865dc1207f1bbbb266e4217cf8888afebfb6
2024-12-16 23:51:27 +09:00
Sean Mooney
5f79ab87c7 [pre-commit] fix typos and configure codespell
This chanage enabled codespell in precommit and
fixes the existing typos.

A followup commit will enable this in tox and ci.

Change-Id: I0a11bcd5a88247a48d3437525fc8a3cb3cdd4e58
2024-11-07 19:50:21 +00:00
Martin Kopec
6adaedf696 Update python versions, drop py3.8
The current testing runtime [1] states testing from py3.9
to 3.12. The patch updates setup.cfg to reflect the correct
python versions.

The patch also drops python 3.8 support following [2].

[1] https://governance.openstack.org/tc/reference/runtimes/2025.1.html
[2] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/FOWV4UQZTH4DPDA67QDEROAESYU5Z3LE/

Change-Id: I2d13409c9bfffc866e31af52611a26f6037021cc
2024-11-06 16:00:11 +01:00
Sean Mooney
9d8b990fd1 [pre-commit] Add initial pre-commit config
This change adds configuration for the pre-commit tool,
follow-up changes will address the remaining issues in a phased
approach to make the reviews simpler.

This is based on the pre-commit config used in nova
with some additional hooks.

Follow-up changes will address the FIXME comments
related to sphinx-lint and codespell, as well as update tox
to enforce these checks in ci.

Change-Id: I87681a19f7fa88366c2b0d310c8b3153aa6a137b
2024-10-22 20:12:53 +01:00
Ghanshyam Mann
863815153e [goal] Deprecate the JSON formatted policy file
As per the community goal of migrating the policy file
the format from JSON to YAML[1], we need to do two things:

1. Change the default value of '[oslo_policy] policy_file''
config option from 'policy.json' to 'policy.yaml' with
upgrade checks.

2. Deprecate the JSON formatted policy file on the project side
via warning in doc and releasenotes.

Also replace policy.json to policy.yaml ref from doc and tests.

[1]https://governance.openstack.org/tc/goals/selected/wallaby/migrate-policy-format-from-json-to-yaml.html

Change-Id: I207c02ba71fe60635fd3406c9c9364c11f259bae
2021-02-12 19:59:27 +00:00
Zuul
2591b03625 Merge "Add releasenote for event-driven-optimization-based" 2020-02-13 07:05:04 +00:00
licanwei
58083bb67b releasenotes: Fix reference url
Change-Id: I0da6021f6d39cb7d6e79e8f637046d8dd0285647
2020-02-05 16:48:49 +08:00
licanwei
f79321ceeb Add releasenote for event-driven-optimization-based
Change-Id: If8fa82dab2e7f0ae359805eb68cc8562cfc641e3
Implements: blueprint event-driven-optimization-based
2020-02-04 03:46:32 +00:00
Dantali0n
ba43f766b8 Releasenote for decision engine threadpool
Add the releasenote for the general purpose decision engine threadpool.
Including config parameters and how contributors can find relevant
documentation.

Implements: blueprint general-purpose-decision-engine-threadpool

Change-Id: I3560069b4e34f13305950559a0f05f7921f7867e
2019-11-30 03:13:15 +00:00
Ghanshyam Mann
17f5a65a62 [ussuri][goal] Drop python 2.7 support and testing
OpenStack is dropping the py2.7 support in ussuri cycle.

Watcher is ready with python 3 and ok to drop the
python 2.7 support.

Complete discussion & schedule can be found in
- http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010142.html
- https://etherpad.openstack.org/p/drop-python2-support

Ussuri Communtiy-wide goal:
https://governance.openstack.org/tc/goals/selected/ussuri/drop-py27.html

Depends-On: https://review.opendev.org/#/c/693631/

Change-Id: I603c6d2c22779e8ef2e70eb6369fc521a77c9c3a
2019-11-16 14:55:01 +00:00
licanwei
a88e076646 Watcher planner slector releasenote
Change-Id: I632a59d9e3cb6f5d0dad8987b1b01934d9ce0b42
Implements: bp watcher-planner-selector
2019-09-18 01:59:01 -07:00
Zuul
67e9e16d62 Merge "node resource consolidation" 2019-09-16 14:44:50 +00:00
chenke
03a6216da0 Add releasenote about bp show-datamodel-api
Partially Implements:blueprint show-datamodel-api

Change-Id: I2f8a41cd8f9f805bd3796cbd639bec233546b521
2019-09-10 09:39:10 +08:00
licanwei
f1fe4b6c62 node resource consolidation
This strategy is used to centralize VMs to as few nodes as possible
by VM migration. User can set a input parameter to decide how to
select the destination node.

Implements: blueprint node-resource-consolidation
Closes-Bug: #1843016
Change-Id: I104c864d532c2092f5dc6f0c8f756ebeae12f09e
2019-09-06 18:03:43 -07:00
licanwei
4b2238f9a5 add releasenote for bp improve-compute-data-model
Change-Id: I19780be28912cb0ea1cad49c7c0f43ab3ba8f6e7
Implements: blueprint improve-compute-data-model
2019-08-09 03:06:43 +00:00
Dantali0n
cadc000f32 Add call_retry for ModelBuilder for error recovery
Add call_retry method for ModelBuilder classes along with configuration
options. This allows ModelBuilder classes to reattempt any failed calls
to external services such as Nova or Ironic.

Change-Id: Ided697adebed957e5ff13b4c6b5b06c816f81c4a
2019-07-19 16:09:18 +02:00
Dantali0n
a45f5abe48 Releasenote for grafana datasource
This is the releasenote for the new grafana datasource it refers to
the documentation on configuring grafana.

Depends-on: Ib12b6a7882703e84a27c301e821c1a034b192508
Change-Id: Icb3939d772f06ad2d66eeba9a59fa8b60822ece0
2019-07-02 10:20:49 +02:00
licanwei
c1a5e443fe Add uWSGI support
This patch implements uWSGI support for Watcher API service.
Because mod_wsgi is deprecated, using uwsgi to replace of mod_wsgi.
Most of Openstack projects have finished it.

Closes-Bug: #1834392
Change-Id: I3fad8d30a15aba493fb91da9337c2515ddea5167
2019-06-27 14:56:52 +08:00
Zuul
667d2d661a Merge "Move datasource query_retry into baseclass." 2019-06-14 09:04:15 +00:00
Zuul
b2111baf91 Merge "Backwards compatibility for node parameter" 2019-06-14 07:42:09 +00:00
Zuul
5f126cffe0 Merge "Add Placement helper" 2019-06-14 02:21:44 +00:00
Dantali0n
584eeefdc8 Move datasource query_retry into baseclass.
Moves the query_retry method into the baseclass and makes the query
retry and timeout options part of the watcher_datasources config group.
This makes the query_retry behavior uniform across all datasources.

A new baseclass method named query_retry_reset is added so datasources
can define operations to perform when recovering from a query error.
Test cases are added to verify the behavior of query_retry.

The query_max_retries and query_timeout config parameters are
deprecated in the gnocchi_client group and will be removed in a future
release.

Change-Id: I33e9dc2d1f5ba8f83fcf1488ff583ca5be5529cc
2019-06-13 15:52:53 +02:00
Dantali0n
dd119ca1f8 Backwards compatibility for node parameter
Adds backwards compatibility for node parameter used by strategies. If
the node value is set by the user configuration it will override the
value for compute_node which is the value used by the strategies now.

This change was introduced in: https://review.opendev.org/#/c/656622/
Resolution discussed in the meeting on the 5th of June 2019
https://eavesdrop.openstack.org/meetings/watcher/2019/watcher.2019-06-05-08.00.log.html

Change-Id: Idaea062789a6b169e64f556fecc34cfbaaee5076
2019-06-12 16:23:58 +00:00
licanwei
b57feba5e8 Add Placement helper
This patch added Placement to Watcher
We plan to improve the data model and strategies in
the future specs.

Change-Id: I7141459eef66557cd5d525b5887bd2a381cdac3f
Implements: blueprint support-placement-api
2019-06-12 11:11:13 +08:00
Zuul
88fb097539 Merge "Audit API supports new force option" 2019-05-29 10:01:04 +00:00
Zuul
59306b9a47 Merge "Require nova_client.api_version >= 2.56" 2019-05-27 03:01:47 +00:00
licanwei
2afd0dfcf5 Audit API supports new force option
Depends-on:Ia08694d2fb76907ea14e64116af2e722fe930063

Change-Id: Ib2d221ea9c994dea396c54cc8d2d32237025a1d4
Implements: blueprint add-force-field-to-audit
2019-05-27 02:08:33 +00:00
Zuul
ba92791117 Merge "support-keystoneclient-option" 2019-05-25 07:01:10 +00:00
Matt Riedemann
7489126d83 Require nova_client.api_version >= 2.56
The [nova_client]/api_version defaults to 2.56 since
change Idd6ebc94f81ad5d65256c80885f2addc1aaeaae1. There
is compatibility code for that change but if 2.56 is
not available watcher_non_live_migrate_instance will
still fail if a destination host is used.

Since 2.56 has been available since the Queens version of
nova it should be reasonable to require at least that
version of nova is running for using Watcher.

This adds code which enforces the minimum version along
with a release note and "watcher-status upgrade check"
check method.

Note that it's kind of weird for watcher to have a config
option like nova_client.api_version since compute API
microversions are per API request even though novaclient
is constructed with the single configured version. It should
really be something the client (watcher in this case) determines
using version discovery and gracefully enables features if
the required nova API version is available, but that's a bigger
change.

Change-Id: Id34938c7bb8a5ca934d997e52cac3b365414c006
2019-05-23 15:49:19 -04:00