Commit Graph

2580 Commits

Author SHA1 Message Date
Zuul
58b25101e6 Merge "Return HTTP code 400 when creating an audit with wrong parameters" 2025-05-27 19:23:25 +00:00
Zuul
690a389369 Merge "Add a unit test to check the error when creating an audit with wrong parameters" 2025-05-27 19:23:23 +00:00
Zuul
1cdd392f96 Merge "Remove deprecated executor in message handling servers" 2025-05-26 14:44:39 +00:00
Zuul
20f231054a Merge "Set actionplan state to FAILED if any action has failed" 2025-05-26 14:44:37 +00:00
Zuul
077c36be8a Merge "Add unit test to check action plan state when a nested action fails" 2025-05-26 14:27:08 +00:00
Alfredo Moralejo
88d81c104e Set actionplan state to FAILED if any action has failed
Currently, an actionplan state is set to SUCCEEDED once the execution
has finished, but that does not imply that all the actions finished
successfully.

This patch is checking the actual state of all the actions in the plan
after the execution has finished. If any action has status FAILED, it
will set the state of the action plan as FAILED and will apply the
appropiate notification parameters. This is the expected behavior according
to Watcher documentation.

The patch is also fixing the unit test for this to set the expected
action plan state to FAILED and notification parameters.

Closes-Bug: #2106407
Change-Id: I7bfc6759b51cd97c26ec13b3918bd8d3b7ac9d4e
2025-05-26 14:58:03 +02:00
Zuul
8ac8a29fda Merge "Fix incorrect logging format" 2025-05-26 11:47:26 +00:00
Zuul
cd2910b0e9 Merge "Check logs in some cinder and nova helper tests" 2025-05-26 11:45:12 +00:00
Chandan Kumar (raukadah)
188e583dcb Drop sg_core related prometheus var
https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/950476
adds the support for passing custom scrape target and
https://github.com/openstack-k8s-operators/sg-core/pull/25
drops sg_core prometheus related vars.

So we also need to sg_core related prometheus vars from our job.
This cr achieves the same.

Depends-On: https://github.com/openstack-k8s-operators/sg-core/pull/25
Depends-On: https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/950476

Change-Id: I6c8f54f8749e81b532c88e9224022294c4a1d331
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-05-21 16:52:36 +05:30
Zuul
26e36e1620 Merge "Handle missing dst_node parameter in zone_migration" 2025-05-20 17:14:29 +00:00
Sean Mooney
040a7f5c41 update tests for new oslo.context release
context.user has been deprecated for years
and renamed to user_id

the deprecated field has now been removed so this
change updates our test cases to reflect that.

Change-Id: I120441fb9392c370c57dc63d8c115d8993d25f62
2025-05-19 19:11:23 +01:00
Zuul
3585e0cc3e Merge "Drop code from Host maintenance strategy migrating instance to disabled hosts" 2025-05-16 18:18:26 +00:00
Zuul
ba8370e1ad Merge "Migrate value column of efficacy indicator on load" 2025-05-16 18:16:23 +00:00
Zuul
97c4e70847 Merge "Add test for missing destination in zone migration" 2025-05-16 17:10:18 +00:00
jgilaber
c6302edeca Handle missing dst_node parameter in zone_migration
For compute nodes, nova works fine if a destination node is not
specified, so this change makes sure we're not passing None when the
user does not set one to avoid an error.

Partial-Bug: 2108988

Change-Id: Ida1f18b97697c041819e29f935aa5e232848226a
2025-05-16 13:51:47 +02:00
Alfredo Moralejo
b36ba8399e Add unit test to check action plan state when a nested action fails
This patch is adding a new unit test to check the behavior of the action
plan when one of the actions in it fails during execution.

Note this is to show a bug, and the expected state will be changed in
the fixing patch.

Related-Bug: #2106407
Change-Id: I2f3fe8f4da772a96db098066d253e5dee330101a
2025-05-16 09:52:28 +02:00
Alfredo Moralejo
4629402f38 Return HTTP code 400 when creating an audit with wrong parameters
Currently, when trying to create an audit which misses a mandatory
parameter watcher returns error 500 instead of 400 which is the
documented error in the API [1] and the appropiate error code for
malformed requests.

This patch catch parameters validation errors according to the json
schema for each strategy and returns error 400. It also fixes the
unit test to validate the expected behavior.

[1] https://docs.openstack.org/api-ref/resource-optimization/#audits

Closes-Bug: #2110538
Change-Id: I23232b3b54421839bb01d54386d4e7b244f4e2a0
2025-05-16 09:35:50 +02:00
Zuul
86a260a2c7 Merge "Set keystone_client default interface to public" 2025-05-15 12:45:52 +00:00
jgilaber
63626d6fc3 Add test for missing destination in zone migration
Add some tests to show that the zone migration strategy generates
problematic input parameters for actions in some cases when destination
parameters are not passed for instances or volumes.

Change-Id: Idc3af0e6d9d2d5388ff3d152d81e63364758607b
2025-05-15 13:00:39 +02:00
afanasev.s
0f5b6a07d0 Fix incorrect logging format
Fix incorrect logging format for multiple variables because of what this
functionality didn't work correctly and some log messages were skipped.
The logging calls require two arguments, but they are passed in a tuple
so it's interpreted as one argument only and it fails as is missing
the second argument.

Closes-Bug: 2110149

Change-Id: I74ed44134b50782c105a0e82f3af34a5fa45d119
2025-05-15 12:55:18 +02:00
jgilaber
7d90a079b0 Check logs in some cinder and nova helper tests
Check the debug logs for some methods in the cinder and nova helpers to
reproduce the erros described in bug [1]. The logger is disabled by default,
so the error was being ignored, in order to  show the error, the logger
needs to be enabled for the tests in question. The logging was disabled
by allembic configuring logging in [2], so this patch also removes that
logging config to expose the errors.

[1] https://bugs.launchpad.net/watcher/+bug/2110149.
[2] https://github.com/openstack/watcher/blob/master/watcher/db/sqlalchemy/alembic/env.py#L26

Change-Id: I3598ca1d08d260602c392f8a8098821faa53f570
2025-05-15 12:55:18 +02:00
Alfredo Moralejo
891119470c Add a unit test to check the error when creating an audit with wrong parameters
Currently, it is returning http error code 500 instead of 400, which
would be the appropiate code.

A follow-up patch will be sent with the vix and switching the error code
and message.

Related-Bug: #2110538
Change-Id: I35ccbb9cf29fc08e78c4d5f626a6518062efbed3
2025-05-14 17:01:59 +02:00
Chandan Kumar (raukadah)
9dea55bd64 Drop code from Host maintenance strategy migrating instance to disabled hosts
Currently host maintenance strategy also migrate instances from maintenance
node to watcher_disabled compute nodes.

watcher_disabled compute nodes might be disabled for some other purpose
by different strategy. If host maintenace use those compute nodes for
migration, It might affect customer workloads.

Host maintenance strategy should never touch disabled hosts unless the user
specify a disable host as backup node.

This cr drops the logic for using disabled compute node for maintenance.
Host maintaince is already using nova schedular for migrating the
instance, will use the same. If there is no available node, strategy
will fail.

Closes-Bug: #2109945

Change-Id: If9795fd06f684eb67d553405cebd8a30887c3997
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-05-14 09:24:25 +05:30
Douglas Viroel
b4ef969eec Remove deprecated executor in message handling servers
Removes the deprecated message executor when creating both RPC
and notification server instances. This parameter is deprecated[1],
as well eventlet option.
When not defined, the server will get the one that fits better the
current context (monkey patched or not)[2]

[1] 27d833e374
[2] 412ab4de92/oslo_messaging/_utils.py (L87)

Change-Id: I784407aa7db10bddcec5dc663e1cec65174631e0
2025-05-13 14:10:18 -03:00
jgilaber
322c89d982 Migrate value column of efficacy indicator on load
In a recent change [1] we modified the database schema for efficacy
indicators to use a 'data' column. However, that patch only contained
the schema migration and a fallback to be able to read from older
databases, and not any kind of data migration. This change introduces
a migration on load, so whenever an efficacy indicator without a 'data'
column is loaded, the column is populated in the database. The change
also modifies the migration test to verify the procedure works well.

[1] https://review.opendev.org/c/openstack/watcher/+/945199

Change-Id: Ib0621b0e03451faca803018d6a2f3ad657a25fb5
2025-05-13 16:36:59 +02:00
Zuul
59607f616a Merge "Drop nova command reference from the code" 2025-05-13 12:39:25 +00:00
Chandan Kumar (raukadah)
3f6c7e406a Drop nova command reference from the code
In DevStack environment, nova service-list command does not
exist. Distro suggests to install python-novaclient from package.

In Strategies documentation, we generate the docs from following
code.[1]
```
       * - ``migration``
         - .. watcher-term:: watcher.applier.actions.migration.Migrate
       * - ``change_nova_service_state``
         - .. watcher-term:: watcher.applier.actions.change_nova_service_state.ChangeNovaServiceState
```
and with in code, we use nova python binding to get list services[2]
and we are not calling openstack cli reference with in the code.

Documenting the equivalent openstack command does not seems to be useful
in the help text as we are using python binding.

Links:
[1]. c4acce91d6/doc/source/strategies/host_maintenance.rst (L45)
[2]. c4acce91d6/watcher/common/nova_helper.py (L150-L152)

Change-Id: I0c663c9741fae94bdb9c30f46d3d396325a33948
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-05-13 08:49:54 +05:30
Zuul
fd3d8b67ff Merge "Set number of decimal digits in efficacy indicator" 2025-05-13 00:06:07 +00:00
Zuul
c73f126b15 Merge "Deprecated Noisy Neighbor strategy" 2025-05-12 23:50:12 +00:00
Douglas Viroel
17d1cf535a Deprecated Noisy Neighbor strategy
Noisy neighbor strategy is a proof of concept strategy that was
built based on LLC metric, which is not available in Nova since
Victoria release[1].
This patch marks this strategy as deprecated, to be removed in
future releases.

[1] https://docs.openstack.org/releasenotes/nova/victoria.html#relnotes-22-0-0-unmaintained-victoria-upgrade-notes

Change-Id: I940b88555007312c76a86706bd44a38fbcf7701e
2025-05-12 15:44:39 -03:00
jgilaber
ae48f65f20 Set keystone_client default interface to public
Set the default interface for keystone_client to public in the watcher
conf instead of admin.

Closes-Bug: 2109494

Change-Id: I9e0289249981ca965190df6dbdc37e09fd0951d7
2025-05-09 08:16:51 +02:00
jgilaber
0ed3d4de83 Set number of decimal digits in efficacy indicator
Configure the numeric type of the EfficacyIndicator value to use Float.
Add a new column named data and deprecate the existing value columen.
With the current model, value will use the default scale of the
Decimal type of mysql, which in some enviornments is 0.

This change also adds a test with mysql as backend to reproduce the
issue, since the existing tests using sqlite do not reproduce the
problem, as well as some simple migration tests.

Closes-Bug: #2103458
Change-Id: Ib281fa32e902d2181449091f493d6506b5199094
2025-05-07 16:20:31 +02:00
jgilaber
6c5845721b Add test for EfficacyIndicator value in mysql
Add a test with mysql as backend to show that the current
EfficacyIndicator model does not store any decimal digit for the value.

Change-Id: I0cdbd7d87cd6869a10b48eda3d59558831c8dd36
2025-05-07 16:20:03 +02:00
Sean Mooney
77e7e4ef7b drop jammy jobs
ubuntu jammy is nolonger part of the required
testing runtime so this change simply removes
the jammy jobs.

Change-Id: I1e3bbb14cea5b856e8146f3a32d60c3a4ffdcfcc
2025-05-02 17:41:06 +00:00
Sean Mooney
f38ab70ba4 drop suse supprot in the devstack plugin
suse has not been a testing runtime for a few releases
and we have no jobs currently validating it still work.

this change just removes the suse specific logic

Change-Id: I357fa71704af7aa6239054ede29d0fdcdc3fb8b5
2025-05-02 17:41:00 +00:00
Sean Mooney
7aabd6dd5a update pre-commit hook versions
This updates all hooks to there latest verions
notable this adds python 3.13 support to autopep8

Change-Id: Ia67ed74c9942ff26bb1f8c1d72bf57aedfcd3846
2025-05-02 17:40:50 +00:00
Zuul
1b12e80882 Merge "Make prometheus the default devstack example" 2025-05-02 13:50:50 +00:00
Zuul
9f685a8cf1 Merge "[host_maintenance] Pass des hostname in add_action solution" 2025-05-02 13:45:57 +00:00
Sean Mooney
57b248f9fe Add support for pyproject.toml and wsgi module paths
pip 23.1 removed the "setup.py install" fallback for projects that do
not have pyproject.toml and now uses a pyproject.toml which is vendored
in pip [1][2]. pip 24.2 has now deprecated a similar fallback to
"setup.py develop" and plans to fully remove this in pip 25.0 [3][4][5].
pbr supports editable installs since 6.0.0

pip 25.1 has now been released and the removal is complete.
by adding our own minimal pyproject.toml to ensure we are using the
correct build system.

This change also requires that we adapt how we generate our wsgi
entry point. when pyproject.toml is used the wsgi console script is
not generated in an editbale install such as is used in devstck

To adress this we need to refactor our usage of our wsgi applciation
to use a module path instead. This change does not remove
the declaration of our wsgi_scrtip entry point but it shoudl
be considered deprecated and it will be removed in the future.

To unblock the gate the devstack plugin is modifed to to deploy
using the wsgi module instead of the console script.

Finally supprot for the mod_wsgi wsgi mode is removed.
that was deprecated in devstack a few cycle ago and
support was removed in I8823e98809ed6b66c27dbcf21a00eea68ef403e8

[1] https://pip.pypa.io/en/stable/news/#v23-1
[2] https://github.com/pypa/pip/issues/8368
[3] https://pip.pypa.io/en/stable/news/#v24-2
[4] https://github.com/pypa/pip/issues/11457
[5] https://ichard26.github.io/blog/2024/08/whats-new-in-pip-24.2/
Closes-Bug: #2109608

Depends-on: https://review.opendev.org/c/openstack/watcher/+/948502
Change-Id: Iad77939ab0403c5720c549f96edfc77d2b7d90ee
2025-05-01 00:19:59 +00:00
Chandan Kumar (raukadah)
278cb7e98c [host_maintenance] Pass des hostname in add_action solution
Currently we are passing src_node and des_node uuid when we try to run
migrate action.

In the watcher-applier log, migration fails with following exception
```
Nova client exception occurred while live migrating instance <uuid>Exception: Compute host <uuid> could not be found
```
Based on 57f55190ff/watcher/applier/actions/migration.py (L122)
and
57f55190ff/watcher/common/nova_helper.py (L322),
live_migrate_instance expects destination hostname not uuid.

This cr replaces dest_node uuid to hostname.

Closes-Bug: #2109309

Change-Id: I3911ff24ea612f69dddae5eab15fabb4891f938d
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-04-25 15:51:20 +05:30
jgilaber
2c76da2868 Make prometheus the default devstack example
Change the devstack local.conf samples and devstack multinode
contributor doc to demonstrate deploying watcher with prometheus as
datasource instead of gnocchi. Keep the gnocchi as an alternative
deployment example.

Depends-On: https://review.opendev.org/c/openstack/watcher/+/946230
Depends-On: https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/946254

Change-Id: I721b550a03f9e5350a3f1ab10292faa1c50049a7
2025-04-24 16:06:50 +02:00
Alfredo Moralejo
c4acce91d6 Add real-data based tests to experimental and weekly pipelines
This job is adding a new job using prometheus datastore and real
workload data into the experimental pipeline so that we can run it
on-demand.

Also, it is adding it to the weekly periodic pipeline as agreed on
Watcher meeting.

Also I am excluding strategies execution with annotation `real_load` in
non-real-load jobs.

Finally, I'm moving the project configuration to the end of the file
as requested in the comments, as it's the usual location by convention.

Change-Id: Id41efda2f0dd8b1521df3f6179c3504f298e0e59
2025-04-15 16:11:21 +02:00
Zuul
adbcac9319 Merge "Replace watcherclient functional job with python-watcherclient-functional" 2025-04-15 13:32:34 +00:00
Zuul
c9a1d06e7c Merge "Aggregate by fqdn label instead instance in host cpu metrics" 2025-04-08 17:37:10 +00:00
Zuul
25c1a8207f Merge "Drop sg_core prometheus related vars" 2025-04-08 11:55:39 +00:00
Chandan Kumar (raukadah)
0702cb3869 Drop sg_core prometheus related vars
The depends-on pr removes the installation of promotheus[1] and node
exporter[2] from sg_core. We no longer need to define those vars in
the devstack config.

Links:
[1]. https://github.com/openstack-k8s-operators/sg-core/pull/21
[2]. https://github.com/openstack-k8s-operators/sg-core/pull/23

Note: We do not need to enable sg_core service on compute node,
so removing it's plugin call.

Change-Id: Ie8645813a360605635de4dff9e8d1ba0d7a0cdc3
Signed-off-by: Chandan Kumar (raukadah) <raukadah@gmail.com>
2025-04-04 19:36:54 +05:30
Zuul
03c107a4ce Merge "Imported Translations from Zanata" 2025-04-03 18:49:08 +00:00
Alfredo Moralejo
c7158b08d1 Aggregate by fqdn label instead instance in host cpu metrics
While in a regular case a specific metric for a specific host will be
provider by a single instance (exporter) so aggregating by label and by
intances should be the same, it is more correct to aggregate by the same
label that the one we use to filter the metrics.

This is follow up of https://review.opendev.org/c/openstack/watcher/+/944795

Related-Bug: #2103451

Change-Id: Ia61f051547ddc51e0d1ccd5a56485ab49ce84c2e
2025-04-02 15:36:17 +02:00
Zuul
035e6584c7 Merge "Query by fqdn_label instead of instance for host metrics" 2025-03-20 12:50:28 +00:00
Chandan Kumar (raukadah)
253e97678c Replace watcherclient functional job with python-watcherclient-functional
https://review.opendev.org/c/openstack/python-watcherclient/+/943132
Move functional tests from watcher_tempest_plugin to watcherclient and
adds new zuul job based on devstack-tox-functional to run functional tests.

This pr replaces the existing zuul job using tempest regex with
devstack tox functional job. The new job will run only watcher/api
changes.

Closes-Bug: #2100741

Depends-On: https://review.opendev.org/c/openstack/python-watcherclient/+/943132

Change-Id: Ic2371745fe8aaf6f283151111fec4f92ea6bdf69
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-03-20 10:04:13 +00:00