Commit Graph

2627 Commits

Author SHA1 Message Date
Zuul
374750847f Merge "Merge decision engine services into a single one" 2025-07-17 13:09:11 +00:00
Zuul
9b9965265a Merge "Drop Code related to OperationNotPermitted exception" 2025-07-08 19:31:11 +00:00
Zuul
98b56b66ac Merge "Drops forbidden patch/delete/post action apis" 2025-07-08 18:38:40 +00:00
Douglas Viroel
081cd5fae9 Merge decision engine services into a single one
The decision engine process was built based on 2
services: a service that handle rpc requests and a
scheduler to trigger watcher periodic tasks.
With the new version of oslo.service, a new threading
backend was added, based on cotyledon service manager,
which starts a new process for each service tha it
manages. These two services can't run in different
process since they need access to a shared in-memory
representation of the cluster (cluster data models)
This patch proposes creating a Decision Engine Service
which includes everything in a single main service.

Change-Id: I335a97ca14b6e023fef055978a56aefebf22d433
Signed-off-by: Douglas Viroel <viroel@gmail.com>
2025-07-08 09:55:32 -03:00
Zuul
1ab5babbb6 Merge "Move eventlet command scripts to a different dir" 2025-07-08 12:41:35 +00:00
Zuul
d771d00c5a Merge "sqlalchemy: Use built-in declarative" 2025-07-08 12:41:32 +00:00
Chandan Kumar (raukadah)
e3b813e27e Drop Code related to OperationNotPermitted exception
The following exception was added in initial import of watcher
code base[1].

In each of the controller REST APIs, it was called with a flag
stating request was coming from top level resources apis.

But this exception and code was not used anywhere in the
rest api. It seems to be a dead code. So, it needs to be
cleaned up.

Note: In audit_template, under patchapi, this exception
was used for not removal goal from audit template.

Since this cr drops this exception, It replace the same
with NotAuthorized exception keeping status code same.

Links:
[1]. d14e057da1 (diff-6d510a275605e20ba8b435157062da2b749265a88a3cfd6d90abb7e8e5feac2aR235)

Closes-Bug: #2115968

Change-Id: I82a5e4a7a51726b3a89257c84a75157fbfcb82eb
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-07-04 19:07:13 +05:30
Chandan Kumar (raukadah)
c0a5abe29c Drops forbidden patch/delete/post action apis
These apis are not implemented with in the watcher code base and
was marked as a forbidden to use.

It does not make sense to keep these api as they are not implemented.
This cr drops the code around that to make the action apis cleaner.

Closes-Bug: #2110895

Change-Id: I0f465157e6cd481b27665ca6016db68c198cebeb
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-07-04 11:51:40 +05:30
Zuul
bbe30f93f2 Merge "Update workload balance doc per review comments" 2025-07-03 19:57:05 +00:00
Zuul
3bc5c72039 Merge "resolve fixme comments in RequestContext" 2025-07-03 17:28:54 +00:00
Zuul
203b926be0 Merge "Drop unused fake class" 2025-07-03 17:28:52 +00:00
Zuul
e64709ea08 Merge "Add warning message for experimental integrations" 2025-07-03 17:27:39 +00:00
Zuul
94d8676db8 Merge "add missing bindeps for docs" 2025-07-03 16:03:47 +00:00
Takashi Kajinami
828bcadf6a sqlalchemy: Use built-in declarative
sqlalchemy.ext.declarative was deprecated in sqlalchemy 1.4.0, due to
the built-in implementations[1].

[1] https://github.com/sqlalchemy/sqlalchemy/commit/450f5c0d6519a439f40

Change-Id: Idb4a361d4d65ff53ecf33b8a2a6aa0d6f6ae1979
2025-06-30 22:14:33 +09:00
Zuul
93366df264 Merge "Add crosslinks to strategies table" 2025-06-30 13:02:28 +00:00
Takashi Kajinami
aa67096fe8 Drop unused fake class
It was a left-over from removal of ceilometer datasource[1].

[1] da23fdc621

Change-Id: I17ef33d6f70e2cc601add721661347d0bf210008
2025-06-28 20:35:09 +09:00
Ronelle Landy
6f72e33de5 Add crosslinks to strategies table
These replace the full external links
used previously.

Change-Id: I9c79f7b7ddebaa25d243fdbe1eb422cba25de8f1
2025-06-27 16:54:38 -04:00
Ronelle Landy
56d0a0d6ea Update workload balance doc per review comments
The original documentation update review [1]
had some additional comments for improvements.
The commit adds the suggested changes.

[1] https://review.opendev.org/c/openstack/watcher/+/951025

Change-Id: I4b4624e2dbc4c6a5f888ec77d6a03b8f66ff0a23
2025-06-27 16:46:17 -04:00
Ronelle Landy
de9eb2cd80 Add doc clarifications for Zone Migration
Adds documation clarifications on how the
strategy and associated parameters as used.

Closes-Bug: #2112480
Change-Id: Id42c280fc5744bebb01d50b52b834e5b3b76af73
2025-06-27 16:12:41 -04:00
Zuul
76de167171 Merge "Add Integrations doc page with support matrix" 2025-06-27 16:09:51 +00:00
Zuul
70032aa477 Merge "Add table - level of test/usage per strategy" 2025-06-27 16:01:31 +00:00
Zuul
16131e5cac Merge "Update Workload Balance strategy documentation" 2025-06-27 13:36:50 +00:00
Ronelle Landy
bfbd136f4b Update Host Maintenance strategy documentation
Add clarifications to the documentation to reflect
the actual strategy usage, including:
 - updating parameter descriptions
 - extending the 'How to Use' section

Closes-Bug: #2111810
Change-Id: Ifd2876056cd8819c50658fb9f213246dc1546d42
2025-06-23 06:36:42 -04:00
Zuul
fe8d8c8839 Merge "Use KiB as unit for host_ram_usage when using prometheus datasource" 2025-06-20 16:19:50 +00:00
Zuul
b8e0e6b01c Merge "Aggregate by label when querying instance cpu usage in prometheus" 2025-06-19 14:46:07 +00:00
Alfredo Moralejo
6ea362da0b Use KiB as unit for host_ram_usage when using prometheus datasource
The prometheus datasource was reporting host_ram_usage in MiB as
described in the docstring for the base datasource interface
definition [1].

However, the gnocchi datasource is reporting it in KiB following
ceilometer metric `hardware.memory.used` [2] and the strategies
using that metric expect it to be in KiB so the best approach is
to change the unit in the prometheus datasource and update the
docstring to avoid missunderstandings in future. So, this patch
is fixing the prometheus datasource to return host_ram_usage
in KiB instead of MiB.

Additionally, it is adding more unit tests for the check_threshold
method so that it covers the memory based strategy execution, validates
the calculated standard deviation and adds the cases where it is below
the threshold.

[1] 15981117ee/watcher/decision_engine/datasources/base.py (L177-L183)
[2] https://docs.openstack.org/ceilometer/train/admin/telemetry-measurements.html#snmp-based-meters

Closes-Bug: #2113776
Change-Id: Idc060d1e709c0265c64ada16062c3a206c6b04fa
2025-06-19 16:25:27 +02:00
Zuul
0f78386462 Merge "Add debug message to report calculated metric for workload_balance" 2025-06-18 12:26:24 +00:00
Alfredo Moralejo
1529e3fadd Add debug message to report calculated metric for workload_balance
The workload_balance strategy calculates host metrics based on the
instance metrics and those are the ones used to compare with the
threshold.

Currently, the strategy does not reports the calculated values what
makes difficult to troubleshoot sometimes. This patch is adding a debug
message to log those values.

This patch is also adding a new unit test for filter_destination_hosts
based on ram instead of cpu and adding assertions for the new debug
messages. To implement properly the new test, I had to sligthly modify
the ram usage fixtures used for the workload_balance tests.

Change-Id: Ief5e167afcf346ff53471f26adc70795c4b69f68
2025-06-17 19:11:48 +02:00
Zuul
31879d26f4 Merge "Add unit test zone migration with_attached_volume" 2025-06-13 12:17:52 +00:00
Zuul
efbae9321e Merge "devstack: Drop template for mod_wsgi" 2025-06-13 10:44:48 +00:00
Ronelle Landy
0599618add Add table - level of test/usage per strategy
This patch adds a table to the strategies page to
show the level of qualification and where the
strategy can be triggered.

Change-Id: I6991566fd5fec3f8bbae06eefa63a8b83a87eed1
2025-06-11 14:19:42 -04:00
Zuul
1d50c12e15 Merge "Adapt zuul.yaml strategies jobs to include tests with tag 'strategy'" 2025-06-11 13:47:34 +00:00
Alfredo Moralejo
3860de0b1e Aggregate by label when querying instance cpu usage in prometheus
Currently, when the prometheus datasource query ceilometer_cpu metric
for instance cpu usage, it aggregates by instance and filter by the
label containing the instance uuid. While this works fine in real
scenarios, where a single metric is provided in a single instance, in
some cases as the CI jobs where metrics are directly injected, leads to
incorrect metric calculation.

We applied a similar fix for the host metrics in [1] but we did not
implement it for instance cpu.

I am also converting the query formatting to the dict format to improve
understability.

[1] https://review.opendev.org/c/openstack/watcher/+/946049

Closes-Bug: #2113936
Change-Id: I3038dec20612162c411fc77446e86a47e0354423
2025-06-11 14:49:56 +02:00
Chandan Kumar (raukadah)
15981117ee Drop unused method get_disabled_compute_nodes_with_reason
get_disabled_compute_nodes_with_reason defined in host_maintenance
strategy is not used anywhere.

This cr drops the unused method.

Change-Id: I07c0d0b63e00d476511aa8b03c0feab8ec4db95b
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-06-09 10:51:45 +05:30
Douglas Viroel
4f8c14646d Move eventlet command scripts to a different dir
This is a initial patch towards the eventlet removal in watcher.
It moves cmd scripts that depends on eventlet to a eventlet dir,
where it is always monkey patched.

Change-Id: Ie23caab018fbf68f8c29a0f748c0708b97933b4b
2025-06-08 09:05:56 -03:00
Douglas Viroel
520ec0b79b Add warning message for experimental integrations
Some services integrations are now classified as experimental
and a warning message will now appear once a client is created
for them. These integrations are not fully tested in CI and
miss a documentation on how they work or should be used.
A release note was added to inform users about the status of
these integrations and related features.

Change-Id: Ib7d0ac0b3e187ae239dfa075fb53a6c0107dff29
2025-06-07 11:33:28 -03:00
Ronelle Landy
f42cb8557b Update Workload Balance strategy documentation
Adds additional parameter and usage explanations
and combined example.

Closes-Bug: #2111848
Change-Id: Id0de4d56fa7083388ad82c61596e7484431d465b
2025-06-06 15:51:23 -04:00
Douglas Viroel
b788a67c52 Add Integrations doc page with support matrix
Adds a new documentation section that descript which service
integrations are currently supported and their integrations status.
This information is not clear today and will help to cover the lack
of testing and documention about them.

Change-Id: I26b2a2ef5672b78a575a2bdaef3a08d5bbc063bd
2025-06-05 13:31:02 -03:00
Zuul
73f8728d22 Merge "Fix audit creation with no name and no goal or audit_template" 2025-06-05 13:39:38 +00:00
Alfredo Moralejo
bf6a28bd1e Fix audit creation with no name and no goal or audit_template
Currently, in that case it was failing because watcher tried to create a
name based on a goal automatically and the goal is not defined.

This patch is moving the check for goal specification in the audit
creation call earlier, and if there is not goal defined, it returns an
invalid call error.

This patch is also modifying the existing error for this case to check
the expected behavior.

Closes-Bug: #2110947

Change-Id: I6f3d73b035e8081e86ce82c205498432f0e0fc33
2025-06-04 14:46:36 +02:00
morenod
1256b24133 Adapt zuul.yaml strategies jobs to include tests with tag 'strategy'
The idea is to adapt zuul.yaml to future test structure where every strategy will be on its own file so now we keep executing everything inside test_execute_strategies but also any other test on any file with tag 'strategy'

Change-Id: I304c858078d35beb1f7b4f1fad4ea8bedde674af
2025-06-04 09:50:35 +00:00
Takashi Kajinami
a559c0505e devstack: Drop template for mod_wsgi
... because mod_wsgi support was already removed by [1].

[1] 57b248f9fe

Change-Id: I100169b3fb7ed68d9b01abb4fc91bdd16eb68aa9
2025-06-04 00:14:07 +09:00
Zuul
59757249bb Merge "Added unit test to validate audit creation with no goal and no name" 2025-05-27 19:32:07 +00:00
Zuul
58b25101e6 Merge "Return HTTP code 400 when creating an audit with wrong parameters" 2025-05-27 19:23:25 +00:00
Zuul
690a389369 Merge "Add a unit test to check the error when creating an audit with wrong parameters" 2025-05-27 19:23:23 +00:00
Zuul
1cdd392f96 Merge "Remove deprecated executor in message handling servers" 2025-05-26 14:44:39 +00:00
Zuul
20f231054a Merge "Set actionplan state to FAILED if any action has failed" 2025-05-26 14:44:37 +00:00
Zuul
077c36be8a Merge "Add unit test to check action plan state when a nested action fails" 2025-05-26 14:27:08 +00:00
Alfredo Moralejo
88d81c104e Set actionplan state to FAILED if any action has failed
Currently, an actionplan state is set to SUCCEEDED once the execution
has finished, but that does not imply that all the actions finished
successfully.

This patch is checking the actual state of all the actions in the plan
after the execution has finished. If any action has status FAILED, it
will set the state of the action plan as FAILED and will apply the
appropiate notification parameters. This is the expected behavior according
to Watcher documentation.

The patch is also fixing the unit test for this to set the expected
action plan state to FAILED and notification parameters.

Closes-Bug: #2106407
Change-Id: I7bfc6759b51cd97c26ec13b3918bd8d3b7ac9d4e
2025-05-26 14:58:03 +02:00
Zuul
8ac8a29fda Merge "Fix incorrect logging format" 2025-05-26 11:47:26 +00:00