Commit Graph

2657 Commits

Author SHA1 Message Date
Zuul
a963e0ff85 Merge "Fix api-ref doc for GET /infra-optim/v1/data_model" 2025-08-22 14:03:15 +00:00
Zuul
6d155c4be6 Merge "Add status_message to objects and notifications" 2025-08-21 14:59:53 +00:00
Zuul
83fea206df Merge "Add status_message column to Actions, Audits and ActionPlans tables" 2025-08-21 14:50:46 +00:00
Zuul
00a3edeac6 Merge "Add parameters to force failures in nop action" 2025-08-21 14:32:37 +00:00
Zuul
b69642181b Merge "Add patch call validation based on allowed_attrs" 2025-08-21 14:24:09 +00:00
Zuul
616c8f4cc4 Merge "Add options to disable migration in host maintenance" 2025-08-21 14:11:22 +00:00
Quang Ngo
cc26b3b334 Add options to disable migration in host maintenance
This change enhances the Host Maintenance strategy by introducing
two new input parameters: `disable_live_migration` and
`disable_cold_migration`. These parameters allow cloud
administrators to control whether live or cold migration should be
considered during host maintenance operations.

If `disable_live_migration` is set, active instances will be cold
migrated if `disable_cold_migration` is not set, otherwise
active instances will be stopped. If `disable_cold_migration` is set,
inactive instances will not be cold migrated.
If both are set, only stop actions will be performed on instances.

The strategy logic and action plan generation have been updated to
reflect these behaviors. A new "stop" action is introduced and
registered, and the weight planner is updated to handle new action.

Documentation for the Host Maintenance strategy is updated to
describe the new parameters and their effects.

Test Plan:
- Unit tests for HostMaintenance strategy with new parameters
- Integration tests for action plan generation with stop action

This implements the specification:
Spec: https://review.opendev.org/c/openstack/watcher-specs/+/943873

Change-Id: I201b8e5c52e1bc1a74f3886a0e301e3c0fa5d351
Signed-off-by: Quang Ngo <quang.ngo@canonical.com>
2025-08-20 22:32:33 +10:00
Alfredo Moralejo
5048a6e3ba Add status_message to objects and notifications
This patch is part of the skipped action blueprint. It adds the
`status_message` field to the Audit, ActionPlan and Action objects and
all related notifications.

It bumps the versions of all the affected objects and notifications and
update the tests to include the new fields.

Change-Id: I3b9467e7e37188e647379cd9c4cbbda8ed75383f
Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
2025-08-19 13:01:00 +02:00
Alfredo Moralejo
84742be8c2 Add status_message column to Actions, Audits and ActionPlans tables
This patch implements the changes in the database required for the
skipped action blueprint.

It just adds a new nullable column to the required tables and add tests
for it.

Note that I am  also introducing a fix in a previous tables tests which
will be affected by the changes in the objects.

Implements: blueprint add-skip-actions

Change-Id: I027bc3861b589bd281a7216583a8c5c351a53c57
Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
2025-08-19 11:05:39 +02:00
Alfredo Moralejo
1fb89aeac3 Add parameters to force failures in nop action
In order to test the different code paths for action execution
it is very useful to be able to make the actions fail in the different
execution stages.

This patch adds three new options `fail_pre_condition`, `fail_execute`
and `fail_post_condition`. Setting any of them to True makes the action
to fail in the specified step.

Change-Id: Ied8c0bb767d9bb6bdfb9209365857a3b4d606b40
Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
2025-08-19 11:05:11 +02:00
Alfredo Moralejo
1a9f17748e Add patch call validation based on allowed_attrs
Currently, patch call field validations are done based on exclussion,
all the fields can be patched unless included in a list
`internal_attrs`.

This patch is adding a new validation rule based on fields inclussion
in a list `allowed_attrs`. When that list is non-empty, only the fields
included on it can be patched. in order to keep the existing behavior
for the existing patch calls, I am defining the list as empty, so that
the rest of validation rules are applied and it is not affecting the
current behavior.

Change-Id: I22010649332c8fb872446a9d0483a0303a4eba3b
Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
2025-08-19 11:01:20 +02:00
Zuul
90f0c2264c Merge "use cinder migrate for swap volume" 2025-08-18 20:32:42 +00:00
Sean Mooney
3742e0a79c use cinder migrate for swap volume
This change removes watchers in tree functionality
for swapping instance volumes and defines swap as an alias
of cinder volume migrate.

The watcher native implementation was missing error handling
which could lead to irretrievable data loss.

The removed code also forged project user credentials to
perform admin request as if it was done by a member of a project.
this was unsafe an posses a security risk due to how it was
implemented. This code has been removed without replacement.

While some effort has been made to allow existing
audits that were defined to work, any reduction of functionality
as a result of this security hardening is intentional.

Closes-Bug: #2112187
Change-Id: Ic3b6bfd164e272d70fe86d7b182478dd962f8ac0
Signed-off-by: Sean Mooney <work@seanmooney.info>
2025-08-18 16:35:38 +00:00
Jaromir Wysoglad
8309d9848a Add Aetos datasource
Implement the spec for multi-tenancy support for metrics. This adds
a new 'Aetos' datasource very similar to the current Prometheus
datasource. Because of that, the original PrometheusHelper class
was split into two classes and the base class is used for
PrometheusHelper and for AetosHelper. Except for the split, there
is one more change to the original PrometheusHelper class code, which
is the addition and use of the _get_fqdn_label() and
_get_instance_uuid_label() methods.

As part of the change, I refactored the current prometheus datasource
unit tests. Most of them are now used to test the PrometheusBase class
with minimal changes. Changes I've made to the original tests:

- the ones that can be be used to test the base class are moved into the
  TestPrometheusBase class
- the _setup_prometheus_client, _get_instance_uuid_label and
  _get_fqdn_label functions are mocked in the base class tests.
  Their concrete implementations are tested in each datasource tests
  separately.
- a self._create_helper() is used to instantiate the helper class with
  correct mocking.
- all config value modification is the original tests got moved out and
  instead of modifying the config values, the _get_* methods are mocked
  to return the wanted values
- to keep similar test coverage, config retrieval is tested for each
  concrete class by testing the _get_* methods.

New watcher-aetos-integration and watcher-aetos-integration-realdata
zuul jobs are added to test the new datasource. These use the same set
of tempest tests as the current watcher-prometheus-integration jobs.
The only difference is the environment setup and the Watcher config,
so that the job deploys Aetos and Watcher uses it instead of accessing
Prometheus directly.

At first this was generated by asking cursor to implement the linked spec
with some additional prompts for some smaller changes. Afterwards I manually
went through the code doing some cleanups, ensuring it complies with
PEP8 and hacking and so on. Later on I manually adjusted the code to use
the latest observabilityclient changes.
The zuul job was also mostly generated by cursor.

Implements: https://blueprints.launchpad.net/watcher/+spec/prometheus-multitenancy-support

Generated-By: Cursor with claude-4-sonnet model
Change-Id: I72c2171f72819bbde6c9cbbf565ee895e5d2bd53
Signed-off-by: Jaromir Wysoglad <jwysogla@redhat.com>
2025-08-14 02:27:24 -04:00
Zuul
355671e979 Merge "Add a new tox environment to run unit tests in threading mode" 2025-08-13 21:37:19 +00:00
Douglas Viroel
37faf614e2 Fix api-ref doc for GET /infra-optim/v1/data_model
Some response parameters from GET /infra-optim/v1/data_model
endpoint are missing from api-ref documentation. This patch
updates the doc to include them.
For more details see, LP #2117726

Closes-Bug: #2117726

Change-Id: Iaa775f56bb8167d9c6b458cd07f1ec3cefaf70fe
Signed-off-by: Douglas Viroel <viroel@gmail.com>
2025-08-12 09:47:01 -03:00
Zuul
4080d5767d Merge "Disable real metrics on devstack injected data jobs" 2025-08-11 17:10:32 +00:00
Zuul
9925fd2cc9 Merge "Replace dateutils usage with datetime and oslo.utils" 2025-08-07 20:46:25 +00:00
Zuul
27baff5184 Merge "Extend decision engine to support threading mode" 2025-08-06 15:38:31 +00:00
Douglas Viroel
8ca794cdbb Add a new tox environment to run unit tests in threading mode
It is done by disabling the eventlet patching and configuring
oslo.service backend to threading. Once oslo.service backend is
configured, it can't be reverted to eventlet. This needs to be
done before including other modules, which may include oslo.service
library.
Adds a job that run a subset of tests with eventlet patching disabled.

Change-Id: I9f8c2c5bbcf3192313cc3b309e8f2719a3bea18f
Signed-off-by: Douglas Viroel <viroel@gmail.com>
2025-08-05 16:50:29 -03:00
Douglas Viroel
f879b10b05 Extend decision engine to support threading mode
With the events of eventlet removal, Watcher will need
to be adapted to support both modes, eventlet and threading, for
a couple of releases before removing all eventlet code.
This patch adds methods and classes that allow decision engine
modules to create futurist thread pools instead of green thread pools,
based on a environment variable that can be enabled by service.
It moves continuous audit handler instance to decison engine service,
so it can be started together with the main decision engine service.
Adds an environment variable that allows the user to disable
eventlet monkey patching and to use oslo.service threading backend.

Change-Id: I8a8be0a7cebdc44005fd77ec960543828c7da318
Signed-off-by: Douglas Viroel <viroel@gmail.com>
2025-08-05 16:45:48 -03:00
Chandan Kumar (raukadah)
95d975f339 Replace dateutils usage with datetime and oslo.utils
This cr fixes:
* Replaced ``dateutil.tz.tzlocal()`` and ``dateutil.tz.tzutc()`` with
  ``datetime.timezone`` built-in classes in audit controllers and
  continuous audit scheduling.

* Replaced ``dateutil.parser.parse()`` with
  ``oslo_utils.timeutils.parse_isotime()`` in the zone migration
  strategy for parsing datetime strings.

Closes-Bug: #2118404

Change-Id: I6d8a345fa4339a688769b147413dcdf3016bf4a0
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-08-05 23:09:50 +05:30
morenod
0435200fb1 Disable real metrics on devstack injected data jobs
We need to disable real data metrics comming from host and
instances on injected data jobs as they are creating wrong results
when they are mixed with the injected data.

We already did this on watcher-operator disabling ceilometer agent and
node_exported on [1] so now we have to do it on devstack installations,
disabling meminfo on node_exporter for host metrics (cpu is already
disabled) and sg-core for instance metrics

[1] https://github.com/openstack-k8s-operators/watcher-operator/pull/196

Change-Id: I4130ca6dd7cb52d96842e04e7720431ebc76efff
Signed-off-by: morenod <dsanzmor@redhat.com>
2025-08-04 12:41:54 +02:00
Douglas Viroel
adfe3858aa Configure watcher tempest's microversion in devstack
Adds a tempest configuration for min and max microversions supported
by watcher. This help us to define the correct range of microversion
to be tested on each stable branch.
New microversion proposals should also increase the default
max_microversion, in order to work with watcher-tempest-plugin
microversion testing.

Change-Id: I0b695ba4530eb89ed17b3935b87e938cadec84cc
Signed-off-by: Douglas Viroel <viroel@gmail.com>
2025-08-01 17:28:40 -03:00
Zuul
a1e7156c7e Merge "finalize python 3.9 support removal" 2025-07-30 15:54:12 +00:00
Zuul
71470dac73 Merge "Add comprehensive release liaison guide for DPL model" 2025-07-30 15:24:49 +00:00
Zuul
5ba086095c Merge "Fix release notes typo and extra information" 2025-07-21 18:41:57 +00:00
Sean Mooney
3e8392b8f1 finalize python 3.9 support removal
The last release of openstack to support python 3.9
was 2025.1 (epoxy), with this change watcher now requires
3.10, testing of 3.9 was removed in previous commits.

Change-Id: Ida53740293e93b0c20dec2e175b390fa18bed852
Signed-off-by: Sean Mooney <work@seanmooney.info>
2025-07-21 18:25:04 +01:00
Sean Mooney
20cd4a0394 Add comprehensive release liaison guide for DPL model
Transform Nova's PTL guide into Watcher-specific release liaison
documentation following the DPL governance model. This guide provides
chronological guidance for release liaisons managing Watcher's
cycle-with-intermediary release process.

Key features:
* DPL liaison coordination with proper precedence hierarchies
* Watcher-specific project context and repository references
* Enhanced FFE process with release liaison decision authority
* Proper RST formatting with code blocks and cross-references
* Comprehensive glossary of OpenStack release terminology
* Usage guidance for both new and experienced release liaisons

Adapts Nova's proven chronological structure while reflecting
Watcher's distributed leadership model and technical requirements.

Assisted-By: claude-code
Change-Id: I133bb06e47c14deaca162a2bf024210f68d78ab2
Signed-off-by: Sean Mooney <work@seanmooney.info>
2025-07-21 16:34:47 +01:00
Zuul
374750847f Merge "Merge decision engine services into a single one" 2025-07-17 13:09:11 +00:00
Chandan Kumar
2fe3b0cdbe Fix release notes typo and extra information
This cr fixes the release notes for
https://review.opendev.org/c/openstack/watcher/+/954120/ and
https://review.opendev.org/c/openstack/watcher/+/954120/

Related-Bug: #2110895
Related-Bug: #2115968

Change-Id: I1f3fc06549c2d5d7ba9debee424429a25a651070
Signed-off-by: Chandan Kumar <chkumar@redhat.com>
2025-07-09 15:44:20 +05:30
Zuul
9b9965265a Merge "Drop Code related to OperationNotPermitted exception" 2025-07-08 19:31:11 +00:00
Zuul
98b56b66ac Merge "Drops forbidden patch/delete/post action apis" 2025-07-08 18:38:40 +00:00
Douglas Viroel
081cd5fae9 Merge decision engine services into a single one
The decision engine process was built based on 2
services: a service that handle rpc requests and a
scheduler to trigger watcher periodic tasks.
With the new version of oslo.service, a new threading
backend was added, based on cotyledon service manager,
which starts a new process for each service tha it
manages. These two services can't run in different
process since they need access to a shared in-memory
representation of the cluster (cluster data models)
This patch proposes creating a Decision Engine Service
which includes everything in a single main service.

Change-Id: I335a97ca14b6e023fef055978a56aefebf22d433
Signed-off-by: Douglas Viroel <viroel@gmail.com>
2025-07-08 09:55:32 -03:00
Zuul
1ab5babbb6 Merge "Move eventlet command scripts to a different dir" 2025-07-08 12:41:35 +00:00
Zuul
d771d00c5a Merge "sqlalchemy: Use built-in declarative" 2025-07-08 12:41:32 +00:00
Chandan Kumar (raukadah)
e3b813e27e Drop Code related to OperationNotPermitted exception
The following exception was added in initial import of watcher
code base[1].

In each of the controller REST APIs, it was called with a flag
stating request was coming from top level resources apis.

But this exception and code was not used anywhere in the
rest api. It seems to be a dead code. So, it needs to be
cleaned up.

Note: In audit_template, under patchapi, this exception
was used for not removal goal from audit template.

Since this cr drops this exception, It replace the same
with NotAuthorized exception keeping status code same.

Links:
[1]. d14e057da1 (diff-6d510a275605e20ba8b435157062da2b749265a88a3cfd6d90abb7e8e5feac2aR235)

Closes-Bug: #2115968

Change-Id: I82a5e4a7a51726b3a89257c84a75157fbfcb82eb
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-07-04 19:07:13 +05:30
Chandan Kumar (raukadah)
c0a5abe29c Drops forbidden patch/delete/post action apis
These apis are not implemented with in the watcher code base and
was marked as a forbidden to use.

It does not make sense to keep these api as they are not implemented.
This cr drops the code around that to make the action apis cleaner.

Closes-Bug: #2110895

Change-Id: I0f465157e6cd481b27665ca6016db68c198cebeb
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2025-07-04 11:51:40 +05:30
Zuul
bbe30f93f2 Merge "Update workload balance doc per review comments" 2025-07-03 19:57:05 +00:00
Zuul
3bc5c72039 Merge "resolve fixme comments in RequestContext" 2025-07-03 17:28:54 +00:00
Zuul
203b926be0 Merge "Drop unused fake class" 2025-07-03 17:28:52 +00:00
Zuul
e64709ea08 Merge "Add warning message for experimental integrations" 2025-07-03 17:27:39 +00:00
Zuul
94d8676db8 Merge "add missing bindeps for docs" 2025-07-03 16:03:47 +00:00
Takashi Kajinami
828bcadf6a sqlalchemy: Use built-in declarative
sqlalchemy.ext.declarative was deprecated in sqlalchemy 1.4.0, due to
the built-in implementations[1].

[1] https://github.com/sqlalchemy/sqlalchemy/commit/450f5c0d6519a439f40

Change-Id: Idb4a361d4d65ff53ecf33b8a2a6aa0d6f6ae1979
2025-06-30 22:14:33 +09:00
Zuul
93366df264 Merge "Add crosslinks to strategies table" 2025-06-30 13:02:28 +00:00
Takashi Kajinami
aa67096fe8 Drop unused fake class
It was a left-over from removal of ceilometer datasource[1].

[1] da23fdc621

Change-Id: I17ef33d6f70e2cc601add721661347d0bf210008
2025-06-28 20:35:09 +09:00
Ronelle Landy
6f72e33de5 Add crosslinks to strategies table
These replace the full external links
used previously.

Change-Id: I9c79f7b7ddebaa25d243fdbe1eb422cba25de8f1
2025-06-27 16:54:38 -04:00
Ronelle Landy
56d0a0d6ea Update workload balance doc per review comments
The original documentation update review [1]
had some additional comments for improvements.
The commit adds the suggested changes.

[1] https://review.opendev.org/c/openstack/watcher/+/951025

Change-Id: I4b4624e2dbc4c6a5f888ec77d6a03b8f66ff0a23
2025-06-27 16:46:17 -04:00
Ronelle Landy
de9eb2cd80 Add doc clarifications for Zone Migration
Adds documation clarifications on how the
strategy and associated parameters as used.

Closes-Bug: #2112480
Change-Id: Id42c280fc5744bebb01d50b52b834e5b3b76af73
2025-06-27 16:12:41 -04:00
Zuul
76de167171 Merge "Add Integrations doc page with support matrix" 2025-06-27 16:09:51 +00:00