watcher

Author	SHA1	Message	Date
jgilaber	fe56660c44	Handle missing dst_pool parameter in zone_migration Unlike Nova, Cinder does not support calling the 'os-migrate_volume'[1] action without a host or a cluster. For volume migrations of type 'migrate' in watcher the dst_pool is required, but for other migrations that migrate the volumes to different types is not needed. This change checks if the dst_pool is defined and prevents some migrations when it's misssing information. Adds testing for creating audits with the Zone Migration status, validating the schema changes. [1] https://docs.openstack.org/api-ref/block-storage/v3/index.html#migrate-a-volume Closes-Bug: 2108988 Change-Id: I305c58e47093c4a884e86f1d91fdc15ef2a1cfba Signed-off-by: jgilaber <jgilaber@redhat.com>	2025-09-10 15:58:24 +02:00
Zuul	e5b18afa01	Merge "Fix doc section to enable cinder notifications"	2025-09-01 14:15:29 +00:00
jgilaber	a4b785e4f1	Fix doc section to enable cinder notifications The section in the Watcher docs that describes how to enable cinder notifications incorrectly tells the user to change the cinder config to send notification to the watcher.watcher_notifications exchange and topic. Instead, it should instruct the user to change the Watcher configuration of the notification_topics [1] to listen to the 'openstack.notifications', which is the one used by cinder by default[2]. This patch also adds 'openstack.notifications' to the default value for the 'notification_topics' parameter. [1] https://docs.openstack.org/watcher/latest/configuration/watcher.html#watcher_decision_engine.notification_topics [2] https://docs.openstack.org/cinder/latest/configuration/block-storage/samples/cinder.conf.html Partial-Bug: 2121384 Change-Id: I4dc1a72af79a23c9ca07d2da5ff41bd7741e37d8 Signed-off-by: jgilaber <jgilaber@redhat.com>	2025-09-01 11:23:00 +02:00
Sean Mooney	ef0f35192d	Make Monasca client optional and lazy-load Monasca is deprecated for removal. This change makes the Monasca client an optional dependency and ensures it is only imported and instantiated when the Monasca datasource is explicitly selected. This reduces the default footprint while preserving functionality for deployments that still rely on Monasca. What changed ============ - requirements.txt: remove python-monascaclient from hard deps - setup.cfg: add [options.extras_require] monasca extra - watcher/common/clients.py: lazy import with clear UnsupportedError - watcher/decision_engine/datasources/monasca.py: lazy client property and deferred import of monascaclient.exc; reset on Unauthorized - watcher/decision_engine/datasources/manager.py: unconditionally import Monasca helper and include in metric_map; helper is lazy - tests: conditionally include Monasca based on availability; adjust expectations instead of skipping by default; avoid over-mocking - tox.ini: enable optional extras via WATCHER_EXTRAS env var - docs: datasources index notes Monasca is deprecated and optional - releasenotes: upgrade note with install example and behavior Why === - Allow deployments not using Monasca to run without the client - Keep Monasca functional when explicitly installed via extras - Provide clear operator guidance and smooth upgrades Compatibility ============= - No change for deployments that do not use Monasca - Deployments using Monasca must install the optional extra: pip install watcher[monasca] Testing ======= - Default: tox -e py3 - With Monasca: WATCHER_EXTRAS=monasca tox -e py3 Assisted-By: GPT-5 (Cursor) Closes-Bug: #2120192 Change-Id: I7c02b74e83d656083ce612727e6da58761200ae4 Signed-off-by: Sean Mooney <work@seanmooney.info>	2025-08-28 16:53:48 +01:00
Douglas Viroel	2452c1e541	Follow up changes for skip-action blueprint These are some of the requested changes from reviews in the series of patches for add-skip-action blueprint. Some of them may required another specific patch since would touch in more files that are not related to this feature. Change-Id: I9e30ca385e7b184ab19449a60db6f6d0f3c0e1b9 Signed-off-by: Douglas Viroel <viroel@gmail.com>	2025-08-26 10:27:57 -03:00
Zuul	1668b9b9f8	Merge "API changes for skipped actions: patch actions and status_message"	2025-08-26 12:54:31 +00:00
Zuul	5e05b50048	Merge "Skip actions automatically based on pre_condition results"	2025-08-26 12:33:08 +00:00
Ronelle Landy	457819072f	Update Overload standard deviation doc Bug #2113862 details a number of suggested corrections and additions to the Workload Stabilization doc. This patch adds those suggested changes. Closes-Bug: #2113862 Assisted-By: Cursor (claude-3.5-sonnet) Change-Id: I4131a304c064d2ea397b2447025c7edf69a56e2a Signed-off-by: Ronelle Landy <rlandy@redhat.com>	2025-08-21 11:09:46 -04:00
Zuul	6d155c4be6	Merge "Add `status_message` to objects and notifications"	2025-08-21 14:59:53 +00:00
Zuul	616c8f4cc4	Merge "Add options to disable migration in host maintenance"	2025-08-21 14:11:22 +00:00
Quang Ngo	cc26b3b334	Add options to disable migration in host maintenance This change enhances the Host Maintenance strategy by introducing two new input parameters: `disable_live_migration` and `disable_cold_migration`. These parameters allow cloud administrators to control whether live or cold migration should be considered during host maintenance operations. If `disable_live_migration` is set, active instances will be cold migrated if `disable_cold_migration` is not set, otherwise active instances will be stopped. If `disable_cold_migration` is set, inactive instances will not be cold migrated. If both are set, only stop actions will be performed on instances. The strategy logic and action plan generation have been updated to reflect these behaviors. A new "stop" action is introduced and registered, and the weight planner is updated to handle new action. Documentation for the Host Maintenance strategy is updated to describe the new parameters and their effects. Test Plan: - Unit tests for HostMaintenance strategy with new parameters - Integration tests for action plan generation with stop action This implements the specification: Spec: https://review.opendev.org/c/openstack/watcher-specs/+/943873 Change-Id: I201b8e5c52e1bc1a74f3886a0e301e3c0fa5d351 Signed-off-by: Quang Ngo <quang.ngo@canonical.com>	2025-08-20 22:32:33 +10:00
Alfredo Moralejo	e06f1b0475	API changes for skipped actions: patch actions and status_message This patch implements the changes in the API required for the skipped action blueprint. It includes: - New field `status_message` is visible in API get calls for Audits, ActionPlans and Audits. - New Patch call is added to `/actions/{action_id}` which allows to manually move actions in PENDING state to SKIPPED for ActionPlans which have not been started. - A new API microversion 1.5 is added for these changes. It also adds requried tests and documentation. Implements: blueprint add-skip-actions Assisted-By: Cursor (claude-4-sonnet) Change-Id: I71fb9af76085e5941a7fd3e9e4c89d6f3a3ada47 Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>	2025-08-20 13:13:19 +02:00
Alfredo Moralejo	6d35be11ec	Skip actions automatically based on pre_condition results This patch is implementing skipping automatically actions based on the result of action pre_condition method. This will allow to manage properly situations as migration actions for vms which does not longer exist. This patch includes: - Adding a new state SKIPPED to the Action objects. - Add a new Exception ActionSkipped. An action which raises it from the pre_condition execution is moved to SKIPPED state. - pre_condition will not be executed for any action in SKIPPED state. - execute will not be executed for any action in SKIPPED or FAILED state. - post_condition will not be executed for any action in SKIPPED state. - moving transition to ONGOING from pre_condition to execute. That means that actions raising ActionSkipped will move from PENDING to SKIPPED while actions raising any other Exception will move from PENDING to FAILED. - Adding information on action failed or skipped state to the `status_message` field. - Adding a new option to the testing action nop to simulate skipping on pre_condition, so that we can easily test it. Implements: blueprint add-skip-actions Assisted-By: Cursor (claude-4-sonnet) Change-Id: I59cb4c7006c7c3bcc5ff2071886d3e2929800f9e Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>	2025-08-20 13:10:10 +02:00
Alfredo Moralejo	5048a6e3ba	Add `status_message` to objects and notifications This patch is part of the skipped action blueprint. It adds the `status_message` field to the Audit, ActionPlan and Action objects and all related notifications. It bumps the versions of all the affected objects and notifications and update the tests to include the new fields. Change-Id: I3b9467e7e37188e647379cd9c4cbbda8ed75383f Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>	2025-08-19 13:01:00 +02:00
Jaromir Wysoglad	8309d9848a	Add Aetos datasource Implement the spec for multi-tenancy support for metrics. This adds a new 'Aetos' datasource very similar to the current Prometheus datasource. Because of that, the original PrometheusHelper class was split into two classes and the base class is used for PrometheusHelper and for AetosHelper. Except for the split, there is one more change to the original PrometheusHelper class code, which is the addition and use of the _get_fqdn_label() and _get_instance_uuid_label() methods. As part of the change, I refactored the current prometheus datasource unit tests. Most of them are now used to test the PrometheusBase class with minimal changes. Changes I've made to the original tests: - the ones that can be be used to test the base class are moved into the TestPrometheusBase class - the _setup_prometheus_client, _get_instance_uuid_label and _get_fqdn_label functions are mocked in the base class tests. Their concrete implementations are tested in each datasource tests separately. - a self._create_helper() is used to instantiate the helper class with correct mocking. - all config value modification is the original tests got moved out and instead of modifying the config values, the _get_* methods are mocked to return the wanted values - to keep similar test coverage, config retrieval is tested for each concrete class by testing the _get_* methods. New watcher-aetos-integration and watcher-aetos-integration-realdata zuul jobs are added to test the new datasource. These use the same set of tempest tests as the current watcher-prometheus-integration jobs. The only difference is the environment setup and the Watcher config, so that the job deploys Aetos and Watcher uses it instead of accessing Prometheus directly. At first this was generated by asking cursor to implement the linked spec with some additional prompts for some smaller changes. Afterwards I manually went through the code doing some cleanups, ensuring it complies with PEP8 and hacking and so on. Later on I manually adjusted the code to use the latest observabilityclient changes. The zuul job was also mostly generated by cursor. Implements: https://blueprints.launchpad.net/watcher/+spec/prometheus-multitenancy-support Generated-By: Cursor with claude-4-sonnet model Change-Id: I72c2171f72819bbde6c9cbbf565ee895e5d2bd53 Signed-off-by: Jaromir Wysoglad <jwysogla@redhat.com>	2025-08-14 02:27:24 -04:00
Zuul	27baff5184	Merge "Extend decision engine to support threading mode"	2025-08-06 15:38:31 +00:00
Douglas Viroel	f879b10b05	Extend decision engine to support threading mode With the events of eventlet removal, Watcher will need to be adapted to support both modes, eventlet and threading, for a couple of releases before removing all eventlet code. This patch adds methods and classes that allow decision engine modules to create futurist thread pools instead of green thread pools, based on a environment variable that can be enabled by service. It moves continuous audit handler instance to decison engine service, so it can be started together with the main decision engine service. Adds an environment variable that allows the user to disable eventlet monkey patching and to use oslo.service threading backend. Change-Id: I8a8be0a7cebdc44005fd77ec960543828c7da318 Signed-off-by: Douglas Viroel <viroel@gmail.com>	2025-08-05 16:45:48 -03:00
Sean Mooney	20cd4a0394	Add comprehensive release liaison guide for DPL model Transform Nova's PTL guide into Watcher-specific release liaison documentation following the DPL governance model. This guide provides chronological guidance for release liaisons managing Watcher's cycle-with-intermediary release process. Key features: * DPL liaison coordination with proper precedence hierarchies * Watcher-specific project context and repository references * Enhanced FFE process with release liaison decision authority * Proper RST formatting with code blocks and cross-references * Comprehensive glossary of OpenStack release terminology * Usage guidance for both new and experienced release liaisons Adapts Nova's proven chronological structure while reflecting Watcher's distributed leadership model and technical requirements. Assisted-By: claude-code Change-Id: I133bb06e47c14deaca162a2bf024210f68d78ab2 Signed-off-by: Sean Mooney <work@seanmooney.info>	2025-07-21 16:34:47 +01:00
Zuul	bbe30f93f2	Merge "Update workload balance doc per review comments"	2025-07-03 19:57:05 +00:00
Zuul	93366df264	Merge "Add crosslinks to strategies table"	2025-06-30 13:02:28 +00:00
Ronelle Landy	6f72e33de5	Add crosslinks to strategies table These replace the full external links used previously. Change-Id: I9c79f7b7ddebaa25d243fdbe1eb422cba25de8f1	2025-06-27 16:54:38 -04:00
Ronelle Landy	56d0a0d6ea	Update workload balance doc per review comments The original documentation update review [1] had some additional comments for improvements. The commit adds the suggested changes. [1] https://review.opendev.org/c/openstack/watcher/+/951025 Change-Id: I4b4624e2dbc4c6a5f888ec77d6a03b8f66ff0a23	2025-06-27 16:46:17 -04:00
Ronelle Landy	de9eb2cd80	Add doc clarifications for Zone Migration Adds documation clarifications on how the strategy and associated parameters as used. Closes-Bug: #2112480 Change-Id: Id42c280fc5744bebb01d50b52b834e5b3b76af73	2025-06-27 16:12:41 -04:00
Zuul	76de167171	Merge "Add Integrations doc page with support matrix"	2025-06-27 16:09:51 +00:00
Zuul	70032aa477	Merge "Add table - level of test/usage per strategy"	2025-06-27 16:01:31 +00:00
Zuul	16131e5cac	Merge "Update Workload Balance strategy documentation"	2025-06-27 13:36:50 +00:00
Ronelle Landy	bfbd136f4b	Update Host Maintenance strategy documentation Add clarifications to the documentation to reflect the actual strategy usage, including: - updating parameter descriptions - extending the 'How to Use' section Closes-Bug: #2111810 Change-Id: Ifd2876056cd8819c50658fb9f213246dc1546d42	2025-06-23 06:36:42 -04:00
Ronelle Landy	0599618add	Add table - level of test/usage per strategy This patch adds a table to the strategies page to show the level of qualification and where the strategy can be triggered. Change-Id: I6991566fd5fec3f8bbae06eefa63a8b83a87eed1	2025-06-11 14:19:42 -04:00
Ronelle Landy	f42cb8557b	Update Workload Balance strategy documentation Adds additional parameter and usage explanations and combined example. Closes-Bug: #2111848 Change-Id: Id0de4d56fa7083388ad82c61596e7484431d465b	2025-06-06 15:51:23 -04:00
Douglas Viroel	b788a67c52	Add Integrations doc page with support matrix Adds a new documentation section that descript which service integrations are currently supported and their integrations status. This information is not clear today and will help to cover the lack of testing and documention about them. Change-Id: I26b2a2ef5672b78a575a2bdaef3a08d5bbc063bd	2025-06-05 13:31:02 -03:00
jgilaber	2c76da2868	Make prometheus the default devstack example Change the devstack local.conf samples and devstack multinode contributor doc to demonstrate deploying watcher with prometheus as datasource instead of gnocchi. Keep the gnocchi as an alternative deployment example. Depends-On: https://review.opendev.org/c/openstack/watcher/+/946230 Depends-On: https://review.opendev.org/c/openstack/devstack-plugin-prometheus/+/946254 Change-Id: I721b550a03f9e5350a3f1ab10292faa1c50049a7	2025-04-24 16:06:50 +02:00
Alfredo Moralejo	a65e7e9b59	Query by fqdn_label instead of instance for host metrics Currently we are using `instance` label to query about host metrics to prometheus. This label is assigned to the url of each endpoint being scrapped. While this work fine in one-exporter-per-compute cases as the driver is mapping the fqdn_label value to the `instance` label value, it fails when there are more that one target with the same value for the fqdn label. This is a valid case, to be able to query by fqdn and do not care about what exporter in the host is providing the metric. This patch is changing the queries we use for hosts to be based on the fqdn_label instead of the instance one. To implement it, we are also simplifying the way we check the metric exist for the host by converting prometheus_fqdn_instance_map into a prometheus_fqdn_labels set which stores the list of fqdn found in prometheus. Closes-Bug: #2103451 Change-Id: I3bcc317441b73da5c876e53edd4622370c6d575e	2025-03-19 15:25:24 +01:00
Zuul	4527f89d8d	Merge "Add support for instance metrics to prometheus datasource"	2025-02-03 13:22:28 +00:00
Zuul	e535177bc0	Merge "Remove ceilometer datasource"	2025-01-29 13:22:46 +00:00
Alfredo Moralejo	136e5d927c	Add support for instance metrics to prometheus datasource In order to support vm_workload_consolidation, workload_balance and workload_stabilization strategis some instance metrics are required. This patch is adding support for them. Implementation is based on a prometheus store populated using sg-core from ceilometer metrics with Pollster source. - instance_ram_usage: rely on ceilometer_memory_usage metrics created from ceilometer memory.usage meter. - instance_ram_allocated: rely on the memory value provided by the inventory created from nova and placement APIs. - instance_cpu_usage: rely on ceilometer_cpu metric created from ceilometer cpu meter. A max value of 100 is set in the query. - instance_root_disk_size: rely on the `disk` value provided by the inventory created from nova and placement APIs. A new parameterer `instance_uuid_label` has been added to the prometheus datasource configuration to identify the label used to store the value of the OpenStack instance uuid for eache instance metric in prometheus. Default value is `resource`. Change-Id: I2f2b56aa002014e511a5e48398ef1da43fc4f5e2	2025-01-23 13:23:04 +01:00
m	3f26dc47f2	Add prometheus data source for watcher decision engine This adds a new data source for the Watcher decision engine that implements the watcher.decision_engine.datasources.DataSourceBase. related spec was merged at [1]. Implements: blueprint prometheus-datasource [1] https://review.opendev.org/c/openstack/watcher-specs/+/933300 Change-Id: I6a70c4acc70a864c418cf347f5f6951cb92ec906	2025-01-10 15:20:37 +02:00
Takashi Kajinami	da23fdc621	Remove ceilometer datasource This datasource requires Ceilometer API which was already removed some years ago. The implementation should have been removed when dependency on ceilometerclient was removed by [1]. Also remove some job definitions which are not actually used. [1] `01d74d0a87` Change-Id: I29c3865dc1207f1bbbb266e4217cf8888afebfb6	2024-12-16 23:51:27 +09:00
Sean Mooney	1f8d06e075	[docs] apply sphinx-lint to docs This change corrects the detected sphinx-linit issue in the existing docs and updates the contributor devstack guide to call out required and advanced. mostly the changes were simple fixes like replacing the configurable default rule with explict literal syntax `term` -> ``term`` some inline Note: comments have been promoted to .. note:: blocks and literal blocks :: have been promoted to .. code-block:: <language> directives. Change-Id: I6320c313d22bf542ad407169e6538dc6acf79901	2024-11-19 00:43:36 +00:00
Sean Mooney	5f79ab87c7	[pre-commit] fix typos and configure codespell This chanage enabled codespell in precommit and fixes the existing typos. A followup commit will enable this in tox and ci. Change-Id: I0a11bcd5a88247a48d3437525fc8a3cb3cdd4e58	2024-11-07 19:50:21 +00:00
Takashi Kajinami	b5e45b43b9	Drop unnecessary 'x' bit from doc config file This file is not actually executable. Trivial-Fix Change-Id: I64352c3c5c6bfd5d08aa4cee873016e02d736a2e	2024-10-28 13:13:24 +00:00
Sean Mooney	9d8b990fd1	[pre-commit] Add initial pre-commit config This change adds configuration for the pre-commit tool, follow-up changes will address the remaining issues in a phased approach to make the reviews simpler. This is based on the pre-commit config used in nova with some additional hooks. Follow-up changes will address the FIXME comments related to sphinx-lint and codespell, as well as update tox to enforce these checks in ci. Change-Id: I87681a19f7fa88366c2b0d310c8b3153aa6a137b	2024-10-22 20:12:53 +01:00
Takashi Kajinami	566a830f64	Bump hacking hacking 3.0.x is quite old. Bump it to the current latest version. Change-Id: I8d87fed6afe5988678c64090af261266d1ca20e6	2024-09-22 23:54:36 +09:00
Lucian Petrut	424e9a76af	vm workload consolidation: use actual host metrics The "vm workload consolidation" strategy is summing up instance usage in order to estimate host usage. The problem is that some infrastructure services (e.g. OVS or Ceph clients) may also use a significant amount of resources, which would be ignored. This can impact Watcher's ability to detect overloaded nodes and correctly rebalance the workload. This commit will use the host metrics, if available. The proposed implementation uses the maximum value between the host metric and the sum of the instance metrics. Note that we're holding a dict of host metric deltas in order to account for planned migrations. Change-Id: I82f474ee613f6c9a7c0a9d24a05cba41d2f68edb	2023-10-27 21:54:42 +03:00
Lucian Petrut	00fea975e2	Handle deprecated "cpu_util" metric The "cpu_util" metric has been deprecated a few years ago. We'll obtain the same result by converting the cumulative cpu time to a percentage, leveraging the rate of change aggregation. Change-Id: I18fe0de6f74c785e674faceea0c48f44055818fe	2023-10-24 10:47:23 +00:00
chenker	c7be34fbaa	update saving_energy docs Change-Id: I3b0c86911a8d32912c2de2e2392af9539b8d9be0	2023-02-07 10:27:54 +00:00
wangjiaqi07	c55143bc21	remove unicode from code Change-Id: I747445d482a2fb40c2f39139c5fd2a0cb26c27bc	2022-08-19 14:17:10 +08:00
Dantali0n	2414f66e38	Add watcher dashboard to devstack documentation Since installing watcher dashboard is fixed in devstack deployments we can update documentation so it recommends to install dashboard plugin. Change-Id: I284a1ec31536ea258cc1979ffd46b22d3e1ac18b	2021-07-09 10:37:28 +02:00
Ghanshyam Mann	863815153e	[goal] Deprecate the JSON formatted policy file As per the community goal of migrating the policy file the format from JSON to YAML[1], we need to do two things: 1. Change the default value of '[oslo_policy] policy_file'' config option from 'policy.json' to 'policy.yaml' with upgrade checks. 2. Deprecate the JSON formatted policy file on the project side via warning in doc and releasenotes. Also replace policy.json to policy.yaml ref from doc and tests. [1]https://governance.openstack.org/tc/goals/selected/wallaby/migrate-policy-format-from-json-to-yaml.html Change-Id: I207c02ba71fe60635fd3406c9c9364c11f259bae	2021-02-12 19:59:27 +00:00
xuanyandong	16a0486655	Remove six Replace the following items with Python 3 style code. - six.string_types - six.integer_types - six.moves - six.PY2 Implements: blueprint six-removal Change-Id: I2a0624bd4b455c7e5a0617f1253efa05485dc673	2020-09-30 16:25:13 +08:00
Andreas Jaeger	1ff940598f	Switch to newer openstackdocstheme and reno versions Switch to openstackdocstheme 2.2.1 and reno 3.1.0 versions. Using these versions will allow especially: * Linking from HTML to PDF document * Allow parallel building of documents * Fix some rendering problems Update Sphinx version as well. Set openstackdocs_pdf_link to link to PDF file. Note that the link to the published document only works on docs.openstack.org where the PDF file is placed in the top-level html directory. The site-preview places the PDF in a pdf directory. Set openstackdocs_auto_name to False to use 'project' variable as name. Change pygments_style to 'native' since old theme version always used 'native' and the theme now respects the setting and using 'sphinx' can lead to some strange rendering. Remove docs requirements from lower-constraints, they are not needed during install or test but only for docs building. openstackdocstheme renames some variables, so follow the renames before the next release removes them. A couple of variables are also not needed anymore, remove them. See also http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014971.html Change-Id: Ia9a3fb804fb59bb70edc150a3eb20c07a279170b	2020-05-21 15:15:16 +00:00

1 2 3 4 5 ...

452 Commits