watcher

Author	SHA1	Message	Date
Douglas Viroel	f879b10b05	Extend decision engine to support threading mode With the events of eventlet removal, Watcher will need to be adapted to support both modes, eventlet and threading, for a couple of releases before removing all eventlet code. This patch adds methods and classes that allow decision engine modules to create futurist thread pools instead of green thread pools, based on a environment variable that can be enabled by service. It moves continuous audit handler instance to decison engine service, so it can be started together with the main decision engine service. Adds an environment variable that allows the user to disable eventlet monkey patching and to use oslo.service threading backend. Change-Id: I8a8be0a7cebdc44005fd77ec960543828c7da318 Signed-off-by: Douglas Viroel <viroel@gmail.com>	2025-08-05 16:45:48 -03:00
Zuul	e64709ea08	Merge "Add warning message for experimental integrations"	2025-07-03 17:27:39 +00:00
Alfredo Moralejo	6ea362da0b	Use KiB as unit for host_ram_usage when using prometheus datasource The prometheus datasource was reporting host_ram_usage in MiB as described in the docstring for the base datasource interface definition [1]. However, the gnocchi datasource is reporting it in KiB following ceilometer metric `hardware.memory.used` [2] and the strategies using that metric expect it to be in KiB so the best approach is to change the unit in the prometheus datasource and update the docstring to avoid missunderstandings in future. So, this patch is fixing the prometheus datasource to return host_ram_usage in KiB instead of MiB. Additionally, it is adding more unit tests for the check_threshold method so that it covers the memory based strategy execution, validates the calculated standard deviation and adds the cases where it is below the threshold. [1] `15981117ee/watcher/decision_engine/datasources/base.py (L177-L183)` [2] https://docs.openstack.org/ceilometer/train/admin/telemetry-measurements.html#snmp-based-meters Closes-Bug: #2113776 Change-Id: Idc060d1e709c0265c64ada16062c3a206c6b04fa	2025-06-19 16:25:27 +02:00
Douglas Viroel	520ec0b79b	Add warning message for experimental integrations Some services integrations are now classified as experimental and a warning message will now appear once a client is created for them. These integrations are not fully tested in CI and miss a documentation on how they work or should be used. A release note was added to inform users about the status of these integrations and related features. Change-Id: Ib7d0ac0b3e187ae239dfa075fb53a6c0107dff29	2025-06-07 11:33:28 -03:00
Zuul	73f8728d22	Merge "Fix audit creation with no name and no goal or audit_template"	2025-06-05 13:39:38 +00:00
Alfredo Moralejo	bf6a28bd1e	Fix audit creation with no name and no goal or audit_template Currently, in that case it was failing because watcher tried to create a name based on a goal automatically and the goal is not defined. This patch is moving the check for goal specification in the audit creation call earlier, and if there is not goal defined, it returns an invalid call error. This patch is also modifying the existing error for this case to check the expected behavior. Closes-Bug: #2110947 Change-Id: I6f3d73b035e8081e86ce82c205498432f0e0fc33	2025-06-04 14:46:36 +02:00
Zuul	58b25101e6	Merge "Return HTTP code 400 when creating an audit with wrong parameters"	2025-05-27 19:23:25 +00:00
Zuul	20f231054a	Merge "Set actionplan state to FAILED if any action has failed"	2025-05-26 14:44:37 +00:00
Alfredo Moralejo	88d81c104e	Set actionplan state to FAILED if any action has failed Currently, an actionplan state is set to SUCCEEDED once the execution has finished, but that does not imply that all the actions finished successfully. This patch is checking the actual state of all the actions in the plan after the execution has finished. If any action has status FAILED, it will set the state of the action plan as FAILED and will apply the appropiate notification parameters. This is the expected behavior according to Watcher documentation. The patch is also fixing the unit test for this to set the expected action plan state to FAILED and notification parameters. Closes-Bug: #2106407 Change-Id: I7bfc6759b51cd97c26ec13b3918bd8d3b7ac9d4e	2025-05-26 14:58:03 +02:00
Zuul	26e36e1620	Merge "Handle missing dst_node parameter in zone_migration"	2025-05-20 17:14:29 +00:00
Zuul	3585e0cc3e	Merge "Drop code from Host maintenance strategy migrating instance to disabled hosts"	2025-05-16 18:18:26 +00:00
jgilaber	c6302edeca	Handle missing dst_node parameter in zone_migration For compute nodes, nova works fine if a destination node is not specified, so this change makes sure we're not passing None when the user does not set one to avoid an error. Partial-Bug: 2108988 Change-Id: Ida1f18b97697c041819e29f935aa5e232848226a	2025-05-16 13:51:47 +02:00
Alfredo Moralejo	4629402f38	Return HTTP code 400 when creating an audit with wrong parameters Currently, when trying to create an audit which misses a mandatory parameter watcher returns error 500 instead of 400 which is the documented error in the API [1] and the appropiate error code for malformed requests. This patch catch parameters validation errors according to the json schema for each strategy and returns error 400. It also fixes the unit test to validate the expected behavior. [1] https://docs.openstack.org/api-ref/resource-optimization/#audits Closes-Bug: #2110538 Change-Id: I23232b3b54421839bb01d54386d4e7b244f4e2a0	2025-05-16 09:35:50 +02:00
Zuul	86a260a2c7	Merge "Set keystone_client default interface to public"	2025-05-15 12:45:52 +00:00
Chandan Kumar (raukadah)	9dea55bd64	Drop code from Host maintenance strategy migrating instance to disabled hosts Currently host maintenance strategy also migrate instances from maintenance node to watcher_disabled compute nodes. watcher_disabled compute nodes might be disabled for some other purpose by different strategy. If host maintenace use those compute nodes for migration, It might affect customer workloads. Host maintenance strategy should never touch disabled hosts unless the user specify a disable host as backup node. This cr drops the logic for using disabled compute node for maintenance. Host maintaince is already using nova schedular for migrating the instance, will use the same. If there is no available node, strategy will fail. Closes-Bug: #2109945 Change-Id: If9795fd06f684eb67d553405cebd8a30887c3997 Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>	2025-05-14 09:24:25 +05:30
Douglas Viroel	17d1cf535a	Deprecated Noisy Neighbor strategy Noisy neighbor strategy is a proof of concept strategy that was built based on LLC metric, which is not available in Nova since Victoria release[1]. This patch marks this strategy as deprecated, to be removed in future releases. [1] https://docs.openstack.org/releasenotes/nova/victoria.html#relnotes-22-0-0-unmaintained-victoria-upgrade-notes Change-Id: I940b88555007312c76a86706bd44a38fbcf7701e	2025-05-12 15:44:39 -03:00
jgilaber	ae48f65f20	Set keystone_client default interface to public Set the default interface for keystone_client to public in the watcher conf instead of admin. Closes-Bug: 2109494 Change-Id: I9e0289249981ca965190df6dbdc37e09fd0951d7	2025-05-09 08:16:51 +02:00
Sean Mooney	57b248f9fe	Add support for pyproject.toml and wsgi module paths pip 23.1 removed the "setup.py install" fallback for projects that do not have pyproject.toml and now uses a pyproject.toml which is vendored in pip [1][2]. pip 24.2 has now deprecated a similar fallback to "setup.py develop" and plans to fully remove this in pip 25.0 [3][4][5]. pbr supports editable installs since 6.0.0 pip 25.1 has now been released and the removal is complete. by adding our own minimal pyproject.toml to ensure we are using the correct build system. This change also requires that we adapt how we generate our wsgi entry point. when pyproject.toml is used the wsgi console script is not generated in an editbale install such as is used in devstck To adress this we need to refactor our usage of our wsgi applciation to use a module path instead. This change does not remove the declaration of our wsgi_scrtip entry point but it shoudl be considered deprecated and it will be removed in the future. To unblock the gate the devstack plugin is modifed to to deploy using the wsgi module instead of the console script. Finally supprot for the mod_wsgi wsgi mode is removed. that was deprecated in devstack a few cycle ago and support was removed in I8823e98809ed6b66c27dbcf21a00eea68ef403e8 [1] https://pip.pypa.io/en/stable/news/#v23-1 [2] https://github.com/pypa/pip/issues/8368 [3] https://pip.pypa.io/en/stable/news/#v24-2 [4] https://github.com/pypa/pip/issues/11457 [5] https://ichard26.github.io/blog/2024/08/whats-new-in-pip-24.2/ Closes-Bug: #2109608 Depends-on: https://review.opendev.org/c/openstack/watcher/+/948502 Change-Id: Iad77939ab0403c5720c549f96edfc77d2b7d90ee	2025-05-01 00:19:59 +00:00
Alfredo Moralejo	a65e7e9b59	Query by fqdn_label instead of instance for host metrics Currently we are using `instance` label to query about host metrics to prometheus. This label is assigned to the url of each endpoint being scrapped. While this work fine in one-exporter-per-compute cases as the driver is mapping the fqdn_label value to the `instance` label value, it fails when there are more that one target with the same value for the fqdn label. This is a valid case, to be able to query by fqdn and do not care about what exporter in the host is providing the metric. This patch is changing the queries we use for hosts to be based on the fqdn_label instead of the instance one. To implement it, we are also simplifying the way we check the metric exist for the host by converting prometheus_fqdn_instance_map into a prometheus_fqdn_labels set which stores the list of fqdn found in prometheus. Closes-Bug: #2103451 Change-Id: I3bcc317441b73da5c876e53edd4622370c6d575e	2025-03-19 15:25:24 +01:00
Sean Mooney	bbf5c41cab	Add epoxy prelude This change added the prelude for the 2025.1 Expoxy release cycle. Change-Id: I8223842a57491a91c565e47bd1819db4d142e628	2025-03-05 17:57:55 +00:00
Takashi Kajinami	977f014cba	Deprecate Monasca data source The Monasca project was marked inactive during 2023.1. Although we have seen multiple people showing interest to keep the project, we haven't seen any real progress. Because the project is likely retired soon, let's deprecate the feature dependent on Monasca so that we can remove it in a future release. Change-Id: Ifd64f5ba59bbac238ff62302ec36a3e36954d6d0	2025-02-16 18:45:31 +09:00
Zuul	4527f89d8d	Merge "Add support for instance metrics to prometheus datasource"	2025-02-03 13:22:28 +00:00
Zuul	e535177bc0	Merge "Remove ceilometer datasource"	2025-01-29 13:22:46 +00:00
Alfredo Moralejo	136e5d927c	Add support for instance metrics to prometheus datasource In order to support vm_workload_consolidation, workload_balance and workload_stabilization strategis some instance metrics are required. This patch is adding support for them. Implementation is based on a prometheus store populated using sg-core from ceilometer metrics with Pollster source. - instance_ram_usage: rely on ceilometer_memory_usage metrics created from ceilometer memory.usage meter. - instance_ram_allocated: rely on the memory value provided by the inventory created from nova and placement APIs. - instance_cpu_usage: rely on ceilometer_cpu metric created from ceilometer cpu meter. A max value of 100 is set in the query. - instance_root_disk_size: rely on the `disk` value provided by the inventory created from nova and placement APIs. A new parameterer `instance_uuid_label` has been added to the prometheus datasource configuration to identify the label used to store the value of the OpenStack instance uuid for eache instance metric in prometheus. Default value is `resource`. Change-Id: I2f2b56aa002014e511a5e48398ef1da43fc4f5e2	2025-01-23 13:23:04 +01:00
m	3f26dc47f2	Add prometheus data source for watcher decision engine This adds a new data source for the Watcher decision engine that implements the watcher.decision_engine.datasources.DataSourceBase. related spec was merged at [1]. Implements: blueprint prometheus-datasource [1] https://review.opendev.org/c/openstack/watcher-specs/+/933300 Change-Id: I6a70c4acc70a864c418cf347f5f6951cb92ec906	2025-01-10 15:20:37 +02:00
Zuul	70ba13ca6d	Merge "Update python versions, drop py3.8"	2024-12-21 01:58:27 +00:00
Takashi Kajinami	da23fdc621	Remove ceilometer datasource This datasource requires Ceilometer API which was already removed some years ago. The implementation should have been removed when dependency on ceilometerclient was removed by [1]. Also remove some job definitions which are not actually used. [1] `01d74d0a87` Change-Id: I29c3865dc1207f1bbbb266e4217cf8888afebfb6	2024-12-16 23:51:27 +09:00
Sean Mooney	5f79ab87c7	[pre-commit] fix typos and configure codespell This chanage enabled codespell in precommit and fixes the existing typos. A followup commit will enable this in tox and ci. Change-Id: I0a11bcd5a88247a48d3437525fc8a3cb3cdd4e58	2024-11-07 19:50:21 +00:00
Martin Kopec	6adaedf696	Update python versions, drop py3.8 The current testing runtime [1] states testing from py3.9 to 3.12. The patch updates setup.cfg to reflect the correct python versions. The patch also drops python 3.8 support following [2]. [1] https://governance.openstack.org/tc/reference/runtimes/2025.1.html [2] https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/FOWV4UQZTH4DPDA67QDEROAESYU5Z3LE/ Change-Id: I2d13409c9bfffc866e31af52611a26f6037021cc	2024-11-06 16:00:11 +01:00
Sean Mooney	9d8b990fd1	[pre-commit] Add initial pre-commit config This change adds configuration for the pre-commit tool, follow-up changes will address the remaining issues in a phased approach to make the reviews simpler. This is based on the pre-commit config used in nova with some additional hooks. Follow-up changes will address the FIXME comments related to sphinx-lint and codespell, as well as update tox to enforce these checks in ci. Change-Id: I87681a19f7fa88366c2b0d310c8b3153aa6a137b	2024-10-22 20:12:53 +01:00
Ghanshyam Mann	863815153e	[goal] Deprecate the JSON formatted policy file As per the community goal of migrating the policy file the format from JSON to YAML[1], we need to do two things: 1. Change the default value of '[oslo_policy] policy_file'' config option from 'policy.json' to 'policy.yaml' with upgrade checks. 2. Deprecate the JSON formatted policy file on the project side via warning in doc and releasenotes. Also replace policy.json to policy.yaml ref from doc and tests. [1]https://governance.openstack.org/tc/goals/selected/wallaby/migrate-policy-format-from-json-to-yaml.html Change-Id: I207c02ba71fe60635fd3406c9c9364c11f259bae	2021-02-12 19:59:27 +00:00
Zuul	2591b03625	Merge "Add releasenote for event-driven-optimization-based"	2020-02-13 07:05:04 +00:00
licanwei	58083bb67b	releasenotes: Fix reference url Change-Id: I0da6021f6d39cb7d6e79e8f637046d8dd0285647	2020-02-05 16:48:49 +08:00
licanwei	f79321ceeb	Add releasenote for event-driven-optimization-based Change-Id: If8fa82dab2e7f0ae359805eb68cc8562cfc641e3 Implements: blueprint event-driven-optimization-based	2020-02-04 03:46:32 +00:00
Dantali0n	ba43f766b8	Releasenote for decision engine threadpool Add the releasenote for the general purpose decision engine threadpool. Including config parameters and how contributors can find relevant documentation. Implements: blueprint general-purpose-decision-engine-threadpool Change-Id: I3560069b4e34f13305950559a0f05f7921f7867e	2019-11-30 03:13:15 +00:00
Ghanshyam Mann	17f5a65a62	[ussuri][goal] Drop python 2.7 support and testing OpenStack is dropping the py2.7 support in ussuri cycle. Watcher is ready with python 3 and ok to drop the python 2.7 support. Complete discussion & schedule can be found in - http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010142.html - https://etherpad.openstack.org/p/drop-python2-support Ussuri Communtiy-wide goal: https://governance.openstack.org/tc/goals/selected/ussuri/drop-py27.html Depends-On: https://review.opendev.org/#/c/693631/ Change-Id: I603c6d2c22779e8ef2e70eb6369fc521a77c9c3a	2019-11-16 14:55:01 +00:00
licanwei	a88e076646	Watcher planner slector releasenote Change-Id: I632a59d9e3cb6f5d0dad8987b1b01934d9ce0b42 Implements: bp watcher-planner-selector	2019-09-18 01:59:01 -07:00
Zuul	67e9e16d62	Merge "node resource consolidation"	2019-09-16 14:44:50 +00:00
chenke	03a6216da0	Add releasenote about bp show-datamodel-api Partially Implements:blueprint show-datamodel-api Change-Id: I2f8a41cd8f9f805bd3796cbd639bec233546b521	2019-09-10 09:39:10 +08:00
licanwei	f1fe4b6c62	node resource consolidation This strategy is used to centralize VMs to as few nodes as possible by VM migration. User can set a input parameter to decide how to select the destination node. Implements: blueprint node-resource-consolidation Closes-Bug: #1843016 Change-Id: I104c864d532c2092f5dc6f0c8f756ebeae12f09e	2019-09-06 18:03:43 -07:00
licanwei	4b2238f9a5	add releasenote for bp improve-compute-data-model Change-Id: I19780be28912cb0ea1cad49c7c0f43ab3ba8f6e7 Implements: blueprint improve-compute-data-model	2019-08-09 03:06:43 +00:00
Dantali0n	cadc000f32	Add call_retry for ModelBuilder for error recovery Add call_retry method for ModelBuilder classes along with configuration options. This allows ModelBuilder classes to reattempt any failed calls to external services such as Nova or Ironic. Change-Id: Ided697adebed957e5ff13b4c6b5b06c816f81c4a	2019-07-19 16:09:18 +02:00
Dantali0n	a45f5abe48	Releasenote for grafana datasource This is the releasenote for the new grafana datasource it refers to the documentation on configuring grafana. Depends-on: Ib12b6a7882703e84a27c301e821c1a034b192508 Change-Id: Icb3939d772f06ad2d66eeba9a59fa8b60822ece0	2019-07-02 10:20:49 +02:00
licanwei	c1a5e443fe	Add uWSGI support This patch implements uWSGI support for Watcher API service. Because mod_wsgi is deprecated, using uwsgi to replace of mod_wsgi. Most of Openstack projects have finished it. Closes-Bug: #1834392 Change-Id: I3fad8d30a15aba493fb91da9337c2515ddea5167	2019-06-27 14:56:52 +08:00
Zuul	667d2d661a	Merge "Move datasource query_retry into baseclass."	2019-06-14 09:04:15 +00:00
Zuul	b2111baf91	Merge "Backwards compatibility for node parameter"	2019-06-14 07:42:09 +00:00
Zuul	5f126cffe0	Merge "Add Placement helper"	2019-06-14 02:21:44 +00:00
Dantali0n	584eeefdc8	Move datasource query_retry into baseclass. Moves the query_retry method into the baseclass and makes the query retry and timeout options part of the watcher_datasources config group. This makes the query_retry behavior uniform across all datasources. A new baseclass method named query_retry_reset is added so datasources can define operations to perform when recovering from a query error. Test cases are added to verify the behavior of query_retry. The query_max_retries and query_timeout config parameters are deprecated in the gnocchi_client group and will be removed in a future release. Change-Id: I33e9dc2d1f5ba8f83fcf1488ff583ca5be5529cc	2019-06-13 15:52:53 +02:00
Dantali0n	dd119ca1f8	Backwards compatibility for node parameter Adds backwards compatibility for node parameter used by strategies. If the node value is set by the user configuration it will override the value for compute_node which is the value used by the strategies now. This change was introduced in: https://review.opendev.org/#/c/656622/ Resolution discussed in the meeting on the 5th of June 2019 https://eavesdrop.openstack.org/meetings/watcher/2019/watcher.2019-06-05-08.00.log.html Change-Id: Idaea062789a6b169e64f556fecc34cfbaaee5076	2019-06-12 16:23:58 +00:00
licanwei	b57feba5e8	Add Placement helper This patch added Placement to Watcher We plan to improve the data model and strategies in the future specs. Change-Id: I7141459eef66557cd5d525b5887bd2a381cdac3f Implements: blueprint support-placement-api	2019-06-12 11:11:13 +08:00

1 2 3

110 Commits