Bug #2113862 details a number of suggested corrections and additions to the Workload Stabilization doc. This patch adds those suggested changes. Closes-Bug: #2113862 Assisted-By: Cursor (claude-3.5-sonnet) Change-Id: I4131a304c064d2ea397b2447025c7edf69a56e2a Signed-off-by: Ronelle Landy <rlandy@redhat.com>
188 lines
8.4 KiB
ReStructuredText
188 lines
8.4 KiB
ReStructuredText
===============================
|
||
Workload Stabilization Strategy
|
||
===============================
|
||
|
||
Synopsis
|
||
--------
|
||
|
||
**display name**: ``Workload stabilization``
|
||
|
||
**goal**: ``workload_balancing``
|
||
|
||
.. watcher-term:: watcher.decision_engine.strategy.strategies.workload_stabilization.WorkloadStabilization
|
||
|
||
Requirements
|
||
------------
|
||
|
||
Metrics
|
||
*******
|
||
|
||
The *workload_stabilization* strategy requires the following metrics:
|
||
|
||
============================ ==================================================
|
||
metric description
|
||
============================ ==================================================
|
||
``instance_ram_usage`` ram memory usage in an instance as float in
|
||
megabytes
|
||
``instance_cpu_usage`` cpu usage in an instance as float ranging between
|
||
0 and 100 representing the total cpu usage as
|
||
percentage
|
||
``host_ram_usage`` ram memory usage in a compute node as float in
|
||
megabytes
|
||
``host_cpu_usage`` cpu usage in a compute node as float ranging
|
||
between 0 and 100 representing the total cpu
|
||
usage as percentage
|
||
============================ ==================================================
|
||
|
||
Cluster data model
|
||
******************
|
||
|
||
Default Watcher's Compute cluster data model:
|
||
|
||
.. watcher-term:: watcher.decision_engine.model.collector.nova.NovaClusterDataModelCollector
|
||
|
||
Actions
|
||
*******
|
||
|
||
Default Watcher's actions:
|
||
|
||
|
||
.. list-table::
|
||
:widths: 30 30
|
||
:header-rows: 1
|
||
|
||
* - action
|
||
- description
|
||
* - ``migration``
|
||
- .. watcher-term:: watcher.applier.actions.migration.Migrate
|
||
|
||
Planner
|
||
*******
|
||
|
||
Default Watcher's planner:
|
||
|
||
.. watcher-term:: watcher.decision_engine.planner.weight.WeightPlanner
|
||
|
||
Configuration
|
||
-------------
|
||
|
||
Strategy parameters are:
|
||
|
||
====================== ====== =================== =============================
|
||
parameter type default Value description
|
||
====================== ====== =================== =============================
|
||
``metrics`` array |metrics| Metrics used as rates of
|
||
cluster loads.
|
||
``thresholds`` object |thresholds| Dict where key is a metric
|
||
and value is a trigger value.
|
||
The strategy will only will
|
||
look for an action plan when
|
||
the standard deviation for
|
||
the usage of one of the
|
||
resources included in the
|
||
metrics, taken as a
|
||
normalized usage between
|
||
0 and 1 among the hosts is
|
||
higher than the threshold.
|
||
The value of a perfectly
|
||
balanced cluster for the
|
||
standard deviation would be
|
||
0, while in a totally
|
||
unbalanced one would be 0.5,
|
||
which should be the maximum
|
||
value.
|
||
``weights`` object |weights| These weights are used to
|
||
calculate common standard
|
||
deviation when optimizing
|
||
the resources usage.
|
||
Name of weight contains meter
|
||
name and _weight suffix.
|
||
Higher values imply the
|
||
metric will be prioritized
|
||
when calculating an optimal
|
||
resulting cluster
|
||
distribution.
|
||
``instance_metrics`` object |instance_metrics| This parameter represents
|
||
the compute node metrics
|
||
representing compute resource
|
||
usage for the instances
|
||
resource indicated in the
|
||
metrics parameter.
|
||
``host_choice`` string retry Method of host’s choice when
|
||
analyzing destination for
|
||
instances.
|
||
There are cycle, retry and
|
||
fullsearch methods. Cycle
|
||
will iterate hosts in cycle.
|
||
Retry will get some hosts
|
||
random (count defined in
|
||
retry_count option).
|
||
Fullsearch will return each
|
||
host from list.
|
||
``retry_count`` number 1 Count of random returned
|
||
hosts.
|
||
``periods`` object |periods| Time, in seconds, to get
|
||
statistical values for
|
||
resources usage for instance
|
||
and host metrics.
|
||
Watcher will use the last
|
||
period to calculate resource
|
||
usage.
|
||
``granularity`` number 300 NOT RECOMMENDED TO MODIFY:
|
||
The time between two measures
|
||
in an aggregated timeseries
|
||
of a metric.
|
||
``aggregation_method`` object |aggn_method| NOT RECOMMENDED TO MODIFY:
|
||
Function used to aggregate
|
||
multiple measures into an
|
||
aggregated value.
|
||
====================== ====== =================== =============================
|
||
|
||
.. |metrics| replace:: ["instance_cpu_usage", "instance_ram_usage"]
|
||
.. |thresholds| replace:: {"instance_cpu_usage": 0.2, "instance_ram_usage": 0.2}
|
||
.. |weights| replace:: {"instance_cpu_usage_weight": 1.0, "instance_ram_usage_weight": 1.0}
|
||
.. |instance_metrics| replace:: {"instance_cpu_usage": "host_cpu_usage", "instance_ram_usage": "host_ram_usage"}
|
||
.. |periods| replace:: {"instance": 720, "node": 600}
|
||
.. |aggn_method| replace:: {"instance": 'mean', "compute_node": 'mean'}
|
||
|
||
|
||
Efficacy Indicator
|
||
------------------
|
||
|
||
Global efficacy indicator:
|
||
|
||
.. watcher-func::
|
||
:format: literal_block
|
||
|
||
watcher.decision_engine.goal.efficacy.specs.WorkloadBalancing.get_global_efficacy_indicator
|
||
|
||
Other efficacy indicators of the goal are:
|
||
|
||
- ``instance_migrations_count``: The number of VM migrations to be performed
|
||
- ``instances_count``: The total number of audited instances in strategy
|
||
- ``standard_deviation_after_audit``: The value of resulted standard deviation
|
||
- ``standard_deviation_before_audit``: The value of original standard deviation
|
||
|
||
Algorithm
|
||
---------
|
||
|
||
You can find description of overload algorithm and role of standard deviation
|
||
here: https://specs.openstack.org/openstack/watcher-specs/specs/newton/implemented/sd-strategy.html
|
||
|
||
How to use it ?
|
||
---------------
|
||
|
||
.. code-block:: shell
|
||
|
||
$ openstack optimize audittemplate create \
|
||
at1 workload_balancing --strategy workload_stabilization
|
||
|
||
$ openstack optimize audit create -a at1 \
|
||
-p thresholds='{"instance_ram_usage": 0.05}' \
|
||
-p metrics='["instance_ram_usage"]'
|
||
|
||
External Links
|
||
--------------
|
||
|
||
None
|