Merge "Update Workload Balance strategy documentation"

2025-06-27 13:36:50 +00:00
parent bfbd136f4b f42cb8557b
commit 16131e5cac
2 changed files with 72 additions and 24 deletions
--- a/doc/source/strategies/workload_balance.rst
+++ b/doc/source/strategies/workload_balance.rst
@@ -11,25 +11,35 @@ Synopsis
    .. watcher-term:: watcher.decision_engine.strategy.strategies.workload_balance.WorkloadBalance
 Requirements
 ------------
 None.
 Metrics
 *******
 The *workload_balance* strategy requires the following metrics:
-======================= ============ ======= =========================
+======================= ============ ======= =========== ======================
-metric                  service name plugins comment
+metric                  service name plugins unit        comment
-======================= ============ ======= =========================
+======================= ============ ======= =========== ======================
-``cpu``                 ceilometer_  none
+``cpu``                 ceilometer_  none    percentage  CPU of the instance.
-``memory.resident``     ceilometer_  none
+                                                         Used to calculate the
-======================= ============ ======= =========================
+                                                         threshold
 ``memory.resident``     ceilometer_  none    MB          RAM of the instance.
                                                         Used to calculate the
                                                         threshold
 ======================= ============ ======= =========== ======================
 .. _ceilometer: https://docs.openstack.org/ceilometer/latest/admin/telemetry-measurements.html#openstack-compute
 **Notes**
 * The parameters above reference the instance CPU or RAM usage, but
  the threshold calculation is based of the CPU/RAM usage on the hypervisor.
 * The RAM usage can be calculated based on the RAM consumed by the instance,
  and the available RAM on the hypervisor.
 * The CPU percentage calculation relies on the CPU load, but also on the number
  of CPUs on the hypervisor.
 * The memory host metric is calculated by summing the RAM usage of each
  instance on the host. This measure is close to the real usage, but is not
  the exact usage on the host.
 Cluster data model
 ******************
@@ -64,16 +74,28 @@ Configuration
 Strategy parameters are:
-============== ====== ==================== ====================================
+================ ====== ==================== ==================================
-parameter      type   default Value        description
+parameter        type   default value        description
-============== ====== ==================== ====================================
+================ ====== ==================== ==================================
-``metrics``    String 'instance_cpu_usage' Workload balance base on cpu or ram
+``metrics``      String 'instance_cpu_usage' Workload balance base on cpu or
-                                           utilization. Choices:
+                                             ram utilization. Choices:
-                                           ['instance_cpu_usage',
+                                             ['instance_cpu_usage',
-                                           'instance_ram_usage']
+                                             'instance_ram_usage']
-``threshold``  Number 25.0                 Workload threshold for migration
+``threshold``    Number 25.0                 Workload threshold for migration.
-``period``     Number 300                  Aggregate time period of ceilometer
+                                             Used for both the source and the
-============== ====== ==================== ====================================
+                                             destination calculations.
                                             Threshold is always a percentage.
 ``period``       Number 300                  Aggregate time period of
                                             ceilometer
 ``granularity``  Number 300                  The time between two measures in
                                             an aggregated timeseries of a
                                             metric.
                                             This parameter is only used
                                             with the Gnocchi data source,
                                             and it must match to any of the
                                             valid archive policies for the
                                             metric.
 ================ ====== ==================== ==================================
 Efficacy Indicator
 ------------------
@@ -89,14 +111,36 @@ to: https://specs.openstack.org/openstack/watcher-specs/specs/mitaka/implemented
 How to use it ?
 ---------------
 Create and audit template using the Workload Balancing strategy.
 .. code-block:: shell
    $ openstack optimize audittemplate create \
      at1 workload_balancing --strategy workload_balance
 Run an audit using the Workload Balance strategy where
 the aim is to get a plan to move VMs from any host where the
 CPU usage is over the threshold of 26%, to a host where the
 utilization of CPU is under the threshold.
 The measurements of CPU utilization are taken from Ceilometer
 with an aggregate period of 310.
 .. code-block:: shell
    $ openstack optimize audit create -a at1 -p threshold=26.0 \
            -p period=310 -p metrics=instance_cpu_usage
 Run an audit using the Workload Balance strategy to
 obtain a plan to balance VMs over hosts with a threshold of 20%.
 In this case, the stipulation of the Ceilometer CPU utilization
 metric measurement is a combination of period and granularity.
 .. code-block:: shell
    $ openstack optimize audit create -a at1 \
           -p granularity=30 -p threshold=20 -p period=300 \
           -p metrics=instance_cpu_usage --auto-trigger
 External Links
 --------------
--- a/watcher/decision_engine/strategy/strategies/workload_balance.py
+++ b/watcher/decision_engine/strategy/strategies/workload_balance.py
@@ -28,13 +28,16 @@ LOG = log.getLogger(__name__)
 class WorkloadBalance(base.WorkloadStabilizationBaseStrategy):
-    """[PoC]Workload balance using live migration
+    """Workload balance using live migration
    *Description*
        It is a migration strategy based on the VM workload of physical
        servers. It generates solutions to move a workload whenever a server's
        CPU or RAM utilization % is higher than the specified threshold.
        The threshold specified is used to trigger a migration,
        but it is also used to determine if there is an available host,
        with low enough utilization, to migrate the instance.
        The VM to be moved should make the host close to average workload
        of all compute nodes.
@@ -48,7 +51,6 @@ class WorkloadBalance(base.WorkloadStabilizationBaseStrategy):
    *Limitations*
       - This is a proof of concept that is not meant to be used in production
       - We cannot forecast how many servers should be migrated. This is the
         reason why we only plan a single virtual machine migration at a time.
         So it's better to use this algorithm with `CONTINUOUS` audits.
@@ -105,7 +107,9 @@ class WorkloadBalance(base.WorkloadStabilizationBaseStrategy):
                    "default": "instance_cpu_usage"
                },
                "threshold": {
-                    "description": "workload threshold for migration",
+                    "description": "Workload threshold for migration - "
                                   "used for source and destination hosts. "
                                   "It is always a percentage value.",
                    "type": "number",
                    "default": 25.0
                },