diff --git a/doc/source/glossary.rst b/doc/source/glossary.rst index 3d753f113..59c06384f 100644 --- a/doc/source/glossary.rst +++ b/doc/source/glossary.rst @@ -20,97 +20,14 @@ They are sorted in alphabetical order. Action ====== -An :ref:`Action ` is what enables Watcher to transform the -current state of a :ref:`Cluster ` after an -:ref:`Audit `. - -An :ref:`Action ` is an atomic task which changes the -current state of a target :ref:`Managed resource ` -of the OpenStack :ref:`Cluster ` such as: - -- Live migration of an instance from one compute node to another compute - node with Nova -- Changing the power level of a compute node (ACPI level, ...) -- Changing the current state of an hypervisor (enable or disable) with Nova - -In most cases, an :ref:`Action ` triggers some concrete -commands on an existing OpenStack module (Nova, Neutron, Cinder, Ironic, etc.) -via a :ref:`Primitive `. - -An :ref:`Action ` has a life-cycle and its current state may -be one of the following: - -- **PENDING** : the :ref:`Action ` has not been executed - yet by the :ref:`Watcher Applier ` -- **ONGOING** : the :ref:`Action ` is currently being - processed by the :ref:`Watcher Applier ` -- **SUCCEEDED** : the :ref:`Action ` has been executed - successfully -- **FAILED** : an error occured while trying to execute the - :ref:`Action ` -- **DELETED** : the :ref:`Action ` is still stored in the - :ref:`Watcher database ` but is not returned - any more through the Watcher APIs. -- **CANCELLED** : the :ref:`Action ` was in **PENDING** or - **ONGOING** state and was cancelled by the - :ref:`Administrator ` +.. watcher-term:: watcher.api.controllers.v1.action .. _action_plan_definition: Action Plan =========== -An :ref:`Action Plan ` is a flow of -:ref:`Actions ` that should be executed in order to satisfy -a given :ref:`Goal `. - -An :ref:`Action Plan ` is generated by Watcher when an -:ref:`Audit ` is successful which implies that the -:ref:`Strategy ` -which was used has found a :ref:`Solution ` to achieve the -:ref:`Goal ` of this :ref:`Audit `. - -In the default implementation of Watcher, an -:ref:`Action Plan ` -is only composed of successive :ref:`Actions ` -(i.e., a Workflow of :ref:`Actions ` belonging to a unique -branch). - -However, Watcher provides abstract interfaces for many of its components, -allowing other implementations to generate and handle more complex -:ref:`Action Plan(s) ` -composed of two types of Action Item(s): - -- simple :ref:`Actions `: atomic tasks, which means it - can not be split into smaller tasks or commands from an OpenStack point of - view. -- composite Actions: which are composed of several simple - :ref:`Actions ` - ordered in sequential and/or parallel flows. - -An :ref:`Action Plan ` may be described using -standard workflow model description formats such as -`Business Process Model and Notation 2.0 (BPMN 2.0) `_ -or `Unified Modeling Language (UML) `_. - -An :ref:`Action Plan ` has a life-cycle and its current -state may be one of the following: - -- **RECOMMENDED** : the :ref:`Action Plan ` is waiting - for a validation from the :ref:`Administrator ` -- **ONGOING** : the :ref:`Action Plan ` is currently - being processed by the :ref:`Watcher Applier ` -- **SUCCEEDED** : the :ref:`Action Plan ` has been - executed successfully (i.e. all :ref:`Actions ` that it - contains have been executed successfully) -- **FAILED** : an error occured while executing the - :ref:`Action Plan ` -- **DELETED** : the :ref:`Action Plan ` is still - stored in the :ref:`Watcher database ` but is - not returned any more through the Watcher APIs. -- **CANCELLED** : the :ref:`Action Plan ` was in - **PENDING** or **ONGOING** state and was cancelled by the - :ref:`Administrator ` +.. watcher-term:: watcher.api.controllers.v1.action_plan .. _administrator_definition: @@ -144,73 +61,14 @@ any Watcher configuration files and to restart Watcher services. Audit ===== -In the Watcher system, an :ref:`Audit ` is a request for -optimizing a :ref:`Cluster `. - -The optimization is done in order to satisfy one :ref:`Goal ` -on a given :ref:`Cluster `. - -For each :ref:`Audit `, the Watcher system generates an -:ref:`Action Plan `. - -An :ref:`Audit ` has a life-cycle and its current state may -be one of the following: - -- **PENDING** : a request for an :ref:`Audit ` has been - submitted (either manually by the - :ref:`Administrator ` or automatically via some - event handling mechanism) and is in the queue for being processed by the - :ref:`Watcher Decision Engine ` -- **ONGOING** : the :ref:`Audit ` is currently being - processed by the - :ref:`Watcher Decision Engine ` -- **SUCCEEDED** : the :ref:`Audit ` has been executed - successfully (note that it may not necessarily produce a - :ref:`Solution `). -- **FAILED** : an error occured while executing the - :ref:`Audit ` -- **DELETED** : the :ref:`Audit ` is still stored in the - :ref:`Watcher database ` but is not returned - any more through the Watcher APIs. -- **CANCELLED** : the :ref:`Audit ` was in **PENDING** or - **ONGOING** state and was cancelled by the - :ref:`Administrator ` +.. watcher-term:: watcher.api.controllers.v1.audit .. _audit_template_definition: Audit Template ============== -An :ref:`Audit ` may be launched several times with the same -settings (:ref:`Goal `, thresholds, ...). Therefore it makes -sense to save those settings in some sort of Audit preset object, which is -known as an :ref:`Audit Template `. - -An :ref:`Audit Template ` contains at least the -:ref:`Goal ` of the :ref:`Audit `. - -It may also contain some error handling settings indicating whether: - -- :ref:`Watcher Applier ` stops the - entire operation -- :ref:`Watcher Applier ` performs a rollback - -and how many retries should be attempted before failure occurs (also the latter -can be complex: for example the scenario in which there are many first-time -failures on ultimately successful :ref:`Actions `). - -Moreover, an :ref:`Audit Template ` may contain some -settings related to the level of automation for the -:ref:`Action Plan ` that will be generated by the -:ref:`Audit `. -A flag will indicate whether the :ref:`Action Plan ` -will be launched automatically or will need a manual confirmation from the -:ref:`Administrator `. - -Last but not least, an :ref:`Audit Template ` may -contain a list of extra parameters related to the -:ref:`Strategy ` configuration. These parameters can be -provided as a list of key-value pairs. +.. watcher-term:: watcher.api.controllers.v1.audit_template .. _availability_zone_definition: @@ -241,136 +99,14 @@ The :ref:`Cluster ` may be divided in one or several Cluster Data Model ================== -A :ref:`Cluster Data Model ` is a logical -representation of the current state and topology of the -:ref:`Cluster ` -:ref:`Managed resources `. - -It is represented as a set of -:ref:`Managed resources ` -(which may be a simple tree or a flat list of key-value pairs) -which enables Watcher :ref:`Strategies ` to know the -current relationships between the different -:ref:`resources `) of the -:ref:`Cluster ` during an :ref:`Audit ` -and enables the :ref:`Strategy ` to request information -such as: - -- What compute nodes are in a given -:ref:`Availability Zone ` - or a given :ref:`Host Aggregate ` ? -- What :ref:`Instances ` are hosted on a given compute - node ? -- What is the current load of a compute node ? -- What is the current free memory of a compute node ? -- What is the network link between two compute nodes ? -- What is the available bandwidth on a given network link ? -- What is the current space available on a given virtual disk of a given - :ref:`Instance ` ? -- What is the current state of a given :ref:`Instance `? -- ... - -In a word, this data model enables the :ref:`Strategy ` -to know: - -- the current topology of the :ref:`Cluster ` -- the current capacity for each - :ref:`Managed resource ` -- the current amount of used/free space for each - :ref:`Managed resource ` -- the current state of each - :ref:`Managed resources ` - -In the Watcher project, we aim at providing a generic and very basic -:ref:`Cluster Data Model ` for each -:ref:`Goal `, usable in the associated -:ref:`Strategies ` through some helper classes in order -to: - -- simplify the development of a new - :ref:`Strategy ` for a given - :ref:`Goal ` when there already are some existing - :ref:`Strategies ` associated to the same - :ref:`Goal ` -- avoid duplicating the same code in several - :ref:`Strategies ` associated to the same - :ref:`Goal ` -- have a better consistency between the different - :ref:`Strategies ` for a given - :ref:`Goal ` -- avoid any strong coupling with any external - :ref:`Cluster Data Model ` - (the proposed data model acts as a pivot data model) - -There may be various -:ref:`generic and basic Cluster Data Models ` -proposed in Watcher helpers, each of them being adapted to achieving a given -:ref:`Goal `: - -- For example, for a - :ref:`Goal ` which aims at optimizing the network - :ref:`resources ` the - :ref:`Strategy ` may need to know which - :ref:`resources ` are communicating together. -- Whereas for a :ref:`Goal ` which aims at optimizing thermal - and power conditions, the :ref:`Strategy ` may need to - know the location of each compute node in the racks and the location of each - rack in the room. - -Note however that a developer can use his/her own -:ref:`Cluster Data Model ` if the proposed data -model does not fit his/her needs as long as the -:ref:`Strategy ` is able to produce a -:ref:`Solution ` for the requested -:ref:`Goal `. -For example, a developer could rely on the Nova Data Model to optimize some -compute resources. - -The :ref:`Cluster Data Model ` may be persisted -in any appropriate storage system (SQL database, NoSQL database, JSON file, -XML File, In Memory Database, ...). +.. watcher-term:: watcher.metrics_engine.cluster_model_collector.api .. _cluster_history_definition: Cluster History =============== -The :ref:`Cluster History ` contains all the -previously collected timestamped data such as metrics and events associated -to any :ref:`managed resource ` of the -:ref:`Cluster `. - -Just like the :ref:`Cluster Data Model `, this -history may be used by any :ref:`Strategy ` in order to -find the most optimal :ref:`Solution ` during an -:ref:`Audit `. - -In the Watcher project, a generic -:ref:`Cluster History ` -API is proposed with some helper classes in order to : - -- share a common measurement (events or metrics) naming based on what is - defined in Ceilometer. - See `the full list of available measurements `_ -- share common meter types (Cumulative, Delta, Gauge) based on what is - defined in Ceilometer. - See `the full list of meter types `_ -- simplify the development of a new :ref:`Strategy ` -- avoid duplicating the same code in several -:ref:`Strategies ` -- have a better consistency between the different -:ref:`Strategies ` -- avoid any strong coupling with any external metrics/events storage system - (the proposed API and measurement naming system acts as a pivot format) - -Note however that a developer can use his/her own history management system if -the Ceilometer system does not fit his/her needs as long as the -:ref:`Strategy ` is able to produce a -:ref:`Solution ` for the requested -:ref:`Goal `. - -The :ref:`Cluster History ` data may be persisted -in any appropriate storage system (InfluxDB, OpenTSDB, MongoDB,...). +.. watcher-term:: watcher.metrics_engine.cluster_history.api .. _controller_node_definition: @@ -419,20 +155,7 @@ them, or at least reported to them. Goal ==== -A :ref:`Goal ` is a human readable, observable and measurable -end result having one objective to be achieved. - -Here are some examples of :ref:`Goals `: - -- minimize the energy consumption -- minimize the number of compute nodes (consolidation) -- balance the workload among compute nodes -- minimize the license cost (some softwares have a licensing model which is - based on the number of sockets or cores where the software is deployed) -- find the most appropriate moment for a planned maintenance on a - given group of host (which may be an entire availability zone): - power supply replacement, cooling system replacement, hardware - modification, ... +.. watcher-term:: watcher.api.controllers.v1.goal .. _host_aggregates_definition: @@ -544,17 +267,7 @@ Please, read `the official OpenStack definition of a Project ` is the component that carries out a -certain type of atomic :ref:`Actions ` on a given -:ref:`Managed resource ` (nova, swift, neutron, -glance,..). A :ref:`Primitive ` is a part of the -:ref:`Watcher Applier ` module. - -For example, there can be a :ref:`Primitive ` which is -responsible for creating a snapshot of a given instance on a Nova compute node. -This :ref:`Primitive ` knows exactly how to send -the appropriate commands to Nova for this type of -:ref:`Actions `. +.. watcher-term:: watcher.applier.primitives.base .. _sla_definition: @@ -610,67 +323,21 @@ which provides a good definition. Solution ======== -A :ref:`Solution ` is a set of -:ref:`Actions ` generated by a -:ref:`Strategy ` (i.e., an algorithm) in order to achieve -the :ref:`Goal ` of an :ref:`Audit `. - -A :ref:`Solution ` is different from an -:ref:`Action Plan ` because it contains the -non-scheduled list of :ref:`Actions ` which is produced by a -:ref:`Strategy `. In other words, the list of Actions in -a :ref:`Solution ` has not yet been re-ordered by the -:ref:`Watcher Planner `. - -Note that some algorithms (i.e. :ref:`Strategies `) may -generate several :ref:`Solutions `. This gives rise to the -problem of determining which :ref:`Solution ` should be -applied. - -Two approaches to dealing with this can be envisaged: - -- **fully automated mode**: only the :ref:`Solution ` - with the highest ranking (i.e., the highest - :ref:`Optimization Efficiency `) - will be sent to the :ref:`Watcher Planner ` and - translated into concrete :ref:`Actions `. -- **manual mode**: several :ref:`Solutions ` are proposed - to the :ref:`Administrator ` with a detailed - measurement of the estimated - :ref:`Optimization Efficiency ` and he/she decides - which one will be launched. +.. watcher-term:: watcher.decision_engine.solution.base .. _strategy_definition: Strategy ======== -A :ref:`Strategy ` is an algorithm implementation which is -able to find a :ref:`Solution ` for a given -:ref:`Goal `. - -There may be several potential strategies which are able to achieve the same -:ref:`Goal `. This is why it is possible to configure which -specific :ref:`Strategy ` should be used for each -:ref:`Goal `. - -Some strategies may provide better optimization results but may take more time -to find an optimal :ref:`Solution `. - -When a new :ref:`Goal ` is added to the Watcher configuration, -at least one default associated :ref:`Strategy ` should be -provided as well. +.. watcher-term:: watcher.decision_engine.strategy.strategies.base .. _watcher_applier_definition: Watcher Applier =============== -This component is in charge of executing the -:ref:`Action Plan ` built by the -:ref:`Watcher Decision Engine `. - -See :doc:`architecture` for more details on this component. +.. watcher-term:: watcher.applier.base .. _watcher_database_definition: @@ -696,47 +363,12 @@ See :doc:`architecture` for more details on this component. Watcher Decision Engine ======================= -This component is responsible for computing a set of potential optimization -:ref:`Actions ` in order to fulfill the -:ref:`Goal ` of an :ref:`Audit `. - -It first reads the parameters of the :ref:`Audit ` from the -associated :ref:`Audit Template ` and knows the -:ref:`Goal ` to achieve. - -It then selects the most appropriate :ref:`Strategy ` -depending on how Watcher was configured for this :ref:`Goal `. - -The :ref:`Strategy ` is then executed and generates a set -of :ref:`Actions ` which are scheduled in time by the -:ref:`Watcher Planner ` (i.e., it generates an -:ref:`Action Plan `). - -See :doc:`architecture` for more details on this component. +.. watcher-term:: watcher.decision_engine.manager .. _watcher_planner_definition: Watcher Planner =============== -The :ref:`Watcher Planner ` is part of the -:ref:`Watcher Decision Engine `. - -This module takes the set of :ref:`Actions ` generated by a -:ref:`Strategy ` and builds the design of a workflow which -defines how-to schedule in time those different -:ref:`Actions ` and for each -:ref:`Action ` what are the prerequisite conditions. - -It is important to schedule :ref:`Actions ` in time in order -to prevent overload of the :ref:`Cluster ` while applying -the :ref:`Action Plan `. For example, it is important -not to migrate too many instances at the same time in order to avoid a network -congestion which may decrease the :ref:`SLA ` for -:ref:`Customers `. - -It is also important to schedule :ref:`Actions ` in order to -avoid security issues such as denial of service on core OpenStack services. - -See :doc:`architecture` for more details on this component. +.. watcher-term:: watcher.decision_engine.planner.base diff --git a/watcher/api/controllers/v1/action.py b/watcher/api/controllers/v1/action.py index 7555fb7c9..9a23ba120 100644 --- a/watcher/api/controllers/v1/action.py +++ b/watcher/api/controllers/v1/action.py @@ -15,6 +15,43 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +An :ref:`Action ` is what enables Watcher to transform the +current state of a :ref:`Cluster ` after an +:ref:`Audit `. + +An :ref:`Action ` is an atomic task which changes the +current state of a target :ref:`Managed resource ` +of the OpenStack :ref:`Cluster ` such as: + +- Live migration of an instance from one compute node to another compute + node with Nova +- Changing the power level of a compute node (ACPI level, ...) +- Changing the current state of an hypervisor (enable or disable) with Nova + +In most cases, an :ref:`Action ` triggers some concrete +commands on an existing OpenStack module (Nova, Neutron, Cinder, Ironic, etc.) +via a :ref:`Primitive `. + +An :ref:`Action ` has a life-cycle and its current state may +be one of the following: + +- **PENDING** : the :ref:`Action ` has not been executed + yet by the :ref:`Watcher Applier ` +- **ONGOING** : the :ref:`Action ` is currently being + processed by the :ref:`Watcher Applier ` +- **SUCCEEDED** : the :ref:`Action ` has been executed + successfully +- **FAILED** : an error occured while trying to execute the + :ref:`Action ` +- **DELETED** : the :ref:`Action ` is still stored in the + :ref:`Watcher database ` but is not returned + any more through the Watcher APIs. +- **CANCELLED** : the :ref:`Action ` was in **PENDING** or + **ONGOING** state and was cancelled by the + :ref:`Administrator ` +""" + import datetime import pecan diff --git a/watcher/api/controllers/v1/action_plan.py b/watcher/api/controllers/v1/action_plan.py index 53f34c62e..c55d3a3cc 100644 --- a/watcher/api/controllers/v1/action_plan.py +++ b/watcher/api/controllers/v1/action_plan.py @@ -15,6 +15,60 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +An :ref:`Action Plan ` is a flow of +:ref:`Actions ` that should be executed in order to satisfy +a given :ref:`Goal `. + +An :ref:`Action Plan ` is generated by Watcher when an +:ref:`Audit ` is successful which implies that the +:ref:`Strategy ` +which was used has found a :ref:`Solution ` to achieve the +:ref:`Goal ` of this :ref:`Audit `. + +In the default implementation of Watcher, an +:ref:`Action Plan ` +is only composed of successive :ref:`Actions ` +(i.e., a Workflow of :ref:`Actions ` belonging to a unique +branch). + +However, Watcher provides abstract interfaces for many of its components, +allowing other implementations to generate and handle more complex +:ref:`Action Plan(s) ` +composed of two types of Action Item(s): + +- simple :ref:`Actions `: atomic tasks, which means it + can not be split into smaller tasks or commands from an OpenStack point of + view. +- composite Actions: which are composed of several simple + :ref:`Actions ` + ordered in sequential and/or parallel flows. + +An :ref:`Action Plan ` may be described using +standard workflow model description formats such as +`Business Process Model and Notation 2.0 (BPMN 2.0) `_ +or `Unified Modeling Language (UML) `_. + +An :ref:`Action Plan ` has a life-cycle and its current +state may be one of the following: + +- **RECOMMENDED** : the :ref:`Action Plan ` is waiting + for a validation from the :ref:`Administrator ` +- **ONGOING** : the :ref:`Action Plan ` is currently + being processed by the :ref:`Watcher Applier ` +- **SUCCEEDED** : the :ref:`Action Plan ` has been + executed successfully (i.e. all :ref:`Actions ` that it + contains have been executed successfully) +- **FAILED** : an error occured while executing the + :ref:`Action Plan ` +- **DELETED** : the :ref:`Action Plan ` is still + stored in the :ref:`Watcher database ` but is + not returned any more through the Watcher APIs. +- **CANCELLED** : the :ref:`Action Plan ` was in + **PENDING** or **ONGOING** state and was cancelled by the + :ref:`Administrator ` +""" # noqa + import datetime import pecan diff --git a/watcher/api/controllers/v1/audit.py b/watcher/api/controllers/v1/audit.py index 31c59a4ec..eb14cdcfa 100644 --- a/watcher/api/controllers/v1/audit.py +++ b/watcher/api/controllers/v1/audit.py @@ -15,6 +15,40 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +In the Watcher system, an :ref:`Audit ` is a request for +optimizing a :ref:`Cluster `. + +The optimization is done in order to satisfy one :ref:`Goal ` +on a given :ref:`Cluster `. + +For each :ref:`Audit `, the Watcher system generates an +:ref:`Action Plan `. + +An :ref:`Audit ` has a life-cycle and its current state may +be one of the following: + +- **PENDING** : a request for an :ref:`Audit ` has been + submitted (either manually by the + :ref:`Administrator ` or automatically via some + event handling mechanism) and is in the queue for being processed by the + :ref:`Watcher Decision Engine ` +- **ONGOING** : the :ref:`Audit ` is currently being + processed by the + :ref:`Watcher Decision Engine ` +- **SUCCEEDED** : the :ref:`Audit ` has been executed + successfully (note that it may not necessarily produce a + :ref:`Solution `). +- **FAILED** : an error occured while executing the + :ref:`Audit ` +- **DELETED** : the :ref:`Audit ` is still stored in the + :ref:`Watcher database ` but is not returned + any more through the Watcher APIs. +- **CANCELLED** : the :ref:`Audit ` was in **PENDING** or + **ONGOING** state and was cancelled by the + :ref:`Administrator ` +""" + import datetime import pecan diff --git a/watcher/api/controllers/v1/audit_template.py b/watcher/api/controllers/v1/audit_template.py index 5c957aea8..774b5896f 100644 --- a/watcher/api/controllers/v1/audit_template.py +++ b/watcher/api/controllers/v1/audit_template.py @@ -15,6 +15,39 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +An :ref:`Audit ` may be launched several times with the same +settings (:ref:`Goal `, thresholds, ...). Therefore it makes +sense to save those settings in some sort of Audit preset object, which is +known as an :ref:`Audit Template `. + +An :ref:`Audit Template ` contains at least the +:ref:`Goal ` of the :ref:`Audit `. + +It may also contain some error handling settings indicating whether: + +- :ref:`Watcher Applier ` stops the + entire operation +- :ref:`Watcher Applier ` performs a rollback + +and how many retries should be attempted before failure occurs (also the latter +can be complex: for example the scenario in which there are many first-time +failures on ultimately successful :ref:`Actions `). + +Moreover, an :ref:`Audit Template ` may contain some +settings related to the level of automation for the +:ref:`Action Plan ` that will be generated by the +:ref:`Audit `. +A flag will indicate whether the :ref:`Action Plan ` +will be launched automatically or will need a manual confirmation from the +:ref:`Administrator `. + +Last but not least, an :ref:`Audit Template ` may +contain a list of extra parameters related to the +:ref:`Strategy ` configuration. These parameters can be +provided as a list of key-value pairs. +""" + import datetime import pecan diff --git a/watcher/api/controllers/v1/goal.py b/watcher/api/controllers/v1/goal.py index 3f9559a53..939f8bcd9 100644 --- a/watcher/api/controllers/v1/goal.py +++ b/watcher/api/controllers/v1/goal.py @@ -15,6 +15,23 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +A :ref:`Goal ` is a human readable, observable and measurable +end result having one objective to be achieved. + +Here are some examples of :ref:`Goals `: + +- minimize the energy consumption +- minimize the number of compute nodes (consolidation) +- balance the workload among compute nodes +- minimize the license cost (some softwares have a licensing model which is + based on the number of sockets or cores where the software is deployed) +- find the most appropriate moment for a planned maintenance on a + given group of host (which may be an entire availability zone): + power supply replacement, cooling system replacement, hardware + modification, ... +""" + from oslo_config import cfg import pecan diff --git a/watcher/applier/base.py b/watcher/applier/base.py index 039bd87d8..55f2187a5 100644 --- a/watcher/applier/base.py +++ b/watcher/applier/base.py @@ -17,6 +17,14 @@ # limitations under the License. # +""" +This component is in charge of executing the +:ref:`Action Plan ` built by the +:ref:`Watcher Decision Engine `. + +See :doc:`architecture` for more details on this component. +""" + import abc import six diff --git a/watcher/applier/primitives/base.py b/watcher/applier/primitives/base.py index 64eadcc7a..e118e5961 100644 --- a/watcher/applier/primitives/base.py +++ b/watcher/applier/primitives/base.py @@ -16,6 +16,21 @@ # See the License for the specific language governing permissions and # limitations under the License. # + +""" +A :ref:`Primitive ` is the component that carries out a +certain type of atomic :ref:`Actions ` on a given +:ref:`Managed resource ` (nova, swift, neutron, +glance,..). A :ref:`Primitive ` is a part of the +:ref:`Watcher Applier ` module. + +For example, there can be a :ref:`Primitive ` which is +responsible for creating a snapshot of a given instance on a Nova compute node. +This :ref:`Primitive ` knows exactly how to send +the appropriate commands to Nova for this type of +:ref:`Actions `. +""" + import abc import six from watcher.applier.promise import Promise diff --git a/watcher/decision_engine/manager.py b/watcher/decision_engine/manager.py index e088094c1..13078a750 100644 --- a/watcher/decision_engine/manager.py +++ b/watcher/decision_engine/manager.py @@ -17,6 +17,26 @@ # limitations under the License. # +""" +This component is responsible for computing a set of potential optimization +:ref:`Actions ` in order to fulfill the +:ref:`Goal ` of an :ref:`Audit `. + +It first reads the parameters of the :ref:`Audit ` from the +associated :ref:`Audit Template ` and knows the +:ref:`Goal ` to achieve. + +It then selects the most appropriate :ref:`Strategy ` +depending on how Watcher was configured for this :ref:`Goal `. + +The :ref:`Strategy ` is then executed and generates a set +of :ref:`Actions ` which are scheduled in time by the +:ref:`Watcher Planner ` (i.e., it generates an +:ref:`Action Plan `). + +See :doc:`architecture` for more details on this component. +""" + from oslo_config import cfg from oslo_log import log diff --git a/watcher/decision_engine/planner/base.py b/watcher/decision_engine/planner/base.py index dde9053d9..13255baf0 100644 --- a/watcher/decision_engine/planner/base.py +++ b/watcher/decision_engine/planner/base.py @@ -16,6 +16,30 @@ # See the License for the specific language governing permissions and # limitations under the License. # + +""" +The :ref:`Watcher Planner ` is part of the +:ref:`Watcher Decision Engine `. + +This module takes the set of :ref:`Actions ` generated by a +:ref:`Strategy ` and builds the design of a workflow which +defines how-to schedule in time those different +:ref:`Actions ` and for each +:ref:`Action ` what are the prerequisite conditions. + +It is important to schedule :ref:`Actions ` in time in order +to prevent overload of the :ref:`Cluster ` while applying +the :ref:`Action Plan `. For example, it is important +not to migrate too many instances at the same time in order to avoid a network +congestion which may decrease the :ref:`SLA ` for +:ref:`Customers `. + +It is also important to schedule :ref:`Actions ` in order to +avoid security issues such as denial of service on core OpenStack services. + +See :doc:`architecture` for more details on this component. +""" + import abc import six diff --git a/watcher/decision_engine/solution/base.py b/watcher/decision_engine/solution/base.py index 57f387e1d..182fd46bb 100644 --- a/watcher/decision_engine/solution/base.py +++ b/watcher/decision_engine/solution/base.py @@ -16,6 +16,39 @@ # See the License for the specific language governing permissions and # limitations under the License. # + +""" +A :ref:`Solution ` is a set of +:ref:`Actions ` generated by a +:ref:`Strategy ` (i.e., an algorithm) in order to achieve +the :ref:`Goal ` of an :ref:`Audit `. + +A :ref:`Solution ` is different from an +:ref:`Action Plan ` because it contains the +non-scheduled list of :ref:`Actions ` which is produced by a +:ref:`Strategy `. In other words, the list of Actions in +a :ref:`Solution ` has not yet been re-ordered by the +:ref:`Watcher Planner `. + +Note that some algorithms (i.e. :ref:`Strategies `) may +generate several :ref:`Solutions `. This gives rise to the +problem of determining which :ref:`Solution ` should be +applied. + +Two approaches to dealing with this can be envisaged: + +- **fully automated mode**: only the :ref:`Solution ` + with the highest ranking (i.e., the highest + :ref:`Optimization Efficiency `) + will be sent to the :ref:`Watcher Planner ` and + translated into concrete :ref:`Actions `. +- **manual mode**: several :ref:`Solutions ` are proposed + to the :ref:`Administrator ` with a detailed + measurement of the estimated + :ref:`Optimization Efficiency ` and he/she decides + which one will be launched. +""" + import abc import six diff --git a/watcher/decision_engine/strategy/strategies/base.py b/watcher/decision_engine/strategy/strategies/base.py index 5fd7a1ff0..c8917d490 100644 --- a/watcher/decision_engine/strategy/strategies/base.py +++ b/watcher/decision_engine/strategy/strategies/base.py @@ -14,6 +14,24 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +A :ref:`Strategy ` is an algorithm implementation which is +able to find a :ref:`Solution ` for a given +:ref:`Goal `. + +There may be several potential strategies which are able to achieve the same +:ref:`Goal `. This is why it is possible to configure which +specific :ref:`Strategy ` should be used for each +:ref:`Goal `. + +Some strategies may provide better optimization results but may take more time +to find an optimal :ref:`Solution `. + +When a new :ref:`Goal ` is added to the Watcher configuration, +at least one default associated :ref:`Strategy ` should be +provided as well. +""" + import abc from oslo_log import log import six diff --git a/watcher/metrics_engine/cluster_history/api.py b/watcher/metrics_engine/cluster_history/api.py index b89c330a5..a9d826041 100644 --- a/watcher/metrics_engine/cluster_history/api.py +++ b/watcher/metrics_engine/cluster_history/api.py @@ -16,6 +16,46 @@ # See the License for the specific language governing permissions and # limitations under the License. # + +""" +The :ref:`Cluster History ` contains all the +previously collected timestamped data such as metrics and events associated +to any :ref:`managed resource ` of the +:ref:`Cluster `. + +Just like the :ref:`Cluster Data Model `, this +history may be used by any :ref:`Strategy ` in order to +find the most optimal :ref:`Solution ` during an +:ref:`Audit `. + +In the Watcher project, a generic +:ref:`Cluster History ` +API is proposed with some helper classes in order to : + +- share a common measurement (events or metrics) naming based on what is + defined in Ceilometer. + See `the full list of available measurements `_ +- share common meter types (Cumulative, Delta, Gauge) based on what is + defined in Ceilometer. + See `the full list of meter types `_ +- simplify the development of a new :ref:`Strategy ` +- avoid duplicating the same code in several +:ref:`Strategies ` +- have a better consistency between the different +:ref:`Strategies ` +- avoid any strong coupling with any external metrics/events storage system + (the proposed API and measurement naming system acts as a pivot format) + +Note however that a developer can use his/her own history management system if +the Ceilometer system does not fit his/her needs as long as the +:ref:`Strategy ` is able to produce a +:ref:`Solution ` for the requested +:ref:`Goal `. + +The :ref:`Cluster History ` data may be persisted +in any appropriate storage system (InfluxDB, OpenTSDB, MongoDB,...). +""" # noqa + import abc import six diff --git a/watcher/metrics_engine/cluster_model_collector/api.py b/watcher/metrics_engine/cluster_model_collector/api.py index fada3fd63..ae4fcf46d 100644 --- a/watcher/metrics_engine/cluster_model_collector/api.py +++ b/watcher/metrics_engine/cluster_model_collector/api.py @@ -16,6 +16,98 @@ # See the License for the specific language governing permissions and # limitations under the License. # + +""" +A :ref:`Cluster Data Model ` is a logical +representation of the current state and topology of the +:ref:`Cluster ` +:ref:`Managed resources `. + +It is represented as a set of +:ref:`Managed resources ` +(which may be a simple tree or a flat list of key-value pairs) +which enables Watcher :ref:`Strategies ` to know the +current relationships between the different +:ref:`resources `) of the +:ref:`Cluster ` during an :ref:`Audit ` +and enables the :ref:`Strategy ` to request information +such as: + +- What compute nodes are in a given +:ref:`Availability Zone ` + or a given :ref:`Host Aggregate ` ? +- What :ref:`Instances ` are hosted on a given compute + node ? +- What is the current load of a compute node ? +- What is the current free memory of a compute node ? +- What is the network link between two compute nodes ? +- What is the available bandwidth on a given network link ? +- What is the current space available on a given virtual disk of a given + :ref:`Instance ` ? +- What is the current state of a given :ref:`Instance `? +- ... + +In a word, this data model enables the :ref:`Strategy ` +to know: + +- the current topology of the :ref:`Cluster ` +- the current capacity for each + :ref:`Managed resource ` +- the current amount of used/free space for each + :ref:`Managed resource ` +- the current state of each + :ref:`Managed resources ` + +In the Watcher project, we aim at providing a generic and very basic +:ref:`Cluster Data Model ` for each +:ref:`Goal `, usable in the associated +:ref:`Strategies ` through some helper classes in order +to: + +- simplify the development of a new + :ref:`Strategy ` for a given + :ref:`Goal ` when there already are some existing + :ref:`Strategies ` associated to the same + :ref:`Goal ` +- avoid duplicating the same code in several + :ref:`Strategies ` associated to the same + :ref:`Goal ` +- have a better consistency between the different + :ref:`Strategies ` for a given + :ref:`Goal ` +- avoid any strong coupling with any external + :ref:`Cluster Data Model ` + (the proposed data model acts as a pivot data model) + +There may be various +:ref:`generic and basic Cluster Data Models ` +proposed in Watcher helpers, each of them being adapted to achieving a given +:ref:`Goal `: + +- For example, for a + :ref:`Goal ` which aims at optimizing the network + :ref:`resources ` the + :ref:`Strategy ` may need to know which + :ref:`resources ` are communicating together. +- Whereas for a :ref:`Goal ` which aims at optimizing thermal + and power conditions, the :ref:`Strategy ` may need to + know the location of each compute node in the racks and the location of each + rack in the room. + +Note however that a developer can use his/her own +:ref:`Cluster Data Model ` if the proposed data +model does not fit his/her needs as long as the +:ref:`Strategy ` is able to produce a +:ref:`Solution ` for the requested +:ref:`Goal `. +For example, a developer could rely on the Nova Data Model to optimize some +compute resources. + +The :ref:`Cluster Data Model ` may be persisted +in any appropriate storage system (SQL database, NoSQL database, JSON file, +XML File, In Memory Database, ...). +""" + import abc import six diff --git a/watcher/objects/action_plan.py b/watcher/objects/action_plan.py index 9ea6e923c..b69a51fee 100644 --- a/watcher/objects/action_plan.py +++ b/watcher/objects/action_plan.py @@ -14,6 +14,59 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +An :ref:`Action Plan ` is a flow of +:ref:`Actions ` that should be executed in order to satisfy +a given :ref:`Goal `. + +An :ref:`Action Plan ` is generated by Watcher when an +:ref:`Audit ` is successful which implies that the +:ref:`Strategy ` +which was used has found a :ref:`Solution ` to achieve the +:ref:`Goal ` of this :ref:`Audit `. + +In the default implementation of Watcher, an +:ref:`Action Plan ` +is only composed of successive :ref:`Actions ` +(i.e., a Workflow of :ref:`Actions ` belonging to a unique +branch). + +However, Watcher provides abstract interfaces for many of its components, +allowing other implementations to generate and handle more complex +:ref:`Action Plan(s) ` +composed of two types of Action Item(s): + +- simple :ref:`Actions `: atomic tasks, which means it + can not be split into smaller tasks or commands from an OpenStack point of + view. +- composite Actions: which are composed of several simple + :ref:`Actions ` + ordered in sequential and/or parallel flows. + +An :ref:`Action Plan ` may be described using +standard workflow model description formats such as +`Business Process Model and Notation 2.0 (BPMN 2.0) `_ +or `Unified Modeling Language (UML) `_. + +An :ref:`Action Plan ` has a life-cycle and its current +state may be one of the following: + +- **RECOMMENDED** : the :ref:`Action Plan ` is waiting + for a validation from the :ref:`Administrator ` +- **ONGOING** : the :ref:`Action Plan ` is currently + being processed by the :ref:`Watcher Applier ` +- **SUCCEEDED** : the :ref:`Action Plan ` has been + executed successfully (i.e. all :ref:`Actions ` that it + contains have been executed successfully) +- **FAILED** : an error occured while executing the + :ref:`Action Plan ` +- **DELETED** : the :ref:`Action Plan ` is still + stored in the :ref:`Watcher database ` but is + not returned any more through the Watcher APIs. +- **CANCELLED** : the :ref:`Action Plan ` was in + **PENDING** or **ONGOING** state and was cancelled by the + :ref:`Administrator ` +""" # noqa from watcher.common import exception from watcher.common import utils diff --git a/watcher/objects/audit.py b/watcher/objects/audit.py index 7232d4073..e933395b8 100644 --- a/watcher/objects/audit.py +++ b/watcher/objects/audit.py @@ -14,6 +14,39 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +In the Watcher system, an :ref:`Audit ` is a request for +optimizing a :ref:`Cluster `. + +The optimization is done in order to satisfy one :ref:`Goal ` +on a given :ref:`Cluster `. + +For each :ref:`Audit `, the Watcher system generates an +:ref:`Action Plan `. + +An :ref:`Audit ` has a life-cycle and its current state may +be one of the following: + +- **PENDING** : a request for an :ref:`Audit ` has been + submitted (either manually by the + :ref:`Administrator ` or automatically via some + event handling mechanism) and is in the queue for being processed by the + :ref:`Watcher Decision Engine ` +- **ONGOING** : the :ref:`Audit ` is currently being + processed by the + :ref:`Watcher Decision Engine ` +- **SUCCEEDED** : the :ref:`Audit ` has been executed + successfully (note that it may not necessarily produce a + :ref:`Solution `). +- **FAILED** : an error occured while executing the + :ref:`Audit ` +- **DELETED** : the :ref:`Audit ` is still stored in the + :ref:`Watcher database ` but is not returned + any more through the Watcher APIs. +- **CANCELLED** : the :ref:`Audit ` was in **PENDING** or + **ONGOING** state and was cancelled by the + :ref:`Administrator ` +""" from watcher.common import exception from watcher.common import utils diff --git a/watcher/objects/audit_template.py b/watcher/objects/audit_template.py index f98d88681..7dac13b99 100644 --- a/watcher/objects/audit_template.py +++ b/watcher/objects/audit_template.py @@ -14,6 +14,38 @@ # See the License for the specific language governing permissions and # limitations under the License. +""" +An :ref:`Audit ` may be launched several times with the same +settings (:ref:`Goal `, thresholds, ...). Therefore it makes +sense to save those settings in some sort of Audit preset object, which is +known as an :ref:`Audit Template `. + +An :ref:`Audit Template ` contains at least the +:ref:`Goal ` of the :ref:`Audit `. + +It may also contain some error handling settings indicating whether: + +- :ref:`Watcher Applier ` stops the + entire operation +- :ref:`Watcher Applier ` performs a rollback + +and how many retries should be attempted before failure occurs (also the latter +can be complex: for example the scenario in which there are many first-time +failures on ultimately successful :ref:`Actions `). + +Moreover, an :ref:`Audit Template ` may contain some +settings related to the level of automation for the +:ref:`Action Plan ` that will be generated by the +:ref:`Audit `. +A flag will indicate whether the :ref:`Action Plan ` +will be launched automatically or will need a manual confirmation from the +:ref:`Administrator `. + +Last but not least, an :ref:`Audit Template ` may +contain a list of extra parameters related to the +:ref:`Strategy ` configuration. These parameters can be +provided as a list of key-value pairs. +""" from oslo_config import cfg from watcher.common import exception