The fields(vcpus, memory and disk_capacity) in the Watcher ComputeNode
do not take allocation ratios used for overcommit into account so there
may be disparity between this and the used count.
This patch added some new fields to solve this problem.
Partially Implements: blueprint improve-compute-data-model
Change-Id: Id33496f368fb23cb8e744c7e8451e1cd1397866b
Add call_retry method for ModelBuilder classes along with configuration
options. This allows ModelBuilder classes to reattempt any failed calls
to external services such as Nova or Ironic.
Change-Id: Ided697adebed957e5ff13b4c6b5b06c816f81c4a
This lets all the ModelBuilder classes use one baseclass and forces
ClusterDataModelCollector's to pass the scope.
The scopes are still unused in the case of Ironic and Cinder.
The idea is to do several follow ups to this and in the end have a
similar method to query_retry in the datasources baseclass.
Change-Id: Ibbdedd3087fef5298d7f4c9d3abdba05d1fbb2f0
aggregate list and availability_zone list may return ironic type
compute nodes. When building compute data model we should check
the hypervisor_type and remove ironic compute nodes.
Change-Id: Idf404c104c30368baf95ef7d05ad8fc3e7adca38
Related-Bug: #1835183
We want to set the value of uuid field of Watcher ComputeNode
to hypversion id(as uuid). We need a method to get compute
node by name.
Change-Id: I0975500f359de92b6d6fdea2e01614cf0ba73f05
Related-Bug: #1835192
The problem is that watcher is passing limit=-1 to novaclient when
listing servers which will always make at least two API calls to be
sure it's done paging:
https://github.com/openstack/python-novaclient/blob/13.0.1/novaclient/v2/servers.py#L896
If we can determine before we list servers that there are only a
certain number where the number of servers is less than 1000. For
example: 4, we should just pass the limit=len(servers) to novaclient
and avoid the second call for paging which takes extra time and
yields no results.
Change-Id: I797ad934a0f8496dbcbf65798e28b0443f238137
Closes-Bug: #1834679
In the process of handling instance_created.end,
there is a KeyError exception output log. This is because
invoking get_instance_by_uuid before creating the instance
in the data model.
During the review of https://review.opendev.org/#/c/663489/,
reviewers think that it's better to remove the KeyError exception.
This patche seperates the process of instance_created.end from
other Nova notifications and removes the call of get_instance_by_uuid.
Change-Id: Ie9e2d4f5b32ee7a5b52bbcd50abfa81dcabab7bb
In get_node_by_instance_uuid, an exception ComputeNodeNotFound
will be thrown if can't find a node through instance uuid.
But the exception information replaces the node name with
instance uuid, which is misleading, so we define a new exception.
Closes-Bug: #1832156
Change-Id: Ic6c44ae44da7c3b9a1c20e9b24a036063af266ba
When receiving Nova notification instance.create.end,
map instance to its node after adding instance to datamodel.
Related-Bug: #1832156
Change-Id: I6f39e8d935195c611f668f71590e1d9ff52ced0d
In the process of creating an instance, Nova will emit an
instance.update notification with 'building' state.
This will cause a KeyError exception because this instance
isn't in Watcher datamodel.
So we should ignore the notification instance.update with
'building' state.
Closes-Bug: #1832154
Change-Id: I950eec50d2cee38bd22c47a70ae6f88bbf049080
The nova CDM builder code and notification handling
code had some inefficiencies when it came to looking
up a hypevisor to get details. The general pattern
used before was:
1. get the minimal hypervisor information by hypervisor_hostname
2. make another query to get the hypervisor details by id
In the notifications case, it was actually three calls because
the first is listing hyprvisors to filter client-side by service
host.
This change collapses 1 and 2 above into a single API call
to get the hypervisor by hypervisor_hostname with details
which will include the service (compute) host information
which is what get_compute_node_by_id() was being used for.
Now that nothing is using get_compute_node_by_id it is removed.
There is more work we could do in get_compute_node_by_hostname
if the compute API allowed filtering hypervisors by service
host so a TODO is left for that.
One final thing: the TODO in get_compute_node_by_hostname about
there being more than one hypervisor per compute service host
for vmware vcenter is not accurate - nova's vcenter driver
hasn't supported a host:node 1:M topology like that since the
Liberty release [1]. The only in-tree driver in nova that supports
1:M is the ironic baremetal driver, so the comment is updated.
[1] Ifc17c5049e3ed29c8dd130339207907b00433960
Depends-On: https://review.opendev.org/661785/
Change-Id: I5e0e88d7b2dd1a69117ab03e0e66851c687606da
This does two things:
1. Rather than make an API call per server on the host,
get all of the servers in a single API call by
filtering on the host. The os-hypervisors API results
to use make this require a bit of refactoring since
get_compute_node_by_name does not have the service
entry in it and get_compute_node_by_id does not have the
servers entry in it. A TODO is added to clean that up
with a single call to os-hypervisors once we have the
support in python-novaclient.
2. Pulls get_node_by_uuid() out of the loop.
A test is added for the nova_helper get_instance_list method
since one did not exist before.
The fake compute node mocks in test_nova_cdmc_execute are
also cleaned up since, as noted above, get_compute_node_by_name
and get_compute_node_by_id don't both return all the details.
Change-Id: Ifd9f83c2f399d4c1765b0c520f4d5a62ad0f5fbd
Changes to the baseclass for datasources so strategies can be made
compatible with every datasource. Baseclass methods clearly describe
expected values and types for both parameters and for method returns.
query_retry has been added as base method since every current
datasource implements it.
Ceilometer is updated to work with the new baseclass. Several methods
which are not part of the baseclass and are not used by any strategies
are removed. The signature of these methods would have to be changed
to fit with the new base class while it would limit strategies to
only work with Ceilometer.
Gnocchi is updated to work with the new baseclass.
Gnocchi and Ceilometer will perform a transformation for the
host_airflow metric as it retrieves 1/10 th of the actual CFM
Monasca is updated to work with the new baseclass.
FakeMetrics for Gnocchi, Monasca and Ceilometer are updated to work
with the new method signatures of the baseclass.
FakeClusterAndMetrics for Ceilometer and Gnocchi are updated to work
with the new method signatures of the baseclass.
The strategies workload_balance, vm_workload_consolidation,
workload_stabilization, basic_consolidation, noisy_neighbour,
outlet_temp_control and uniform_airflow are updated to work with the
new datasource baseclass.
This patch will break compatibility with plugin strategies and
datasources due to the changes in signatures.
Depends-on: I7aa52a9b82f4aa849f2378d4d1c03453e45c0c78
Change-Id: Ie30ca3dbf01062cbb20d3be5d514ec6b5155cd7c
Implements: blueprint formal-datasource-interface
As of change Ic4659d1f18af181203439a8bf1b38805ff34c309 the
nova CDM will not be built until an audit is performed.
Instances and services (compute hosts) can be created and
deleted before an audit is performed which will attempt
to use the notification callback function which relies
on the CDM being built already, and if not results in
an AttributeError.
This change side-steps that issue by checking to see that the
nova CDM exists before trying to call the notification
callback function.
An alternative to this is forcefully create the nova CDM when
notifications are received before an audit which is what happend
before change Ic4659d1f18af181203439a8bf1b38805ff34c309.
Change-Id: I16990afb82019821c443c9df26d3e515e52efa69
Closes-Bug: #1828582
_post_live_migration[1] runs on the source host and calls
post_live_migration_at_destination on the dest host which
emits the instance.live_migration_post_dest.end notification:[2]
But it's not the last notification for the live migration operation.
so we should use instance.live_migration_post.end instead of
instance.live_migration_post_dest.end notification.
[1]daa2ac2287/nova/compute/manager.py (L6907)
[2]daa2ac2287/nova/compute/manager.py (L7035)
Change-Id: Id1e2d98f56d5a95d49e32f98d2910660b9f48ce6
The _add_virtual_layer and _add_virtual_servers methods
have not been used since Ic4659d1f18af181203439a8bf1b38805ff34c309
in Stein so this change removes them.
Change-Id: I8c05f29c3c03aa5897cb182bb492948771c42881
This resolves problems with the audit scope such as the scope being
ignored, the scope not merging due to a type in .append, change update
into .add method when adding single elements to a set and making the
access of dict keys and values as lists work in python 3.7.
All these methods from the model builder now have tests to prevent
regressions.
Co-Authored-By: Canwei Li <li.canwei2@zte.com.cn>
Change-Id: I287763d5e426ff860aefabc4a1f3fe3f51accd76
This patch adds a scope to the datamodel, which only gets the VMs
of the specified nodes, and no longer gets all VMs from nova.
Implements: blueprint scope-for-watcher-datamodel
Change-Id: Ic4659d1f18af181203439a8bf1b38805ff34c309
Bare metal cluster data model was introduced in Queens cycle.
Since the model is different from compute data model, we
need add CDM scoper for bare metal cluster data model
Change-Id: Idd041cefb692085d4545252d229ebe8602926b58
Implements: blueprint audit-scoper-for-baremetal-data-model
The method is quite simple and it doesn't need a dostring.
Also existing docstring was incorrect. The name of the expected
parameter is 'name', not 'node'. And it cannot be an object
of the type node.StorageNode
Change-Id: I94124d327c490d45eae4d2ded218beadfbc33ad7
The correct type of parameter 'pool' in method build_storage_pool is
<class 'cinderclient.v2.pools.Pool'>
Change-Id: I986f707e4e740ebec94a46c6ee413f9a70197dad