Compare commits

...

78 Commits

Author SHA1 Message Date
7ecc6a2635 Bump version to 0.9.31 2021-07-30 12:08:12 -04:00
73e8149cb0 Remove explicit image-features from rbd cmd
This should be managed in ceph.conf with the `rbd default
features` configuration option instead, and thus can be tailored to the
underlying OS version.
2021-07-30 11:33:59 -04:00
4a7246b8c0 Ensure RBD resize has bytes appended
If this isn't, the resize will be interpreted as a MB value and result
in an absurdly big volume instead. This is the same consistency
validation that occurs on add.
2021-07-30 11:25:13 -04:00
c49351469b Revert "Ensure consistent sizing of volumes"
This reverts commit dc03e95bbf.
2021-07-29 15:30:00 -04:00
dc03e95bbf Ensure consistent sizing of volumes
Convert from human to bytes, then to megabytes and always pass this to
the RBD command. This ensures consistency regardless of what is actually
passed by the user.
2021-07-29 15:14:25 -04:00
c460aa051a Add missing floppy RASD type for compat 2021-07-27 16:32:32 -04:00
3ab6365a53 Adjust receive output to show proper source 2021-07-22 15:43:08 -04:00
32613ff119 Remove obsolete Suggests lines from control 2021-07-20 00:35:21 -04:00
2a99a27feb Bump version to 0.9.30 2021-07-20 00:01:45 -04:00
45f23c12ea Remove logs from schema validation
These are managed entirely by the logging subsystem not by the schema
handler due to catch-22's.
2021-07-20 00:00:37 -04:00
fa1d93e933 Bump version to 0.9.29 2021-07-19 16:55:41 -04:00
b14bc7e3a3 Add retry to log writes 2021-07-19 13:11:28 -04:00
4d6842f942 Don't bail out if write fails, keep retrying 2021-07-19 13:09:36 -04:00
6ead21a308 Handle cleanup from a failure properly 2021-07-19 12:39:13 -04:00
b7c8c2ee3d Fix handling of this_node and d_domain in cleanup 2021-07-19 12:36:35 -04:00
d48f58930b Use harder exits and add cleanup termination 2021-07-19 12:27:16 -04:00
7c36388c8f Add post-networking delay and adjust daemon delay 2021-07-19 12:23:45 -04:00
e9df043c0a Ensure ZK logging does not block startup 2021-07-19 12:19:59 -04:00
71e4d0b32a Bump version to 0.9.28 2021-07-19 09:29:34 -04:00
f16bad4691 Revamp confirmation options for vm modify
Before, "-y"/"--yes" only confirmed the reboot portion. Instead, modify
this to confirm both the diff portion and the restart portion, and add
separate flags to bypass one or the other independently, ensuring the
administrator has lots of flexibility. UNSAFE mode implies "-y" so both
would be auto-confirmed if that option is set.
2021-07-19 00:25:43 -04:00
15d92c483f Bump version to 0.9.27 2021-07-19 00:03:40 -04:00
7dd17e71e7 Fix bug with VM editing with file
Current config is needed for the diff but it was in a conditional.
2021-07-19 00:02:19 -04:00
5be968123f Readd 1 second queue get timeout
Otherwise daemon stops will sometimes inexplicably block.
2021-07-18 22:17:57 -04:00
99fd7ebe63 Fix excessive CPU due to looping 2021-07-18 22:06:50 -04:00
cffc96d156 Fix failure in creating base keys 2021-07-18 21:00:23 -04:00
602093029c Bump version to 0.9.26 2021-07-18 20:49:52 -04:00
bd7a773d6b Add node log following functionality 2021-07-18 20:37:53 -04:00
8d671b3422 Add some tag tests to test-cluster.sh 2021-07-18 20:37:37 -04:00
2358ad6bbe Reduce the number of lines per call
500 was a lot every half second; 200 seems more reasonable. Even a fast
kernel boot should generate < 200 lines in half a second.
2021-07-18 20:23:45 -04:00
a0e9b57d39 Increase log line frequency 2021-07-18 20:19:59 -04:00
2d48127e9c Use even better/faster set comparison 2021-07-18 20:18:35 -04:00
55f2b00366 Add some spaces for better readability 2021-07-18 20:18:23 -04:00
ba257048ad Improve output formatting of node logs 2021-07-18 20:06:08 -04:00
b770e15a91 Fix final termination of logger
We need to do a bit more finagling with the logger on termination to
ensure that all messages are written and the queue drained before
actually terminating.
2021-07-18 19:53:00 -04:00
e23a65128a Remove del of logger item 2021-07-18 19:03:47 -04:00
982dfd52c6 Adjust date output format 2021-07-18 19:00:54 -04:00
3a2478ee0c Cleanly terminate logger on cleanup 2021-07-18 18:57:44 -04:00
a088aa4484 Add node log functions to API and CLI 2021-07-18 18:54:28 -04:00
323c7c41ae Implement node logging into Zookeeper
Adds the ability to send node daemon logs to Zookeeper to facilitate a
command like "pvc node log", similar to "pvc vm log". Each node stores
its logs in a separate tree under "/logs" which can then be combined or
queried. By default, set by config, only 2000 lines are kept.
2021-07-18 17:11:43 -04:00
cd1db3d587 Ensure node name is part of confing 2021-07-18 16:38:58 -04:00
401f102344 Add serial BIOS to default libvirt schema 2021-07-15 10:45:14 -04:00
4ac020888b Add some tag tests to test-cluster.sh 2021-07-14 15:02:03 -04:00
8f3b68d48a Mention multiple option for tags in VM define 2021-07-14 01:12:10 -04:00
6d4c26c8d8 Don't show tag line in info if no tags 2021-07-14 00:59:24 -04:00
75fb60b1b4 Add VM list filtering by tag
Uses same method as state or node filtering, rather than altering how
the main LIMIT field works.
2021-07-14 00:59:20 -04:00
9ea9ac3b8a Revamp tag handling and display
Add an additional protected class, limit manipulation to one at a time,
and ensure future flexibility. Also makes display consistent with other
VM elements.
2021-07-13 22:39:52 -04:00
27f1758791 Add tags manipulation to API
Also fixes some checks for Metadata too since these two actions are
almost identical, and adds tags to define endpoint.
2021-07-13 19:05:33 -04:00
c0a3467b70 Simplify VM metadata reads
Directly call the new common getDomainMetadata function to avoid
excessive Zookeeper calls for this information.
2021-07-13 19:05:33 -04:00
9a199992a1 Add functions for manipulating VM tags
Adds tags to schema (v3), to VM definition, adds function to modify
tags, adds function to get tags, and adds tags to VM data output.

Tags will enable more granular classification of VMs based either on
administrator configuration or from automated system events.
2021-07-13 19:05:33 -04:00
c6d552ae57 Rework success checks for IPMI fencing
Previously, if the node failed to restart, it was declared a "bad fence"
and no further action would be taken. However, there are some
situations, for instance critical hardware failures, where intelligent
systems will not attempt (or succeed at) starting up the node in such a
case, which would result in dead, known-offline nodes without recovery.

Tweak this behaviour somewhat. The main path of Reboot -> Check On ->
Success + fence-flush is retained, but some additional side-paths are
now defined:

1. We attempt to power "on" the chassis 1 second after the reboot, just
in case it is off and can be recovered. We then wait another 2 seconds
and check the power status (as we did before).

2. If the reboot succeeded, follow this series of choices:

    a. If the chassis is on, the fence succeeded.

    b. If the chassis is off, the fence "succeeded" as well.

    c. If the chassis is in some other state, the fence failed.

3. If the reboot failed, follow this series of choices:

    a. If the chassis is off, the fence itself failed, but we can treat
    it as "succeeded"" since the chassis is in a known-offline state.
    This is the most likely situation when there is a critical hardware
    failure, and the server's IPMI does not allow itself to start back
    up again.

    b. If the chassis is in any other state ("on" or unknown), the fence
    itself failed and we must treat this as a fence failure.

Overall, this should alleviate the aforementioned issue of a critical
failure rendering the node persistently "off" not triggering a
fence-flush and ensure fencing is more robust.
2021-07-13 17:54:41 -04:00
2e9f6ac201 Bump version to 0.9.25 2021-07-11 23:19:09 -04:00
f09849bedf Don't overwrite shutdown state on termination
Just a minor quibble and not really impactful.
2021-07-11 23:18:14 -04:00
8c975e5c46 Add chroot context manager example to debootstrap
Closes #132
2021-07-11 23:10:41 -04:00
c76149141f Only log ZK connections when persistent
Prevents spam in the API logs.
2021-07-10 23:35:49 -04:00
f00c4d07f4 Add date output to keepalive
Helps track when there is a log follow in "-o cat" mode.
2021-07-10 23:24:59 -04:00
20b66c10e1 Move two more commands to Rados library 2021-07-10 17:28:42 -04:00
cfeba50b17 Revert "Return to all command-based Ceph gathering"
This reverts commit 65d14ccd92.

This was actually a bad idea. For inexplicable reasons, running these
Ceph commands manually (not even via Python, but in a normal shell)
takes 7 * two orders of magnitude longer than running them with the
Rados module, so long in fact that some basic commands like "ceph
health" would sometimes take longer than the 1 second timeout to
complete. The Rados commands would however take about 1ms instead.

Despite the occasional issues when monitors drop out, the Rados module
is clearly far superior to the shell commands for any moderately-loaded
Ceph cluster. We can look into solving timeouts another way (perhaps
with Processes instead of Threads) at a later time.

Rados module "ceph health":
    b'{"checks":{},"status":"HEALTH_OK"}'
    0.001204 (s)
    b'{"checks":{},"status":"HEALTH_OK"}'
    0.001258 (s)
Command "ceph health":
    joshua@hv1.c.bonilan.net ~ $ time ceph health >/dev/null
    real    0m0.772s
    user    0m0.707s
    sys     0m0.046s
    joshua@hv1.c.bonilan.net ~ $ time ceph health >/dev/null
    real    0m0.796s
    user    0m0.728s
    sys     0m0.054s
2021-07-10 03:47:45 -04:00
0699c48d10 Fix bad schema path name 2021-07-09 16:47:09 -04:00
551bae2518 Bump version to 0.9.24 2021-07-09 15:58:36 -04:00
4832245d9c Handle non-RBD disks and non-RBD errors better 2021-07-09 15:48:57 -04:00
2138f2f59f Fail VM removal on disk removal failures
Prevents bad states where the VM is "removed" but some of its disks
remain due to e.g. stuck watchers.

Rearrange the sequence so it goes stop, delete disks, then delete VM,
and then return a failure if any of the disk(s) fail to remove, allowing
the task to be rerun after fixing the problem.
2021-07-09 15:39:06 -04:00
d1d355a96b Avoid errors if stats data is None 2021-07-09 13:13:54 -04:00
2b5dc286ab Correct failure to get ceph_health data 2021-07-09 13:10:28 -04:00
c0c9327a7d Return an empty log if the value is None 2021-07-09 13:08:00 -04:00
5ffabcfef5 Avoid failing if we can't get the future data 2021-07-09 13:05:37 -04:00
330cf14638 Remove return statements in keepalive collectors
These seem to bork the keepalive timer process, so just remove them and
let it continue to press on.
2021-07-09 13:04:17 -04:00
9d0eb20197 Mention UUID matching in vm list help 2021-07-09 11:51:20 -04:00
3f5b7045a2 Allow raw listing of cluster names in CLI 2021-07-09 10:53:20 -04:00
80fe96b24d Add some additional docstrings 2021-07-07 12:28:08 -04:00
80f04ce8ee Remove connection renewal in state handler
Regenerating the ZK connection was fraught with issues, including
duplicate connections, strange failures to reconnect, and various other
wonkiness.

Instead let Kazoo handle states sensibly. Kazoo moves to SUSPENDED state
when it loses connectivity, and stays there indefinitely (based on
cursory tests). And Kazoo seems to always resume from this just fine on
its own. Thus all that hackery did nothing but complicate reconnection.

This therefore turns the listener into a purely informational function,
providing logs of when/why it failed, and we also add some additional
output messages during initial connection and final disconnection.
2021-07-07 11:55:12 -04:00
65d14ccd92 Return to all command-based Ceph gathering
Using the Rados module was very problematic, specifically because it had
no sensible timeout parameters and thus would hang for many seconds.
This has poor implications since it blocks further keepalives.

Instead, remove the Rados usage entirely and go back completely to using
manual OS commands to gather this information. While this may cause PID
exhaustion more quickly it's worthwhile to avoid failure scenarios when
Ceph stats time out.

Closes #137
2021-07-06 11:30:45 -04:00
adc022f55d Add missing install of pvcapid-worker.sh 2021-07-06 09:40:42 -04:00
7082982a33 Bump version to 0.9.23 2021-07-05 23:40:32 -04:00
5b6ef71909 Ensure daemon mode is updated on startup
Fixes the side effect of the previous bug during deploys of 0.9.22.
2021-07-05 23:39:23 -04:00
a8c28786dd Better handle empty ipaths in schema
When trying to write to sub-item paths that don't yet exist, the
previous method would just blindly write to whatever the root key is,
which is never what we actually want.

Instead, check explicitly for a "base path" situation, and handle that.
Then, if we try to get a subpath that isn't valid, return None. Finally
in the various functions, if the path is None, just continue (or return
false/None) and (try to) chug along.
2021-07-05 23:35:03 -04:00
be7b0be8ed Fix typo in schema path name 2021-07-05 23:23:23 -04:00
c45804e8c1 Revert "Return none if a schema path is not found"
This reverts commit b1fcf6a4a5.
2021-07-05 23:16:39 -04:00
b1fcf6a4a5 Return none if a schema path is not found
This can cause overwriting of unintended keys, so should not be
happening. Will have to find the bugs this causes.
2021-07-05 17:15:55 -04:00
34 changed files with 1735 additions and 281 deletions

View File

@ -1 +1 @@
0.9.22
0.9.31

View File

@ -42,6 +42,57 @@ To get started with PVC, please see the [About](https://parallelvirtualcluster.r
## Changelog
#### v0.9.31
* [Packages] Cleans up obsolete Suggests lines
* [Node Daemon] Adjusts log text of VM migrations to show the correct source node
* [API Daemon] Adjusts the OVA importer to support floppy RASD types for compatability
* [API Daemon] Ensures that volume resize commands without a suffix get B appended
* [API Daemon] Removes the explicit setting of image-features in PVC; defaulting to the limited set has been moved to the ceph.conf configuration on nodes via PVC Ansible
#### v0.9.30
* [Node Daemon] Fixes bug with schema validation
#### v0.9.29
* [Node Daemon] Corrects numerous bugs with node logging framework
#### v0.9.28
* [CLI Client] Revamp confirmation options for "vm modify" command
#### v0.9.27
* [CLI Client] Fixes a bug with vm modify command when passed a file
#### v0.9.26
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures
* [All] Implements VM tagging functionality
* [All] Implements Node log access via PVC functionality
#### v0.9.25
* [Node Daemon] Returns to Rados library calls for Ceph due to performance problems
* [Node Daemon] Adds a date output to keepalive messages
* [Daemons] Configures ZK connection logging only for persistent connections
* [API Provisioner] Add context manager-based chroot to Debootstrap example script
* [Node Daemon] Fixes a bug where shutdown daemon state was overwritten
#### v0.9.24
* [Node Daemon] Removes Rados module polling of Ceph cluster and returns to command-based polling for timeout purposes, and removes some flaky return statements
* [Node Daemon] Removes flaky Zookeeper connection renewals that caused problems
* [CLI Client] Allow raw lists of clusters from `pvc cluster list`
* [API Daemon] Fixes several issues when getting VM data without stats
* [API Daemon] Fixes issues with removing VMs while disks are still in use (failed provisioning, etc.)
#### v0.9.23
* [Daemons] Fixes a critical overwriting bug in zkhandler when schema paths are not yet valid
* [Node Daemon] Ensures the daemon mode is updated on every startup (fixes the side effect of the above bug in 0.9.22)
#### v0.9.22
* [API Daemon] Drastically improves performance when getting large lists (e.g. VMs)

View File

@ -34,6 +34,29 @@
# with that.
import os
from contextlib import contextmanager
# Create a chroot context manager
# This can be used later in the script to chroot to the destination directory
# for instance to run commands within the target.
@contextmanager
def chroot_target(destination):
try:
real_root = os.open("/", os.O_RDONLY)
os.chroot(destination)
fake_root = os.open("/", os.O_RDONLY)
os.fchdir(fake_root)
yield
finally:
os.fchdir(real_root)
os.chroot(".")
os.fchdir(real_root)
os.close(fake_root)
os.close(real_root)
del fake_root
del real_root
# Installation function - performs a debootstrap install of a Debian system
# Note that the only arguments are keyword arguments.
@ -193,40 +216,25 @@ GRUB_DISABLE_LINUX_UUID=false
fh.write(data)
# Chroot, do some in-root tasks, then exit the chroot
# EXITING THE CHROOT IS VERY IMPORTANT OR THE FOLLOWING STAGES OF THE PROVISIONER
# WILL FAIL IN UNEXPECTED WAYS! Keep this in mind when using chroot in your scripts.
real_root = os.open("/", os.O_RDONLY)
os.chroot(temporary_directory)
fake_root = os.open("/", os.O_RDONLY)
os.fchdir(fake_root)
# Install and update GRUB
os.system(
"grub-install --force /dev/rbd/{}/{}_{}".format(root_disk['pool'], vm_name, root_disk['disk_id'])
)
os.system(
"update-grub"
)
# Set a really dumb root password [TEMPORARY]
os.system(
"echo root:test123 | chpasswd"
)
# Enable cloud-init target on (first) boot
# NOTE: Your user-data should handle this and disable it once done, or things get messy.
# That cloud-init won't run without this hack seems like a bug... but even the official
# Debian cloud images are affected, so who knows.
os.system(
"systemctl enable cloud-init.target"
)
# Restore our original root/exit the chroot
# EXITING THE CHROOT IS VERY IMPORTANT OR THE FOLLOWING STAGES OF THE PROVISIONER
# WILL FAIL IN UNEXPECTED WAYS! Keep this in mind when using chroot in your scripts.
os.fchdir(real_root)
os.chroot(".")
os.fchdir(real_root)
os.close(fake_root)
os.close(real_root)
with chroot_target(temporary_directory):
# Install and update GRUB
os.system(
"grub-install --force /dev/rbd/{}/{}_{}".format(root_disk['pool'], vm_name, root_disk['disk_id'])
)
os.system(
"update-grub"
)
# Set a really dumb root password [TEMPORARY]
os.system(
"echo root:test123 | chpasswd"
)
# Enable cloud-init target on (first) boot
# NOTE: Your user-data should handle this and disable it once done, or things get messy.
# That cloud-init won't run without this hack seems like a bug... but even the official
# Debian cloud images are affected, so who knows.
os.system(
"systemctl enable cloud-init.target"
)
# Unmount the bound devfs
os.system(
@ -235,8 +243,4 @@ GRUB_DISABLE_LINUX_UUID=false
)
)
# Clean up file handles so paths can be unmounted
del fake_root
del real_root
# Everything else is done via cloud-init user-data

View File

@ -29,7 +29,7 @@
# This script will run under root privileges as the provisioner does. Be careful
# with that.
# Installation function - performs a debootstrap install of a Debian system
# Installation function - performs no actions then returns
# Note that the only arguments are keyword arguments.
def install(**kwargs):
# The provisioner has already mounted the disks on kwargs['temporary_directory'].

View File

@ -25,7 +25,7 @@ import yaml
from distutils.util import strtobool as dustrtobool
# Daemon version
version = '0.9.22'
version = '0.9.31'
# API version
API_VERSION = 1.0

View File

@ -592,7 +592,7 @@ class API_Node_Root(Resource):
name: limit
type: string
required: false
description: A search limit; fuzzy by default, use ^/$ to force exact matches
description: A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches
- in: query
name: daemon_state
type: string
@ -834,6 +834,52 @@ class API_Node_DomainState(Resource):
api.add_resource(API_Node_DomainState, '/node/<node>/domain-state')
# /node/<node</log
class API_Node_Log(Resource):
@RequestParser([
{'name': 'lines'}
])
@Authenticator
def get(self, node, reqargs):
"""
Return the recent logs of {node}
---
tags:
- node
parameters:
- in: query
name: lines
type: integer
required: false
description: The number of lines to retrieve
responses:
200:
description: OK
schema:
type: object
id: NodeLog
properties:
name:
type: string
description: The name of the Node
data:
type: string
description: The recent log text
404:
description: Node not found
schema:
type: object
id: Message
"""
return api_helper.node_log(
node,
reqargs.get('lines', None)
)
api.add_resource(API_Node_Log, '/node/<node>/log')
##########################################################
# Client API - VM
##########################################################
@ -844,6 +890,7 @@ class API_VM_Root(Resource):
{'name': 'limit'},
{'name': 'node'},
{'name': 'state'},
{'name': 'tag'},
])
@Authenticator
def get(self, reqargs):
@ -892,6 +939,22 @@ class API_VM_Root(Resource):
migration_method:
type: string
description: The preferred migration method (live, shutdown, none)
tags:
type: array
description: The tag(s) of the VM
items:
type: object
id: VMTag
properties:
name:
type: string
description: The name of the tag
type:
type: string
description: The type of the tag (user, system)
protected:
type: boolean
description: Whether the tag is protected or not
description:
type: string
description: The description of the VM
@ -1076,7 +1139,7 @@ class API_VM_Root(Resource):
name: limit
type: string
required: false
description: A name search limit; fuzzy by default, use ^/$ to force exact matches
description: A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches
- in: query
name: node
type: string
@ -1087,6 +1150,11 @@ class API_VM_Root(Resource):
type: string
required: false
description: Limit list to VMs in this state
- in: query
name: tag
type: string
required: false
description: Limit list to VMs with this tag
responses:
200:
description: OK
@ -1098,6 +1166,7 @@ class API_VM_Root(Resource):
return api_helper.vm_list(
reqargs.get('node', None),
reqargs.get('state', None),
reqargs.get('tag', None),
reqargs.get('limit', None)
)
@ -1107,6 +1176,8 @@ class API_VM_Root(Resource):
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms', 'none'), 'helptext': "A valid selector must be specified"},
{'name': 'autostart'},
{'name': 'migration_method', 'choices': ('live', 'shutdown', 'none'), 'helptext': "A valid migration_method must be specified"},
{'name': 'user_tags', 'action': 'append'},
{'name': 'protected_tags', 'action': 'append'},
{'name': 'xml', 'required': True, 'helptext': "A Libvirt XML document must be specified"},
])
@Authenticator
@ -1158,6 +1229,20 @@ class API_VM_Root(Resource):
- live
- shutdown
- none
- in: query
name: user_tags
type: array
required: false
description: The user tag(s) of the VM
items:
type: string
- in: query
name: protected_tags
type: array
required: false
description: The protected user tag(s) of the VM
items:
type: string
responses:
200:
description: OK
@ -1170,13 +1255,22 @@ class API_VM_Root(Resource):
type: object
id: Message
"""
user_tags = reqargs.get('user_tags', None)
if user_tags is None:
user_tags = []
protected_tags = reqargs.get('protected_tags', None)
if protected_tags is None:
protected_tags = []
return api_helper.vm_define(
reqargs.get('xml'),
reqargs.get('node', None),
reqargs.get('limit', None),
reqargs.get('selector', 'none'),
bool(strtobool(reqargs.get('autostart', 'false'))),
reqargs.get('migration_method', 'none')
reqargs.get('migration_method', 'none'),
user_tags,
protected_tags
)
@ -1203,7 +1297,7 @@ class API_VM_Element(Resource):
type: object
id: Message
"""
return api_helper.vm_list(None, None, vm, is_fuzzy=False)
return api_helper.vm_list(None, None, None, vm, is_fuzzy=False)
@RequestParser([
{'name': 'limit'},
@ -1211,6 +1305,8 @@ class API_VM_Element(Resource):
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms', 'none'), 'helptext': "A valid selector must be specified"},
{'name': 'autostart'},
{'name': 'migration_method', 'choices': ('live', 'shutdown', 'none'), 'helptext': "A valid migration_method must be specified"},
{'name': 'user_tags', 'action': 'append'},
{'name': 'protected_tags', 'action': 'append'},
{'name': 'xml', 'required': True, 'helptext': "A Libvirt XML document must be specified"},
])
@Authenticator
@ -1265,6 +1361,20 @@ class API_VM_Element(Resource):
- live
- shutdown
- none
- in: query
name: user_tags
type: array
required: false
description: The user tag(s) of the VM
items:
type: string
- in: query
name: protected_tags
type: array
required: false
description: The protected user tag(s) of the VM
items:
type: string
responses:
200:
description: OK
@ -1277,13 +1387,22 @@ class API_VM_Element(Resource):
type: object
id: Message
"""
user_tags = reqargs.get('user_tags', None)
if user_tags is None:
user_tags = []
protected_tags = reqargs.get('protected_tags', None)
if protected_tags is None:
protected_tags = []
return api_helper.vm_define(
reqargs.get('xml'),
reqargs.get('node', None),
reqargs.get('limit', None),
reqargs.get('selector', 'none'),
bool(strtobool(reqargs.get('autostart', 'false'))),
reqargs.get('migration_method', 'none')
reqargs.get('migration_method', 'none'),
user_tags,
protected_tags
)
@RequestParser([
@ -1401,7 +1520,7 @@ class API_VM_Metadata(Resource):
type: string
description: The preferred migration method (live, shutdown, none)
404:
description: Not found
description: VM not found
schema:
type: object
id: Message
@ -1469,6 +1588,11 @@ class API_VM_Metadata(Resource):
schema:
type: object
id: Message
404:
description: VM not found
schema:
type: object
id: Message
"""
return api_helper.update_vm_meta(
vm,
@ -1483,6 +1607,99 @@ class API_VM_Metadata(Resource):
api.add_resource(API_VM_Metadata, '/vm/<vm>/meta')
# /vm/<vm>/tags
class API_VM_Tags(Resource):
@Authenticator
def get(self, vm):
"""
Return the tags of {vm}
---
tags:
- vm
responses:
200:
description: OK
schema:
type: object
id: VMTags
properties:
name:
type: string
description: The name of the VM
tags:
type: array
description: The tag(s) of the VM
items:
type: object
id: VMTag
404:
description: VM not found
schema:
type: object
id: Message
"""
return api_helper.get_vm_tags(vm)
@RequestParser([
{'name': 'action', 'choices': ('add', 'remove'), 'helptext': "A valid action must be specified"},
{'name': 'tag'},
{'name': 'protected'}
])
@Authenticator
def post(self, vm, reqargs):
"""
Set the tags of {vm}
---
tags:
- vm
parameters:
- in: query
name: action
type: string
required: true
description: The action to perform with the tag
enum:
- add
- remove
- in: query
name: tag
type: string
required: true
description: The text value of the tag
- in: query
name: protected
type: boolean
required: false
default: false
description: Set the protected state of the tag
responses:
200:
description: OK
schema:
type: object
id: Message
400:
description: Bad request
schema:
type: object
id: Message
404:
description: VM not found
schema:
type: object
id: Message
"""
return api_helper.update_vm_tag(
vm,
reqargs.get('action'),
reqargs.get('tag'),
reqargs.get('protected', False)
)
api.add_resource(API_VM_Tags, '/vm/<vm>/tags')
# /vm/<vm</state
class API_VM_State(Resource):
@Authenticator

View File

@ -307,6 +307,34 @@ def node_ready(zkhandler, node, wait):
return output, retcode
@ZKConnection(config)
def node_log(zkhandler, node, lines=None):
"""
Return the current logs for Node.
"""
# Default to 10 lines of log if not set
try:
lines = int(lines)
except TypeError:
lines = 10
retflag, retdata = pvc_node.get_node_log(zkhandler, node, lines)
if retflag:
retcode = 200
retdata = {
'name': node,
'data': retdata
}
else:
retcode = 400
retdata = {
'message': retdata
}
return retdata, retcode
#
# VM functions
#
@ -326,7 +354,7 @@ def vm_state(zkhandler, vm):
"""
Return the state of virtual machine VM.
"""
retflag, retdata = pvc_vm.get_list(zkhandler, None, None, vm, is_fuzzy=False)
retflag, retdata = pvc_vm.get_list(zkhandler, None, None, None, vm, is_fuzzy=False)
if retflag:
if retdata:
@ -355,7 +383,7 @@ def vm_node(zkhandler, vm):
"""
Return the current node of virtual machine VM.
"""
retflag, retdata = pvc_vm.get_list(zkhandler, None, None, vm, is_fuzzy=False)
retflag, retdata = pvc_vm.get_list(zkhandler, None, None, None, vm, is_fuzzy=False)
if retflag:
if retdata:
@ -409,11 +437,11 @@ def vm_console(zkhandler, vm, lines=None):
@pvc_common.Profiler(config)
@ZKConnection(config)
def vm_list(zkhandler, node=None, state=None, limit=None, is_fuzzy=True):
def vm_list(zkhandler, node=None, state=None, tag=None, limit=None, is_fuzzy=True):
"""
Return a list of VMs with limit LIMIT.
"""
retflag, retdata = pvc_vm.get_list(zkhandler, node, state, limit, is_fuzzy)
retflag, retdata = pvc_vm.get_list(zkhandler, node, state, tag, limit, is_fuzzy)
if retflag:
if retdata:
@ -433,7 +461,7 @@ def vm_list(zkhandler, node=None, state=None, limit=None, is_fuzzy=True):
@ZKConnection(config)
def vm_define(zkhandler, xml, node, limit, selector, autostart, migration_method):
def vm_define(zkhandler, xml, node, limit, selector, autostart, migration_method, user_tags=[], protected_tags=[]):
"""
Define a VM from Libvirt XML in the PVC cluster.
"""
@ -444,7 +472,13 @@ def vm_define(zkhandler, xml, node, limit, selector, autostart, migration_method
except Exception as e:
return {'message': 'XML is malformed or incorrect: {}'.format(e)}, 400
retflag, retdata = pvc_vm.define_vm(zkhandler, new_cfg, node, limit, selector, autostart, migration_method, profile=None)
tags = list()
for tag in user_tags:
tags.append({'name': tag, 'type': 'user', 'protected': False})
for tag in protected_tags:
tags.append({'name': tag, 'type': 'user', 'protected': True})
retflag, retdata = pvc_vm.define_vm(zkhandler, new_cfg, node, limit, selector, autostart, migration_method, profile=None, tags=tags)
if retflag:
retcode = 200
@ -463,28 +497,20 @@ def get_vm_meta(zkhandler, vm):
"""
Get metadata of a VM.
"""
retflag, retdata = pvc_vm.get_list(zkhandler, None, None, vm, is_fuzzy=False)
dom_uuid = pvc_vm.getDomainUUID(zkhandler, vm)
if not dom_uuid:
return {"message": "VM not found."}, 404
if retflag:
if retdata:
retcode = 200
retdata = {
'name': vm,
'node_limit': retdata['node_limit'],
'node_selector': retdata['node_selector'],
'node_autostart': retdata['node_autostart'],
'migration_method': retdata['migration_method']
}
else:
retcode = 404
retdata = {
'message': 'VM not found.'
}
else:
retcode = 400
retdata = {
'message': retdata
}
domain_node_limit, domain_node_selector, domain_node_autostart, domain_migrate_method = pvc_common.getDomainMetadata(zkhandler, dom_uuid)
retcode = 200
retdata = {
'name': vm,
'node_limit': domain_node_limit,
'node_selector': domain_node_selector,
'node_autostart': domain_node_autostart,
'migration_method': domain_migrate_method
}
return retdata, retcode
@ -494,11 +520,16 @@ def update_vm_meta(zkhandler, vm, limit, selector, autostart, provisioner_profil
"""
Update metadata of a VM.
"""
dom_uuid = pvc_vm.getDomainUUID(zkhandler, vm)
if not dom_uuid:
return {"message": "VM not found."}, 404
if autostart is not None:
try:
autostart = bool(strtobool(autostart))
except Exception:
autostart = False
retflag, retdata = pvc_vm.modify_vm_metadata(zkhandler, vm, limit, selector, autostart, provisioner_profile, migration_method)
if retflag:
@ -512,6 +543,51 @@ def update_vm_meta(zkhandler, vm, limit, selector, autostart, provisioner_profil
return output, retcode
@ZKConnection(config)
def get_vm_tags(zkhandler, vm):
"""
Get the tags of a VM.
"""
dom_uuid = pvc_vm.getDomainUUID(zkhandler, vm)
if not dom_uuid:
return {"message": "VM not found."}, 404
tags = pvc_common.getDomainTags(zkhandler, dom_uuid)
retcode = 200
retdata = {
'name': vm,
'tags': tags
}
return retdata, retcode
@ZKConnection(config)
def update_vm_tag(zkhandler, vm, action, tag, protected=False):
"""
Update a tag of a VM.
"""
if action not in ['add', 'remove']:
return {"message": "Tag action must be one of 'add', 'remove'."}, 400
dom_uuid = pvc_vm.getDomainUUID(zkhandler, vm)
if not dom_uuid:
return {"message": "VM not found."}, 404
retflag, retdata = pvc_vm.modify_vm_tag(zkhandler, vm, action, tag, protected=protected)
if retflag:
retcode = 200
else:
retcode = 400
output = {
'message': retdata.replace('\"', '\'')
}
return output, retcode
@ZKConnection(config)
def vm_modify(zkhandler, name, restart, xml):
"""
@ -752,7 +828,7 @@ def vm_flush_locks(zkhandler, vm):
"""
Flush locks of a (stopped) VM.
"""
retflag, retdata = pvc_vm.get_list(zkhandler, None, None, vm, is_fuzzy=False)
retflag, retdata = pvc_vm.get_list(zkhandler, None, None, None, vm, is_fuzzy=False)
if retdata[0].get('state') not in ['stop', 'disable']:
return {"message": "VM must be stopped to flush locks"}, 400

View File

@ -41,6 +41,7 @@ libvirt_header = """<domain type='kvm'>
<bootmenu enable='yes'/>
<boot dev='cdrom'/>
<boot dev='hd'/>
<bios useserial='yes' rebootTimeout='5'/>
</os>
<features>
<acpi/>

View File

@ -414,6 +414,7 @@ class OVFParser(object):
"5": "ide-controller",
"6": "scsi-controller",
"10": "ethernet-adapter",
"14": "floppy",
"15": "cdrom",
"17": "disk",
"20": "other-storage-device",

View File

@ -19,6 +19,8 @@
#
###############################################################################
import time
import pvc.cli_lib.ansiprint as ansiprint
from pvc.cli_lib.common import call_api
@ -69,6 +71,89 @@ def node_domain_state(config, node, action, wait):
return retstatus, response.json().get('message', '')
def view_node_log(config, node, lines=100):
"""
Return node log lines from the API (and display them in a pager in the main CLI)
API endpoint: GET /node/{node}/log
API arguments: lines={lines}
API schema: {"name":"{node}","data":"{node_log}"}
"""
params = {
'lines': lines
}
response = call_api(config, 'get', '/node/{node}/log'.format(node=node), params=params)
if response.status_code != 200:
return False, response.json().get('message', '')
node_log = response.json()['data']
# Shrink the log buffer to length lines
shrunk_log = node_log.split('\n')[-lines:]
loglines = '\n'.join(shrunk_log)
return True, loglines
def follow_node_log(config, node, lines=10):
"""
Return and follow node log lines from the API
API endpoint: GET /node/{node}/log
API arguments: lines={lines}
API schema: {"name":"{nodename}","data":"{node_log}"}
"""
# We always grab 200 to match the follow call, but only _show_ `lines` number
params = {
'lines': 200
}
response = call_api(config, 'get', '/node/{node}/log'.format(node=node), params=params)
if response.status_code != 200:
return False, response.json().get('message', '')
# Shrink the log buffer to length lines
node_log = response.json()['data']
shrunk_log = node_log.split('\n')[-int(lines):]
loglines = '\n'.join(shrunk_log)
# Print the initial data and begin following
print(loglines, end='')
print('\n', end='')
while True:
# Grab the next line set (200 is a reasonable number of lines per half-second; any more are skipped)
try:
params = {
'lines': 200
}
response = call_api(config, 'get', '/node/{node}/log'.format(node=node), params=params)
new_node_log = response.json()['data']
except Exception:
break
# Split the new and old log strings into constitutent lines
old_node_loglines = node_log.split('\n')
new_node_loglines = new_node_log.split('\n')
# Set the node log to the new log value for the next iteration
node_log = new_node_log
# Get the difference between the two sets of lines
old_node_loglines_set = set(old_node_loglines)
diff_node_loglines = [x for x in new_node_loglines if x not in old_node_loglines_set]
# If there's a difference, print it out
if len(diff_node_loglines) > 0:
print('\n'.join(diff_node_loglines), end='')
print('\n', end='')
# Wait half a second
time.sleep(0.5)
return True, ''
def node_info(config, node):
"""
Get information about node

View File

@ -54,12 +54,12 @@ def vm_info(config, vm):
return False, response.json().get('message', '')
def vm_list(config, limit, target_node, target_state):
def vm_list(config, limit, target_node, target_state, target_tag):
"""
Get list information about VMs (limited by {limit}, {target_node}, or {target_state})
API endpoint: GET /api/v1/vm
API arguments: limit={limit}, node={target_node}, state={target_state}
API arguments: limit={limit}, node={target_node}, state={target_state}, tag={target_tag}
API schema: [{json_data_object},{json_data_object},etc.]
"""
params = dict()
@ -69,6 +69,8 @@ def vm_list(config, limit, target_node, target_state):
params['node'] = target_node
if target_state:
params['state'] = target_state
if target_tag:
params['tag'] = target_tag
response = call_api(config, 'get', '/vm', params=params)
@ -78,12 +80,12 @@ def vm_list(config, limit, target_node, target_state):
return False, response.json().get('message', '')
def vm_define(config, xml, node, node_limit, node_selector, node_autostart, migration_method):
def vm_define(config, xml, node, node_limit, node_selector, node_autostart, migration_method, user_tags, protected_tags):
"""
Define a new VM on the cluster
API endpoint: POST /vm
API arguments: xml={xml}, node={node}, limit={node_limit}, selector={node_selector}, autostart={node_autostart}, migration_method={migration_method}
API arguments: xml={xml}, node={node}, limit={node_limit}, selector={node_selector}, autostart={node_autostart}, migration_method={migration_method}, user_tags={user_tags}, protected_tags={protected_tags}
API schema: {"message":"{data}"}
"""
params = {
@ -91,7 +93,9 @@ def vm_define(config, xml, node, node_limit, node_selector, node_autostart, migr
'limit': node_limit,
'selector': node_selector,
'autostart': node_autostart,
'migration_method': migration_method
'migration_method': migration_method,
'user_tags': user_tags,
'protected_tags': protected_tags
}
data = {
'xml': xml
@ -155,7 +159,7 @@ def vm_metadata(config, vm, node_limit, node_selector, node_autostart, migration
"""
Modify PVC metadata of a VM
API endpoint: GET /vm/{vm}/meta, POST /vm/{vm}/meta
API endpoint: POST /vm/{vm}/meta
API arguments: limit={node_limit}, selector={node_selector}, autostart={node_autostart}, migration_method={migration_method} profile={provisioner_profile}
API schema: {"message":"{data}"}
"""
@ -188,6 +192,119 @@ def vm_metadata(config, vm, node_limit, node_selector, node_autostart, migration
return retstatus, response.json().get('message', '')
def vm_tags_get(config, vm):
"""
Get PVC tags of a VM
API endpoint: GET /vm/{vm}/tags
API arguments:
API schema: {{"name": "{name}", "type": "{type}"},...}
"""
response = call_api(config, 'get', '/vm/{vm}/tags'.format(vm=vm))
if response.status_code == 200:
retstatus = True
retdata = response.json()
else:
retstatus = False
retdata = response.json().get('message', '')
return retstatus, retdata
def vm_tag_set(config, vm, action, tag, protected=False):
"""
Modify PVC tags of a VM
API endpoint: POST /vm/{vm}/tags
API arguments: action={action}, tag={tag}, protected={protected}
API schema: {"message":"{data}"}
"""
params = {
'action': action,
'tag': tag,
'protected': protected
}
# Update the tags
response = call_api(config, 'post', '/vm/{vm}/tags'.format(vm=vm), params=params)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get('message', '')
def format_vm_tags(config, name, tags):
"""
Format the output of a tags dictionary in a nice table
"""
if len(tags) < 1:
return "No tags found."
output_list = []
name_length = 5
_name_length = len(name) + 1
if _name_length > name_length:
name_length = _name_length
tags_name_length = 4
tags_type_length = 5
tags_protected_length = 10
for tag in tags:
_tags_name_length = len(tag['name']) + 1
if _tags_name_length > tags_name_length:
tags_name_length = _tags_name_length
_tags_type_length = len(tag['type']) + 1
if _tags_type_length > tags_type_length:
tags_type_length = _tags_type_length
_tags_protected_length = len(str(tag['protected'])) + 1
if _tags_protected_length > tags_protected_length:
tags_protected_length = _tags_protected_length
output_list.append(
'{bold}{tags_name: <{tags_name_length}} \
{tags_type: <{tags_type_length}} \
{tags_protected: <{tags_protected_length}}{end_bold}'.format(
name_length=name_length,
tags_name_length=tags_name_length,
tags_type_length=tags_type_length,
tags_protected_length=tags_protected_length,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
tags_name='Name',
tags_type='Type',
tags_protected='Protected'
)
)
for tag in sorted(tags, key=lambda t: t['name']):
output_list.append(
'{bold}{tags_name: <{tags_name_length}} \
{tags_type: <{tags_type_length}} \
{tags_protected: <{tags_protected_length}}{end_bold}'.format(
name_length=name_length,
tags_type_length=tags_type_length,
tags_name_length=tags_name_length,
tags_protected_length=tags_protected_length,
bold='',
end_bold='',
tags_name=tag['name'],
tags_type=tag['type'],
tags_protected=str(tag['protected'])
)
)
return '\n'.join(output_list)
def vm_remove(config, vm, delete_disks=False):
"""
Remove a VM
@ -1098,9 +1215,9 @@ def follow_console_log(config, vm, lines=10):
API arguments: lines={lines}
API schema: {"name":"{vmname}","data":"{console_log}"}
"""
# We always grab 500 to match the follow call, but only _show_ `lines` number
# We always grab 200 to match the follow call, but only _show_ `lines` number
params = {
'lines': 500
'lines': 200
}
response = call_api(config, 'get', '/vm/{vm}/console'.format(vm=vm), params=params)
@ -1116,10 +1233,10 @@ def follow_console_log(config, vm, lines=10):
print(loglines, end='')
while True:
# Grab the next line set (500 is a reasonable number of lines per second; any more are skipped)
# Grab the next line set (200 is a reasonable number of lines per half-second; any more are skipped)
try:
params = {
'lines': 500
'lines': 200
}
response = call_api(config, 'get', '/vm/{vm}/console'.format(vm=vm), params=params)
new_console_log = response.json()['data']
@ -1128,8 +1245,10 @@ def follow_console_log(config, vm, lines=10):
# Split the new and old log strings into constitutent lines
old_console_loglines = console_log.split('\n')
new_console_loglines = new_console_log.split('\n')
# Set the console log to the new log value for the next iteration
console_log = new_console_log
# Remove the lines from the old log until we hit the first line of the new log; this
# ensures that the old log is a string that we can remove from the new log entirely
for index, line in enumerate(old_console_loglines, start=0):
@ -1144,8 +1263,8 @@ def follow_console_log(config, vm, lines=10):
# If there's a difference, print it out
if diff_console_log:
print(diff_console_log, end='')
# Wait a second
time.sleep(1)
# Wait half a second
time.sleep(0.5)
return True, ''
@ -1248,6 +1367,54 @@ def format_info(config, domain_information, long_output):
ainformation.append('{}Autostart:{} {}'.format(ansiprint.purple(), ansiprint.end(), formatted_node_autostart))
ainformation.append('{}Migration Method:{} {}'.format(ansiprint.purple(), ansiprint.end(), formatted_migration_method))
# Tag list
tags_name_length = 5
tags_type_length = 5
tags_protected_length = 10
for tag in domain_information['tags']:
_tags_name_length = len(tag['name']) + 1
if _tags_name_length > tags_name_length:
tags_name_length = _tags_name_length
_tags_type_length = len(tag['type']) + 1
if _tags_type_length > tags_type_length:
tags_type_length = _tags_type_length
_tags_protected_length = len(str(tag['protected'])) + 1
if _tags_protected_length > tags_protected_length:
tags_protected_length = _tags_protected_length
if len(domain_information['tags']) > 0:
ainformation.append('')
ainformation.append('{purple}Tags:{end} {bold}{tags_name: <{tags_name_length}} {tags_type: <{tags_type_length}} {tags_protected: <{tags_protected_length}}{end}'.format(
purple=ansiprint.purple(),
bold=ansiprint.bold(),
end=ansiprint.end(),
tags_name_length=tags_name_length,
tags_type_length=tags_type_length,
tags_protected_length=tags_protected_length,
tags_name='Name',
tags_type='Type',
tags_protected='Protected'
))
for tag in sorted(domain_information['tags'], key=lambda t: t['type'] + t['name']):
ainformation.append(' {tags_name: <{tags_name_length}} {tags_type: <{tags_type_length}} {tags_protected: <{tags_protected_length}}'.format(
tags_name_length=tags_name_length,
tags_type_length=tags_type_length,
tags_protected_length=tags_protected_length,
tags_name=tag['name'],
tags_type=tag['type'],
tags_protected=str(tag['protected'])
))
else:
ainformation.append('')
ainformation.append('{purple}Tags:{end} N/A'.format(
purple=ansiprint.purple(),
bold=ansiprint.bold(),
end=ansiprint.end(),
))
# Network list
net_list = []
cluster_net_list = call_api(config, 'get', '/network').json()
@ -1331,6 +1498,14 @@ def format_list(config, vm_list, raw):
net_list.append(net['vni'])
return net_list
# Function to get tag names and returna nicer list
def getNiceTagName(domain_information):
# Tag list
tag_list = []
for tag in sorted(domain_information['tags'], key=lambda t: t['type'] + t['name']):
tag_list.append(tag['name'])
return tag_list
# Handle raw mode since it just lists the names
if raw:
ainformation = list()
@ -1344,6 +1519,7 @@ def format_list(config, vm_list, raw):
# Dynamic columns: node_name, node, migrated
vm_name_length = 5
vm_state_length = 6
vm_tags_length = 5
vm_nets_length = 9
vm_ram_length = 8
vm_vcpu_length = 6
@ -1351,6 +1527,7 @@ def format_list(config, vm_list, raw):
vm_migrated_length = 9
for domain_information in vm_list:
net_list = getNiceNetID(domain_information)
tag_list = getNiceTagName(domain_information)
# vm_name column
_vm_name_length = len(domain_information['name']) + 1
if _vm_name_length > vm_name_length:
@ -1359,6 +1536,10 @@ def format_list(config, vm_list, raw):
_vm_state_length = len(domain_information['state']) + 1
if _vm_state_length > vm_state_length:
vm_state_length = _vm_state_length
# vm_tags column
_vm_tags_length = len(','.join(tag_list)) + 1
if _vm_tags_length > vm_tags_length:
vm_tags_length = _vm_tags_length
# vm_nets column
_vm_nets_length = len(','.join(net_list)) + 1
if _vm_nets_length > vm_nets_length:
@ -1375,12 +1556,12 @@ def format_list(config, vm_list, raw):
# Format the string (header)
vm_list_output.append(
'{bold}{vm_header: <{vm_header_length}} {resource_header: <{resource_header_length}} {node_header: <{node_header_length}}{end_bold}'.format(
vm_header_length=vm_name_length + vm_state_length + 1,
vm_header_length=vm_name_length + vm_state_length + vm_tags_length + 2,
resource_header_length=vm_nets_length + vm_ram_length + vm_vcpu_length + 2,
node_header_length=vm_node_length + vm_migrated_length + 1,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
vm_header='VMs ' + ''.join(['-' for _ in range(4, vm_name_length + vm_state_length)]),
vm_header='VMs ' + ''.join(['-' for _ in range(4, vm_name_length + vm_state_length + vm_tags_length + 1)]),
resource_header='Resources ' + ''.join(['-' for _ in range(10, vm_nets_length + vm_ram_length + vm_vcpu_length + 1)]),
node_header='Node ' + ''.join(['-' for _ in range(5, vm_node_length + vm_migrated_length)])
)
@ -1389,12 +1570,14 @@ def format_list(config, vm_list, raw):
vm_list_output.append(
'{bold}{vm_name: <{vm_name_length}} \
{vm_state_colour}{vm_state: <{vm_state_length}}{end_colour} \
{vm_tags: <{vm_tags_length}} \
{vm_networks: <{vm_nets_length}} \
{vm_memory: <{vm_ram_length}} {vm_vcpu: <{vm_vcpu_length}} \
{vm_node: <{vm_node_length}} \
{vm_migrated: <{vm_migrated_length}}{end_bold}'.format(
vm_name_length=vm_name_length,
vm_state_length=vm_state_length,
vm_tags_length=vm_tags_length,
vm_nets_length=vm_nets_length,
vm_ram_length=vm_ram_length,
vm_vcpu_length=vm_vcpu_length,
@ -1406,6 +1589,7 @@ def format_list(config, vm_list, raw):
end_colour='',
vm_name='Name',
vm_state='State',
vm_tags='Tags',
vm_networks='Networks',
vm_memory='RAM (M)',
vm_vcpu='vCPUs',
@ -1434,6 +1618,9 @@ def format_list(config, vm_list, raw):
# Handle colouring for an invalid network config
net_list = getNiceNetID(domain_information)
tag_list = getNiceTagName(domain_information)
if len(tag_list) < 1:
tag_list = ['N/A']
vm_net_colour = ''
for net_vni in net_list:
if net_vni not in ['cluster', 'storage', 'upstream'] and not re.match(r'^macvtap:.*', net_vni) and not re.match(r'^hostdev:.*', net_vni):
@ -1443,12 +1630,14 @@ def format_list(config, vm_list, raw):
vm_list_output.append(
'{bold}{vm_name: <{vm_name_length}} \
{vm_state_colour}{vm_state: <{vm_state_length}}{end_colour} \
{vm_tags: <{vm_tags_length}} \
{vm_net_colour}{vm_networks: <{vm_nets_length}}{end_colour} \
{vm_memory: <{vm_ram_length}} {vm_vcpu: <{vm_vcpu_length}} \
{vm_node: <{vm_node_length}} \
{vm_migrated: <{vm_migrated_length}}{end_bold}'.format(
vm_name_length=vm_name_length,
vm_state_length=vm_state_length,
vm_tags_length=vm_tags_length,
vm_nets_length=vm_nets_length,
vm_ram_length=vm_ram_length,
vm_vcpu_length=vm_vcpu_length,
@ -1460,6 +1649,7 @@ def format_list(config, vm_list, raw):
end_colour=ansiprint.end(),
vm_name=domain_information['name'],
vm_state=domain_information['state'],
vm_tags=','.join(tag_list),
vm_net_colour=vm_net_colour,
vm_networks=','.join(net_list),
vm_memory=domain_information['memory'],

View File

@ -251,7 +251,11 @@ def cluster_remove(name):
# pvc cluster list
###############################################################################
@click.command(name='list', short_help='List all available clusters.')
def cluster_list():
@click.option(
'-r', '--raw', 'raw', is_flag=True, default=False,
help='Display the raw list of cluster names only.'
)
def cluster_list(raw):
"""
List all the available PVC clusters configured in this CLI instance.
"""
@ -302,27 +306,28 @@ def cluster_list():
if _api_key_length > api_key_length:
api_key_length = _api_key_length
# Display the data nicely
click.echo("Available clusters:")
click.echo()
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
name="Name",
name_length=name_length,
description="Description",
description_length=description_length,
address="Address",
address_length=address_length,
port="Port",
port_length=port_length,
scheme="Scheme",
scheme_length=scheme_length,
api_key="API Key",
api_key_length=api_key_length
if not raw:
# Display the data nicely
click.echo("Available clusters:")
click.echo()
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
name="Name",
name_length=name_length,
description="Description",
description_length=description_length,
address="Address",
address_length=address_length,
port="Port",
port_length=port_length,
scheme="Scheme",
scheme_length=scheme_length,
api_key="API Key",
api_key_length=api_key_length
)
)
)
for cluster in clusters:
cluster_details = clusters[cluster]
@ -341,24 +346,27 @@ def cluster_list():
if not api_key:
api_key = 'N/A'
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold='',
end_bold='',
name=cluster,
name_length=name_length,
description=description,
description_length=description_length,
address=address,
address_length=address_length,
port=port,
port_length=port_length,
scheme=scheme,
scheme_length=scheme_length,
api_key=api_key,
api_key_length=api_key_length
if not raw:
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold='',
end_bold='',
name=cluster,
name_length=name_length,
description=description,
description_length=description_length,
address=address,
address_length=address_length,
port=port,
port_length=port_length,
scheme=scheme,
scheme_length=scheme_length,
api_key=api_key,
api_key_length=api_key_length
)
)
)
else:
click.echo(cluster)
# Validate that the cluster is set for a given command
@ -532,6 +540,43 @@ def node_unflush(node, wait):
cleanup(retcode, retmsg)
###############################################################################
# pvc node log
###############################################################################
@click.command(name='log', short_help='Show logs of a node.')
@click.argument(
'node'
)
@click.option(
'-l', '--lines', 'lines', default=None, show_default=False,
help='Display this many log lines from the end of the log buffer. [default: 1000; with follow: 10]'
)
@click.option(
'-f', '--follow', 'follow', is_flag=True, default=False,
help='Follow the log buffer; output may be delayed by a few seconds relative to the live system. The --lines value defaults to 10 for the initial output.'
)
@cluster_req
def node_log(node, lines, follow):
"""
Show node logs of virtual machine DOMAIN on its current node in a pager or continuously. DOMAIN may be a UUID or name. Note that migrating a VM to a different node will cause the log buffer to be overwritten by entries from the new node.
"""
# Set the default here so we can handle it
if lines is None:
if follow:
lines = 10
else:
lines = 1000
if follow:
retcode, retmsg = pvc_node.follow_node_log(config, node, lines)
else:
retcode, retmsg = pvc_node.view_node_log(config, node, lines)
click.echo_via_pager(retmsg)
retmsg = ''
cleanup(retcode, retmsg)
###############################################################################
# pvc node info
###############################################################################
@ -630,11 +675,21 @@ def cli_vm():
type=click.Choice(['none', 'live', 'shutdown']),
help='The preferred migration method of the VM between nodes; saved with VM.'
)
@click.option(
'-g', '--tag', 'user_tags',
default=[], multiple=True,
help='User tag for the VM; can be specified multiple times, once per tag.'
)
@click.option(
'-G', '--protected-tag', 'protected_tags',
default=[], multiple=True,
help='Protected user tag for the VM; can be specified multiple times, once per tag.'
)
@click.argument(
'vmconfig', type=click.File()
)
@cluster_req
def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart, migration_method):
def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart, migration_method, user_tags, protected_tags):
"""
Define a new virtual machine from Libvirt XML configuration file VMCONFIG.
"""
@ -650,7 +705,7 @@ def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart,
except Exception:
cleanup(False, 'Error: XML is malformed or invalid')
retcode, retmsg = pvc_vm.vm_define(config, new_cfg, target_node, node_limit, node_selector, node_autostart, migration_method)
retcode, retmsg = pvc_vm.vm_define(config, new_cfg, target_node, node_limit, node_selector, node_autostart, migration_method, user_tags, protected_tags)
cleanup(retcode, retmsg)
@ -709,9 +764,19 @@ def vm_meta(domain, node_limit, node_selector, node_autostart, migration_method,
help='Immediately restart VM to apply new config.'
)
@click.option(
'-y', '--yes', 'confirm_flag',
'-d', '--confirm-diff', 'confirm_diff_flag',
is_flag=True, default=False,
help='Confirm the restart'
help='Confirm the diff.'
)
@click.option(
'-c', '--confirm-restart', 'confirm_restart_flag',
is_flag=True, default=False,
help='Confirm the restart.'
)
@click.option(
'-y', '--yes', 'confirm_all_flag',
is_flag=True, default=False,
help='Confirm the diff and the restart.'
)
@click.argument(
'domain'
@ -719,7 +784,7 @@ def vm_meta(domain, node_limit, node_selector, node_autostart, migration_method,
@click.argument(
'cfgfile', type=click.File(), default=None, required=False
)
def vm_modify(domain, cfgfile, editor, restart, confirm_flag):
def vm_modify(domain, cfgfile, editor, restart, confirm_diff_flag, confirm_restart_flag, confirm_all_flag):
"""
Modify existing virtual machine DOMAIN, either in-editor or with replacement CONFIG. DOMAIN may be a UUID or name.
"""
@ -733,12 +798,12 @@ def vm_modify(domain, cfgfile, editor, restart, confirm_flag):
dom_name = vm_information.get('name')
if editor is True:
# Grab the current config
current_vm_cfg_raw = vm_information.get('xml')
xml_data = etree.fromstring(current_vm_cfg_raw)
current_vm_cfgfile = etree.tostring(xml_data, pretty_print=True).decode('utf8').strip()
# Grab the current config
current_vm_cfg_raw = vm_information.get('xml')
xml_data = etree.fromstring(current_vm_cfg_raw)
current_vm_cfgfile = etree.tostring(xml_data, pretty_print=True).decode('utf8').strip()
if editor is True:
new_vm_cfgfile = click.edit(text=current_vm_cfgfile, require_save=True, extension='.xml')
if new_vm_cfgfile is None:
click.echo('Aborting with no modifications.')
@ -776,9 +841,10 @@ def vm_modify(domain, cfgfile, editor, restart, confirm_flag):
except Exception as e:
cleanup(False, 'Error: XML is malformed or invalid: {}'.format(e))
click.confirm('Write modifications to cluster?', abort=True)
if not confirm_diff_flag and not confirm_all_flag and not config['unsafe']:
click.confirm('Write modifications to cluster?', abort=True)
if restart and not confirm_flag and not config['unsafe']:
if restart and not confirm_restart_flag and not confirm_all_flag and not config['unsafe']:
try:
click.confirm('Restart VM {}'.format(domain), prompt_suffix='? ', abort=True)
except Exception:
@ -1103,6 +1169,90 @@ def vm_flush_locks(domain):
cleanup(retcode, retmsg)
###############################################################################
# pvc vm tag
###############################################################################
@click.group(name='tag', short_help='Manage tags of a virtual machine.', context_settings=CONTEXT_SETTINGS)
def vm_tags():
"""
Manage the tags of a virtual machine in the PVC cluster."
"""
pass
###############################################################################
# pvc vm tag get
###############################################################################
@click.command(name='get', short_help='Get the current tags of a virtual machine.')
@click.argument(
'domain'
)
@click.option(
'-r', '--raw', 'raw', is_flag=True, default=False,
help='Display the raw value only without formatting.'
)
@cluster_req
def vm_tags_get(domain, raw):
"""
Get the current tags of the virtual machine DOMAIN.
"""
retcode, retdata = pvc_vm.vm_tags_get(config, domain)
if retcode:
if not raw:
retdata = pvc_vm.format_vm_tags(config, domain, retdata['tags'])
else:
if len(retdata['tags']) > 0:
retdata = '\n'.join([tag['name'] for tag in retdata['tags']])
else:
retdata = 'No tags found.'
cleanup(retcode, retdata)
###############################################################################
# pvc vm tag add
###############################################################################
@click.command(name='add', short_help='Add new tags to a virtual machine.')
@click.argument(
'domain'
)
@click.argument(
'tag'
)
@click.option(
'-p', '--protected', 'protected', is_flag=True, required=False, default=False,
help="Set this tag as protected; protected tags cannot be removed."
)
@cluster_req
def vm_tags_add(domain, tag, protected):
"""
Add TAG to the virtual machine DOMAIN.
"""
retcode, retmsg = pvc_vm.vm_tag_set(config, domain, 'add', tag, protected)
cleanup(retcode, retmsg)
###############################################################################
# pvc vm tag remove
###############################################################################
@click.command(name='remove', short_help='Remove tags from a virtual machine.')
@click.argument(
'domain'
)
@click.argument(
'tag'
)
@cluster_req
def vm_tags_remove(domain, tag):
"""
Remove TAG from the virtual machine DOMAIN.
"""
retcode, retmsg = pvc_vm.vm_tag_set(config, domain, 'remove', tag)
cleanup(retcode, retmsg)
###############################################################################
# pvc vm vcpu
###############################################################################
@ -1645,19 +1795,23 @@ def vm_dump(filename, domain):
'-s', '--state', 'target_state', default=None,
help='Limit list to VMs in the specified state.'
)
@click.option(
'-g', '--tag', 'target_tag', default=None,
help='Limit list to VMs with the specified tag.'
)
@click.option(
'-r', '--raw', 'raw', is_flag=True, default=False,
help='Display the raw list of VM names only.'
)
@cluster_req
def vm_list(target_node, target_state, limit, raw):
def vm_list(target_node, target_state, target_tag, limit, raw):
"""
List all virtual machines; optionally only match names matching regex LIMIT.
List all virtual machines; optionally only match names or full UUIDs matching regex LIMIT.
NOTE: Red-coloured network lists indicate one or more configured networks are missing/invalid.
"""
retcode, retdata = pvc_vm.vm_list(config, limit, target_node, target_state)
retcode, retdata = pvc_vm.vm_list(config, limit, target_node, target_state, target_tag)
if retcode:
retdata = pvc_vm.format_list(config, retdata, raw)
else:
@ -4601,9 +4755,14 @@ cli_node.add_command(node_primary)
cli_node.add_command(node_flush)
cli_node.add_command(node_ready)
cli_node.add_command(node_unflush)
cli_node.add_command(node_log)
cli_node.add_command(node_info)
cli_node.add_command(node_list)
vm_tags.add_command(vm_tags_get)
vm_tags.add_command(vm_tags_add)
vm_tags.add_command(vm_tags_remove)
vm_vcpu.add_command(vm_vcpu_get)
vm_vcpu.add_command(vm_vcpu_set)
@ -4634,6 +4793,7 @@ cli_vm.add_command(vm_move)
cli_vm.add_command(vm_migrate)
cli_vm.add_command(vm_unmigrate)
cli_vm.add_command(vm_flush_locks)
cli_vm.add_command(vm_tags)
cli_vm.add_command(vm_vcpu)
cli_vm.add_command(vm_memory)
cli_vm.add_command(vm_network)

View File

@ -2,7 +2,7 @@ from setuptools import setup
setup(
name='pvc',
version='0.9.22',
version='0.9.31',
packages=['pvc', 'pvc.cli_lib'],
install_requires=[
'Click',

View File

@ -491,7 +491,7 @@ def add_volume(zkhandler, pool, name, size):
size = '{}B'.format(size)
# 2. Create the volume
retcode, stdout, stderr = common.run_os_command('rbd create --size {} --image-feature layering,exclusive-lock {}/{}'.format(size, pool, name))
retcode, stdout, stderr = common.run_os_command('rbd create --size {} {}/{}'.format(size, pool, name))
if retcode:
return False, 'ERROR: Failed to create RBD volume "{}": {}'.format(name, stderr)
@ -536,6 +536,10 @@ def resize_volume(zkhandler, pool, name, size):
if not verifyVolume(zkhandler, pool, name):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool)
# Add 'B' if the volume is in bytes
if re.match(r'^[0-9]+$', size):
size = '{}B'.format(size)
# 1. Resize the volume
retcode, stdout, stderr = common.run_os_command('rbd resize --size {} {}/{}'.format(size, pool, name))
if retcode:

View File

@ -60,7 +60,7 @@ def getClusterInformation(zkhandler):
retcode, node_list = pvc_node.get_list(zkhandler, None)
# Get vm information object list
retcode, vm_list = pvc_vm.get_list(zkhandler, None, None, None)
retcode, vm_list = pvc_vm.get_list(zkhandler, None, None, None, None)
# Get network information object list
retcode, network_list = pvc_network.get_list(zkhandler, None, None)

View File

@ -306,6 +306,50 @@ def getDomainDiskList(zkhandler, dom_uuid):
return disk_list
#
# Get a list of domain tags
#
def getDomainTags(zkhandler, dom_uuid):
"""
Get a list of tags for domain dom_uuid
The UUID must be validated before calling this function!
"""
tags = list()
for tag in zkhandler.children(('domain.meta.tags', dom_uuid)):
tag_type = zkhandler.read(('domain.meta.tags', dom_uuid, 'tag.type', tag))
protected = bool(strtobool(zkhandler.read(('domain.meta.tags', dom_uuid, 'tag.protected', tag))))
tags.append({'name': tag, 'type': tag_type, 'protected': protected})
return tags
#
# Get a set of domain metadata
#
def getDomainMetadata(zkhandler, dom_uuid):
"""
Get the domain metadata for domain dom_uuid
The UUID must be validated before calling this function!
"""
domain_node_limit = zkhandler.read(('domain.meta.node_limit', dom_uuid))
domain_node_selector = zkhandler.read(('domain.meta.node_selector', dom_uuid))
domain_node_autostart = zkhandler.read(('domain.meta.autostart', dom_uuid))
domain_migration_method = zkhandler.read(('domain.meta.migrate_method', dom_uuid))
if not domain_node_limit:
domain_node_limit = None
else:
domain_node_limit = domain_node_limit.split(',')
if not domain_node_autostart:
domain_node_autostart = None
return domain_node_limit, domain_node_selector, domain_node_autostart, domain_migration_method
#
# Get domain information from XML
#
@ -319,19 +363,8 @@ def getInformationFromXML(zkhandler, uuid):
domain_lastnode = zkhandler.read(('domain.last_node', uuid))
domain_failedreason = zkhandler.read(('domain.failed_reason', uuid))
domain_node_limit = zkhandler.read(('domain.meta.node_limit', uuid))
domain_node_selector = zkhandler.read(('domain.meta.node_selector', uuid))
domain_node_autostart = zkhandler.read(('domain.meta.autostart', uuid))
domain_migration_method = zkhandler.read(('domain.meta.migrate_method', uuid))
if not domain_node_limit:
domain_node_limit = None
else:
domain_node_limit = domain_node_limit.split(',')
if not domain_node_autostart:
domain_node_autostart = None
domain_node_limit, domain_node_selector, domain_node_autostart, domain_migration_method = getDomainMetadata(zkhandler, uuid)
domain_tags = getDomainTags(zkhandler, uuid)
domain_profile = zkhandler.read(('domain.profile', uuid))
domain_vnc = zkhandler.read(('domain.console.vnc', uuid))
@ -343,8 +376,13 @@ def getInformationFromXML(zkhandler, uuid):
parsed_xml = getDomainXML(zkhandler, uuid)
stats_data = loads(zkhandler.read(('domain.stats', uuid)))
if stats_data is None:
stats_data = zkhandler.read(('domain.stats', uuid))
if stats_data is not None:
try:
stats_data = loads(stats_data)
except Exception:
stats_data = {}
else:
stats_data = {}
domain_uuid, domain_name, domain_description, domain_memory, domain_vcpu, domain_vcputopo = getDomainMainDetails(parsed_xml)
@ -373,6 +411,7 @@ def getInformationFromXML(zkhandler, uuid):
'node_selector': domain_node_selector,
'node_autostart': bool(strtobool(domain_node_autostart)),
'migration_method': domain_migration_method,
'tags': domain_tags,
'description': domain_description,
'profile': domain_profile,
'memory': int(domain_memory),

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python3
# log.py - Output (stdout + logfile) functions
# log.py - PVC daemon logger functions
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2021 Joshua M. Boniface <joshua@boniface.me>
@ -19,7 +19,13 @@
#
###############################################################################
import datetime
from collections import deque
from threading import Thread
from queue import Queue
from datetime import datetime
from time import sleep
from daemon_lib.zkhandler import ZKHandler
class Logger(object):
@ -77,17 +83,39 @@ class Logger(object):
self.last_colour = ''
self.last_prompt = ''
if self.config['zookeeper_logging']:
self.zookeeper_queue = Queue()
self.zookeeper_logger = ZookeeperLogger(self.config, self.zookeeper_queue)
self.zookeeper_logger.start()
# Provide a hup function to close and reopen the writer
def hup(self):
self.writer.close()
self.writer = open(self.logfile, 'a', buffering=0)
# Provide a termination function so all messages are flushed before terminating the main daemon
def terminate(self):
if self.config['file_logging']:
self.writer.close()
if self.config['zookeeper_logging']:
self.out("Waiting 15s for Zookeeper message queue to drain", state='s')
tick_count = 0
while not self.zookeeper_queue.empty():
sleep(0.5)
tick_count += 1
if tick_count > 30:
break
self.zookeeper_logger.stop()
self.zookeeper_logger.join()
# Output function
def out(self, message, state=None, prefix=''):
# Get the date
if self.config['log_dates']:
date = '{} - '.format(datetime.datetime.now().strftime('%Y/%m/%d %H:%M:%S.%f'))
date = '{} '.format(datetime.now().strftime('%Y/%m/%d %H:%M:%S.%f'))
else:
date = ''
@ -123,6 +151,98 @@ class Logger(object):
if self.config['file_logging']:
self.writer.write(message + '\n')
# Log to Zookeeper
if self.config['zookeeper_logging']:
self.zookeeper_queue.put(message)
# Set last message variables
self.last_colour = colour
self.last_prompt = prompt
class ZookeeperLogger(Thread):
"""
Defines a threaded writer for Zookeeper locks. Threading prevents the blocking of other
daemon events while the records are written. They will be eventually-consistent
"""
def __init__(self, config, zookeeper_queue):
self.config = config
self.node = self.config['node']
self.max_lines = self.config['node_log_lines']
self.zookeeper_queue = zookeeper_queue
self.connected = False
self.running = False
self.zkhandler = None
Thread.__init__(self, args=(), kwargs=None)
def start_zkhandler(self):
# We must open our own dedicated Zookeeper instance because we can't guarantee one already exists when this starts
if self.zkhandler is not None:
try:
self.zkhandler.disconnect()
except Exception:
pass
while True:
try:
self.zkhandler = ZKHandler(self.config, logger=None)
self.zkhandler.connect(persistent=True)
break
except Exception:
sleep(0.5)
continue
self.connected = True
# Ensure the root keys for this are instantiated
self.zkhandler.write([
('base.logs', ''),
(('logs', self.node), '')
])
def run(self):
while not self.connected:
self.start_zkhandler()
sleep(1)
self.running = True
# Get the logs that are currently in Zookeeper and populate our deque
raw_logs = self.zkhandler.read(('logs.messages', self.node))
if raw_logs is None:
raw_logs = ''
logs = deque(raw_logs.split('\n'), self.max_lines)
while self.running:
# Get a new message
try:
message = self.zookeeper_queue.get(timeout=1)
if not message:
continue
except Exception:
continue
if not self.config['log_dates']:
# We want to log dates here, even if the log_dates config is not set
date = '{} '.format(datetime.now().strftime('%Y/%m/%d %H:%M:%S.%f'))
else:
date = ''
# Add the message to the deque
logs.append(f'{date}{message}')
tick_count = 0
while True:
try:
# Write the updated messages into Zookeeper
self.zkhandler.write([(('logs.messages', self.node), '\n'.join(logs))])
break
except Exception:
# The write failed (connection loss, etc.) so retry for 15 seconds
sleep(0.5)
tick_count += 1
if tick_count > 30:
break
else:
continue
return
def stop(self):
self.running = False

View File

@ -0,0 +1 @@
{"version": "3", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "cmd": "/cmd", "cmd.node": "/cmd/nodes", "cmd.domain": "/cmd/domains", "cmd.ceph": "/cmd/ceph", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "data.pvc_version": "/pvcversion", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword", "sriov": "/sriov", "sriov.pf": "/sriov/pf", "sriov.vf": "/sriov/vf"}, "sriov_pf": {"phy": "", "mtu": "/mtu", "vfcount": "/vfcount"}, "sriov_vf": {"phy": "", "pf": "/pf", "mtu": "/mtu", "mac": "/mac", "phy_mac": "/phy_mac", "config": "/config", "config.vlan_id": "/config/vlan_id", "config.vlan_qos": "/config/vlan_qos", "config.tx_rate_min": "/config/tx_rate_min", "config.tx_rate_max": "/config/tx_rate_max", "config.spoof_check": "/config/spoof_check", "config.link_state": "/config/link_state", "config.trust": "/config/trust", "config.query_rss": "/config/query_rss", "pci": "/pci", "pci.domain": "/pci/domain", "pci.bus": "/pci/bus", "pci.slot": "/pci/slot", "pci.function": "/pci/function", "used": "/used", "used_by": "/used_by"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "meta.tags": "/tags", "migrate.sync_lock": "/migrate_sync_lock"}, "tag": {"name": "", "type": "/type", "protected": "/protected"}, "network": {"vni": "", "type": "/nettype", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}

View File

@ -0,0 +1 @@
{"version": "4", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "cmd": "/cmd", "cmd.node": "/cmd/nodes", "cmd.domain": "/cmd/domains", "cmd.ceph": "/cmd/ceph", "logs": "/logs", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "logs": {"node": "", "messages": "/messages"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "data.pvc_version": "/pvcversion", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword", "sriov": "/sriov", "sriov.pf": "/sriov/pf", "sriov.vf": "/sriov/vf"}, "sriov_pf": {"phy": "", "mtu": "/mtu", "vfcount": "/vfcount"}, "sriov_vf": {"phy": "", "pf": "/pf", "mtu": "/mtu", "mac": "/mac", "phy_mac": "/phy_mac", "config": "/config", "config.vlan_id": "/config/vlan_id", "config.vlan_qos": "/config/vlan_qos", "config.tx_rate_min": "/config/tx_rate_min", "config.tx_rate_max": "/config/tx_rate_max", "config.spoof_check": "/config/spoof_check", "config.link_state": "/config/link_state", "config.trust": "/config/trust", "config.query_rss": "/config/query_rss", "pci": "/pci", "pci.domain": "/pci/domain", "pci.bus": "/pci/bus", "pci.slot": "/pci/slot", "pci.function": "/pci/function", "used": "/used", "used_by": "/used_by"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "meta.tags": "/tags", "migrate.sync_lock": "/migrate_sync_lock"}, "tag": {"name": "", "type": "/type", "protected": "/protected"}, "network": {"vni": "", "type": "/nettype", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}

View File

@ -182,6 +182,24 @@ def ready_node(zkhandler, node, wait=False):
return True, retmsg
def get_node_log(zkhandler, node, lines=2000):
# Verify node is valid
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
# Get the data from ZK
node_log = zkhandler.read(('logs.messages', node))
if node_log is None:
return True, ''
# Shrink the log buffer to length lines
shrunk_log = node_log.split('\n')[-lines:]
loglines = '\n'.join(shrunk_log)
return True, loglines
def get_info(zkhandler, node):
# Verify node is valid
if not common.verifyNode(zkhandler, node):

View File

@ -24,6 +24,7 @@ import re
import lxml.objectify
import lxml.etree
from distutils.util import strtobool
from uuid import UUID
from concurrent.futures import ThreadPoolExecutor
@ -174,7 +175,7 @@ def flush_locks(zkhandler, domain):
return success, message
def define_vm(zkhandler, config_data, target_node, node_limit, node_selector, node_autostart, migration_method=None, profile=None, initial_state='stop'):
def define_vm(zkhandler, config_data, target_node, node_limit, node_selector, node_autostart, migration_method=None, profile=None, tags=[], initial_state='stop'):
# Parse the XML data
try:
parsed_xml = lxml.objectify.fromstring(config_data)
@ -246,9 +247,18 @@ def define_vm(zkhandler, config_data, target_node, node_limit, node_selector, no
(('domain.meta.migrate_method', dom_uuid), migration_method),
(('domain.meta.node_limit', dom_uuid), formatted_node_limit),
(('domain.meta.node_selector', dom_uuid), node_selector),
(('domain.meta.tags', dom_uuid), ''),
(('domain.migrate.sync_lock', dom_uuid), ''),
])
for tag in tags:
tag_name = tag['name']
zkhandler.write([
(('domain.meta.tags', dom_uuid, 'tag.name', tag_name), tag['name']),
(('domain.meta.tags', dom_uuid, 'tag.type', tag_name), tag['type']),
(('domain.meta.tags', dom_uuid, 'tag.protected', tag_name), tag['protected']),
])
return True, 'Added new VM with Name "{}" and UUID "{}" to database.'.format(dom_name, dom_uuid)
@ -282,6 +292,38 @@ def modify_vm_metadata(zkhandler, domain, node_limit, node_selector, node_autost
return True, 'Successfully modified PVC metadata of VM "{}".'.format(domain)
def modify_vm_tag(zkhandler, domain, action, tag, protected=False):
dom_uuid = getDomainUUID(zkhandler, domain)
if not dom_uuid:
return False, 'ERROR: Could not find VM "{}" in the cluster!'.format(domain)
if action == 'add':
zkhandler.write([
(('domain.meta.tags', dom_uuid, 'tag.name', tag), tag),
(('domain.meta.tags', dom_uuid, 'tag.type', tag), 'user'),
(('domain.meta.tags', dom_uuid, 'tag.protected', tag), protected),
])
return True, 'Successfully added tag "{}" to VM "{}".'.format(tag, domain)
elif action == 'remove':
if not zkhandler.exists(('domain.meta.tags', dom_uuid, 'tag', tag)):
return False, 'The tag "{}" does not exist.'.format(tag)
if zkhandler.read(('domain.meta.tags', dom_uuid, 'tag.type', tag)) != 'user':
return False, 'The tag "{}" is not a user tag and cannot be removed.'.format(tag)
if bool(strtobool(zkhandler.read(('domain.meta.tags', dom_uuid, 'tag.protected', tag)))):
return False, 'The tag "{}" is protected and cannot be removed.'.format(tag)
zkhandler.delete([
(('domain.meta.tags', dom_uuid, 'tag', tag))
])
return True, 'Successfully removed tag "{}" from VM "{}".'.format(tag, domain)
else:
return False, 'Specified tag action is not available.'
def modify_vm(zkhandler, domain, restart, new_vm_config):
dom_uuid = getDomainUUID(zkhandler, domain)
if not dom_uuid:
@ -403,7 +445,7 @@ def rename_vm(zkhandler, domain, new_domain):
undefine_vm(zkhandler, dom_uuid)
# Define the new VM
define_vm(zkhandler, vm_config_new, dom_info['node'], dom_info['node_limit'], dom_info['node_selector'], dom_info['node_autostart'], migration_method=dom_info['migration_method'], profile=dom_info['profile'], initial_state='stop')
define_vm(zkhandler, vm_config_new, dom_info['node'], dom_info['node_limit'], dom_info['node_selector'], dom_info['node_autostart'], migration_method=dom_info['migration_method'], profile=dom_info['profile'], tags=dom_info['tags'], initial_state='stop')
# If the VM is migrated, store that
if dom_info['migrated'] != 'no':
@ -449,14 +491,6 @@ def remove_vm(zkhandler, domain):
if current_vm_state != 'stop':
change_state(zkhandler, dom_uuid, 'stop')
# Gracefully terminate the class instances
change_state(zkhandler, dom_uuid, 'delete')
# Delete the configurations
zkhandler.delete([
('domain', dom_uuid)
])
# Wait for 1 second to allow state to flow to all nodes
time.sleep(1)
@ -465,11 +499,28 @@ def remove_vm(zkhandler, domain):
# vmpool/vmname_volume
try:
disk_pool, disk_name = disk.split('/')
retcode, message = ceph.remove_volume(zkhandler, disk_pool, disk_name)
except ValueError:
continue
return True, 'Removed VM "{}" and disks from the cluster.'.format(domain)
retcode, message = ceph.remove_volume(zkhandler, disk_pool, disk_name)
if not retcode:
if re.match('^ERROR: No volume with name', message):
continue
else:
return False, message
# Gracefully terminate the class instances
change_state(zkhandler, dom_uuid, 'delete')
# Wait for 1/2 second to allow state to flow to all nodes
time.sleep(0.5)
# Delete the VM configuration from Zookeeper
zkhandler.delete([
('domain', dom_uuid)
])
return True, 'Removed VM "{}" and its disks from the cluster.'.format(domain)
def start_vm(zkhandler, domain):
@ -789,7 +840,10 @@ def get_console_log(zkhandler, domain, lines=1000):
return False, 'ERROR: Could not find VM "{}" in the cluster!'.format(domain)
# Get the data from ZK
console_log = zkhandler.read(('domain.log.console', dom_uuid))
console_log = zkhandler.read(('domain.console.log', dom_uuid))
if console_log is None:
return True, ''
# Shrink the log buffer to length lines
shrunk_log = console_log.split('\n')[-lines:]
@ -812,7 +866,7 @@ def get_info(zkhandler, domain):
return True, domain_information
def get_list(zkhandler, node, state, limit, is_fuzzy=True):
def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True):
if node:
# Verify node is valid
if not common.verifyNode(zkhandler, node):
@ -850,6 +904,7 @@ def get_list(zkhandler, node, state, limit, is_fuzzy=True):
for vm in full_vm_list:
name = zkhandler.read(('domain', vm))
is_limit_match = False
is_tag_match = False
is_node_match = False
is_state_match = False
@ -866,6 +921,13 @@ def get_list(zkhandler, node, state, limit, is_fuzzy=True):
else:
is_limit_match = True
if tag:
vm_tags = zkhandler.children(('domain.meta.tags', vm))
if tag in vm_tags:
is_tag_match = True
else:
is_tag_match = True
# Check on node
if node:
vm_node = zkhandler.read(('domain.node', vm))
@ -882,7 +944,7 @@ def get_list(zkhandler, node, state, limit, is_fuzzy=True):
else:
is_state_match = True
get_vm_info[vm] = True if is_limit_match and is_node_match and is_state_match else False
get_vm_info[vm] = True if is_limit_match and is_tag_match and is_node_match and is_state_match else False
# Obtain our VM data in a thread pool
# This helps parallelize the numerous Zookeeper calls a bit, within the bounds of the GIL, and
@ -897,6 +959,9 @@ def get_list(zkhandler, node, state, limit, is_fuzzy=True):
for vm_uuid in vm_execute_list:
futures.append(executor.submit(common.getInformationFromXML, zkhandler, vm_uuid))
for future in futures:
vm_data_list.append(future.result())
try:
vm_data_list.append(future.result())
except Exception:
pass
return True, vm_data_list

View File

@ -124,37 +124,29 @@ class ZKHandler(object):
# State/connection management
#
def listener(self, state):
"""
Listen for KazooState changes and log accordingly.
This function does not do anything except for log the state, and Kazoo handles the rest.
"""
if state == KazooState.CONNECTED:
self.log('Connection to Zookeeper started', state='o')
self.log('Connection to Zookeeper resumed', state='o')
else:
self.log('Connection to Zookeeper lost', state='w')
while True:
time.sleep(0.5)
_zk_conn = KazooClient(hosts=self.coordinators)
try:
_zk_conn.start()
except Exception:
del _zk_conn
continue
self.zk_conn = _zk_conn
self.zk_conn.add_listener(self.listener)
break
self.log('Connection to Zookeeper lost with state {}'.format(state), state='w')
def connect(self, persistent=False):
"""
Start the zk_conn object and connect to the cluster, then load the current schema version
Start the zk_conn object and connect to the cluster
"""
try:
self.zk_conn.start()
if persistent:
self.log('Connection to Zookeeper started', state='o')
self.zk_conn.add_listener(self.listener)
except Exception as e:
raise ZKConnectionException(self, e)
def disconnect(self):
def disconnect(self, persistent=False):
"""
Stop and close the zk_conn object and disconnect from the cluster
@ -162,11 +154,27 @@ class ZKHandler(object):
"""
self.zk_conn.stop()
self.zk_conn.close()
if persistent:
self.log('Connection to Zookeeper terminated', state='o')
#
# Schema helper actions
#
def get_schema_path(self, key):
"""
Get the Zookeeper path for {key} from the current schema based on its format.
If {key} is a tuple of length 2, it's treated as a path plus an item instance of that path (e.g. a node, a VM, etc.).
If {key} is a tuple of length 4, it is treated as a path plus an item instance, as well as another item instance of the subpath.
If {key} is just a string, it's treated as a lone path (mostly used for the 'base' schema group.
Otherwise, returns None since this is not a valid key.
This function also handles the special case where a string that looks like an existing path (i.e. starts with '/') is passed;
in that case it will silently return the same path back. This was mostly a migration functionality and is deprecated.
"""
if isinstance(key, tuple):
# This is a key tuple with both an ipath and an item
if len(key) == 2:
@ -201,6 +209,10 @@ class ZKHandler(object):
Check if a key exists
"""
path = self.get_schema_path(key)
if path is None:
# This path is invalid, this is likely due to missing schema entries, so return False
return False
stat = self.zk_conn.exists(path)
if stat:
return True
@ -213,11 +225,13 @@ class ZKHandler(object):
"""
try:
path = self.get_schema_path(key)
data = self.zk_conn.get(path)[0].decode(self.encoding)
except NoNodeError:
data = None
if path is None:
# This path is invalid; this is likely due to missing schema entries, so return None
return None
return data
return self.zk_conn.get(path)[0].decode(self.encoding)
except NoNodeError:
return None
def write(self, kvpairs):
"""
@ -238,6 +252,9 @@ class ZKHandler(object):
value = kvpair[1]
path = self.get_schema_path(key)
if path is None:
# This path is invalid; this is likely due to missing schema entries, so continue
continue
if not self.exists(key):
# Creating a new key
@ -276,9 +293,9 @@ class ZKHandler(object):
keys = [keys]
for key in keys:
path = self.get_schema_path(key)
if self.exists(key):
try:
path = self.get_schema_path(key)
self.zk_conn.delete(path, recursive=recursive)
except Exception as e:
self.log("ZKHandler error: Failed to delete key {}: {}".format(path, e), state='e')
@ -292,11 +309,13 @@ class ZKHandler(object):
"""
try:
path = self.get_schema_path(key)
children = self.zk_conn.get_children(path)
except NoNodeError:
children = None
if path is None:
# This path is invalid; this is likely due to missing schema entries, so return None
return None
return children
return self.zk_conn.get_children(path)
except NoNodeError:
return None
def rename(self, kkpairs):
"""
@ -327,13 +346,20 @@ class ZKHandler(object):
source_key = kkpair[0]
source_path = self.get_schema_path(source_key)
if source_path is None:
# This path is invalid; this is likely due to missing schema entries, so continue
continue
destination_key = kkpair[1]
destination_path = self.get_schema_path(destination_key)
if destination_path is None:
# This path is invalid; this is likely due to missing schema entries, so continue
continue
if not self.exists(source_key):
self.log("ZKHander error: Source key '{}' does not exist".format(source_path), state='e')
return False
if self.exists(destination_key):
self.log("ZKHander error: Destination key '{}' already exists".format(destination_path), state='e')
return False
@ -440,7 +466,7 @@ class ZKHandler(object):
#
class ZKSchema(object):
# Current version
_version = 2
_version = 4
# Root for doing nested keys
_schema_root = ''
@ -464,6 +490,7 @@ class ZKSchema(object):
'cmd.node': f'{_schema_root}/cmd/nodes',
'cmd.domain': f'{_schema_root}/cmd/domains',
'cmd.ceph': f'{_schema_root}/cmd/ceph',
'logs': '/logs',
'node': f'{_schema_root}/nodes',
'domain': f'{_schema_root}/domains',
'network': f'{_schema_root}/networks',
@ -474,6 +501,11 @@ class ZKSchema(object):
'volume': f'{_schema_root}/ceph/volumes',
'snapshot': f'{_schema_root}/ceph/snapshots',
},
# The schema of an individual logs entry (/logs/{node_name})
'logs': {
'node': '', # The root key
'messages': '/messages',
},
# The schema of an individual node entry (/nodes/{node_name})
'node': {
'name': '', # The root key
@ -550,8 +582,15 @@ class ZKSchema(object):
'meta.migrate_method': '/migration_method',
'meta.node_selector': '/node_selector',
'meta.node_limit': '/node_limit',
'meta.tags': '/tags',
'migrate.sync_lock': '/migrate_sync_lock'
},
# The schema of an individual domain tag entry (/domains/{domain}/tags/{tag})
'tag': {
'name': '', # The root key
'type': '/type',
'protected': '/protected'
},
# The schema of an individual network entry (/networks/{vni})
'network': {
'vni': '', # The root key
@ -698,9 +737,16 @@ class ZKSchema(object):
if base_path is None:
# This should only really happen for second-layer key types where the helper functions join them together
base_path = ''
if not ipath:
# This is a root path
return f'{base_path}/{item}'
sub_path = self.schema.get(itype).get('.'.join(ipath))
if sub_path is None:
sub_path = ''
# We didn't find the path we're looking for, so we don't want to do anything
return None
return f'{base_path}/{item}{sub_path}'
# Get keys of a schema location

69
debian/changelog vendored
View File

@ -1,3 +1,72 @@
pvc (0.9.31-0) unstable; urgency=high
* [Packages] Cleans up obsolete Suggests lines
* [Node Daemon] Adjusts log text of VM migrations to show the correct source node
* [API Daemon] Adjusts the OVA importer to support floppy RASD types for compatability
* [API Daemon] Ensures that volume resize commands without a suffix get B appended
* [API Daemon] Removes the explicit setting of image-features in PVC; defaulting to the limited set has been moved to the ceph.conf configuration on nodes via PVC Ansible
-- Joshua M. Boniface <joshua@boniface.me> Fri, 30 Jul 2021 12:08:12 -0400
pvc (0.9.30-0) unstable; urgency=high
* [Node Daemon] Fixes bug with schema validation
-- Joshua M. Boniface <joshua@boniface.me> Tue, 20 Jul 2021 00:01:45 -0400
pvc (0.9.29-0) unstable; urgency=high
* [Node Daemon] Corrects numerous bugs with node logging framework
-- Joshua M. Boniface <joshua@boniface.me> Mon, 19 Jul 2021 16:55:41 -0400
pvc (0.9.28-0) unstable; urgency=high
* [CLI Client] Revamp confirmation options for "vm modify" command
-- Joshua M. Boniface <joshua@boniface.me> Mon, 19 Jul 2021 09:29:34 -0400
pvc (0.9.27-0) unstable; urgency=high
* [CLI Client] Fixes a bug with vm modify command when passed a file
-- Joshua M. Boniface <joshua@boniface.me> Mon, 19 Jul 2021 00:03:40 -0400
pvc (0.9.26-0) unstable; urgency=high
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures
* [All] Implements VM tagging functionality
* [All] Implements Node log access via PVC functionality
-- Joshua M. Boniface <joshua@boniface.me> Sun, 18 Jul 2021 20:49:52 -0400
pvc (0.9.25-0) unstable; urgency=high
* [Node Daemon] Returns to Rados library calls for Ceph due to performance problems
* [Node Daemon] Adds a date output to keepalive messages
* [Daemons] Configures ZK connection logging only for persistent connections
* [API Provisioner] Add context manager-based chroot to Debootstrap example script
* [Node Daemon] Fixes a bug where shutdown daemon state was overwritten
-- Joshua M. Boniface <joshua@boniface.me> Sun, 11 Jul 2021 23:19:09 -0400
pvc (0.9.24-0) unstable; urgency=high
* [Node Daemon] Removes Rados module polling of Ceph cluster and returns to command-based polling for timeout purposes, and removes some flaky return statements
* [Node Daemon] Removes flaky Zookeeper connection renewals that caused problems
* [CLI Client] Allow raw lists of clusters from `pvc cluster list`
* [API Daemon] Fixes several issues when getting VM data without stats
* [API Daemon] Fixes issues with removing VMs while disks are still in use (failed provisioning, etc.)
-- Joshua M. Boniface <joshua@boniface.me> Fri, 09 Jul 2021 15:58:36 -0400
pvc (0.9.23-0) unstable; urgency=high
* [Daemons] Fixes a critical overwriting bug in zkhandler when schema paths are not yet valid
* [Node Daemon] Ensures the daemon mode is updated on every startup (fixes the side effect of the above bug in 0.9.22)
-- Joshua M. Boniface <joshua@boniface.me> Mon, 05 Jul 2021 23:40:32 -0400
pvc (0.9.22-0) unstable; urgency=high
* [API Daemon] Drastically improves performance when getting large lists (e.g. VMs)

1
debian/control vendored
View File

@ -9,7 +9,6 @@ X-Python3-Version: >= 3.2
Package: pvc-daemon-node
Architecture: all
Depends: systemd, pvc-daemon-common, python3-kazoo, python3-psutil, python3-apscheduler, python3-libvirt, python3-psycopg2, python3-dnspython, python3-yaml, python3-distutils, python3-rados, python3-gevent, ipmitool, libvirt-daemon-system, arping, vlan, bridge-utils, dnsmasq, nftables, pdns-server, pdns-backend-pgsql
Suggests: pvc-client-api, pvc-client-cli
Description: Parallel Virtual Cluster node daemon (Python 3)
A KVM/Zookeeper/Ceph-based VM and private cloud manager
.

View File

@ -5,5 +5,6 @@ api-daemon/pvcapid.sample.yaml etc/pvc
api-daemon/pvcapid usr/share/pvc
api-daemon/pvcapid.service lib/systemd/system
api-daemon/pvcapid-worker.service lib/systemd/system
api-daemon/pvcapid-worker.sh usr/share/pvc
api-daemon/provisioner usr/share/pvc
api-daemon/migrations usr/share/pvc

View File

@ -42,6 +42,57 @@ To get started with PVC, please see the [About](https://parallelvirtualcluster.r
## Changelog
#### v0.9.31
* [Packages] Cleans up obsolete Suggests lines
* [Node Daemon] Adjusts log text of VM migrations to show the correct source node
* [API Daemon] Adjusts the OVA importer to support floppy RASD types for compatability
* [API Daemon] Ensures that volume resize commands without a suffix get B appended
* [API Daemon] Removes the explicit setting of image-features in PVC; defaulting to the limited set has been moved to the ceph.conf configuration on nodes via PVC Ansible
#### v0.9.30
* [Node Daemon] Fixes bug with schema validation
#### v0.9.29
* [Node Daemon] Corrects numerous bugs with node logging framework
#### v0.9.28
* [CLI Client] Revamp confirmation options for "vm modify" command
#### v0.9.27
* [CLI Client] Fixes a bug with vm modify command when passed a file
#### v0.9.26
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures
* [All] Implements VM tagging functionality
* [All] Implements Node log access via PVC functionality
#### v0.9.25
* [Node Daemon] Returns to Rados library calls for Ceph due to performance problems
* [Node Daemon] Adds a date output to keepalive messages
* [Daemons] Configures ZK connection logging only for persistent connections
* [API Provisioner] Add context manager-based chroot to Debootstrap example script
* [Node Daemon] Fixes a bug where shutdown daemon state was overwritten
#### v0.9.24
* [Node Daemon] Removes Rados module polling of Ceph cluster and returns to command-based polling for timeout purposes, and removes some flaky return statements
* [Node Daemon] Removes flaky Zookeeper connection renewals that caused problems
* [CLI Client] Allow raw lists of clusters from `pvc cluster list`
* [API Daemon] Fixes several issues when getting VM data without stats
* [API Daemon] Fixes issues with removing VMs while disks are still in use (failed provisioning, etc.)
#### v0.9.23
* [Daemons] Fixes a critical overwriting bug in zkhandler when schema paths are not yet valid
* [Node Daemon] Ensures the daemon mode is updated on every startup (fixes the side effect of the above bug in 0.9.22)
#### v0.9.22
* [API Daemon] Drastically improves performance when getting large lists (e.g. VMs)

View File

@ -144,6 +144,19 @@
},
"type": "object"
},
"NodeLog": {
"properties": {
"data": {
"description": "The recent log text",
"type": "string"
},
"name": {
"description": "The name of the Node",
"type": "string"
}
},
"type": "object"
},
"VMLog": {
"properties": {
"data": {
@ -215,6 +228,23 @@
},
"type": "object"
},
"VMTags": {
"properties": {
"name": {
"description": "The name of the VM",
"type": "string"
},
"tags": {
"description": "The tag(s) of the VM",
"items": {
"id": "VMTag",
"type": "object"
},
"type": "array"
}
},
"type": "object"
},
"acl": {
"properties": {
"description": {
@ -1370,6 +1400,28 @@
"description": "The current state of the VM",
"type": "string"
},
"tags": {
"description": "The tag(s) of the VM",
"items": {
"id": "VMTag",
"properties": {
"name": {
"description": "The name of the tag",
"type": "string"
},
"protected": {
"description": "Whether the tag is protected or not",
"type": "boolean"
},
"type": {
"description": "The type of the tag (user, system)",
"type": "string"
}
},
"type": "object"
},
"type": "array"
},
"type": {
"description": "The type of the VM",
"type": "string"
@ -2415,7 +2467,7 @@
"description": "",
"parameters": [
{
"description": "A search limit; fuzzy by default, use ^/$ to force exact matches",
"description": "A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches",
"in": "query",
"name": "limit",
"required": false,
@ -2626,6 +2678,38 @@
]
}
},
"/api/v1/node/{node}/log": {
"get": {
"description": "",
"parameters": [
{
"description": "The number of lines to retrieve",
"in": "query",
"name": "lines",
"required": false,
"type": "integer"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/NodeLog"
}
},
"404": {
"description": "Node not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Return the recent logs of {node}",
"tags": [
"node"
]
}
},
"/api/v1/provisioner/create": {
"post": {
"description": "Note: Starts a background job in the pvc-provisioner-worker Celery worker while returning a task ID; the task ID can be used to query the \"GET /provisioner/status/<task_id>\" endpoint for the job status",
@ -5795,7 +5879,7 @@
"description": "",
"parameters": [
{
"description": "A name search limit; fuzzy by default, use ^/$ to force exact matches",
"description": "A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches",
"in": "query",
"name": "limit",
"required": false,
@ -5814,6 +5898,13 @@
"name": "state",
"required": false,
"type": "string"
},
{
"description": "Limit list to VMs with this tag",
"in": "query",
"name": "tag",
"required": false,
"type": "string"
}
],
"responses": {
@ -5889,6 +5980,26 @@
"name": "migration_method",
"required": false,
"type": "string"
},
{
"description": "The user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "user_tags",
"required": false,
"type": "array"
},
{
"description": "The protected user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "protected_tags",
"required": false,
"type": "array"
}
],
"responses": {
@ -6027,6 +6138,26 @@
"name": "migration_method",
"required": false,
"type": "string"
},
{
"description": "The user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "user_tags",
"required": false,
"type": "array"
},
{
"description": "The protected user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "protected_tags",
"required": false,
"type": "array"
}
],
"responses": {
@ -6151,7 +6282,7 @@
}
},
"404": {
"description": "Not found",
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
@ -6225,6 +6356,12 @@
"schema": {
"$ref": "#/definitions/Message"
}
},
"404": {
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Set the metadata of {vm}",
@ -6412,6 +6549,84 @@
"vm"
]
}
},
"/api/v1/vm/{vm}/tags": {
"get": {
"description": "",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/VMTags"
}
},
"404": {
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Return the tags of {vm}",
"tags": [
"vm"
]
},
"post": {
"description": "",
"parameters": [
{
"description": "The action to perform with the tag",
"enum": [
"add",
"remove"
],
"in": "query",
"name": "action",
"required": true,
"type": "string"
},
{
"description": "The text value of the tag",
"in": "query",
"name": "tag",
"required": true,
"type": "string"
},
{
"default": false,
"description": "Set the protected state of the tag",
"in": "query",
"name": "protected",
"required": false,
"type": "boolean"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/Message"
}
},
"400": {
"description": "Bad request",
"schema": {
"$ref": "#/definitions/Message"
}
},
"404": {
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Set the tags of {vm}",
"tags": [
"vm"
]
}
}
},
"swagger": "2.0"

View File

@ -140,6 +140,8 @@ pvc:
file_logging: True
# stdout_logging: Enable or disable logging to stdout (i.e. journald)
stdout_logging: True
# zookeeper_logging: Enable ot disable logging to Zookeeper (for `pvc node log` functionality)
zookeeper_logging: True
# log_colours: Enable or disable ANSI colours in log output
log_colours: True
# log_dates: Enable or disable date strings in log output
@ -152,10 +154,12 @@ pvc:
log_keepalive_storage_details: True
# console_log_lines: Number of console log lines to store in Zookeeper per VM
console_log_lines: 1000
# node_log_lines: Number of node log lines to store in Zookeeper per node
node_log_lines: 2000
# networking: PVC networking configuration
# OPTIONAL if enable_networking: False
networking:
# bridge_device: Underlying device to use for bridged vLAN networks; usually the device underlying <cluster>
# bridge_device: Underlying device to use for bridged vLAN networks; usually the device of <cluster>
bridge_device: ens4
# sriov_enable: Enable or disable (default if absent) SR-IOV network support
sriov_enable: False

View File

@ -32,6 +32,7 @@ import yaml
import json
from socket import gethostname
from datetime import datetime
from threading import Thread
from ipaddress import ip_address, ip_network
from apscheduler.schedulers.background import BackgroundScheduler
@ -55,7 +56,7 @@ import pvcnoded.CephInstance as CephInstance
import pvcnoded.MetadataAPIInstance as MetadataAPIInstance
# Version string for startup output
version = '0.9.22'
version = '0.9.31'
###############################################################################
# PVCD - node daemon startup program
@ -75,8 +76,11 @@ version = '0.9.22'
# Daemon functions
###############################################################################
# Ensure the update_timer is None until it's set for real
# Ensure update_timer, this_node, and d_domain are None until they're set for real
# Ensures cleanup() doesn't fail due to these items not being created yet
update_timer = None
this_node = None
d_domain = None
# Create timer to update this node in Zookeeper
@ -109,7 +113,7 @@ try:
pvcnoded_config_file = os.environ['PVCD_CONFIG_FILE']
except Exception:
print('ERROR: The "PVCD_CONFIG_FILE" environment variable must be set before starting pvcnoded.')
exit(1)
os._exit(1)
# Set local hostname and domain variables
myfqdn = gethostname()
@ -141,11 +145,12 @@ def readConfig(pvcnoded_config_file, myhostname):
o_config = yaml.load(cfgfile, Loader=yaml.SafeLoader)
except Exception as e:
print('ERROR: Failed to parse configuration file: {}'.format(e))
exit(1)
os._exit(1)
# Handle the basic config (hypervisor-only)
try:
config_general = {
'node': o_config['pvc']['node'],
'coordinators': o_config['pvc']['cluster']['coordinators'],
'enable_hypervisor': o_config['pvc']['functions']['enable_hypervisor'],
'enable_networking': o_config['pvc']['functions']['enable_networking'],
@ -156,12 +161,14 @@ def readConfig(pvcnoded_config_file, myhostname):
'console_log_directory': o_config['pvc']['system']['configuration']['directories']['console_log_directory'],
'file_logging': o_config['pvc']['system']['configuration']['logging']['file_logging'],
'stdout_logging': o_config['pvc']['system']['configuration']['logging']['stdout_logging'],
'zookeeper_logging': o_config['pvc']['system']['configuration']['logging'].get('zookeeper_logging', False),
'log_colours': o_config['pvc']['system']['configuration']['logging']['log_colours'],
'log_dates': o_config['pvc']['system']['configuration']['logging']['log_dates'],
'log_keepalives': o_config['pvc']['system']['configuration']['logging']['log_keepalives'],
'log_keepalive_cluster_details': o_config['pvc']['system']['configuration']['logging']['log_keepalive_cluster_details'],
'log_keepalive_storage_details': o_config['pvc']['system']['configuration']['logging']['log_keepalive_storage_details'],
'console_log_lines': o_config['pvc']['system']['configuration']['logging']['console_log_lines'],
'node_log_lines': o_config['pvc']['system']['configuration']['logging'].get('node_log_lines', 0),
'vm_shutdown_timeout': int(o_config['pvc']['system']['intervals']['vm_shutdown_timeout']),
'keepalive_interval': int(o_config['pvc']['system']['intervals']['keepalive_interval']),
'fence_intervals': int(o_config['pvc']['system']['intervals']['fence_intervals']),
@ -175,7 +182,7 @@ def readConfig(pvcnoded_config_file, myhostname):
}
except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e))
exit(1)
cleanup(failure=True)
config = config_general
# Handle debugging config
@ -232,7 +239,7 @@ def readConfig(pvcnoded_config_file, myhostname):
except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e))
exit(1)
cleanup(failure=True)
config = {**config, **config_networking}
# Create the by-id address entries
@ -246,7 +253,7 @@ def readConfig(pvcnoded_config_file, myhostname):
network = ip_network(config[network_key])
except Exception:
print('ERROR: Network address {} for {} is not valid!'.format(config[network_key], network_key))
exit(1)
cleanup(failure=True)
# If we should be autoselected
if config[address_key] == 'by-id':
@ -266,7 +273,7 @@ def readConfig(pvcnoded_config_file, myhostname):
raise
except Exception:
print('ERROR: Floating address {} for {} is not valid!'.format(config[floating_key], floating_key))
exit(1)
cleanup(failure=True)
# Handle the storage config
if config['enable_storage']:
@ -277,7 +284,7 @@ def readConfig(pvcnoded_config_file, myhostname):
}
except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e))
exit(1)
cleanup(failure=True)
config = {**config, **config_storage}
# Handle an empty ipmi_hostname
@ -484,6 +491,9 @@ if enable_networking:
else:
common.run_os_command('ip route add default via {} dev {}'.format(upstream_gateway, 'brupstream'))
logger.out('Waiting 3s for networking to come up', state='s')
time.sleep(3)
###############################################################################
# PHASE 2c - Prepare sysctl for pvcnoded
###############################################################################
@ -555,8 +565,8 @@ if enable_storage:
logger.out('Starting Ceph manager daemon', state='i')
common.run_os_command('systemctl start ceph-mgr@{}'.format(myhostname))
logger.out('Waiting 5s for daemons to start', state='s')
time.sleep(5)
logger.out('Waiting 3s for daemons to start', state='s')
time.sleep(3)
###############################################################################
# PHASE 4 - Attempt to connect to the coordinators and start zookeeper client
@ -571,7 +581,7 @@ try:
zkhandler.connect(persistent=True)
except Exception as e:
logger.out('ERROR: Failed to connect to Zookeeper cluster: {}'.format(e), state='e')
exit(1)
os._exit(1)
logger.out('Validating Zookeeper schema', state='i')
@ -658,7 +668,7 @@ def update_schema(new_schema_version, stat, event=''):
# Restart ourselves with the new schema
logger.out('Reloading node daemon', state='s')
try:
zkhandler.disconnect()
zkhandler.disconnect(persistent=True)
del zkhandler
except Exception:
pass
@ -692,8 +702,8 @@ else:
# Cleanup function
def cleanup():
global zkhandler, update_timer, d_domain
def cleanup(failure=False):
global logger, zkhandler, update_timer, d_domain
logger.out('Terminating pvcnoded and cleaning up', state='s')
@ -704,19 +714,19 @@ def cleanup():
# Waiting for any flushes to complete
logger.out('Waiting for any active flushes', state='s')
while this_node.flush_thread is not None:
time.sleep(0.5)
if this_node is not None:
while this_node.flush_thread is not None:
time.sleep(0.5)
# Stop console logging on all VMs
logger.out('Stopping domain console watchers', state='s')
for domain in d_domain:
if d_domain[domain].getnode() == myhostname:
try:
d_domain[domain].console_log_instance.stop()
except NameError:
pass
except AttributeError:
pass
if d_domain is not None:
for domain in d_domain:
if d_domain[domain].getnode() == myhostname:
try:
d_domain[domain].console_log_instance.stop()
except Exception:
pass
# Force into secondary coordinator state if needed
try:
@ -733,13 +743,11 @@ def cleanup():
# Stop keepalive thread
try:
stopKeepaliveTimer()
except NameError:
pass
except AttributeError:
pass
logger.out('Performing final keepalive update', state='s')
node_keepalive()
logger.out('Performing final keepalive update', state='s')
node_keepalive()
except Exception:
pass
# Set stop state in Zookeeper
zkhandler.write([
@ -751,18 +759,25 @@ def cleanup():
# Close the Zookeeper connection
try:
zkhandler.disconnect()
zkhandler.disconnect(persistent=True)
del zkhandler
except Exception:
pass
logger.out('Terminated pvc daemon', state='s')
os._exit(0)
logger.terminate()
if failure:
retcode = 1
else:
retcode = 0
os._exit(retcode)
# Termination function
def term(signum='', frame=''):
cleanup()
cleanup(failure=False)
# Hangup (logrotate) function
@ -791,6 +806,7 @@ if zkhandler.exists(('node', myhostname)):
logger.out("Node is " + fmt_green + "present" + fmt_end + " in Zookeeper", state='i')
# Update static data just in case it's changed
zkhandler.write([
(('node', myhostname), config['daemon_mode']),
(('node.mode', myhostname), config['daemon_mode']),
(('node.state.daemon', myhostname), 'init'),
(('node.state.router', myhostname), init_routerstate),
@ -861,7 +877,7 @@ if enable_hypervisor:
lv_conn.close()
except Exception as e:
logger.out('ERROR: Failed to connect to Libvirt daemon: {}'.format(e), state='e')
exit(1)
cleanup(failure=True)
###############################################################################
# PHASE 7c - Ensure NFT is running on the local host
@ -1333,11 +1349,13 @@ def collect_ceph_stats(queue):
ceph_health = health_status['status']
except Exception as e:
logger.out('Failed to obtain Ceph health data: {}'.format(e), state='e')
return
ceph_health = 'HEALTH_UNKN'
if ceph_health == 'HEALTH_OK':
if ceph_health in ['HEALTH_OK']:
ceph_health_colour = fmt_green
elif ceph_health == 'HEALTH_WARN':
elif ceph_health in ['HEALTH_UNKN']:
ceph_health_colour = fmt_cyan
elif ceph_health in ['HEALTH_WARN']:
ceph_health_colour = fmt_yellow
else:
ceph_health_colour = fmt_red
@ -1355,7 +1373,6 @@ def collect_ceph_stats(queue):
])
except Exception as e:
logger.out('Failed to set Ceph status data: {}'.format(e), state='e')
return
if debug:
logger.out("Set ceph rados df information in zookeeper (primary only)", state='d', prefix='ceph-thread')
@ -1369,15 +1386,15 @@ def collect_ceph_stats(queue):
])
except Exception as e:
logger.out('Failed to set Ceph utilization data: {}'.format(e), state='e')
return
if debug:
logger.out("Set pool information in zookeeper (primary only)", state='d', prefix='ceph-thread')
# Get pool info
retcode, stdout, stderr = common.run_os_command('ceph df --format json', timeout=1)
command = {"prefix": "df", "format": "json"}
ceph_df_output = ceph_conn.mon_command(json.dumps(command), b'', timeout=1)[1].decode('ascii')
try:
ceph_pool_df_raw = json.loads(stdout)['pools']
ceph_pool_df_raw = json.loads(ceph_df_output)['pools']
except Exception as e:
logger.out('Failed to obtain Pool data (ceph df): {}'.format(e), state='w')
ceph_pool_df_raw = []
@ -1448,9 +1465,9 @@ def collect_ceph_stats(queue):
osd_dump = dict()
command = {"prefix": "osd dump", "format": "json"}
osd_dump_output = ceph_conn.mon_command(json.dumps(command), b'', timeout=1)[1].decode('ascii')
try:
retcode, stdout, stderr = common.run_os_command('ceph osd dump --format json --connect-timeout 2', timeout=2)
osd_dump_raw = json.loads(stdout)['osds']
osd_dump_raw = json.loads(osd_dump_output)['osds']
except Exception as e:
logger.out('Failed to obtain OSD data: {}'.format(e), state='w')
osd_dump_raw = []
@ -1607,7 +1624,6 @@ def collect_vm_stats(queue):
lv_conn = libvirt.open(libvirt_name)
if lv_conn is None:
logger.out('Failed to open connection to "{}"'.format(libvirt_name), state='e')
return
memalloc = 0
memprov = 0
@ -1777,8 +1793,9 @@ def node_keepalive():
# Get past state and update if needed
if debug:
logger.out("Get past state and update if needed", state='d', prefix='main-thread')
past_state = zkhandler.read(('node.state.daemon', this_node.name))
if past_state != 'run':
if past_state != 'run' and past_state != 'shutdown':
this_node.daemon_state = 'run'
zkhandler.write([
(('node.state.daemon', this_node.name), 'run')
@ -1867,7 +1884,6 @@ def node_keepalive():
])
except Exception:
logger.out('Failed to set keepalive data', state='e')
return
# Display node information to the terminal
if config['log_keepalives']:
@ -1878,9 +1894,10 @@ def node_keepalive():
else:
cst_colour = fmt_cyan
logger.out(
'{}{} keepalive{} [{}{}{}]'.format(
'{}{} keepalive @ {}{} [{}{}{}]'.format(
fmt_purple,
myhostname,
datetime.now(),
fmt_end,
fmt_bold + cst_colour,
this_node.router_state,

View File

@ -180,7 +180,7 @@ class MetadataAPIInstance(object):
client_macaddr = host_information.get('mac_address', None)
# Find the VM with that MAC address - we can't assume that the hostname is actually right
_discard, vm_list = pvc_vm.get_list(self.zkhandler, None, None, None)
_discard, vm_list = pvc_vm.get_list(self.zkhandler, None, None, None, None)
vm_details = dict()
for vm in vm_list:
try:

View File

@ -246,7 +246,7 @@ class NodeInstance(object):
if data != self.domain_list:
self.domain_list = data
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.count.provisioned_domainss', self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.count.provisioned_domains', self.name))
def watch_node_domainscount(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher

View File

@ -635,7 +635,7 @@ class VMInstance(object):
self.inreceive = True
self.logger.out('Receiving VM migration from node "{}"'.format(self.node), state='i', prefix='Domain {}'.format(self.domuuid))
self.logger.out('Receiving VM migration from node "{}"'.format(self.last_currentnode), state='i', prefix='Domain {}'.format(self.domuuid))
# Short delay to ensure sender is in sync
time.sleep(0.5)

View File

@ -133,31 +133,46 @@ def rebootViaIPMI(ipmi_hostname, ipmi_user, ipmi_password, logger):
if ipmi_reset_retcode != 0:
logger.out('Failed to reboot dead node', state='e')
print(ipmi_reset_stderr)
return False
time.sleep(1)
# Power on the node (just in case it is offline)
ipmi_command_start = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power on'.format(
ipmi_hostname, ipmi_user, ipmi_password
)
ipmi_start_retcode, ipmi_start_stdout, ipmi_start_stderr = common.run_os_command(ipmi_command_start)
time.sleep(2)
# Ensure the node is powered on
# Check the chassis power state
logger.out('Checking power state of dead node', state='i')
ipmi_command_status = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power status'.format(
ipmi_hostname, ipmi_user, ipmi_password
)
ipmi_status_retcode, ipmi_status_stdout, ipmi_status_stderr = common.run_os_command(ipmi_command_status)
# Trigger a power start if needed
if ipmi_status_stdout != "Chassis Power is on":
ipmi_command_start = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power on'.format(
ipmi_hostname, ipmi_user, ipmi_password
)
ipmi_start_retcode, ipmi_start_stdout, ipmi_start_stderr = common.run_os_command(ipmi_command_start)
if ipmi_start_retcode != 0:
logger.out('Failed to start powered-off dead node', state='e')
print(ipmi_reset_stderr)
if ipmi_reset_retcode == 0:
if ipmi_status_stdout == "Chassis Power is on":
# We successfully rebooted the node and it is powered on; this is a succeessful fence
logger.out('Successfully rebooted dead node', state='o')
return True
elif ipmi_status_stdout == "Chassis Power is off":
# We successfully rebooted the node but it is powered off; this might be expected or not, but the node is confirmed off so we can call it a successful fence
logger.out('Chassis power is in confirmed off state after successfuly IPMI reboot; proceeding with fence-flush', state='o')
return True
else:
# We successfully rebooted the node but it is in some unknown power state; since this might indicate a silent failure, we must call it a failed fence
logger.out('Chassis power is in an unknown state after successful IPMI reboot; not performing fence-flush', state='e')
return False
else:
if ipmi_status_stdout == "Chassis Power is off":
# We failed to reboot the node but it is powered off; it has probably suffered a serious hardware failure, but the node is confirmed off so we can call it a successful fence
logger.out('Chassis power is in confirmed off state after failed IPMI reboot; proceeding with fence-flush', state='o')
return True
else:
# We failed to reboot the node but it is in some unknown power state (including "on"); since this might indicate a silent failure, we must call it a failed fence
logger.out('Chassis power is not in confirmed off state after failed IPMI reboot; not performing fence-flush', state='e')
return False
# Declare success
logger.out('Successfully rebooted dead node', state='o')
return True
#

View File

@ -38,7 +38,7 @@ sleep 30
_pvc vm stop --yes testX
_pvc vm disable testX
_pvc vm undefine --yes testX
_pvc vm define --target hv3 ${vm_tmp}
_pvc vm define --target hv3 --tag pvc-test ${vm_tmp}
_pvc vm start testX
sleep 30
_pvc vm restart --yes --wait testX
@ -50,6 +50,10 @@ sleep 5
_pvc vm move --wait --target hv1 testX
sleep 5
_pvc vm meta testX --limit hv1 --selector vms --method live --profile test --no-autostart
_pvc vm tag add testX mytag
_pvc vm tag get testX
_pvc vm list --tag mytag
_pvc vm tag remove testX mytag
_pvc vm network get testX
_pvc vm vcpu set testX 4
_pvc vm vcpu get testX