Compare commits

..

288 Commits

Author SHA1 Message Date
15d92c483f Bump version to 0.9.27 2021-07-19 00:03:40 -04:00
7dd17e71e7 Fix bug with VM editing with file
Current config is needed for the diff but it was in a conditional.
2021-07-19 00:02:19 -04:00
5be968123f Readd 1 second queue get timeout
Otherwise daemon stops will sometimes inexplicably block.
2021-07-18 22:17:57 -04:00
99fd7ebe63 Fix excessive CPU due to looping 2021-07-18 22:06:50 -04:00
cffc96d156 Fix failure in creating base keys 2021-07-18 21:00:23 -04:00
602093029c Bump version to 0.9.26 2021-07-18 20:49:52 -04:00
bd7a773d6b Add node log following functionality 2021-07-18 20:37:53 -04:00
8d671b3422 Add some tag tests to test-cluster.sh 2021-07-18 20:37:37 -04:00
2358ad6bbe Reduce the number of lines per call
500 was a lot every half second; 200 seems more reasonable. Even a fast
kernel boot should generate < 200 lines in half a second.
2021-07-18 20:23:45 -04:00
a0e9b57d39 Increase log line frequency 2021-07-18 20:19:59 -04:00
2d48127e9c Use even better/faster set comparison 2021-07-18 20:18:35 -04:00
55f2b00366 Add some spaces for better readability 2021-07-18 20:18:23 -04:00
ba257048ad Improve output formatting of node logs 2021-07-18 20:06:08 -04:00
b770e15a91 Fix final termination of logger
We need to do a bit more finagling with the logger on termination to
ensure that all messages are written and the queue drained before
actually terminating.
2021-07-18 19:53:00 -04:00
e23a65128a Remove del of logger item 2021-07-18 19:03:47 -04:00
982dfd52c6 Adjust date output format 2021-07-18 19:00:54 -04:00
3a2478ee0c Cleanly terminate logger on cleanup 2021-07-18 18:57:44 -04:00
a088aa4484 Add node log functions to API and CLI 2021-07-18 18:54:28 -04:00
323c7c41ae Implement node logging into Zookeeper
Adds the ability to send node daemon logs to Zookeeper to facilitate a
command like "pvc node log", similar to "pvc vm log". Each node stores
its logs in a separate tree under "/logs" which can then be combined or
queried. By default, set by config, only 2000 lines are kept.
2021-07-18 17:11:43 -04:00
cd1db3d587 Ensure node name is part of confing 2021-07-18 16:38:58 -04:00
401f102344 Add serial BIOS to default libvirt schema 2021-07-15 10:45:14 -04:00
4ac020888b Add some tag tests to test-cluster.sh 2021-07-14 15:02:03 -04:00
8f3b68d48a Mention multiple option for tags in VM define 2021-07-14 01:12:10 -04:00
6d4c26c8d8 Don't show tag line in info if no tags 2021-07-14 00:59:24 -04:00
75fb60b1b4 Add VM list filtering by tag
Uses same method as state or node filtering, rather than altering how
the main LIMIT field works.
2021-07-14 00:59:20 -04:00
9ea9ac3b8a Revamp tag handling and display
Add an additional protected class, limit manipulation to one at a time,
and ensure future flexibility. Also makes display consistent with other
VM elements.
2021-07-13 22:39:52 -04:00
27f1758791 Add tags manipulation to API
Also fixes some checks for Metadata too since these two actions are
almost identical, and adds tags to define endpoint.
2021-07-13 19:05:33 -04:00
c0a3467b70 Simplify VM metadata reads
Directly call the new common getDomainMetadata function to avoid
excessive Zookeeper calls for this information.
2021-07-13 19:05:33 -04:00
9a199992a1 Add functions for manipulating VM tags
Adds tags to schema (v3), to VM definition, adds function to modify
tags, adds function to get tags, and adds tags to VM data output.

Tags will enable more granular classification of VMs based either on
administrator configuration or from automated system events.
2021-07-13 19:05:33 -04:00
c6d552ae57 Rework success checks for IPMI fencing
Previously, if the node failed to restart, it was declared a "bad fence"
and no further action would be taken. However, there are some
situations, for instance critical hardware failures, where intelligent
systems will not attempt (or succeed at) starting up the node in such a
case, which would result in dead, known-offline nodes without recovery.

Tweak this behaviour somewhat. The main path of Reboot -> Check On ->
Success + fence-flush is retained, but some additional side-paths are
now defined:

1. We attempt to power "on" the chassis 1 second after the reboot, just
in case it is off and can be recovered. We then wait another 2 seconds
and check the power status (as we did before).

2. If the reboot succeeded, follow this series of choices:

    a. If the chassis is on, the fence succeeded.

    b. If the chassis is off, the fence "succeeded" as well.

    c. If the chassis is in some other state, the fence failed.

3. If the reboot failed, follow this series of choices:

    a. If the chassis is off, the fence itself failed, but we can treat
    it as "succeeded"" since the chassis is in a known-offline state.
    This is the most likely situation when there is a critical hardware
    failure, and the server's IPMI does not allow itself to start back
    up again.

    b. If the chassis is in any other state ("on" or unknown), the fence
    itself failed and we must treat this as a fence failure.

Overall, this should alleviate the aforementioned issue of a critical
failure rendering the node persistently "off" not triggering a
fence-flush and ensure fencing is more robust.
2021-07-13 17:54:41 -04:00
2e9f6ac201 Bump version to 0.9.25 2021-07-11 23:19:09 -04:00
f09849bedf Don't overwrite shutdown state on termination
Just a minor quibble and not really impactful.
2021-07-11 23:18:14 -04:00
8c975e5c46 Add chroot context manager example to debootstrap
Closes #132
2021-07-11 23:10:41 -04:00
c76149141f Only log ZK connections when persistent
Prevents spam in the API logs.
2021-07-10 23:35:49 -04:00
f00c4d07f4 Add date output to keepalive
Helps track when there is a log follow in "-o cat" mode.
2021-07-10 23:24:59 -04:00
20b66c10e1 Move two more commands to Rados library 2021-07-10 17:28:42 -04:00
cfeba50b17 Revert "Return to all command-based Ceph gathering"
This reverts commit 65d14ccd92.

This was actually a bad idea. For inexplicable reasons, running these
Ceph commands manually (not even via Python, but in a normal shell)
takes 7 * two orders of magnitude longer than running them with the
Rados module, so long in fact that some basic commands like "ceph
health" would sometimes take longer than the 1 second timeout to
complete. The Rados commands would however take about 1ms instead.

Despite the occasional issues when monitors drop out, the Rados module
is clearly far superior to the shell commands for any moderately-loaded
Ceph cluster. We can look into solving timeouts another way (perhaps
with Processes instead of Threads) at a later time.

Rados module "ceph health":
    b'{"checks":{},"status":"HEALTH_OK"}'
    0.001204 (s)
    b'{"checks":{},"status":"HEALTH_OK"}'
    0.001258 (s)
Command "ceph health":
    joshua@hv1.c.bonilan.net ~ $ time ceph health >/dev/null
    real    0m0.772s
    user    0m0.707s
    sys     0m0.046s
    joshua@hv1.c.bonilan.net ~ $ time ceph health >/dev/null
    real    0m0.796s
    user    0m0.728s
    sys     0m0.054s
2021-07-10 03:47:45 -04:00
0699c48d10 Fix bad schema path name 2021-07-09 16:47:09 -04:00
551bae2518 Bump version to 0.9.24 2021-07-09 15:58:36 -04:00
4832245d9c Handle non-RBD disks and non-RBD errors better 2021-07-09 15:48:57 -04:00
2138f2f59f Fail VM removal on disk removal failures
Prevents bad states where the VM is "removed" but some of its disks
remain due to e.g. stuck watchers.

Rearrange the sequence so it goes stop, delete disks, then delete VM,
and then return a failure if any of the disk(s) fail to remove, allowing
the task to be rerun after fixing the problem.
2021-07-09 15:39:06 -04:00
d1d355a96b Avoid errors if stats data is None 2021-07-09 13:13:54 -04:00
2b5dc286ab Correct failure to get ceph_health data 2021-07-09 13:10:28 -04:00
c0c9327a7d Return an empty log if the value is None 2021-07-09 13:08:00 -04:00
5ffabcfef5 Avoid failing if we can't get the future data 2021-07-09 13:05:37 -04:00
330cf14638 Remove return statements in keepalive collectors
These seem to bork the keepalive timer process, so just remove them and
let it continue to press on.
2021-07-09 13:04:17 -04:00
9d0eb20197 Mention UUID matching in vm list help 2021-07-09 11:51:20 -04:00
3f5b7045a2 Allow raw listing of cluster names in CLI 2021-07-09 10:53:20 -04:00
80fe96b24d Add some additional docstrings 2021-07-07 12:28:08 -04:00
80f04ce8ee Remove connection renewal in state handler
Regenerating the ZK connection was fraught with issues, including
duplicate connections, strange failures to reconnect, and various other
wonkiness.

Instead let Kazoo handle states sensibly. Kazoo moves to SUSPENDED state
when it loses connectivity, and stays there indefinitely (based on
cursory tests). And Kazoo seems to always resume from this just fine on
its own. Thus all that hackery did nothing but complicate reconnection.

This therefore turns the listener into a purely informational function,
providing logs of when/why it failed, and we also add some additional
output messages during initial connection and final disconnection.
2021-07-07 11:55:12 -04:00
65d14ccd92 Return to all command-based Ceph gathering
Using the Rados module was very problematic, specifically because it had
no sensible timeout parameters and thus would hang for many seconds.
This has poor implications since it blocks further keepalives.

Instead, remove the Rados usage entirely and go back completely to using
manual OS commands to gather this information. While this may cause PID
exhaustion more quickly it's worthwhile to avoid failure scenarios when
Ceph stats time out.

Closes #137
2021-07-06 11:30:45 -04:00
adc022f55d Add missing install of pvcapid-worker.sh 2021-07-06 09:40:42 -04:00
7082982a33 Bump version to 0.9.23 2021-07-05 23:40:32 -04:00
5b6ef71909 Ensure daemon mode is updated on startup
Fixes the side effect of the previous bug during deploys of 0.9.22.
2021-07-05 23:39:23 -04:00
a8c28786dd Better handle empty ipaths in schema
When trying to write to sub-item paths that don't yet exist, the
previous method would just blindly write to whatever the root key is,
which is never what we actually want.

Instead, check explicitly for a "base path" situation, and handle that.
Then, if we try to get a subpath that isn't valid, return None. Finally
in the various functions, if the path is None, just continue (or return
false/None) and (try to) chug along.
2021-07-05 23:35:03 -04:00
be7b0be8ed Fix typo in schema path name 2021-07-05 23:23:23 -04:00
c45804e8c1 Revert "Return none if a schema path is not found"
This reverts commit b1fcf6a4a5.
2021-07-05 23:16:39 -04:00
b1fcf6a4a5 Return none if a schema path is not found
This can cause overwriting of unintended keys, so should not be
happening. Will have to find the bugs this causes.
2021-07-05 17:15:55 -04:00
47f39a1a2a Fix ordering issue in test-cluster script 2021-07-05 15:14:34 -04:00
54f82a3ea0 Fix bug in VM network list with SR-IOV 2021-07-05 15:14:01 -04:00
37cd278bc2 Bump version to 0.9.22 2021-07-05 14:18:51 -04:00
47a522f8af Use manual zkhandler creation in Benchmark job
Like the other Celery job this does not work properly with the
ZKConnection decorator due to conflicting "self", so just connect
manually exactly like the provisioner task does.
2021-07-05 14:12:56 -04:00
087c23859c Adjust layout of Provisioner lists output
Use the same header format as the others.
2021-07-05 14:06:22 -04:00
6c21a52714 Adjust layout of Ceph/storage lists output
Use the same header format as node, VM, and network lists.
2021-07-05 12:57:18 -04:00
afde436cd0 Adjust layout of Network lists output
Use the same header format as node and VM lists.
2021-07-05 11:48:39 -04:00
1fe71969ca Adjust layout of VM list output
Matches the new node list output format with the additional header line,
as well as revamps some other aspects:

    1. Adjusts the UUID to be under the name in the info output.
    2. Removes the UUID from the list output to save space, because this
       is generally not needed in day-to-day quick-list output.
    3. Renames the "Node" header to "Current" to better reflect what
       that column actually means and avoid conflicting with the parent
       header.
2021-07-05 10:52:48 -04:00
2b04df22a6 Add PVC version to node information output
Also adjusts the layout of the node list output to avoid excessively
long lines. Adds another header line with categories and spacing dashes
for easier visual parsing.
2021-07-05 10:45:20 -04:00
a69105569f Add node PVC version data to Node information
Allows API client to see the currently-active version of the node
daemon.
2021-07-05 09:57:38 -04:00
21a1a7da9e Fix bad schema reference
Not sure how this didn't cause an issue until now, but the wrong key
path was used and this was getting unexpected data with the newly-added
version string instead of the proper mode string.
2021-07-05 09:53:51 -04:00
e44f3d623e Remove unnecessary try/except blocks from VM reads
The zkhandler read() function takes care of ensuring there is a None
value returned if these fail, so these aren't required. Makes the code a
fair bit more readable here.
2021-07-02 12:01:58 -04:00
f0fd3d3f0e Make extra sure VMs terminate when told
When doing a stop_vm or terminate_vm, check again after 0.2 seconds
and try re-terminating if it's still running. Covers cases where a VM
doesn't stop if given the 'stop' state.
2021-07-02 11:40:34 -04:00
f12de6727d Adjust logo slightly and add debug state 2021-07-02 02:32:08 -04:00
e94f5354e6 Update startup messages with new ASCII logo 2021-07-02 02:21:30 -04:00
c51023ba81 Add profiler to keepalive function 2021-07-02 01:55:15 -04:00
61465ef38f Add profiler to several other functions in API 2021-07-02 01:53:19 -04:00
64c6b04352 Ensure all edited files are restored 2021-07-02 01:50:25 -04:00
20542c3653 Add profiler to cluster status function 2021-07-01 17:35:29 -04:00
00b503824e Set unstable version in API and CLI too 2021-07-01 17:35:11 -04:00
43009486ae Move Ceph pool/volume list assembly to thread pool
Same reasons as the VM list, though less impactful.
2021-07-01 17:33:13 -04:00
58789f1db4 Move VM list assembly to thread pool
This helps parallelize the numerous Zookeeper calls a little bit, at
least within the bounds of the GIL, to improve performance when getting
a large list of VMs. The max_workers value is capped at 32 to avoid
causing too many threads during concurrent executions, but still
provides a noticeable speedup (on the order of 0.2-0.4 seconds with 75
VMs, scaling up further as counts grow).
2021-07-01 17:32:47 -04:00
baf4c3fbc7 Add performance profiler function
Usable anywhere that the global daemon "config" parameter can be passed
in (e.g. pvcapid/helper.py, pvcnoded/Daemon.py, etc.). Stores results in
a subdirectory of the PVC logdir called "profiler" if this directory can
be created, or prints results.

The debug config parameter ensures that the profiler can be added to
functions and not run unless the server is explicitly in debug mode.
Might not be useful as I don't initially plan to add this to every
function (only when investigating performance problems), but this
flexibility allows that to change later.
2021-07-01 14:01:33 -04:00
e093efceb1 Add NoNodeError handlers in ZK locks
Instead of looping 5+ times acquiring an impossible lock on a
nonexistent key, just fail on a different error and return failure
immediately.

This is likely a major corner case that shouldn't happen, but better to
be safe than 500.
2021-07-01 01:17:38 -04:00
a080598781 Avoid superfluous ZK exists calls
These cause a major (2x) slowdown in read calls since Zookeeper
connections are expensive/slow. Instead, just try the thing and return
None if there's no key there.

Also wrap the children command in similar error handling since that did
not exist and could likely cause some bugs at some point.
2021-07-01 01:15:51 -04:00
39e82ee426 Cast base schema version to int
Or all our comparisons will fail later and nodes can't start.
2021-06-30 09:40:33 -04:00
fe0a1d582a Bump version to 0.9.21 2021-06-29 19:21:31 -04:00
64feabf7c0 Fix adds in bump-version 2021-06-29 19:20:13 -04:00
cc841898b4 Pad dimensions of logos slightly 2021-06-29 19:16:35 -04:00
de5599b800 Revert "Try dark image instead"
This reverts commit 87f963df4c.
2021-06-29 19:11:38 -04:00
87f963df4c Try dark image instead 2021-06-29 19:09:39 -04:00
de23c8c57e Add background colours to logos 2021-06-29 19:08:10 -04:00
c62a0a6c6a Revamp introduction text 2021-06-29 18:47:01 -04:00
12212177ef Update to new logo 2021-06-29 18:41:36 -04:00
6adaf1f669 Fix incorrect handling of deletions in init 2021-06-29 18:41:02 -04:00
b05c93e260 Fix bad return from initialize call 2021-06-29 18:31:56 -04:00
ffdd6bf3f8 Fix typo in command argument 2021-06-29 18:22:39 -04:00
aae9ae2e80 Fix incorrect handling of overwrite flag 2021-06-29 18:22:01 -04:00
f91c07fdcf Re-add UUID limit matching for full UUIDs
This *was* valuable when passing a full UUID in, so go back to that.
Verify first that the limit string is an actual UUID, and then compare
against it if applicable.
2021-06-28 12:27:43 -04:00
4e2a1c3e52 Add worker wrapper to fix Deb incompatibility
Celery 5.x introduced a new worker argument format that is not
backwards-compatible with the older Celery 4.x format. This created a
conundrum since we use one service unit for both Debian 10 (4.x) and
Debian 11 (5.x). Instead of worse hacks, create a wrapper script to
start the worker with the correct arguments instead.
2021-06-28 12:19:29 -04:00
dbfa339cfb Ensure postinst and prerm always succeed 2021-06-23 20:35:40 -04:00
c54f66efa8 Limit match only on VM name
I can see no possible reason to want to do limits against UUIDs, but
supporting that means match is not what one would expect since a random
UUID could match the limit. So only limit based on the name.
2021-06-23 19:17:35 -04:00
cd860bae6b Optimize VM list in API
With many VMs this slows down linearly. Rework it a bit so there are
fewer calls to getInformationFromXML and so the processing could happen
in parallel at some point.
2021-06-23 19:14:26 -04:00
bbb132414c Restore shebang and don't do store if completion 2021-06-23 05:26:50 -04:00
04fa63f081 Only hit the network endpoint once
Otherwise this is hit for every VM which gets very slow very fast.
2021-06-23 05:15:48 -04:00
f248d579df Convert pvc-client-cli into a proper Python module
Also fixes up the Debian packaging such that this works how I would
want, with proper module installation while leaving everything else
untouched. Finally implements automatic installation and removal of the
BASH completion for the PVC command.
2021-06-23 05:03:19 -04:00
f0db631947 Ignore root-level venv for testing 2021-06-23 02:15:49 -04:00
e91a597591 Merge branch 'sriov'
Implement SR-IOV support in PVC.

Closes #130
2021-06-23 00:58:44 -04:00
8d21da9041 Add some additional interaction tests 2021-06-22 22:08:51 -04:00
1ae34c1960 Fix bad messages in volume remove 2021-06-22 04:31:02 -04:00
75f2560217 Add documentation on SR-IOV client networks 2021-06-22 04:20:38 -04:00
7d2b7441c2 Mention SR-IOV in the Daemon and Ansible manuals 2021-06-22 03:55:19 -04:00
5ec198bf98 Update API doc with remaining items 2021-06-22 03:47:27 -04:00
e6b26745ce Adjust some help messages in pvc.py 2021-06-22 03:40:21 -04:00
3490ecbb59 Remove explicit ZK address from Patronictl command 2021-06-22 03:31:06 -04:00
2928d695c9 Ensure migration method is updated on state changes 2021-06-22 03:20:15 -04:00
7d2a3b5361 Ensure Macvtap NICs can use a model
Defaults to virtio like a bridged NIC. Otherwise performance is abysmal.
2021-06-22 02:38:16 -04:00
07dbd55f03 Use list comprehension to compare against source 2021-06-22 02:31:14 -04:00
26dd24e3f5 Ensure MTU is set on VF when starting up 2021-06-22 02:26:14 -04:00
6cd0ccf0ad Fix network check on VM config modification 2021-06-22 02:21:55 -04:00
1787a970ab Fix bug in address check format string 2021-06-22 02:21:32 -04:00
e623909a43 Store PHY MAC for VFs and restore after free 2021-06-22 00:56:47 -04:00
60e1da09dd Don't try any shenannegans when updating NICs
Trying to do this on the VMInstance side had problems because we can't
differentiate the 3 types of migration there. So, just update this in
the API side and hope everything goes well.

This introduces an edge bug: if a VM is using a macvtap SR-IOV device,
and then tries to migrate, and the migrate is aborted, the NIC lists
will be inconsistent.

When I revamp the VMInstance in the future, I should be able to correct
this, but for now we'll have to live with that edgecase.
2021-06-22 00:00:50 -04:00
dc560c1dcb Better handle retcodes in migrate update 2021-06-21 23:46:47 -04:00
68c7481aa2 Ensure offline migrations update SR-IOV NIC states 2021-06-21 23:35:52 -04:00
7d42fba373 Ensure being in migrate doesn't abort shutdown 2021-06-21 23:28:53 -04:00
b532bc9104 Add missing managed flag for hostdev 2021-06-21 23:22:36 -04:00
24ce361a04 Ensure SR-IOV NIC states are updated on migration 2021-06-21 23:18:34 -04:00
eeb83da97d Add support for SR-IOV NICs to VMs 2021-06-21 23:18:22 -04:00
93c2fdec93 Swap order of networks and disks in provisioner
Done to make the resulting config match the expectations when using "vm
network add", which is that networks are below disks, not above.

Not a functional change, just ensures the VM XML is consistent after
many changes.
2021-06-21 21:59:57 -04:00
904337b677 Fix busted changelog from previous commit 2021-06-21 21:19:06 -04:00
64d1a37b3c Add PCIe device paths to SR-IOV VF information
This will be used when adding VM network interfaces of type hostdev.
2021-06-21 21:08:46 -04:00
13cc0f986f Implement SR-IOV VF config set
Also fixes some random bugs, adds proper interface sorting, and assorted
tweaks.
2021-06-21 18:40:11 -04:00
e13baf8bd3 Add initial SR-IOV list/info to CLI 2021-06-21 17:12:53 -04:00
ae480d6cc1 Add SR-IOV listing/info endpoints to API 2021-06-21 17:12:45 -04:00
33195c3c29 Ensure VF list is sorted 2021-06-21 17:11:48 -04:00
a697c2db2e Add SRIOV PF and VF listing to API 2021-06-21 01:42:55 -04:00
ca11dbf491 Sort the list of VFs for easier parsing 2021-06-21 01:40:05 -04:00
e8bd1bf2c4 Ensure used/used_by are set on creation 2021-06-21 01:25:38 -04:00
bff6d71e18 Add logging to SRIOVVFInstance and fix bug 2021-06-17 02:02:41 -04:00
57b041dc62 Ensure default for vLAN and QOS is 0 not empty 2021-06-17 01:54:37 -04:00
509afd4d05 Add hostdev net_type to handler as well 2021-06-17 01:52:58 -04:00
5607a6bb62 Avoid overwriting VF data
Ensures that the configuration of a VF is not overwritten in Zookeeper
on a node restart. The SRIOVVFInstance handlers were modified to start
with None values, so that the DataWatch statements will always trigger
updates to the live system interfaces on daemon startup, thus ensuring
that the config stored in Zookeeper is applied to the system on startup
(mostly relevant after a cold boot or if the API changes them during a
daemon restart).
2021-06-17 01:45:22 -04:00
8f1af2a642 Ignore hostdev interfaces in VM net stat gathering
Prevents errors if a SR-IOV hostdev interface is configured until this
is more defined.
2021-06-17 01:33:11 -04:00
e7b6a3eac1 Implement SR-IOV PF and VF instances
Adds support for the node daemon managing SR-IOV PF and VF instances.

PFs are added to Zookeeper automatically based on the config at startup
during network configuration, and are otherwise completely static. PFs
are automatically removed from Zookeeper, along with all coresponding
VFs, should the PF phy device be removed from the configuration.

VFs are configured based on the (autocreated) VFs of each PF device,
added to Zookeeper, and then a new class instance, SRIOVVFInstance, is
used to watch them for configuration changes. This will enable the
runtime management of VF settings by the API. The set of keys ensures
that both configuration and details of the NIC can be tracked.

Most keys are self-explanatory, especially for PFs and the basic keys
for VFs. The configuration tree is also self-explanatory, being based
entirely on the options available in the `ip link set {dev} vf` command.

Two additional keys are also present: `used` and `used_by`, which will
be able to track the (boolean) state of usage, as well as the VM that
uses a given VIF. Since the VM side implementation will support both
macvtap and direct "hostdev" assignments, this will ensure that this
state can be tracked on both the VF and the VM side.
2021-06-17 01:33:03 -04:00
0ad6d55dff Add initial SR-IOV support to node daemon
Adds configuration values for enabled flag and SR-IOV devices to the
configuration and sets up the initial SR-IOV configuration on daemon
startup (inserting the module, configuring the VF count, etc.).
2021-06-15 22:56:09 -04:00
eada5db5e4 Add diagram and info about invalid georedundancy 2021-06-15 10:20:42 -04:00
164becd3ef Fix info and list matching 2021-06-15 02:32:34 -04:00
e4a65230a1 Just do the shutdown command itself 2021-06-15 02:32:14 -04:00
da48304d4a Avoid hackery in VNI list and support direct type 2021-06-15 00:31:13 -04:00
f540dd320b Allow VNI for "direct" type vNICs 2021-06-15 00:27:01 -04:00
284c581845 Ensure shutdown migrations actually time out
Without this a VM that fails to respond to a shutdown will just spin
forever, blocking state changes.
2021-06-15 00:23:15 -04:00
7b85d5e3f3 Stop VM before removing 2021-06-14 21:44:17 -04:00
23318524b9 Ensure validate writes a valid schema version 2021-06-14 21:27:37 -04:00
5f11b3198b Fix base schema None issue in handler too 2021-06-14 21:13:40 -04:00
953e46055a Fix issue with loading None version schema 2021-06-14 21:09:55 -04:00
96f1d7df83 Fix bad quote 2021-06-14 20:36:28 -04:00
d2bcfe5cf7 Bump version to 0.9.20 2021-06-14 18:06:27 -04:00
ef1701b4c8 Handle an additional exception case 2021-06-14 17:15:40 -04:00
08dc756549 Actually disable the pvcapid service
Prevents it from trying to start itself during updates or reboots on
non-primary coordinators.
2021-06-14 17:13:22 -04:00
0a9c0c1ccb Use a nicer reload method on hot schema update
Instead of exiting and trusting systemd to restart us, instead leverage
the os.execv() call to reload the process in the current PID context.

Also improves the log messages so it's very clear what's going on.
2021-06-14 17:10:21 -04:00
e34a7d4d2a Handle hot reloads properly
A hot reload isn't possible due to DataWatch and ChildrenWatch
constructs, so we instead need to terminate the daemon to "apply" the
schema update. Thus we use exit code 150 (Application defined in LSB)
and reorder some of the elements of the schema validation to ensure
things happen in the right order.
2021-06-14 12:52:43 -04:00
ddd3eeedda Remove needless literal_eval statements 2021-06-14 01:46:30 -04:00
6fdc6674cf Fix grabbing existing version
The schema `version = ` now messes this up.
2021-06-14 01:40:10 -04:00
78453a173c Add functional testing script
Since trying to unit test this monstrous program at this point is a
daunting task, instead create a functional testing script. Can be
theoretically run against any cluster with an appropriate "test"
provisioner profile, but I mostly just run it against my own.
2021-06-14 01:14:20 -04:00
20c773413c Fix bug in snapshot rename 2021-06-14 00:55:26 -04:00
49f4feb482 Fix typo bug in key rename 2021-06-14 00:51:45 -04:00
a2205bec13 Allow VM dump to file directly
Similar to the cluster backup task.
2021-06-13 22:32:54 -04:00
7727221b59 Correctly use the Click file in backups 2021-06-13 22:17:35 -04:00
30a160d5ff Fix invalid type_key 2021-06-13 21:20:10 -04:00
1cbc66dccf Fix bugs in lease listing 2021-06-13 21:10:42 -04:00
bbd903e568 Fix bad schema name 2021-06-13 21:02:44 -04:00
1f49bfa1b2 Fix name of schema element 2021-06-13 20:56:17 -04:00
9511dc9864 Correct issue with invalid ACL ordering 2021-06-13 20:55:28 -04:00
3013973975 Fix bad schema names 2021-06-13 20:32:41 -04:00
8269930d40 Fix bad entry in network add 2021-06-13 18:22:13 -04:00
647bce2a22 Ensure we don't grab None data 2021-06-13 16:43:25 -04:00
ae79113f7c Correct key typo and add error handler 2021-06-13 15:49:30 -04:00
3bad3de720 Verify if key exists before reading 2021-06-13 15:39:43 -04:00
d2f93b3a2e Fix call to celery 2021-06-13 14:56:09 -04:00
680c62a6e4 Fix schema path call and version check 2021-06-13 14:46:30 -04:00
26b1f531e9 Fix bad variable interpolation 2021-06-13 14:37:23 -04:00
be9f1e8636 Use more compatible is_alive in thread 2021-06-13 14:36:27 -04:00
88a1d89501 Fix bad key name 2021-06-13 14:29:54 -04:00
7110a42e5f Add final schema elements after refactoring 2021-06-13 14:26:17 -04:00
01c82f5d19 Move backup and restore into common 2021-06-13 14:25:51 -04:00
059230d369 Convert vm.py to new ZK schema handler 2021-06-13 13:41:21 -04:00
f6e37906a9 Convert node.py to new ZK schema handler 2021-06-13 13:18:34 -04:00
0a162b304a Convert network.py to new ZK schema handler 2021-06-12 18:40:25 -04:00
f071343333 Add DHCP lease schema and temp workaround 2021-06-12 18:22:43 -04:00
01c762a362 Convert common.py to new ZK schema handler 2021-06-12 17:59:09 -04:00
9b1bd8476f Convert cluster.py to new ZK schema handler 2021-06-12 17:11:32 -04:00
6d00ec07b5 Convert ceph.py to new ZK schema handler 2021-06-12 17:09:29 -04:00
247ae4fe2d Fix pre-refactor path bug 2021-06-10 01:18:33 -04:00
b694945010 Fix incorrect name bug 2021-06-10 01:11:14 -04:00
b1c13c9fc1 Fix another bug with read call 2021-06-10 01:08:18 -04:00
75fc40a1e8 Fix bug with nkipath 2021-06-10 01:00:40 -04:00
2aa7f87ca9 Fix bug in creating child path keys 2021-06-10 00:55:54 -04:00
5273c4ebfa Fix bug with encoding raw creates 2021-06-10 00:52:07 -04:00
8dc9fd6dcb Fix bug with sub self command path/key 2021-06-10 00:49:01 -04:00
058c2ceef3 Convert VXNetworkInstance to new ZK schema handler 2021-06-10 00:36:18 -04:00
e7d60260a0 Fix typo in CephInstance path 2021-06-10 00:36:02 -04:00
f030ed974c Correct schema and handling of network subkeys
Required a bit of refactoring in the validation code to ensure we have
direct access, without relying on the translations done in the normal
zkhandler functions.
2021-06-10 00:35:42 -04:00
9985e1dadd Add support for 2-level dynamic keys 2021-06-09 23:52:21 -04:00
85aba7cc18 Convert VMInstance to new ZK schema handler 2021-06-09 23:15:08 -04:00
7e42118e6f Adjust lock schema in NodeInstance and VMInstance
Removes a superfluous lock and puts the sync_lock keys in more usable
places.
2021-06-09 22:51:00 -04:00
24663a3333 Add missing VM schema entry 2021-06-09 22:12:24 -04:00
2704badfbe Convert VMConsole... to new ZK schema handler 2021-06-09 22:08:32 -04:00
450bf6b153 Convert NodeInstance to new ZK schema handler 2021-06-09 22:07:32 -04:00
b94fe88405 Convert fencing to new ZK schema handler 2021-06-09 21:29:01 -04:00
610f6e8f2c Convert CephInstance to new ZK schema handler 2021-06-09 21:17:09 -04:00
f913f42a6d Replace schema paths with updated zkhandler 2021-06-09 20:29:42 -04:00
a9a57533a7 Integrate schema handling within ZKHandler
Abstracts away the schema management, especially when doing actions, to
prevent duplication in other areas.
2021-06-09 13:23:57 -04:00
76c37e6628 Tweak some field names slightly and add missing 2021-06-09 09:58:18 -04:00
0a04adf8f9 Allow empty sub_paths 2021-06-09 01:54:29 -04:00
ae269bdfde Add scripts to generate ZK migration JSON 2021-06-09 00:04:38 -04:00
f2b55ba937 Fix some bugs with migrations 2021-06-09 00:04:16 -04:00
e475552391 Fix some bugs with hot reload 2021-06-09 00:03:26 -04:00
5540bdc86b Add automatic schema upgrade to nodes
Performs an automatic schema upgrade when all nodes are updated to the
latest version.

Addresses #129
2021-06-08 23:35:39 -04:00
3c102b3769 Add per-node schema tracking
This will allow nodes to start with their own schema versions, and then
be updated simultaneously by the API.

References #129
2021-06-08 23:35:39 -04:00
a4aaf89681 Add ZKSchema loading and validation to Daemon
Also removes some previous hack migrations from pre-0.9.19.

Addresses #129
2021-06-08 23:35:39 -04:00
602dd7b714 Update version 0 schema and add full validation
Addresses #129
2021-06-08 23:35:39 -04:00
126f0742cd Add Zookeeper schema manager to zkhandler
Adds a new class, ZKSchema, to handle schema management in Zookeeper in
an automated and consistent way. This should solve several issues:

1. Pain in managing changes to ZK keys
2. Pain in handling those changes during live upgrades
3. Simplifying the codebase to remove hardcoded ZK paths

The current master schema for PVC 0.9.19 is committed as version 0.

Addresses #129
2021-06-08 23:35:39 -04:00
5843d8aff4 Fix fence call to findTargetNode 2021-06-08 23:34:49 -04:00
fb78be3d8d Add mentions of Debian Bullseye support 2021-06-06 18:09:16 -04:00
cf96bb009f Bump version to 0.9.19 2021-06-06 01:47:41 -04:00
719954b70b Fix missing list comma 2021-06-06 01:39:43 -04:00
f0dc0fc782 Avoid duplicating maintenance state change
This makes no functional difference, but is technically more correct.
2021-06-05 01:36:40 -04:00
5d88e92acc Avoid returning errors with duplicate router mode
Like the previous (new) flush change, these shouldn't be errors, but
simply information "what you want is already done" messages.
2021-06-05 01:14:31 -04:00
505c109875 Avoid re-flush or re-ready nodes if unnecessary 2021-06-05 01:08:32 -04:00
3eedfaa7d5 Collect database model error 2021-06-03 00:22:48 -04:00
7de7e1bc71 Properly handle cluster networks in provisioner 2021-06-02 15:57:46 -04:00
34ef055954 Adjust VNI column for provisioner to text
Allows the storing of the textual cluster labels (e.g. 'upstream') as
valid VNI values in the template.
2021-06-02 15:45:22 -04:00
7dea5d2fac Move logger to common, fix buffering 2021-06-01 18:50:26 -04:00
3a5226b893 Add missing flushed output 2021-06-01 18:30:18 -04:00
de2ff2e01b Fix removed function args 2021-06-01 17:02:36 -04:00
cd75413667 Increase initial lock timer
With the new library the reader seems to be a little too quick, so hold
the write lock for 1 second instead of 1/2 second to ensure it is
caught.
2021-06-01 17:00:11 -04:00
9764090d6d Merge node common with daemon common 2021-06-01 12:22:11 -04:00
f73c433fc7 Remove useless try and import 2021-06-01 12:05:17 -04:00
12ac3686de Convert missed elements to new zkhandler 2021-06-01 11:57:21 -04:00
5740d0f2d5 Remove obsolete zkhandler.py 2021-06-01 11:55:44 -04:00
889f4cdf47 Convert common to new zkhandler 2021-06-01 11:55:32 -04:00
8f66a8d00e Fix missed zkhandler conversion 2021-06-01 11:53:33 -04:00
6beea0693c Convert fencing to new zkhandler 2021-06-01 11:53:21 -04:00
1c9a7a6479 Convert VXNetworkInstance to new zkhandler 2021-06-01 11:49:39 -04:00
790098f181 Convert VMInstance to new zkhandler 2021-06-01 11:46:27 -04:00
8a4a41e092 Convert NodeInstance to new zkhandler 2021-06-01 11:27:35 -04:00
a48bf2d71e More gracefully handle none selectors
Allow selection of "none" as the node selector, and handle this by
always using the cluster default instead of writing it in.
2021-06-01 11:13:13 -04:00
a0b9087167 Set Daemon migration selector in zookeeper 2021-06-01 10:52:41 -04:00
33a54cf7f2 Move configuration keys to /config tree 2021-06-01 10:48:55 -04:00
d6a8cf9780 Convert MetadataAPIInstance to new zkhandler 2021-05-31 19:55:09 -04:00
abd619a3c1 Convert DNSAggregatorInstance to new zkhandler 2021-05-31 19:55:01 -04:00
ef5fe78125 Convert CepnInstance to new zkhandler 2021-05-31 19:51:27 -04:00
f6d0e89568 Properly add absent node type 2021-05-31 19:26:27 -04:00
d3b5b5236a Remove transactional delete
This just doesn't work due to the darn limit on recursive deletes in
transactions.
2021-05-31 19:22:01 -04:00
8625e9bd3e Update Delete to recursive method 2021-05-31 03:14:09 -04:00
ede3e88cd7 Modify node daemon root to use updated zkhandler 2021-05-31 03:14:09 -04:00
ed4f84a3ec Add log handling and persistent listener 2021-05-31 03:14:09 -04:00
a1969eb981 Allow overwrite during init command 2021-05-31 00:12:28 -04:00
c7992000eb Explicitly output JSON cluster data 2021-05-30 23:50:42 -04:00
a1e8cc5867 Skip patroni tree during backups 2021-05-30 23:39:37 -04:00
ac0c3b0ec9 Ensure temp_dir exists before starting
Otherwise some failures throw the wrong error.
2021-05-30 16:04:38 -04:00
60db800d9c Use full ZKHandler in provisioner
Required due to references to self from Celery that are replaced by the
ZKConnection self instance.
2021-05-30 15:59:37 -04:00
9be426507a Fix erroneous lock calls 2021-05-30 15:31:17 -04:00
58a5b00aa1 Remove extraneous zkhandler reference 2021-05-30 01:01:40 -04:00
73407e245f Move startup code to an entrypoint function
Prevents further issues with startup.
2021-05-30 00:18:04 -04:00
25f80a4478 Move API version string location to Daemon
Prevents a startup bug with pvcapid-manage.py.
2021-05-30 00:11:24 -04:00
c23a53d082 Add daemon_lib symlink to pvcnoded 2021-05-30 00:00:07 -04:00
b4f2cf879e Rework vm library for new zkhandler 2021-05-29 21:17:19 -04:00
3603b782c0 Rework node library for new zkhandler 2021-05-29 20:56:21 -04:00
62cb72b62f Rework network library for new zkhandler 2021-05-29 20:53:42 -04:00
b186a75b4e Rework common library for new zkhandler 2021-05-29 20:35:28 -04:00
6205dba451 Rework cluster library for new zkhandler 2021-05-29 20:32:20 -04:00
688d1a6ae2 Rework ceph library for new zkhandler 2021-05-29 20:29:51 -04:00
163015bd4a Port remaining helper functions to ZKConnection 2021-05-29 00:30:42 -04:00
49bbad8021 Port provisioner to ZKConnection 2021-05-29 00:26:15 -04:00
2c0bafc313 Port benchmark to ZKConnection 2021-05-29 00:24:53 -04:00
1963f2c336 Convert OVA helper to ZKConnection 2021-05-29 00:22:06 -04:00
9cd121ef9f Convert remaining VM functions 2021-05-29 00:16:26 -04:00
ea63a58b21 Port two more functions to new decorator 2021-05-28 23:38:53 -04:00
0eceec0341 Disable SQLAlchemy modifcation tracking 2021-05-28 23:36:36 -04:00
c6bececb55 Revamp config parsing and imports
Brings sanity to the passing of the config variable around the various
submodules for use in the ZKConnection decorator.
2021-05-28 23:33:36 -04:00
4554a0d6af Add line break to lint output 2021-05-28 00:20:03 -04:00
f82da03a62 Add first wrappers and exceptions 2021-05-28 00:19:39 -04:00
fef230ad98 Implement class-based version of zkhander 2021-05-27 22:50:00 -04:00
3128c8fa70 Correct flawed conditional in some commands 2021-05-25 09:59:20 -04:00
0c75a127b2 Bump version to 0.9.18 2021-05-23 17:23:10 -04:00
f46c2e7f6a Implement VM rename functionality
Closes #125
2021-05-23 17:21:19 -04:00
9de14c46fb Bump version to 0.9.17 2021-05-19 17:06:29 -04:00
1b8b101b64 Fix bugs in log follow command 2021-05-19 16:22:48 -04:00
84 changed files with 8122 additions and 11399 deletions

5
.gitignore vendored
View File

@ -1,3 +1,8 @@
*.pyc
*.tmp
*.swp
# Ignore build artifacts
debian/pvc-*/
debian/*.log
debian/*.substvars
debian/files

View File

@ -5,7 +5,7 @@ pushd $( git rev-parse --show-toplevel ) &>/dev/null
ex=0
# Linting
echo -n "Linting... "
echo "Linting..."
./lint
if [[ $? -ne 0 ]]; then
echo "Aborting commit due to linting errors."

1
.version Normal file
View File

@ -0,0 +1 @@
0.9.27

View File

@ -672,8 +672,3 @@ may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.
----
Logo contains elements Copyright Anjara Begue via <http://www.vecteezy.com>
which are released under a Creative Commons Attribution license.

126
README.md
View File

@ -1,25 +1,137 @@
# PVC - The Parallel Virtual Cluster system
<p align="center">
<img alt="Logo banner" src="https://git.bonifacelabs.ca/uploads/-/system/project/avatar/135/pvc_logo.png"/>
<img alt="Logo banner" src="docs/images/pvc_logo_black.png"/>
<br/><br/>
<a href="https://github.com/parallelvirtualcluster/pvc"><img alt="License" src="https://img.shields.io/github/license/parallelvirtualcluster/pvc"/></a>
<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
<a href="https://parallelvirtualcluster.readthedocs.io/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
</p>
**NOTICE FOR GITHUB**: This repository is a read-only mirror of the PVC repositories from my personal GitLab instance. Pull requests submitted here will not be merged. Issues submitted here will however be treated as authoritative.
## What is PVC?
PVC is a KVM+Ceph+Zookeeper-based, Free Software, scalable, redundant, self-healing, and self-managing private cloud solution designed with administrator simplicity in mind. It is built from the ground-up to be redundant at the host layer, allowing the cluster to gracefully handle the loss of nodes or their components, both due to hardware failure or due to maintenance. It is able to scale from a minimum of 3 nodes up to 12 or more nodes, while retaining performance and flexibility, allowing the administrator to build a small cluster today and grow it as needed.
PVC is a virtual machine-based hyperconverged infrastructure (HCI) virtualization cluster solution that is fully Free Software, scalable, redundant, self-healing, self-managing, and designed for administrator simplicity. It is an alternative to other HCI solutions such as Harvester, Nutanix, and VMWare, as well as to other common virtualization stacks such as ProxMox and OpenStack.
PVC is a complete HCI solution, built from well-known and well-trusted Free Software tools, to assist an administrator in creating and managing a cluster of servers to run virtual machines, as well as self-managing several important aspects including storage failover, node failure and recovery, virtual machine failure and recovery, and network plumbing. It is designed to act consistently, reliably, and unobtrusively, letting the administrator concentrate on more important things.
PVC is highly scalable. From a minimum (production) node count of 3, up to 12 or more, and supporting many dozens of VMs, PVC scales along with your workload and requirements. Deploy a cluster once and grow it as your needs expand.
As a consequence of its features, PVC makes administrating very high-uptime VMs extremely easy, featuring VM live migration, built-in always-enabled shared storage with transparent multi-node replication, and consistent network plumbing throughout the cluster. Nodes can also be seamlessly removed from or added to service, with zero VM downtime, to facilitate maintenance, upgrades, or other work.
PVC also features an optional, fully customizable VM provisioning framework, designed to automate and simplify VM deployments using custom provisioning profiles, scripts, and CloudInit userdata API support.
Installation of PVC is accomplished by two main components: a [Node installer ISO](https://github.com/parallelvirtualcluster/pvc-installer) which creates on-demand installer ISOs, and an [Ansible role framework](https://github.com/parallelvirtualcluster/pvc-ansible) to configure, bootstrap, and administrate the nodes. Once up, the cluster is managed via an HTTP REST API, accessible via a Python Click CLI client or WebUI.
Just give it physical servers, and it will run your VMs without you having to think about it, all in just an hour or two of setup time.
## What is it based on?
The core node and API daemons, as well as the CLI API client, are written in Python 3 and are fully Free Software (GNU GPL v3). In addition to these, PVC makes use of the following software tools to provide a holistic hyperconverged infrastructure solution:
* Debian GNU/Linux as the base OS.
* Linux KVM, QEMU, and Libvirt for VM management.
* Linux `ip`, FRRouting, NFTables, DNSMasq, and PowerDNS for network management.
* Ceph for storage management.
* Apache Zookeeper for the primary cluster state database.
* Patroni PostgreSQL manager for the secondary relation databases (DNS aggregation, Provisioner configuration).
The major goal of PVC is to be administrator friendly, providing the power of Enterprise-grade private clouds like OpenStack, Nutanix, and VMWare to homelabbers, SMBs, and small ISPs, without the cost or complexity. It believes in picking the best tool for a job and abstracting it behind the cluster as a whole, freeing the administrator from the boring and time-consuming task of selecting the best component, and letting them get on with the things that really matter. Administration can be done from a simple CLI or via a RESTful API capable of building full-featured web frontends or additional applications, taking a self-documenting approach to keep the administrator learning curvet as low as possible. Setup is easy and straightforward with an [ISO-based node installer](https://git.bonifacelabs.ca/parallelvirtualcluster/pvc-installer) and [Ansible role framework](https://git.bonifacelabs.ca/parallelvirtualcluster/pvc-ansible) designed to get a cluster up and running as quickly as possible. Build your cloud in an hour, grow it as you need, and never worry about it: just add physical servers.
## Getting Started
To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your cluster.
To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your first cluster.
## Changelog
#### v0.9.27
* [CLI Client] Fixes a bug with vm modify command when passed a file
#### v0.9.26
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures
* [All] Implements VM tagging functionality
* [All] Implements Node log access via PVC functionality
#### v0.9.25
* [Node Daemon] Returns to Rados library calls for Ceph due to performance problems
* [Node Daemon] Adds a date output to keepalive messages
* [Daemons] Configures ZK connection logging only for persistent connections
* [API Provisioner] Add context manager-based chroot to Debootstrap example script
* [Node Daemon] Fixes a bug where shutdown daemon state was overwritten
#### v0.9.24
* [Node Daemon] Removes Rados module polling of Ceph cluster and returns to command-based polling for timeout purposes, and removes some flaky return statements
* [Node Daemon] Removes flaky Zookeeper connection renewals that caused problems
* [CLI Client] Allow raw lists of clusters from `pvc cluster list`
* [API Daemon] Fixes several issues when getting VM data without stats
* [API Daemon] Fixes issues with removing VMs while disks are still in use (failed provisioning, etc.)
#### v0.9.23
* [Daemons] Fixes a critical overwriting bug in zkhandler when schema paths are not yet valid
* [Node Daemon] Ensures the daemon mode is updated on every startup (fixes the side effect of the above bug in 0.9.22)
#### v0.9.22
* [API Daemon] Drastically improves performance when getting large lists (e.g. VMs)
* [Daemons] Adds profiler functions for use in debug mode
* [Daemons] Improves reliability of ZK locking
* [Daemons] Adds the new logo in ASCII form to the Daemon startup message
* [Node Daemon] Fixes bug where VMs would sometimes not stop
* [Node Daemon] Code cleanups in various classes
* [Node Daemon] Fixes a bug when reading node schema data
* [All] Adds node PVC version information to the list output
* [CLI Client] Improves the style and formatting of list output including a new header line
* [API Worker] Fixes a bug that prevented the storage benchmark job from running
#### v0.9.21
* [API Daemon] Ensures VMs stop before removing them
* [Node Daemon] Fixes a bug with VM shutdowns not timing out
* [Documentation] Adds information about georedundancy caveats
* [All] Adds support for SR-IOV NICs (hostdev and macvtap) and surrounding documentation
* [Node Daemon] Fixes a bug where shutdown aborted migrations unexpectedly
* [Node Daemon] Fixes a bug where the migration method was not updated realtime
* [Node Daemon] Adjusts the Patroni commands to remove reference to Zookeeper path
* [CLI Client] Adjusts several help messages and fixes some typos
* [CLI Client] Converts the CLI client to a proper Python module
* [API Daemon] Improves VM list performance
* [API Daemon] Adjusts VM list matching critera (only matches against the UUID if it's a full UUID)
* [API Worker] Fixes incompatibility between Deb 10 and 11 in launching Celery worker
* [API Daemon] Corrects several bugs with initialization command
* [Documentation] Adds a shiny new logo and revamps introduction text
#### v0.9.20
* [Daemons] Implemented a Zookeeper schema handler and version 0 schema
* [Daemons] Completes major refactoring of codebase to make use of the schema handler
* [Daemons] Adds support for dynamic chema changges and "hot reloading" of pvcnoded processes
* [Daemons] Adds a functional testing script for verifying operation against a test cluster
* [Daemons, CLI] Fixes several minor bugs found by the above script
* [Daemons, CLI] Add support for Debian 11 "Bullseye"
#### v0.9.19
* [CLI] Corrects some flawed conditionals
* [API] Disables SQLAlchemy modification tracking functionality (not used by us)
* [Daemons] Implements new zkhandler module for improved reliability and reusability
* [Daemons] Refactors some code to use new zkhandler module
* [API, CLI] Adds support for "none" migration selector (uses cluster default instead)
* [Daemons] Moves some configuration keys to new /config tree
* [Node Daemon] Increases initial lock timeout for VM migrations to avoid out-of-sync potential
* [Provisioner] Support storing and using textual cluster network labels ("upstream", "storage", "cluster") in templates
* [API] Avoid duplicating existing node states
#### v0.9.18
* Adds VM rename functionality to API and CLI client
#### v0.9.17
* [CLI] Fixes bugs in log follow output
#### v0.9.16
* Improves some CLI help messages

View File

@ -0,0 +1,28 @@
"""PVC version 0.9.18
Revision ID: bae4d5a77c74
Revises: 3efe890e1d87
Create Date: 2021-06-02 15:41:40.061806
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = 'bae4d5a77c74'
down_revision = '3efe890e1d87'
branch_labels = None
depends_on = None
def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.execute('ALTER TABLE network ALTER COLUMN vni TYPE TEXT')
# ### end Alembic commands ###
def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.execute('ALTER TABLE network ALTER COLUMN vni TYPE INTEGER USING vni::integer')
# ### end Alembic commands ###

View File

@ -34,6 +34,29 @@
# with that.
import os
from contextlib import contextmanager
# Create a chroot context manager
# This can be used later in the script to chroot to the destination directory
# for instance to run commands within the target.
@contextmanager
def chroot_target(destination):
try:
real_root = os.open("/", os.O_RDONLY)
os.chroot(destination)
fake_root = os.open("/", os.O_RDONLY)
os.fchdir(fake_root)
yield
finally:
os.fchdir(real_root)
os.chroot(".")
os.fchdir(real_root)
os.close(fake_root)
os.close(real_root)
del fake_root
del real_root
# Installation function - performs a debootstrap install of a Debian system
# Note that the only arguments are keyword arguments.
@ -193,40 +216,25 @@ GRUB_DISABLE_LINUX_UUID=false
fh.write(data)
# Chroot, do some in-root tasks, then exit the chroot
# EXITING THE CHROOT IS VERY IMPORTANT OR THE FOLLOWING STAGES OF THE PROVISIONER
# WILL FAIL IN UNEXPECTED WAYS! Keep this in mind when using chroot in your scripts.
real_root = os.open("/", os.O_RDONLY)
os.chroot(temporary_directory)
fake_root = os.open("/", os.O_RDONLY)
os.fchdir(fake_root)
# Install and update GRUB
os.system(
"grub-install --force /dev/rbd/{}/{}_{}".format(root_disk['pool'], vm_name, root_disk['disk_id'])
)
os.system(
"update-grub"
)
# Set a really dumb root password [TEMPORARY]
os.system(
"echo root:test123 | chpasswd"
)
# Enable cloud-init target on (first) boot
# NOTE: Your user-data should handle this and disable it once done, or things get messy.
# That cloud-init won't run without this hack seems like a bug... but even the official
# Debian cloud images are affected, so who knows.
os.system(
"systemctl enable cloud-init.target"
)
# Restore our original root/exit the chroot
# EXITING THE CHROOT IS VERY IMPORTANT OR THE FOLLOWING STAGES OF THE PROVISIONER
# WILL FAIL IN UNEXPECTED WAYS! Keep this in mind when using chroot in your scripts.
os.fchdir(real_root)
os.chroot(".")
os.fchdir(real_root)
os.close(fake_root)
os.close(real_root)
with chroot_target(temporary_directory):
# Install and update GRUB
os.system(
"grub-install --force /dev/rbd/{}/{}_{}".format(root_disk['pool'], vm_name, root_disk['disk_id'])
)
os.system(
"update-grub"
)
# Set a really dumb root password [TEMPORARY]
os.system(
"echo root:test123 | chpasswd"
)
# Enable cloud-init target on (first) boot
# NOTE: Your user-data should handle this and disable it once done, or things get messy.
# That cloud-init won't run without this hack seems like a bug... but even the official
# Debian cloud images are affected, so who knows.
os.system(
"systemctl enable cloud-init.target"
)
# Unmount the bound devfs
os.system(
@ -235,8 +243,4 @@ GRUB_DISABLE_LINUX_UUID=false
)
)
# Clean up file handles so paths can be unmounted
del fake_root
del real_root
# Everything else is done via cloud-init user-data

View File

@ -29,7 +29,7 @@
# This script will run under root privileges as the provisioner does. Be careful
# with that.
# Installation function - performs a debootstrap install of a Debian system
# Installation function - performs no actions then returns
# Note that the only arguments are keyword arguments.
def install(**kwargs):
# The provisioner has already mounted the disks on kwargs['temporary_directory'].

24
api-daemon/pvcapid-manage-zk.py Executable file
View File

@ -0,0 +1,24 @@
#!/usr/bin/env python3
# pvcapid-manage-zk.py - PVC Zookeeper migration generator
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2021 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
from daemon_lib.zkhandler import ZKSchema
ZKSchema.write()

View File

@ -9,7 +9,7 @@ Type = simple
WorkingDirectory = /usr/share/pvc
Environment = PYTHONUNBUFFERED=true
Environment = PVC_CONFIG_FILE=/etc/pvc/pvcapid.yaml
ExecStart = /usr/bin/celery worker -A pvcapid.flaskapi.celery --concurrency 1 --loglevel INFO
ExecStart = /usr/share/pvc/pvcapid-worker.sh
Restart = on-failure
[Install]

40
api-daemon/pvcapid-worker.sh Executable file
View File

@ -0,0 +1,40 @@
#!/usr/bin/env bash
# pvcapid-worker.py - API Celery worker daemon startup stub
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2021 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
CELERY_BIN="$( which celery )"
# This absolute hackery is needed because Celery got the bright idea to change how their
# app arguments work in a non-backwards-compatible way with Celery 5.
case "$( cat /etc/debian_version )" in
10.*)
CELERY_ARGS="worker --app pvcapid.flaskapi.celery --concurrency 1 --loglevel INFO"
;;
11.*)
CELERY_ARGS="--app pvcapid.flaskapi.celery worker --concurrency 1 --loglevel INFO"
;;
*)
echo "Invalid Debian version found!"
exit 1
;;
esac
${CELERY_BIN} ${CELERY_ARGS}
exit $?

View File

@ -20,3 +20,5 @@
###############################################################################
import pvcapid.Daemon # noqa: F401
pvcapid.Daemon.entrypoint()

View File

@ -19,37 +19,119 @@
#
###############################################################################
import pvcapid.flaskapi as pvc_api
import os
import yaml
from distutils.util import strtobool as dustrtobool
# Daemon version
version = '0.9.27'
# API version
API_VERSION = 1.0
##########################################################
# Helper Functions
##########################################################
def strtobool(stringv):
if stringv is None:
return False
if isinstance(stringv, bool):
return bool(stringv)
try:
return bool(dustrtobool(stringv))
except Exception:
return False
##########################################################
# Configuration Parsing
##########################################################
# Parse the configuration file
try:
pvcapid_config_file = os.environ['PVC_CONFIG_FILE']
except Exception:
print('Error: The "PVC_CONFIG_FILE" environment variable must be set before starting pvcapid.')
exit(1)
print('Loading configuration from file "{}"'.format(pvcapid_config_file))
# Read in the config
try:
with open(pvcapid_config_file, 'r') as cfgfile:
o_config = yaml.load(cfgfile, Loader=yaml.BaseLoader)
except Exception as e:
print('ERROR: Failed to parse configuration file: {}'.format(e))
exit(1)
try:
# Create the config object
config = {
'debug': strtobool(o_config['pvc']['debug']),
'coordinators': o_config['pvc']['coordinators'],
'listen_address': o_config['pvc']['api']['listen_address'],
'listen_port': int(o_config['pvc']['api']['listen_port']),
'auth_enabled': strtobool(o_config['pvc']['api']['authentication']['enabled']),
'auth_secret_key': o_config['pvc']['api']['authentication']['secret_key'],
'auth_tokens': o_config['pvc']['api']['authentication']['tokens'],
'ssl_enabled': strtobool(o_config['pvc']['api']['ssl']['enabled']),
'ssl_key_file': o_config['pvc']['api']['ssl']['key_file'],
'ssl_cert_file': o_config['pvc']['api']['ssl']['cert_file'],
'database_host': o_config['pvc']['provisioner']['database']['host'],
'database_port': int(o_config['pvc']['provisioner']['database']['port']),
'database_name': o_config['pvc']['provisioner']['database']['name'],
'database_user': o_config['pvc']['provisioner']['database']['user'],
'database_password': o_config['pvc']['provisioner']['database']['pass'],
'queue_host': o_config['pvc']['provisioner']['queue']['host'],
'queue_port': o_config['pvc']['provisioner']['queue']['port'],
'queue_path': o_config['pvc']['provisioner']['queue']['path'],
'storage_hosts': o_config['pvc']['provisioner']['ceph_cluster']['storage_hosts'],
'storage_domain': o_config['pvc']['provisioner']['ceph_cluster']['storage_domain'],
'ceph_monitor_port': o_config['pvc']['provisioner']['ceph_cluster']['ceph_monitor_port'],
'ceph_storage_secret_uuid': o_config['pvc']['provisioner']['ceph_cluster']['ceph_storage_secret_uuid']
}
# Use coordinators as storage hosts if not explicitly specified
if not config['storage_hosts']:
config['storage_hosts'] = config['coordinators']
except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e))
exit(1)
##########################################################
# Entrypoint
##########################################################
# Version string for startup output
version = '0.9.16'
def entrypoint():
import pvcapid.flaskapi as pvc_api # noqa: E402
if pvc_api.config['ssl_enabled']:
context = (pvc_api.config['ssl_cert_file'], pvc_api.config['ssl_key_file'])
else:
context = None
if config['ssl_enabled']:
context = (config['ssl_cert_file'], config['ssl_key_file'])
else:
context = None
# Print our startup messages
print('')
print('|--------------------------------------------------|')
print('| ######## ## ## ###### |')
print('| ## ## ## ## ## ## |')
print('| ## ## ## ## ## |')
print('| ######## ## ## ## |')
print('| ## ## ## ## |')
print('| ## ## ## ## ## |')
print('| ## ### ###### |')
print('|--------------------------------------------------|')
print('| Parallel Virtual Cluster API daemon v{0: <11} |'.format(version))
print('| API version: v{0: <34} |'.format(pvc_api.API_VERSION))
print('| Listen: {0: <40} |'.format('{}:{}'.format(pvc_api.config['listen_address'], pvc_api.config['listen_port'])))
print('| SSL: {0: <43} |'.format(str(pvc_api.config['ssl_enabled'])))
print('| Authentication: {0: <32} |'.format(str(pvc_api.config['auth_enabled'])))
print('|--------------------------------------------------|')
print('')
# Print our startup messages
print('')
print('|----------------------------------------------------------|')
print('| |')
print('| ███████████ ▜█▙ ▟█▛ █████ █ █ █ |')
print('| ██ ▜█▙ ▟█▛ ██ |')
print('| ███████████ ▜█▙ ▟█▛ ██ |')
print('| ██ ▜█▙▟█▛ ███████████ |')
print('| |')
print('|----------------------------------------------------------|')
print('| Parallel Virtual Cluster API daemon v{0: <19} |'.format(version))
print('| Debug: {0: <49} |'.format(str(config['debug'])))
print('| API version: v{0: <42} |'.format(API_VERSION))
print('| Listen: {0: <48} |'.format('{}:{}'.format(config['listen_address'], config['listen_port'])))
print('| SSL: {0: <51} |'.format(str(config['ssl_enabled'])))
print('| Authentication: {0: <40} |'.format(str(config['auth_enabled'])))
print('|----------------------------------------------------------|')
print('')
pvc_api.app.run(pvc_api.config['listen_address'], pvc_api.config['listen_port'], threaded=True, ssl_context=context)
pvc_api.app.run(config['listen_address'], config['listen_port'], threaded=True, ssl_context=context)

View File

@ -22,24 +22,13 @@
import psycopg2
import psycopg2.extras
from distutils.util import strtobool as dustrtobool
from pvcapid.Daemon import config
from daemon_lib.zkhandler import ZKHandler
import daemon_lib.common as pvc_common
import daemon_lib.ceph as pvc_ceph
config = None # Set in this namespace by flaskapi
def strtobool(stringv):
if stringv is None:
return False
if isinstance(stringv, bool):
return bool(stringv)
try:
return bool(dustrtobool(stringv))
except Exception:
return False
#
# Exceptions (used by Celery tasks)
@ -48,7 +37,7 @@ class BenchmarkError(Exception):
"""
An exception that results from the Benchmark job.
"""
def __init__(self, message, cur_time=None, db_conn=None, db_cur=None, zk_conn=None):
def __init__(self, message, cur_time=None, db_conn=None, db_cur=None, zkhandler=None):
self.message = message
if cur_time is not None:
# Clean up our dangling result
@ -58,7 +47,7 @@ class BenchmarkError(Exception):
db_conn.commit()
# Close the database connections cleanly
close_database(db_conn, db_cur)
pvc_common.stopZKConnection(zk_conn)
zkhandler.disconnect()
def __str__(self):
return str(self.message)
@ -134,7 +123,8 @@ def run_benchmark(self, pool):
raise Exception
try:
zk_conn = pvc_common.startZKConnection(config['coordinators'])
zkhandler = ZKHandler(config)
zkhandler.connect()
except Exception:
print('FATAL - failed to connect to Zookeeper')
raise Exception
@ -146,7 +136,7 @@ def run_benchmark(self, pool):
db_cur.execute(query, args)
db_conn.commit()
except Exception as e:
raise BenchmarkError("Failed to store running status: {}".format(e), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zk_conn=zk_conn)
raise BenchmarkError("Failed to store running status: {}".format(e), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zkhandler=zkhandler)
# Phase 1 - volume preparation
self.update_state(state='RUNNING', meta={'current': 1, 'total': 3, 'status': 'Creating benchmark volume'})
@ -155,9 +145,9 @@ def run_benchmark(self, pool):
volume = 'pvcbenchmark'
# Create the RBD volume
retcode, retmsg = pvc_ceph.add_volume(zk_conn, pool, volume, "8G")
retcode, retmsg = pvc_ceph.add_volume(zkhandler, pool, volume, "8G")
if not retcode:
raise BenchmarkError('Failed to create volume "{}": {}'.format(volume, retmsg), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zk_conn=zk_conn)
raise BenchmarkError('Failed to create volume "{}": {}'.format(volume, retmsg), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zkhandler=zkhandler)
else:
print(retmsg)
@ -244,7 +234,7 @@ def run_benchmark(self, pool):
retcode, stdout, stderr = pvc_common.run_os_command(fio_cmd)
if retcode:
raise BenchmarkError("Failed to run fio test: {}".format(stderr), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zk_conn=zk_conn)
raise BenchmarkError("Failed to run fio test: {}".format(stderr), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zkhandler=zkhandler)
# Parse the terse results to avoid storing tons of junk
# Reference: https://fio.readthedocs.io/en/latest/fio_doc.html#terse-output
@ -445,9 +435,9 @@ def run_benchmark(self, pool):
time.sleep(1)
# Remove the RBD volume
retcode, retmsg = pvc_ceph.remove_volume(zk_conn, pool, volume)
retcode, retmsg = pvc_ceph.remove_volume(zkhandler, pool, volume)
if not retcode:
raise BenchmarkError('Failed to remove volume "{}": {}'.format(volume, retmsg), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zk_conn=zk_conn)
raise BenchmarkError('Failed to remove volume "{}": {}'.format(volume, retmsg), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zkhandler=zkhandler)
else:
print(retmsg)
@ -458,8 +448,10 @@ def run_benchmark(self, pool):
db_cur.execute(query, args)
db_conn.commit()
except Exception as e:
raise BenchmarkError("Failed to store test results: {}".format(e), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zk_conn=zk_conn)
raise BenchmarkError("Failed to store test results: {}".format(e), cur_time=cur_time, db_conn=db_conn, db_cur=db_cur, zkhandler=zkhandler)
close_database(db_conn, db_cur)
pvc_common.stopZKConnection(zk_conn)
zkhandler.disconnect()
del zkhandler
return {'status': "Storage benchmark '{}' completed successfully.", 'current': 3, 'total': 3}

View File

@ -19,15 +19,14 @@
#
###############################################################################
import yaml
import os
import flask
from distutils.util import strtobool as dustrtobool
from functools import wraps
from flask_restful import Resource, Api, reqparse, abort
from celery import Celery
from pvcapid.Daemon import config, strtobool, API_VERSION
import pvcapid.helper as api_helper
import pvcapid.provisioner as api_provisioner
import pvcapid.benchmark as api_benchmark
@ -35,84 +34,12 @@ import pvcapid.ova as api_ova
from flask_sqlalchemy import SQLAlchemy
API_VERSION = 1.0
def strtobool(stringv):
if stringv is None:
return False
if isinstance(stringv, bool):
return bool(stringv)
try:
return bool(dustrtobool(stringv))
except Exception:
return False
# Parse the configuration file
try:
pvcapid_config_file = os.environ['PVC_CONFIG_FILE']
except Exception:
print('Error: The "PVC_CONFIG_FILE" environment variable must be set before starting pvcapid.')
exit(1)
print('Loading configuration from file "{}"'.format(pvcapid_config_file))
# Read in the config
try:
with open(pvcapid_config_file, 'r') as cfgfile:
o_config = yaml.load(cfgfile, Loader=yaml.BaseLoader)
except Exception as e:
print('ERROR: Failed to parse configuration file: {}'.format(e))
exit(1)
try:
# Create the config object
config = {
'debug': strtobool(o_config['pvc']['debug']),
'coordinators': o_config['pvc']['coordinators'],
'listen_address': o_config['pvc']['api']['listen_address'],
'listen_port': int(o_config['pvc']['api']['listen_port']),
'auth_enabled': strtobool(o_config['pvc']['api']['authentication']['enabled']),
'auth_secret_key': o_config['pvc']['api']['authentication']['secret_key'],
'auth_tokens': o_config['pvc']['api']['authentication']['tokens'],
'ssl_enabled': strtobool(o_config['pvc']['api']['ssl']['enabled']),
'ssl_key_file': o_config['pvc']['api']['ssl']['key_file'],
'ssl_cert_file': o_config['pvc']['api']['ssl']['cert_file'],
'database_host': o_config['pvc']['provisioner']['database']['host'],
'database_port': int(o_config['pvc']['provisioner']['database']['port']),
'database_name': o_config['pvc']['provisioner']['database']['name'],
'database_user': o_config['pvc']['provisioner']['database']['user'],
'database_password': o_config['pvc']['provisioner']['database']['pass'],
'queue_host': o_config['pvc']['provisioner']['queue']['host'],
'queue_port': o_config['pvc']['provisioner']['queue']['port'],
'queue_path': o_config['pvc']['provisioner']['queue']['path'],
'storage_hosts': o_config['pvc']['provisioner']['ceph_cluster']['storage_hosts'],
'storage_domain': o_config['pvc']['provisioner']['ceph_cluster']['storage_domain'],
'ceph_monitor_port': o_config['pvc']['provisioner']['ceph_cluster']['ceph_monitor_port'],
'ceph_storage_secret_uuid': o_config['pvc']['provisioner']['ceph_cluster']['ceph_storage_secret_uuid']
}
# Use coordinators as storage hosts if not explicitly specified
if not config['storage_hosts']:
config['storage_hosts'] = config['coordinators']
# Set the config object in the api_helper namespace
api_helper.config = config
# Set the config object in the api_provisioner namespace
api_provisioner.config = config
# Set the config object in the api_benchmark namespace
api_benchmark.config = config
# Set the config object in the api_ova namespace
api_ova.config = config
except Exception as e:
print('ERROR: Failed to load configuration: {}'.format(e))
exit(1)
# Create Flask app and set config values
app = flask.Flask(__name__)
app.config['CELERY_BROKER_URL'] = 'redis://{}:{}{}'.format(config['queue_host'], config['queue_port'], config['queue_path'])
app.config['CELERY_RESULT_BACKEND'] = 'redis://{}:{}{}'.format(config['queue_host'], config['queue_port'], config['queue_path'])
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://{}:{}@{}:{}/{}'.format(config['database_user'], config['database_password'], config['database_host'], config['database_port'], config['database_name'])
if config['debug']:
@ -333,17 +260,27 @@ api.add_resource(API_Logout, '/logout')
# /initialize
class API_Initialize(Resource):
@RequestParser([
{'name': 'yes-i-really-mean-it', 'required': True, 'helptext': "Initialization is destructive; please confirm with the argument 'yes-i-really-mean-it'."}
{'name': 'overwrite', 'required': False},
{'name': 'yes-i-really-mean-it', 'required': True, 'helptext': "Initialization is destructive; please confirm with the argument 'yes-i-really-mean-it'."},
])
@Authenticator
def post(self, reqargs):
"""
Initialize a new PVC cluster
Note: Normally used only once during cluster bootstrap; checks for the existence of the "/primary_node" key before proceeding and returns 400 if found
If the 'overwrite' option is not True, the cluster will return 400 if the `/config/primary_node` key is found. If 'overwrite' is True, the existing cluster
data will be erased and new, empty data written in its place.
All node daemons should be stopped before running this command, and the API daemon started manually to avoid undefined behavior.
---
tags:
- root
parameters:
- in: query
name: overwrite
type: bool
required: false
description: A flag to enable or disable (default) overwriting existing data
- in: query
name: yes-i-really-mean-it
type: string
@ -362,10 +299,12 @@ class API_Initialize(Resource):
400:
description: Bad request
"""
if api_helper.initialize_cluster():
return {"message": "Successfully initialized a new PVC cluster"}, 200
if reqargs.get('overwrite', 'False') == 'True':
overwrite_flag = True
else:
return {"message": "PVC cluster already initialized"}, 400
overwrite_flag = False
return api_helper.initialize_cluster(overwrite=overwrite_flag)
api.add_resource(API_Initialize, '/initialize')
@ -596,6 +535,9 @@ class API_Node_Root(Resource):
domain_state:
type: string
description: The current domain (VM) state
pvc_version:
type: string
description: The current running PVC node daemon version
cpu_count:
type: integer
description: The number of available CPU cores
@ -650,7 +592,7 @@ class API_Node_Root(Resource):
name: limit
type: string
required: false
description: A search limit; fuzzy by default, use ^/$ to force exact matches
description: A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches
- in: query
name: daemon_state
type: string
@ -892,6 +834,52 @@ class API_Node_DomainState(Resource):
api.add_resource(API_Node_DomainState, '/node/<node>/domain-state')
# /node/<node</log
class API_Node_Log(Resource):
@RequestParser([
{'name': 'lines'}
])
@Authenticator
def get(self, node, reqargs):
"""
Return the recent logs of {node}
---
tags:
- node
parameters:
- in: query
name: lines
type: integer
required: false
description: The number of lines to retrieve
responses:
200:
description: OK
schema:
type: object
id: NodeLog
properties:
name:
type: string
description: The name of the Node
data:
type: string
description: The recent log text
404:
description: Node not found
schema:
type: object
id: Message
"""
return api_helper.node_log(
node,
reqargs.get('lines', None)
)
api.add_resource(API_Node_Log, '/node/<node>/log')
##########################################################
# Client API - VM
##########################################################
@ -902,6 +890,7 @@ class API_VM_Root(Resource):
{'name': 'limit'},
{'name': 'node'},
{'name': 'state'},
{'name': 'tag'},
])
@Authenticator
def get(self, reqargs):
@ -950,6 +939,22 @@ class API_VM_Root(Resource):
migration_method:
type: string
description: The preferred migration method (live, shutdown, none)
tags:
type: array
description: The tag(s) of the VM
items:
type: object
id: VMTag
properties:
name:
type: string
description: The name of the tag
type:
type: string
description: The type of the tag (user, system)
protected:
type: boolean
description: Whether the tag is protected or not
description:
type: string
description: The description of the VM
@ -1134,7 +1139,7 @@ class API_VM_Root(Resource):
name: limit
type: string
required: false
description: A name search limit; fuzzy by default, use ^/$ to force exact matches
description: A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches
- in: query
name: node
type: string
@ -1145,6 +1150,11 @@ class API_VM_Root(Resource):
type: string
required: false
description: Limit list to VMs in this state
- in: query
name: tag
type: string
required: false
description: Limit list to VMs with this tag
responses:
200:
description: OK
@ -1156,15 +1166,18 @@ class API_VM_Root(Resource):
return api_helper.vm_list(
reqargs.get('node', None),
reqargs.get('state', None),
reqargs.get('tag', None),
reqargs.get('limit', None)
)
@RequestParser([
{'name': 'limit'},
{'name': 'node'},
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms'), 'helptext': "A valid selector must be specified"},
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms', 'none'), 'helptext': "A valid selector must be specified"},
{'name': 'autostart'},
{'name': 'migration_method', 'choices': ('live', 'shutdown', 'none'), 'helptext': "A valid migration_method must be specified"},
{'name': 'user_tags', 'action': 'append'},
{'name': 'protected_tags', 'action': 'append'},
{'name': 'xml', 'required': True, 'helptext': "A Libvirt XML document must be specified"},
])
@Authenticator
@ -1216,6 +1229,20 @@ class API_VM_Root(Resource):
- live
- shutdown
- none
- in: query
name: user_tags
type: array
required: false
description: The user tag(s) of the VM
items:
type: string
- in: query
name: protected_tags
type: array
required: false
description: The protected user tag(s) of the VM
items:
type: string
responses:
200:
description: OK
@ -1228,13 +1255,22 @@ class API_VM_Root(Resource):
type: object
id: Message
"""
user_tags = reqargs.get('user_tags', None)
if user_tags is None:
user_tags = []
protected_tags = reqargs.get('protected_tags', None)
if protected_tags is None:
protected_tags = []
return api_helper.vm_define(
reqargs.get('xml'),
reqargs.get('node', None),
reqargs.get('limit', None),
reqargs.get('selector', 'mem'),
reqargs.get('selector', 'none'),
bool(strtobool(reqargs.get('autostart', 'false'))),
reqargs.get('migration_method', 'none')
reqargs.get('migration_method', 'none'),
user_tags,
protected_tags
)
@ -1261,14 +1297,16 @@ class API_VM_Element(Resource):
type: object
id: Message
"""
return api_helper.vm_list(None, None, vm, is_fuzzy=False)
return api_helper.vm_list(None, None, None, vm, is_fuzzy=False)
@RequestParser([
{'name': 'limit'},
{'name': 'node'},
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms'), 'helptext': "A valid selector must be specified"},
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms', 'none'), 'helptext': "A valid selector must be specified"},
{'name': 'autostart'},
{'name': 'migration_method', 'choices': ('live', 'shutdown', 'none'), 'helptext': "A valid migration_method must be specified"},
{'name': 'user_tags', 'action': 'append'},
{'name': 'protected_tags', 'action': 'append'},
{'name': 'xml', 'required': True, 'helptext': "A Libvirt XML document must be specified"},
])
@Authenticator
@ -1307,6 +1345,7 @@ class API_VM_Element(Resource):
- vcpus
- load
- vms
- none (cluster default)
- in: query
name: autostart
type: boolean
@ -1322,6 +1361,20 @@ class API_VM_Element(Resource):
- live
- shutdown
- none
- in: query
name: user_tags
type: array
required: false
description: The user tag(s) of the VM
items:
type: string
- in: query
name: protected_tags
type: array
required: false
description: The protected user tag(s) of the VM
items:
type: string
responses:
200:
description: OK
@ -1334,13 +1387,22 @@ class API_VM_Element(Resource):
type: object
id: Message
"""
user_tags = reqargs.get('user_tags', None)
if user_tags is None:
user_tags = []
protected_tags = reqargs.get('protected_tags', None)
if protected_tags is None:
protected_tags = []
return api_helper.vm_define(
reqargs.get('xml'),
reqargs.get('node', None),
reqargs.get('limit', None),
reqargs.get('selector', 'mem'),
reqargs.get('selector', 'none'),
bool(strtobool(reqargs.get('autostart', 'false'))),
reqargs.get('migration_method', 'none')
reqargs.get('migration_method', 'none'),
user_tags,
protected_tags
)
@RequestParser([
@ -1458,7 +1520,7 @@ class API_VM_Metadata(Resource):
type: string
description: The preferred migration method (live, shutdown, none)
404:
description: Not found
description: VM not found
schema:
type: object
id: Message
@ -1467,7 +1529,7 @@ class API_VM_Metadata(Resource):
@RequestParser([
{'name': 'limit'},
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms'), 'helptext': "A valid selector must be specified"},
{'name': 'selector', 'choices': ('mem', 'vcpus', 'load', 'vms', 'none'), 'helptext': "A valid selector must be specified"},
{'name': 'autostart'},
{'name': 'profile'},
{'name': 'migration_method', 'choices': ('live', 'shutdown', 'none'), 'helptext': "A valid migration_method must be specified"},
@ -1526,6 +1588,11 @@ class API_VM_Metadata(Resource):
schema:
type: object
id: Message
404:
description: VM not found
schema:
type: object
id: Message
"""
return api_helper.update_vm_meta(
vm,
@ -1540,6 +1607,99 @@ class API_VM_Metadata(Resource):
api.add_resource(API_VM_Metadata, '/vm/<vm>/meta')
# /vm/<vm>/tags
class API_VM_Tags(Resource):
@Authenticator
def get(self, vm):
"""
Return the tags of {vm}
---
tags:
- vm
responses:
200:
description: OK
schema:
type: object
id: VMTags
properties:
name:
type: string
description: The name of the VM
tags:
type: array
description: The tag(s) of the VM
items:
type: object
id: VMTag
404:
description: VM not found
schema:
type: object
id: Message
"""
return api_helper.get_vm_tags(vm)
@RequestParser([
{'name': 'action', 'choices': ('add', 'remove'), 'helptext': "A valid action must be specified"},
{'name': 'tag'},
{'name': 'protected'}
])
@Authenticator
def post(self, vm, reqargs):
"""
Set the tags of {vm}
---
tags:
- vm
parameters:
- in: query
name: action
type: string
required: true
description: The action to perform with the tag
enum:
- add
- remove
- in: query
name: tag
type: string
required: true
description: The text value of the tag
- in: query
name: protected
type: boolean
required: false
default: false
description: Set the protected state of the tag
responses:
200:
description: OK
schema:
type: object
id: Message
400:
description: Bad request
schema:
type: object
id: Message
404:
description: VM not found
schema:
type: object
id: Message
"""
return api_helper.update_vm_tag(
vm,
reqargs.get('action'),
reqargs.get('tag'),
reqargs.get('protected', False)
)
api.add_resource(API_VM_Tags, '/vm/<vm>/tags')
# /vm/<vm</state
class API_VM_State(Resource):
@Authenticator
@ -1804,6 +1964,45 @@ class API_VM_Console(Resource):
api.add_resource(API_VM_Console, '/vm/<vm>/console')
# /vm/<vm>/rename
class API_VM_Rename(Resource):
@RequestParser([
{'name': 'new_name'}
])
@Authenticator
def post(self, vm, reqargs):
"""
Rename VM {vm}, and all connected disk volumes which include this name, to {new_name}
---
tags:
- vm
parameters:
- in: query
name: new_name
type: string
required: true
description: The new name of the VM
responses:
200:
description: OK
schema:
type: object
id: Message
400:
description: Bad request
schema:
type: object
id: Message
"""
return api_helper.vm_rename(
vm,
reqargs.get('new_name', None)
)
api.add_resource(API_VM_Rename, '/vm/<vm>/rename')
##########################################################
# Client API - Network
##########################################################
@ -2739,6 +2938,301 @@ class API_Network_ACL_Element(Resource):
api.add_resource(API_Network_ACL_Element, '/network/<vni>/acl/<description>')
##########################################################
# Client API - SR-IOV
##########################################################
# /sriov
class API_SRIOV_Root(Resource):
@Authenticator
def get(self):
pass
api.add_resource(API_SRIOV_Root, '/sriov')
# /sriov/pf
class API_SRIOV_PF_Root(Resource):
@RequestParser([
{'name': 'node', 'required': True, 'helptext': "A valid node must be specified."},
])
@Authenticator
def get(self, reqargs):
"""
Return a list of SR-IOV PFs on a given node
---
tags:
- network / sriov
responses:
200:
description: OK
schema:
type: object
id: sriov_pf
properties:
phy:
type: string
description: The name of the SR-IOV PF device
mtu:
type: string
description: The MTU of the SR-IOV PF device
vfs:
type: list
items:
type: string
description: The PHY name of a VF of this PF
"""
return api_helper.sriov_pf_list(reqargs.get('node'))
api.add_resource(API_SRIOV_PF_Root, '/sriov/pf')
# /sriov/pf/<node>
class API_SRIOV_PF_Node(Resource):
@Authenticator
def get(self, node):
"""
Return a list of SR-IOV PFs on node {node}
---
tags:
- network / sriov
responses:
200:
description: OK
schema:
$ref: '#/definitions/sriov_pf'
"""
return api_helper.sriov_pf_list(node)
api.add_resource(API_SRIOV_PF_Node, '/sriov/pf/<node>')
# /sriov/vf
class API_SRIOV_VF_Root(Resource):
@RequestParser([
{'name': 'node', 'required': True, 'helptext': "A valid node must be specified."},
{'name': 'pf', 'required': False, 'helptext': "A PF parent may be specified."},
])
@Authenticator
def get(self, reqargs):
"""
Return a list of SR-IOV VFs on a given node, optionally limited to those in the specified PF
---
tags:
- network / sriov
responses:
200:
description: OK
schema:
type: object
id: sriov_vf
properties:
phy:
type: string
description: The name of the SR-IOV VF device
pf:
type: string
description: The name of the SR-IOV PF parent of this VF device
mtu:
type: integer
description: The current MTU of the VF device
mac:
type: string
description: The current MAC address of the VF device
config:
type: object
id: sriov_vf_config
properties:
vlan_id:
type: string
description: The tagged vLAN ID of the SR-IOV VF device
vlan_qos:
type: string
description: The QOS group of the tagged vLAN
tx_rate_min:
type: string
description: The minimum TX rate of the SR-IOV VF device
tx_rate_max:
type: string
description: The maximum TX rate of the SR-IOV VF device
spoof_check:
type: boolean
description: Whether device spoof checking is enabled or disabled
link_state:
type: string
description: The current SR-IOV VF link state (either enabled, disabled, or auto)
trust:
type: boolean
description: Whether guest device trust is enabled or disabled
query_rss:
type: boolean
description: Whether VF RSS querying is enabled or disabled
usage:
type: object
id: sriov_vf_usage
properties:
used:
type: boolean
description: Whether the SR-IOV VF is currently used by a VM or not
domain:
type: boolean
description: The UUID of the domain the SR-IOV VF is currently used by
"""
return api_helper.sriov_vf_list(reqargs.get('node'), reqargs.get('pf', None))
api.add_resource(API_SRIOV_VF_Root, '/sriov/vf')
# /sriov/vf/<node>
class API_SRIOV_VF_Node(Resource):
@RequestParser([
{'name': 'pf', 'required': False, 'helptext': "A PF parent may be specified."},
])
@Authenticator
def get(self, node, reqargs):
"""
Return a list of SR-IOV VFs on node {node}, optionally limited to those in the specified PF
---
tags:
- network / sriov
responses:
200:
description: OK
schema:
$ref: '#/definitions/sriov_vf'
"""
return api_helper.sriov_vf_list(node, reqargs.get('pf', None))
api.add_resource(API_SRIOV_VF_Node, '/sriov/vf/<node>')
# /sriov/vf/<node>/<vf>
class API_SRIOV_VF_Element(Resource):
@Authenticator
def get(self, node, vf):
"""
Return information about {vf} on {node}
---
tags:
- network / sriov
responses:
200:
description: OK
schema:
$ref: '#/definitions/sriov_vf'
404:
description: Not found
schema:
type: object
id: Message
"""
vf_list = list()
full_vf_list, _ = api_helper.sriov_vf_list(node)
for vf_element in full_vf_list:
if vf_element['phy'] == vf:
vf_list.append(vf_element)
if len(vf_list) == 1:
return vf_list, 200
else:
return {'message': "No VF '{}' found on node '{}'".format(vf, node)}, 404
@RequestParser([
{'name': 'vlan_id'},
{'name': 'vlan_qos'},
{'name': 'tx_rate_min'},
{'name': 'tx_rate_max'},
{'name': 'link_state', 'choices': ('auto', 'enable', 'disable'), 'helptext': "A valid state must be specified"},
{'name': 'spoof_check'},
{'name': 'trust'},
{'name': 'query_rss'},
])
@Authenticator
def put(self, node, vf, reqargs):
"""
Set the configuration of {vf} on {node}
---
tags:
- network / sriov
parameters:
- in: query
name: vlan_id
type: integer
required: false
description: The vLAN ID for vLAN tagging (0 is disabled)
- in: query
name: vlan_qos
type: integer
required: false
description: The vLAN QOS priority (0 is disabled)
- in: query
name: tx_rate_min
type: integer
required: false
description: The minimum TX rate (0 is disabled)
- in: query
name: tx_rate_max
type: integer
required: false
description: The maximum TX rate (0 is disabled)
- in: query
name: link_state
type: string
required: false
description: The administrative link state
enum:
- auto
- enable
- disable
- in: query
name: spoof_check
type: boolean
required: false
description: Enable or disable spoof checking
- in: query
name: trust
type: boolean
required: false
description: Enable or disable VF user trust
- in: query
name: query_rss
type: boolean
required: false
description: Enable or disable query RSS support
responses:
200:
description: OK
schema:
type: object
id: Message
400:
description: Bad request
schema:
type: object
id: Message
"""
return api_helper.update_sriov_vf_config(
node,
vf,
reqargs.get('vlan_id', None),
reqargs.get('vlan_qos', None),
reqargs.get('tx_rate_min', None),
reqargs.get('tx_rate_max', None),
reqargs.get('link_state', None),
reqargs.get('spoof_check', None),
reqargs.get('trust', None),
reqargs.get('query_rss', None),
)
api.add_resource(API_SRIOV_VF_Element, '/sriov/vf/<node>/<vf>')
##########################################################
# Client API - Storage
##########################################################

File diff suppressed because it is too large Load Diff

View File

@ -41,6 +41,7 @@ libvirt_header = """<domain type='kvm'>
<bootmenu enable='yes'/>
<boot dev='cdrom'/>
<boot dev='hd'/>
<bios useserial='yes' rebootTimeout='5'/>
</os>
<features>
<acpi/>

View File

@ -77,7 +77,7 @@ class DBNetworkElement(db.Model):
id = db.Column(db.Integer, primary_key=True)
network_template = db.Column(db.Integer, db.ForeignKey("network_template.id"), nullable=False)
vni = db.Column(db.Integer, nullable=False)
vni = db.Column(db.Text, nullable=False)
def __init__(self, network_template, vni):
self.network_template = network_template

View File

@ -30,13 +30,15 @@ import lxml.etree
from werkzeug.formparser import parse_form_data
from pvcapid.Daemon import config
from daemon_lib.zkhandler import ZKConnection
import daemon_lib.common as pvc_common
import daemon_lib.ceph as pvc_ceph
import pvcapid.provisioner as provisioner
config = None # Set in this namespace by flaskapi
#
# Common functions
@ -110,7 +112,8 @@ def list_ova(limit, is_fuzzy=True):
return {'message': 'No OVAs found.'}, 404
def delete_ova(name):
@ZKConnection(config)
def delete_ova(zkhandler, name):
ova_data, retcode = list_ova(name, is_fuzzy=False)
if retcode != 200:
retmsg = {'message': 'The OVA "{}" does not exist.'.format(name)}
@ -127,9 +130,8 @@ def delete_ova(name):
volumes = cur.fetchall()
# Remove each volume for this OVA
zk_conn = pvc_common.startZKConnection(config['coordinators'])
for volume in volumes:
pvc_ceph.remove_volume(zk_conn, volume.get('pool'), volume.get('volume_name'))
pvc_ceph.remove_volume(zkhandler, volume.get('pool'), volume.get('volume_name'))
# Delete the volume entries from the database
query = "DELETE FROM ova_volume WHERE ova = %s;"
@ -160,7 +162,8 @@ def delete_ova(name):
return retmsg, retcode
def upload_ova(pool, name, ova_size):
@ZKConnection(config)
def upload_ova(zkhandler, pool, name, ova_size):
ova_archive = None
# Cleanup function
@ -168,21 +171,17 @@ def upload_ova(pool, name, ova_size):
# Close the OVA archive
if ova_archive:
ova_archive.close()
zk_conn = pvc_common.startZKConnection(config['coordinators'])
# Unmap the OVA temporary blockdev
retflag, retdata = pvc_ceph.unmap_volume(zk_conn, pool, "ova_{}".format(name))
retflag, retdata = pvc_ceph.unmap_volume(zkhandler, pool, "ova_{}".format(name))
# Remove the OVA temporary blockdev
retflag, retdata = pvc_ceph.remove_volume(zk_conn, pool, "ova_{}".format(name))
pvc_common.stopZKConnection(zk_conn)
retflag, retdata = pvc_ceph.remove_volume(zkhandler, pool, "ova_{}".format(name))
# Normalize the OVA size to bytes
ova_size_bytes = pvc_ceph.format_bytes_fromhuman(ova_size)
ova_size = '{}B'.format(ova_size_bytes)
# Verify that the cluster has enough space to store the OVA volumes (2x OVA size, temporarily, 1x permanently)
zk_conn = pvc_common.startZKConnection(config['coordinators'])
pool_information = pvc_ceph.getPoolInformation(zk_conn, pool)
pvc_common.stopZKConnection(zk_conn)
pool_information = pvc_ceph.getPoolInformation(zkhandler, pool)
pool_free_space_bytes = int(pool_information['stats']['free_bytes'])
if ova_size_bytes * 2 >= pool_free_space_bytes:
output = {
@ -196,9 +195,7 @@ def upload_ova(pool, name, ova_size):
return output, retcode
# Create a temporary OVA blockdev
zk_conn = pvc_common.startZKConnection(config['coordinators'])
retflag, retdata = pvc_ceph.add_volume(zk_conn, pool, "ova_{}".format(name), ova_size)
pvc_common.stopZKConnection(zk_conn)
retflag, retdata = pvc_ceph.add_volume(zkhandler, pool, "ova_{}".format(name), ova_size)
if not retflag:
output = {
'message': retdata.replace('\"', '\'')
@ -208,9 +205,7 @@ def upload_ova(pool, name, ova_size):
return output, retcode
# Map the temporary OVA blockdev
zk_conn = pvc_common.startZKConnection(config['coordinators'])
retflag, retdata = pvc_ceph.map_volume(zk_conn, pool, "ova_{}".format(name))
pvc_common.stopZKConnection(zk_conn)
retflag, retdata = pvc_ceph.map_volume(zkhandler, pool, "ova_{}".format(name))
if not retflag:
output = {
'message': retdata.replace('\"', '\'')
@ -276,15 +271,11 @@ def upload_ova(pool, name, ova_size):
dev_size = '{}B'.format(pvc_ceph.format_bytes_fromhuman(dev_size_raw))
def cleanup_img_maps():
zk_conn = pvc_common.startZKConnection(config['coordinators'])
# Unmap the temporary blockdev
retflag, retdata = pvc_ceph.unmap_volume(zk_conn, pool, volume)
pvc_common.stopZKConnection(zk_conn)
retflag, retdata = pvc_ceph.unmap_volume(zkhandler, pool, volume)
# Create the blockdev
zk_conn = pvc_common.startZKConnection(config['coordinators'])
retflag, retdata = pvc_ceph.add_volume(zk_conn, pool, volume, dev_size)
pvc_common.stopZKConnection(zk_conn)
retflag, retdata = pvc_ceph.add_volume(zkhandler, pool, volume, dev_size)
if not retflag:
output = {
'message': retdata.replace('\"', '\'')
@ -295,9 +286,7 @@ def upload_ova(pool, name, ova_size):
return output, retcode
# Map the blockdev
zk_conn = pvc_common.startZKConnection(config['coordinators'])
retflag, retdata = pvc_ceph.map_volume(zk_conn, pool, volume)
pvc_common.stopZKConnection(zk_conn)
retflag, retdata = pvc_ceph.map_volume(zkhandler, pool, volume)
if not retflag:
output = {
'message': retdata.replace('\"', '\'')

View File

@ -24,7 +24,9 @@ import psycopg2
import psycopg2.extras
import re
from distutils.util import strtobool as dustrtobool
from pvcapid.Daemon import config, strtobool
from daemon_lib.zkhandler import ZKHandler
import daemon_lib.common as pvc_common
import daemon_lib.node as pvc_node
@ -36,19 +38,6 @@ import pvcapid.libvirt_schema as libvirt_schema
from pvcapid.ova import list_ova
config = None # Set in this namespace by flaskapi
def strtobool(stringv):
if stringv is None:
return False
if isinstance(stringv, bool):
return bool(stringv)
try:
return bool(dustrtobool(stringv))
except Exception:
return False
#
# Exceptions (used by Celery tasks)
@ -230,6 +219,9 @@ def create_template_system(name, vcpu_count, vram_mb, serial=False, vnc=False, v
retcode = 400
return retmsg, retcode
if node_selector == 'none':
node_selector = None
query = "INSERT INTO system_template (name, vcpu_count, vram_mb, serial, vnc, vnc_bind, node_limit, node_selector, node_autostart, migration_method, ova) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s);"
args = (name, vcpu_count, vram_mb, serial, vnc, vnc_bind, node_limit, node_selector, node_autostart, migration_method, ova)
@ -276,7 +268,7 @@ def create_template_network_element(name, vni):
networks = []
found_vni = False
for network in networks:
if int(network['vni']) == int(vni):
if network['vni'] == vni:
found_vni = True
if found_vni:
retmsg = {'message': 'The VNI "{}" in network template "{}" already exists.'.format(vni, name)}
@ -425,6 +417,9 @@ def modify_template_system(name, vcpu_count=None, vram_mb=None, serial=None, vnc
fields.append({'field': 'node_limit', 'data': node_limit})
if node_selector is not None:
if node_selector == 'none':
node_selector = 'None'
fields.append({'field': 'node_selector', 'data': node_selector})
if node_autostart is not None:
@ -1070,6 +1065,8 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
import datetime
import random
temp_dir = None
time.sleep(2)
print("Starting provisioning of VM '{}' with profile '{}'".format(vm_name, vm_profile))
@ -1078,14 +1075,13 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
try:
db_conn, db_cur = open_database(config)
except Exception:
print('FATAL - failed to connect to Postgres')
raise Exception
raise ClusterError('Failed to connect to Postgres')
try:
zk_conn = pvc_common.startZKConnection(config['coordinators'])
zkhandler = ZKHandler(config)
zkhandler.connect()
except Exception:
print('FATAL - failed to connect to Zookeeper')
raise Exception
raise ClusterError('Failed to connect to Zookeeper')
# Phase 1 - setup
# * Get the profile elements
@ -1187,11 +1183,11 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
time.sleep(1)
# Verify that a VM with this name does not already exist
if pvc_vm.searchClusterByName(zk_conn, vm_name):
if pvc_vm.searchClusterByName(zkhandler, vm_name):
raise ClusterError("A VM with the name '{}' already exists in the cluster.".format(vm_name))
# Verify that at least one host has enough free RAM to run the VM
_discard, nodes = pvc_node.get_list(zk_conn, None)
_discard, nodes = pvc_node.get_list(zkhandler, None)
target_node = None
last_free = 0
for node in nodes:
@ -1212,10 +1208,10 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
print('Selecting target node "{}" with "{}" MB free RAM'.format(target_node, last_free))
# Verify that all configured networks are present on the cluster
cluster_networks, _discard = pvc_network.getClusterNetworkList(zk_conn)
cluster_networks, _discard = pvc_network.getClusterNetworkList(zkhandler)
for network in vm_data['networks']:
vni = str(network['vni'])
if vni not in cluster_networks:
if vni not in cluster_networks and vni not in ['upstream', 'cluster', 'storage']:
raise ClusterError('The network VNI "{}" is not present on the cluster.'.format(vni))
print("All configured networks for VM are valid")
@ -1224,7 +1220,7 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
pools = dict()
for volume in vm_data['volumes']:
if volume.get('source_volume') is not None:
volume_data = pvc_ceph.getVolumeInformation(zk_conn, volume['pool'], volume['source_volume'])
volume_data = pvc_ceph.getVolumeInformation(zkhandler, volume['pool'], volume['source_volume'])
if not volume_data:
raise ClusterError('The source volume {}/{} could not be found.'.format(volume['pool'], volume['source_volume']))
if not volume['pool'] in pools:
@ -1239,7 +1235,7 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
for pool in pools:
try:
pool_information = pvc_ceph.getPoolInformation(zk_conn, pool)
pool_information = pvc_ceph.getPoolInformation(zkhandler, pool)
if not pool_information:
raise
except Exception:
@ -1327,11 +1323,38 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
vm_architecture=system_architecture
)
# Add disk devices
monitor_list = list()
coordinator_names = config['storage_hosts']
for coordinator in coordinator_names:
monitor_list.append("{}.{}".format(coordinator, config['storage_domain']))
ceph_storage_secret = config['ceph_storage_secret_uuid']
for volume in vm_data['volumes']:
vm_schema += libvirt_schema.devices_disk_header.format(
ceph_storage_secret=ceph_storage_secret,
disk_pool=volume['pool'],
vm_name=vm_name,
disk_id=volume['disk_id']
)
for monitor in monitor_list:
vm_schema += libvirt_schema.devices_disk_coordinator.format(
coordinator_name=monitor,
coordinator_ceph_mon_port=config['ceph_monitor_port']
)
vm_schema += libvirt_schema.devices_disk_footer
vm_schema += libvirt_schema.devices_vhostmd
# Add network devices
network_id = 0
for network in vm_data['networks']:
vni = network['vni']
eth_bridge = "vmbr{}".format(vni)
if vni in ['upstream', 'cluster', 'storage']:
eth_bridge = "br{}".format(vni)
else:
eth_bridge = "vmbr{}".format(vni)
vm_id_hex = '{:x}'.format(int(vm_id % 16))
net_id_hex = '{:x}'.format(int(network_id % 16))
@ -1365,30 +1388,6 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
network_id += 1
# Add disk devices
monitor_list = list()
coordinator_names = config['storage_hosts']
for coordinator in coordinator_names:
monitor_list.append("{}.{}".format(coordinator, config['storage_domain']))
ceph_storage_secret = config['ceph_storage_secret_uuid']
for volume in vm_data['volumes']:
vm_schema += libvirt_schema.devices_disk_header.format(
ceph_storage_secret=ceph_storage_secret,
disk_pool=volume['pool'],
vm_name=vm_name,
disk_id=volume['disk_id']
)
for monitor in monitor_list:
vm_schema += libvirt_schema.devices_disk_coordinator.format(
coordinator_name=monitor,
coordinator_ceph_mon_port=config['ceph_monitor_port']
)
vm_schema += libvirt_schema.devices_disk_footer
vm_schema += libvirt_schema.devices_vhostmd
# Add default devices
vm_schema += libvirt_schema.devices_default
@ -1437,7 +1436,7 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
node_selector = vm_data['system_details']['node_selector']
node_autostart = vm_data['system_details']['node_autostart']
migration_method = vm_data['system_details']['migration_method']
retcode, retmsg = pvc_vm.define_vm(zk_conn, vm_schema.strip(), target_node, node_limit, node_selector, node_autostart, migration_method, vm_profile, initial_state='provision')
retcode, retmsg = pvc_vm.define_vm(zkhandler, vm_schema.strip(), target_node, node_limit, node_selector, node_autostart, migration_method, vm_profile, initial_state='provision')
print(retmsg)
else:
print("Skipping VM definition")
@ -1449,12 +1448,12 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
for volume in vm_data['volumes']:
if volume.get('source_volume') is not None:
success, message = pvc_ceph.clone_volume(zk_conn, volume['pool'], volume['source_volume'], "{}_{}".format(vm_name, volume['disk_id']))
success, message = pvc_ceph.clone_volume(zkhandler, volume['pool'], volume['source_volume'], "{}_{}".format(vm_name, volume['disk_id']))
print(message)
if not success:
raise ProvisioningError('Failed to clone volume "{}" to "{}".'.format(volume['source_volume'], volume['disk_id']))
else:
success, message = pvc_ceph.add_volume(zk_conn, volume['pool'], "{}_{}".format(vm_name, volume['disk_id']), "{}G".format(volume['disk_size_gb']))
success, message = pvc_ceph.add_volume(zkhandler, volume['pool'], "{}_{}".format(vm_name, volume['disk_id']), "{}G".format(volume['disk_size_gb']))
print(message)
if not success:
raise ProvisioningError('Failed to create volume "{}".'.format(volume['disk_id']))
@ -1478,11 +1477,11 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
print('Converting {} source volume {} to raw format on {}'.format(volume['volume_format'], src_volume, dst_volume))
# Map the target RBD device
retcode, retmsg = pvc_ceph.map_volume(zk_conn, volume['pool'], dst_volume_name)
retcode, retmsg = pvc_ceph.map_volume(zkhandler, volume['pool'], dst_volume_name)
if not retcode:
raise ProvisioningError('Failed to map destination volume "{}": {}'.format(dst_volume_name, retmsg))
# Map the source RBD device
retcode, retmsg = pvc_ceph.map_volume(zk_conn, volume['pool'], src_volume_name)
retcode, retmsg = pvc_ceph.map_volume(zkhandler, volume['pool'], src_volume_name)
if not retcode:
raise ProvisioningError('Failed to map source volume "{}": {}'.format(src_volume_name, retmsg))
# Convert from source to target
@ -1497,11 +1496,11 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
raise ProvisioningError('Failed to convert {} volume "{}" to raw volume "{}": {}'.format(volume['volume_format'], src_volume, dst_volume, stderr))
# Unmap the source RBD device (don't bother later)
retcode, retmsg = pvc_ceph.unmap_volume(zk_conn, volume['pool'], src_volume_name)
retcode, retmsg = pvc_ceph.unmap_volume(zkhandler, volume['pool'], src_volume_name)
if not retcode:
raise ProvisioningError('Failed to unmap source volume "{}": {}'.format(src_volume_name, retmsg))
# Unmap the target RBD device (don't bother later)
retcode, retmsg = pvc_ceph.unmap_volume(zk_conn, volume['pool'], dst_volume_name)
retcode, retmsg = pvc_ceph.unmap_volume(zkhandler, volume['pool'], dst_volume_name)
if not retcode:
raise ProvisioningError('Failed to unmap destination volume "{}": {}'.format(dst_volume_name, retmsg))
else:
@ -1521,7 +1520,7 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
filesystem_args = ' '.join(filesystem_args_list)
# Map the RBD device
retcode, retmsg = pvc_ceph.map_volume(zk_conn, volume['pool'], dst_volume_name)
retcode, retmsg = pvc_ceph.map_volume(zkhandler, volume['pool'], dst_volume_name)
if not retcode:
raise ProvisioningError('Failed to map volume "{}": {}'.format(dst_volume, retmsg))
@ -1659,8 +1658,10 @@ def create_vm(self, vm_name, vm_profile, define_vm=True, start_vm=True, script_r
if start_vm:
self.update_state(state='RUNNING', meta={'current': 10, 'total': 10, 'status': 'Starting VM'})
time.sleep(1)
retcode, retmsg = pvc_vm.start_vm(zk_conn, vm_name)
retcode, retmsg = pvc_vm.start_vm(zkhandler, vm_name)
print(retmsg)
pvc_common.stopZKConnection(zk_conn)
zkhandler.disconnect()
del zkhandler
return {'status': 'VM "{}" with profile "{}" has been provisioned and started successfully'.format(vm_name, vm_profile), 'current': 10, 'total': 10}

View File

@ -10,9 +10,11 @@ new_ver="${base_ver}~git-$(git rev-parse --short HEAD)"
echo ${new_ver} >&3
# Back up the existing changelog and Daemon.py files
tmpdir=$( mktemp -d )
cp -a debian/changelog node-daemon/pvcnoded/Daemon.py ${tmpdir}/
cp -a debian/changelog client-cli/setup.py ${tmpdir}/
cp -a node-daemon/pvcnoded/Daemon.py ${tmpdir}/node-Daemon.py
cp -a api-daemon/pvcapid/Daemon.py ${tmpdir}/api-Daemon.py
# Replace the "base" version with the git revision version
sed -i "s/version = '${base_ver}'/version = '${new_ver}'/" node-daemon/pvcnoded/Daemon.py
sed -i "s/version = '${base_ver}'/version = '${new_ver}'/" node-daemon/pvcnoded/Daemon.py api-daemon/pvcapid/Daemon.py client-cli/setup.py
sed -i "s/${base_ver}-0/${new_ver}/" debian/changelog
cat <<EOF > debian/changelog
pvc (${new_ver}) unstable; urgency=medium
@ -27,7 +29,10 @@ dh_make -p pvc_${new_ver} --createorig --single --yes
dpkg-buildpackage -us -uc
# Restore original changelog and Daemon.py files
cp -a ${tmpdir}/changelog debian/changelog
cp -a ${tmpdir}/Daemon.py node-daemon/pvcnoded/Daemon.py
cp -a ${tmpdir}/setup.py client-cli/setup.py
cp -a ${tmpdir}/node-Daemon.py node-daemon/pvcnoded/Daemon.py
cp -a ${tmpdir}/api-Daemon.py api-daemon/pvcapid/Daemon.py
# Clean up
rm -r ${tmpdir}
dh_clean

View File

@ -7,7 +7,7 @@ if [[ -z ${new_version} ]]; then
exit 1
fi
current_version="$( grep 'version = ' node-daemon/pvcnoded/Daemon.py | awk -F "'" '{ print $2 }' )"
current_version="$( cat .version )"
echo "${current_version} -> ${new_version}"
changelog_file=$( mktemp )
@ -18,6 +18,8 @@ changelog="$( cat ${changelog_file} | grep -v '^#' | sed 's/^*/ */' )"
sed -i "s,version = '${current_version}',version = '${new_version}'," node-daemon/pvcnoded/Daemon.py
sed -i "s,version = '${current_version}',version = '${new_version}'," api-daemon/pvcapid/Daemon.py
sed -i "s,version='${current_version}',version='${new_version}'," client-cli/setup.py
echo ${new_version} > .version
readme_tmpdir=$( mktemp -d )
cp README.md ${readme_tmpdir}/
@ -47,7 +49,7 @@ echo -e "${deb_changelog_new}" >> ${deb_changelog_file}
echo -e "${deb_changelog_orig}" >> ${deb_changelog_file}
mv ${deb_changelog_file} debian/changelog
git add node-daemon/pvcnoded/Daemon.py api-daemon/pvcapid/Daemon.py README.md docs/index.md debian/changelog
git add node-daemon/pvcnoded/Daemon.py api-daemon/pvcapid/Daemon.py client-cli/setup.py README.md docs/index.md debian/changelog .version
git commit -v
echo

View File

View File

View File

@ -24,8 +24,8 @@ import math
from requests_toolbelt.multipart.encoder import MultipartEncoder, MultipartEncoderMonitor
import cli_lib.ansiprint as ansiprint
from cli_lib.common import UploadProgressBar, call_api
import pvc.cli_lib.ansiprint as ansiprint
from pvc.cli_lib.common import UploadProgressBar, call_api
#
# Supplemental functions
@ -419,6 +419,21 @@ def format_list_osd(osd_list):
osd_rddata_length = _osd_rddata_length
# Format the output header
osd_list_output.append('{bold}{osd_header: <{osd_header_length}} {state_header: <{state_header_length}} {details_header: <{details_header_length}} {read_header: <{read_header_length}} {write_header: <{write_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
osd_header_length=osd_id_length + osd_node_length + 1,
state_header_length=osd_up_length + osd_in_length + 1,
details_header_length=osd_size_length + osd_pgs_length + osd_weight_length + osd_reweight_length + osd_used_length + osd_free_length + osd_util_length + osd_var_length + 7,
read_header_length=osd_rdops_length + osd_rddata_length + 1,
write_header_length=osd_wrops_length + osd_wrdata_length + 1,
osd_header='OSDs ' + ''.join(['-' for _ in range(5, osd_id_length + osd_node_length)]),
state_header='State ' + ''.join(['-' for _ in range(6, osd_up_length + osd_in_length)]),
details_header='Details ' + ''.join(['-' for _ in range(8, osd_size_length + osd_pgs_length + osd_weight_length + osd_reweight_length + osd_used_length + osd_free_length + osd_util_length + osd_var_length + 6)]),
read_header='Read ' + ''.join(['-' for _ in range(5, osd_rdops_length + osd_rddata_length)]),
write_header='Write ' + ''.join(['-' for _ in range(6, osd_wrops_length + osd_wrdata_length)]))
)
osd_list_output.append('{bold}\
{osd_id: <{osd_id_length}} \
{osd_node: <{osd_node_length}} \
@ -428,13 +443,13 @@ def format_list_osd(osd_list):
{osd_pgs: <{osd_pgs_length}} \
{osd_weight: <{osd_weight_length}} \
{osd_reweight: <{osd_reweight_length}} \
Sp: {osd_used: <{osd_used_length}} \
{osd_used: <{osd_used_length}} \
{osd_free: <{osd_free_length}} \
{osd_util: <{osd_util_length}} \
{osd_var: <{osd_var_length}} \
Rd: {osd_rdops: <{osd_rdops_length}} \
{osd_rdops: <{osd_rdops_length}} \
{osd_rddata: <{osd_rddata_length}} \
Wr: {osd_wrops: <{osd_wrops_length}} \
{osd_wrops: <{osd_wrops_length}} \
{osd_wrdata: <{osd_wrdata_length}} \
{end_bold}'.format(
bold=ansiprint.bold(),
@ -495,13 +510,13 @@ Wr: {osd_wrops: <{osd_wrops_length}} \
{osd_pgs: <{osd_pgs_length}} \
{osd_weight: <{osd_weight_length}} \
{osd_reweight: <{osd_reweight_length}} \
{osd_used: <{osd_used_length}} \
{osd_used: <{osd_used_length}} \
{osd_free: <{osd_free_length}} \
{osd_util: <{osd_util_length}} \
{osd_var: <{osd_var_length}} \
{osd_rdops: <{osd_rdops_length}} \
{osd_rdops: <{osd_rdops_length}} \
{osd_rddata: <{osd_rddata_length}} \
{osd_wrops: <{osd_wrops_length}} \
{osd_wrops: <{osd_wrops_length}} \
{osd_wrdata: <{osd_wrdata_length}} \
{end_bold}'.format(
bold='',
@ -648,7 +663,7 @@ def format_list_pool(pool_list):
pool_name_length = 5
pool_id_length = 3
pool_used_length = 5
pool_usedpct_length = 5
pool_usedpct_length = 6
pool_free_length = 5
pool_num_objects_length = 6
pool_num_clones_length = 7
@ -737,19 +752,32 @@ def format_list_pool(pool_list):
pool_read_data_length = _pool_read_data_length
# Format the output header
pool_list_output.append('{bold}{pool_header: <{pool_header_length}} {objects_header: <{objects_header_length}} {read_header: <{read_header_length}} {write_header: <{write_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
pool_header_length=pool_id_length + pool_name_length + pool_used_length + pool_usedpct_length + pool_free_length + 4,
objects_header_length=pool_num_objects_length + pool_num_clones_length + pool_num_copies_length + pool_num_degraded_length + 3,
read_header_length=pool_read_ops_length + pool_read_data_length + 1,
write_header_length=pool_write_ops_length + pool_write_data_length + 1,
pool_header='Pools ' + ''.join(['-' for _ in range(6, pool_id_length + pool_name_length + pool_used_length + pool_usedpct_length + pool_free_length + 3)]),
objects_header='Objects ' + ''.join(['-' for _ in range(8, pool_num_objects_length + pool_num_clones_length + pool_num_copies_length + pool_num_degraded_length + 2)]),
read_header='Read ' + ''.join(['-' for _ in range(5, pool_read_ops_length + pool_read_data_length)]),
write_header='Write ' + ''.join(['-' for _ in range(6, pool_write_ops_length + pool_write_data_length)]))
)
pool_list_output.append('{bold}\
{pool_id: <{pool_id_length}} \
{pool_name: <{pool_name_length}} \
{pool_used: <{pool_used_length}} \
{pool_usedpct: <{pool_usedpct_length}} \
{pool_free: <{pool_free_length}} \
Obj: {pool_objects: <{pool_objects_length}} \
{pool_objects: <{pool_objects_length}} \
{pool_clones: <{pool_clones_length}} \
{pool_copies: <{pool_copies_length}} \
{pool_degraded: <{pool_degraded_length}} \
Rd: {pool_read_ops: <{pool_read_ops_length}} \
{pool_read_ops: <{pool_read_ops_length}} \
{pool_read_data: <{pool_read_data_length}} \
Wr: {pool_write_ops: <{pool_write_ops_length}} \
{pool_write_ops: <{pool_write_ops_length}} \
{pool_write_data: <{pool_write_data_length}} \
{end_bold}'.format(
bold=ansiprint.bold(),
@ -770,7 +798,7 @@ Wr: {pool_write_ops: <{pool_write_ops_length}} \
pool_id='ID',
pool_name='Name',
pool_used='Used',
pool_usedpct='%',
pool_usedpct='Used%',
pool_free='Free',
pool_objects='Count',
pool_clones='Clones',
@ -790,13 +818,13 @@ Wr: {pool_write_ops: <{pool_write_ops_length}} \
{pool_used: <{pool_used_length}} \
{pool_usedpct: <{pool_usedpct_length}} \
{pool_free: <{pool_free_length}} \
{pool_objects: <{pool_objects_length}} \
{pool_objects: <{pool_objects_length}} \
{pool_clones: <{pool_clones_length}} \
{pool_copies: <{pool_copies_length}} \
{pool_degraded: <{pool_degraded_length}} \
{pool_read_ops: <{pool_read_ops_length}} \
{pool_read_ops: <{pool_read_ops_length}} \
{pool_read_data: <{pool_read_data_length}} \
{pool_write_ops: <{pool_write_ops_length}} \
{pool_write_ops: <{pool_write_ops_length}} \
{pool_write_data: <{pool_write_data_length}} \
{end_bold}'.format(
bold='',
@ -1057,6 +1085,15 @@ def format_list_volume(volume_list):
volume_features_length = _volume_features_length
# Format the output header
volume_list_output.append('{bold}{volume_header: <{volume_header_length}} {details_header: <{details_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
volume_header_length=volume_name_length + volume_pool_length + 1,
details_header_length=volume_size_length + volume_objects_length + volume_order_length + volume_format_length + volume_features_length + 4,
volume_header='Volumes ' + ''.join(['-' for _ in range(8, volume_name_length + volume_pool_length)]),
details_header='Details ' + ''.join(['-' for _ in range(8, volume_size_length + volume_objects_length + volume_order_length + volume_format_length + volume_features_length + 3)]))
)
volume_list_output.append('{bold}\
{volume_name: <{volume_name_length}} \
{volume_pool: <{volume_pool_length}} \
@ -1084,7 +1121,7 @@ def format_list_volume(volume_list):
volume_features='Features')
)
for volume_information in volume_list:
for volume_information in sorted(volume_list, key=lambda v: v['pool'] + v['name']):
volume_list_output.append('{bold}\
{volume_name: <{volume_name_length}} \
{volume_pool: <{volume_pool_length}} \
@ -1112,7 +1149,7 @@ def format_list_volume(volume_list):
volume_features=','.join(volume_information['stats']['features']))
)
return '\n'.join(sorted(volume_list_output))
return '\n'.join(volume_list_output)
#
@ -1263,6 +1300,13 @@ def format_list_snapshot(snapshot_list):
snapshot_pool_length = _snapshot_pool_length
# Format the output header
snapshot_list_output.append('{bold}{snapshot_header: <{snapshot_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
snapshot_header_length=snapshot_name_length + snapshot_volume_length + snapshot_pool_length + 2,
snapshot_header='Snapshots ' + ''.join(['-' for _ in range(10, snapshot_name_length + snapshot_volume_length + snapshot_pool_length + 1)]))
)
snapshot_list_output.append('{bold}\
{snapshot_name: <{snapshot_name_length}} \
{snapshot_volume: <{snapshot_volume_length}} \
@ -1278,7 +1322,7 @@ def format_list_snapshot(snapshot_list):
snapshot_pool='Pool')
)
for snapshot_information in snapshot_list:
for snapshot_information in sorted(snapshot_list, key=lambda s: s['pool'] + s['volume'] + s['snapshot']):
snapshot_name = snapshot_information['snapshot']
snapshot_volume = snapshot_information['volume']
snapshot_pool = snapshot_information['pool']
@ -1297,7 +1341,7 @@ def format_list_snapshot(snapshot_list):
snapshot_pool=snapshot_pool)
)
return '\n'.join(sorted(snapshot_list_output))
return '\n'.join(snapshot_list_output)
#
@ -1365,6 +1409,11 @@ def format_list_benchmark(config, benchmark_information):
benchmark_bandwidth_length[test] = 7
benchmark_iops_length[test] = 6
benchmark_seq_bw_length = 15
benchmark_seq_iops_length = 10
benchmark_rand_bw_length = 15
benchmark_rand_iops_length = 10
for benchmark in benchmark_information:
benchmark_job = benchmark['job']
_benchmark_job_length = len(benchmark_job)
@ -1373,53 +1422,68 @@ def format_list_benchmark(config, benchmark_information):
if benchmark['benchmark_result'] == 'Running':
continue
benchmark_data = json.loads(benchmark['benchmark_result'])
benchmark_bandwidth = dict()
benchmark_iops = dict()
for test in ["seq_read", "seq_write", "rand_read_4K", "rand_write_4K"]:
benchmark_data = json.loads(benchmark['benchmark_result'])
benchmark_bandwidth[test] = format_bytes_tohuman(int(benchmark_data[test]['overall']['bandwidth']) * 1024)
benchmark_iops[test] = format_ops_tohuman(int(benchmark_data[test]['overall']['iops']))
_benchmark_bandwidth_length = len(benchmark_bandwidth[test]) + 1
if _benchmark_bandwidth_length > benchmark_bandwidth_length[test]:
benchmark_bandwidth_length[test] = _benchmark_bandwidth_length
seq_benchmark_bandwidth = "{} / {}".format(benchmark_bandwidth['seq_read'], benchmark_bandwidth['seq_write'])
seq_benchmark_iops = "{} / {}".format(benchmark_iops['seq_read'], benchmark_iops['seq_write'])
rand_benchmark_bandwidth = "{} / {}".format(benchmark_bandwidth['rand_read_4K'], benchmark_bandwidth['rand_write_4K'])
rand_benchmark_iops = "{} / {}".format(benchmark_iops['rand_read_4K'], benchmark_iops['rand_write_4K'])
_benchmark_iops_length = len(benchmark_iops[test]) + 1
if _benchmark_iops_length > benchmark_bandwidth_length[test]:
benchmark_iops_length[test] = _benchmark_iops_length
_benchmark_seq_bw_length = len(seq_benchmark_bandwidth) + 1
if _benchmark_seq_bw_length > benchmark_seq_bw_length:
benchmark_seq_bw_length = _benchmark_seq_bw_length
_benchmark_seq_iops_length = len(seq_benchmark_iops) + 1
if _benchmark_seq_iops_length > benchmark_seq_iops_length:
benchmark_seq_iops_length = _benchmark_seq_iops_length
_benchmark_rand_bw_length = len(rand_benchmark_bandwidth) + 1
if _benchmark_rand_bw_length > benchmark_rand_bw_length:
benchmark_rand_bw_length = _benchmark_rand_bw_length
_benchmark_rand_iops_length = len(rand_benchmark_iops) + 1
if _benchmark_rand_iops_length > benchmark_rand_iops_length:
benchmark_rand_iops_length = _benchmark_rand_iops_length
# Format the output header line 1
benchmark_list_output.append('{bold}\
{benchmark_job: <{benchmark_job_length}} \
{seq_header: <{seq_header_length}} \
{rand_header: <{rand_header_length}} \
{benchmark_job: <{benchmark_job_length}} \
{seq_header: <{seq_header_length}} \
{rand_header: <{rand_header_length}}\
{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
benchmark_job_length=benchmark_job_length,
seq_header_length=benchmark_bandwidth_length['seq_read'] + benchmark_bandwidth_length['seq_write'] + benchmark_iops_length['seq_read'] + benchmark_iops_length['seq_write'] + 3,
rand_header_length=benchmark_bandwidth_length['rand_read_4K'] + benchmark_bandwidth_length['rand_write_4K'] + benchmark_iops_length['rand_read_4K'] + benchmark_iops_length['rand_write_4K'] + 2,
benchmark_job='Benchmark Job',
seq_header='Sequential (4M blocks):',
rand_header='Random (4K blocks):')
seq_header_length=benchmark_seq_bw_length + benchmark_seq_iops_length + 1,
rand_header_length=benchmark_rand_bw_length + benchmark_rand_iops_length + 1,
benchmark_job='Benchmarks ' + ''.join(['-' for _ in range(11, benchmark_job_length - 1)]),
seq_header='Sequential (4M blocks) ' + ''.join(['-' for _ in range(23, benchmark_seq_bw_length + benchmark_seq_iops_length)]),
rand_header='Random (4K blocks) ' + ''.join(['-' for _ in range(19, benchmark_rand_bw_length + benchmark_rand_iops_length)]))
)
benchmark_list_output.append('{bold}\
{benchmark_job: <{benchmark_job_length}} \
{seq_benchmark_bandwidth: <{seq_benchmark_bandwidth_length}} \
{seq_benchmark_iops: <{seq_benchmark_iops_length}} \
{benchmark_job: <{benchmark_job_length}} \
{seq_benchmark_bandwidth: <{seq_benchmark_bandwidth_length}} \
{seq_benchmark_iops: <{seq_benchmark_iops_length}} \
{rand_benchmark_bandwidth: <{rand_benchmark_bandwidth_length}} \
{rand_benchmark_iops: <{rand_benchmark_iops_length}} \
{rand_benchmark_iops: <{rand_benchmark_iops_length}}\
{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
benchmark_job_length=benchmark_job_length,
seq_benchmark_bandwidth_length=benchmark_bandwidth_length['seq_read'] + benchmark_bandwidth_length['seq_write'] + 2,
seq_benchmark_iops_length=benchmark_iops_length['seq_read'] + benchmark_iops_length['seq_write'],
rand_benchmark_bandwidth_length=benchmark_bandwidth_length['rand_read_4K'] + benchmark_bandwidth_length['rand_write_4K'] + 1,
rand_benchmark_iops_length=benchmark_iops_length['rand_read_4K'] + benchmark_iops_length['rand_write_4K'],
benchmark_job='',
seq_benchmark_bandwidth_length=benchmark_seq_bw_length,
seq_benchmark_iops_length=benchmark_seq_iops_length,
rand_benchmark_bandwidth_length=benchmark_rand_bw_length,
rand_benchmark_iops_length=benchmark_rand_iops_length,
benchmark_job='Job',
seq_benchmark_bandwidth='R/W Bandwith/s',
seq_benchmark_iops='R/W IOPS',
rand_benchmark_bandwidth='R/W Bandwith/s',
@ -1448,19 +1512,19 @@ def format_list_benchmark(config, benchmark_information):
rand_benchmark_iops = "{} / {}".format(benchmark_iops['rand_read_4K'], benchmark_iops['rand_write_4K'])
benchmark_list_output.append('{bold}\
{benchmark_job: <{benchmark_job_length}} \
{seq_benchmark_bandwidth: <{seq_benchmark_bandwidth_length}} \
{seq_benchmark_iops: <{seq_benchmark_iops_length}} \
{benchmark_job: <{benchmark_job_length}} \
{seq_benchmark_bandwidth: <{seq_benchmark_bandwidth_length}} \
{seq_benchmark_iops: <{seq_benchmark_iops_length}} \
{rand_benchmark_bandwidth: <{rand_benchmark_bandwidth_length}} \
{rand_benchmark_iops: <{rand_benchmark_iops_length}} \
{rand_benchmark_iops: <{rand_benchmark_iops_length}}\
{end_bold}'.format(
bold='',
end_bold='',
benchmark_job_length=benchmark_job_length,
seq_benchmark_bandwidth_length=benchmark_bandwidth_length['seq_read'] + benchmark_bandwidth_length['seq_write'] + 2,
seq_benchmark_iops_length=benchmark_iops_length['seq_read'] + benchmark_iops_length['seq_write'],
rand_benchmark_bandwidth_length=benchmark_bandwidth_length['rand_read_4K'] + benchmark_bandwidth_length['rand_write_4K'] + 1,
rand_benchmark_iops_length=benchmark_iops_length['rand_read_4K'] + benchmark_iops_length['rand_write_4K'],
seq_benchmark_bandwidth_length=benchmark_seq_bw_length,
seq_benchmark_iops_length=benchmark_seq_iops_length,
rand_benchmark_bandwidth_length=benchmark_rand_bw_length,
rand_benchmark_iops_length=benchmark_rand_iops_length,
benchmark_job=benchmark_job,
seq_benchmark_bandwidth=seq_benchmark_bandwidth,
seq_benchmark_iops=seq_benchmark_iops,

View File

@ -21,20 +21,21 @@
import json
import cli_lib.ansiprint as ansiprint
from cli_lib.common import call_api
import pvc.cli_lib.ansiprint as ansiprint
from pvc.cli_lib.common import call_api
def initialize(config):
def initialize(config, overwrite=False):
"""
Initialize the PVC cluster
API endpoint: GET /api/v1/initialize
API arguments: yes-i-really-mean-it
API arguments: overwrite, yes-i-really-mean-it
API schema: {json_data_object}
"""
params = {
'yes-i-really-mean-it': 'yes'
'yes-i-really-mean-it': 'yes',
'overwrite': overwrite
}
response = call_api(config, 'post', '/initialize', params=params)

View File

@ -20,8 +20,8 @@
###############################################################################
import re
import cli_lib.ansiprint as ansiprint
from cli_lib.common import call_api
import pvc.cli_lib.ansiprint as ansiprint
from pvc.cli_lib.common import call_api
def isValidMAC(macaddr):
@ -360,7 +360,6 @@ def net_acl_add(config, net, direction, description, rule, order):
def net_acl_remove(config, net, description):
"""
Remove a network ACL
@ -378,27 +377,135 @@ def net_acl_remove(config, net, description):
return retstatus, response.json().get('message', '')
#
# SR-IOV functions
#
def net_sriov_pf_list(config, node):
"""
List all PFs on NODE
API endpoint: GET /api/v1/sriov/pf/<node>
API arguments: node={node}
API schema: [{json_data_object},{json_data_object},etc.]
"""
response = call_api(config, 'get', '/sriov/pf/{}'.format(node))
if response.status_code == 200:
return True, response.json()
else:
return False, response.json().get('message', '')
def net_sriov_vf_set(config, node, vf, vlan_id, vlan_qos, tx_rate_min, tx_rate_max, link_state, spoof_check, trust, query_rss):
"""
Mdoify configuration of a SR-IOV VF
API endpoint: PUT /api/v1/sriov/vf/<node>/<vf>
API arguments: vlan_id={vlan_id}, vlan_qos={vlan_qos}, tx_rate_min={tx_rate_min}, tx_rate_max={tx_rate_max},
link_state={link_state}, spoof_check={spoof_check}, trust={trust}, query_rss={query_rss}
API schema: {"message": "{data}"}
"""
params = dict()
# Update any params that we've sent
if vlan_id is not None:
params['vlan_id'] = vlan_id
if vlan_qos is not None:
params['vlan_qos'] = vlan_qos
if tx_rate_min is not None:
params['tx_rate_min'] = tx_rate_min
if tx_rate_max is not None:
params['tx_rate_max'] = tx_rate_max
if link_state is not None:
params['link_state'] = link_state
if spoof_check is not None:
params['spoof_check'] = spoof_check
if trust is not None:
params['trust'] = trust
if query_rss is not None:
params['query_rss'] = query_rss
# Write the new configuration to the API
response = call_api(config, 'put', '/sriov/vf/{node}/{vf}'.format(node=node, vf=vf), params=params)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get('message', '')
def net_sriov_vf_list(config, node, pf=None):
"""
List all VFs on NODE, optionally limited by PF
API endpoint: GET /api/v1/sriov/vf/<node>
API arguments: node={node}, pf={pf}
API schema: [{json_data_object},{json_data_object},etc.]
"""
params = dict()
params['pf'] = pf
response = call_api(config, 'get', '/sriov/vf/{}'.format(node), params=params)
if response.status_code == 200:
return True, response.json()
else:
return False, response.json().get('message', '')
def net_sriov_vf_info(config, node, vf):
"""
Get info about VF on NODE
API endpoint: GET /api/v1/sriov/vf/<node>/<vf>
API arguments:
API schema: [{json_data_object}]
"""
response = call_api(config, 'get', '/sriov/vf/{}/{}'.format(node, vf))
if response.status_code == 200:
if isinstance(response.json(), list) and len(response.json()) != 1:
# No exact match; return not found
return False, "VF not found."
else:
# Return a single instance if the response is a list
if isinstance(response.json(), list):
return True, response.json()[0]
# This shouldn't happen, but is here just in case
else:
return True, response.json()
else:
return False, response.json().get('message', '')
#
# Output display functions
#
def getOutputColours(network_information):
if network_information['ip6']['network'] != "None":
v6_flag_colour = ansiprint.green()
def getColour(value):
if value in ['True', "start"]:
return ansiprint.green()
elif value in ["restart", "shutdown"]:
return ansiprint.yellow()
elif value in ["stop", "fail"]:
return ansiprint.red()
else:
v6_flag_colour = ansiprint.blue()
if network_information['ip4']['network'] != "None":
v4_flag_colour = ansiprint.green()
else:
v4_flag_colour = ansiprint.blue()
return ansiprint.blue()
if network_information['ip6']['dhcp_flag'] == "True":
dhcp6_flag_colour = ansiprint.green()
else:
dhcp6_flag_colour = ansiprint.blue()
if network_information['ip4']['dhcp_flag'] == "True":
dhcp4_flag_colour = ansiprint.green()
else:
dhcp4_flag_colour = ansiprint.blue()
def getOutputColours(network_information):
v6_flag_colour = getColour(network_information['ip6']['network'])
v4_flag_colour = getColour(network_information['ip4']['network'])
dhcp6_flag_colour = getColour(network_information['ip6']['dhcp_flag'])
dhcp4_flag_colour = getColour(network_information['ip4']['dhcp_flag'])
return v6_flag_colour, v4_flag_colour, dhcp6_flag_colour, dhcp4_flag_colour
@ -492,6 +599,14 @@ def format_list(config, network_list):
net_domain_length = _net_domain_length
# Format the string (header)
network_list_output.append('{bold}{networks_header: <{networks_header_length}} {config_header: <{config_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
networks_header_length=net_vni_length + net_description_length + 1,
config_header_length=net_nettype_length + net_domain_length + net_v6_flag_length + net_dhcp6_flag_length + net_v4_flag_length + net_dhcp4_flag_length + 6,
networks_header='Networks ' + ''.join(['-' for _ in range(9, net_vni_length + net_description_length)]),
config_header='Config ' + ''.join(['-' for _ in range(7, net_nettype_length + net_domain_length + net_v6_flag_length + net_dhcp6_flag_length + net_v4_flag_length + net_dhcp4_flag_length + 5)]))
)
network_list_output.append('{bold}\
{net_vni: <{net_vni_length}} \
{net_description: <{net_description_length}} \
@ -522,7 +637,7 @@ def format_list(config, network_list):
net_dhcp4_flag='DHCPv4')
)
for network_information in network_list:
for network_information in sorted(network_list, key=lambda n: int(n['vni'])):
v6_flag_colour, v4_flag_colour, dhcp6_flag_colour, dhcp4_flag_colour = getOutputColours(network_information)
if network_information['ip4']['network'] != "None":
v4_flag = 'True'
@ -569,7 +684,7 @@ def format_list(config, network_list):
colour_off=ansiprint.end())
)
return '\n'.join(sorted(network_list_output))
return '\n'.join(network_list_output)
def format_list_dhcp(dhcp_lease_list):
@ -579,7 +694,7 @@ def format_list_dhcp(dhcp_lease_list):
lease_hostname_length = 9
lease_ip4_address_length = 11
lease_mac_address_length = 13
lease_timestamp_length = 13
lease_timestamp_length = 10
for dhcp_lease_information in dhcp_lease_list:
# hostname column
_lease_hostname_length = len(str(dhcp_lease_information['hostname'])) + 1
@ -593,8 +708,19 @@ def format_list_dhcp(dhcp_lease_list):
_lease_mac_address_length = len(str(dhcp_lease_information['mac_address'])) + 1
if _lease_mac_address_length > lease_mac_address_length:
lease_mac_address_length = _lease_mac_address_length
# timestamp column
_lease_timestamp_length = len(str(dhcp_lease_information['timestamp'])) + 1
if _lease_timestamp_length > lease_timestamp_length:
lease_timestamp_length = _lease_timestamp_length
# Format the string (header)
dhcp_lease_list_output.append('{bold}{lease_header: <{lease_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
lease_header_length=lease_hostname_length + lease_ip4_address_length + lease_mac_address_length + lease_timestamp_length + 3,
lease_header='Leases ' + ''.join(['-' for _ in range(7, lease_hostname_length + lease_ip4_address_length + lease_mac_address_length + lease_timestamp_length + 2)]))
)
dhcp_lease_list_output.append('{bold}\
{lease_hostname: <{lease_hostname_length}} \
{lease_ip4_address: <{lease_ip4_address_length}} \
@ -613,7 +739,7 @@ def format_list_dhcp(dhcp_lease_list):
lease_timestamp='Timestamp')
)
for dhcp_lease_information in dhcp_lease_list:
for dhcp_lease_information in sorted(dhcp_lease_list, key=lambda l: l['hostname']):
dhcp_lease_list_output.append('{bold}\
{lease_hostname: <{lease_hostname_length}} \
{lease_ip4_address: <{lease_ip4_address_length}} \
@ -632,7 +758,7 @@ def format_list_dhcp(dhcp_lease_list):
lease_timestamp=str(dhcp_lease_information['timestamp']))
)
return '\n'.join(sorted(dhcp_lease_list_output))
return '\n'.join(dhcp_lease_list_output)
def format_list_acl(acl_list):
@ -662,6 +788,13 @@ def format_list_acl(acl_list):
acl_rule_length = _acl_rule_length
# Format the string (header)
acl_list_output.append('{bold}{acl_header: <{acl_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
acl_header_length=acl_direction_length + acl_order_length + acl_description_length + acl_rule_length + 3,
acl_header='ACLs ' + ''.join(['-' for _ in range(5, acl_direction_length + acl_order_length + acl_description_length + acl_rule_length + 2)]))
)
acl_list_output.append('{bold}\
{acl_direction: <{acl_direction_length}} \
{acl_order: <{acl_order_length}} \
@ -680,7 +813,7 @@ def format_list_acl(acl_list):
acl_rule='Rule')
)
for acl_information in acl_list:
for acl_information in sorted(acl_list, key=lambda l: l['direction'] + str(l['order'])):
acl_list_output.append('{bold}\
{acl_direction: <{acl_direction_length}} \
{acl_order: <{acl_order_length}} \
@ -699,4 +832,264 @@ def format_list_acl(acl_list):
acl_rule=acl_information['rule'])
)
return '\n'.join(sorted(acl_list_output))
return '\n'.join(acl_list_output)
def format_list_sriov_pf(pf_list):
# The maximum column width of the VFs column
max_vfs_length = 70
# Handle when we get an empty entry
if not pf_list:
pf_list = list()
pf_list_output = []
# Determine optimal column widths
pf_phy_length = 6
pf_mtu_length = 4
pf_vfs_length = 4
for pf_information in pf_list:
# phy column
_pf_phy_length = len(str(pf_information['phy'])) + 1
if _pf_phy_length > pf_phy_length:
pf_phy_length = _pf_phy_length
# mtu column
_pf_mtu_length = len(str(pf_information['mtu'])) + 1
if _pf_mtu_length > pf_mtu_length:
pf_mtu_length = _pf_mtu_length
# vfs column
_pf_vfs_length = len(str(', '.join(pf_information['vfs']))) + 1
if _pf_vfs_length > pf_vfs_length:
pf_vfs_length = _pf_vfs_length
# We handle columnizing very long lists later
if pf_vfs_length > max_vfs_length:
pf_vfs_length = max_vfs_length
# Format the string (header)
pf_list_output.append('{bold}{pf_header: <{pf_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
pf_header_length=pf_phy_length + pf_mtu_length + pf_vfs_length + 2,
pf_header='PFs ' + ''.join(['-' for _ in range(4, pf_phy_length + pf_mtu_length + pf_vfs_length + 1)]))
)
pf_list_output.append('{bold}\
{pf_phy: <{pf_phy_length}} \
{pf_mtu: <{pf_mtu_length}} \
{pf_vfs: <{pf_vfs_length}} \
{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
pf_phy_length=pf_phy_length,
pf_mtu_length=pf_mtu_length,
pf_vfs_length=pf_vfs_length,
pf_phy='Device',
pf_mtu='MTU',
pf_vfs='VFs')
)
for pf_information in sorted(pf_list, key=lambda p: p['phy']):
# Figure out how to nicely columnize our list
nice_vfs_list = [list()]
vfs_lines = 0
cur_vfs_length = 0
for vfs in pf_information['vfs']:
vfs_len = len(vfs)
cur_vfs_length += vfs_len + 2 # for the comma and space
if cur_vfs_length > max_vfs_length:
cur_vfs_length = 0
vfs_lines += 1
nice_vfs_list.append(list())
nice_vfs_list[vfs_lines].append(vfs)
# Append the lines
pf_list_output.append('{bold}\
{pf_phy: <{pf_phy_length}} \
{pf_mtu: <{pf_mtu_length}} \
{pf_vfs: <{pf_vfs_length}} \
{end_bold}'.format(
bold='',
end_bold='',
pf_phy_length=pf_phy_length,
pf_mtu_length=pf_mtu_length,
pf_vfs_length=pf_vfs_length,
pf_phy=pf_information['phy'],
pf_mtu=pf_information['mtu'],
pf_vfs=', '.join(nice_vfs_list[0]))
)
if len(nice_vfs_list) > 1:
for idx in range(1, len(nice_vfs_list)):
pf_list_output.append('{bold}\
{pf_phy: <{pf_phy_length}} \
{pf_mtu: <{pf_mtu_length}} \
{pf_vfs: <{pf_vfs_length}} \
{end_bold}'.format(
bold='',
end_bold='',
pf_phy_length=pf_phy_length,
pf_mtu_length=pf_mtu_length,
pf_vfs_length=pf_vfs_length,
pf_phy='',
pf_mtu='',
pf_vfs=', '.join(nice_vfs_list[idx]))
)
return '\n'.join(pf_list_output)
def format_list_sriov_vf(vf_list):
# Handle when we get an empty entry
if not vf_list:
vf_list = list()
vf_list_output = []
# Determine optimal column widths
vf_phy_length = 4
vf_pf_length = 3
vf_mtu_length = 4
vf_mac_length = 11
vf_used_length = 5
vf_domain_length = 5
for vf_information in vf_list:
# phy column
_vf_phy_length = len(str(vf_information['phy'])) + 1
if _vf_phy_length > vf_phy_length:
vf_phy_length = _vf_phy_length
# pf column
_vf_pf_length = len(str(vf_information['pf'])) + 1
if _vf_pf_length > vf_pf_length:
vf_pf_length = _vf_pf_length
# mtu column
_vf_mtu_length = len(str(vf_information['mtu'])) + 1
if _vf_mtu_length > vf_mtu_length:
vf_mtu_length = _vf_mtu_length
# mac column
_vf_mac_length = len(str(vf_information['mac'])) + 1
if _vf_mac_length > vf_mac_length:
vf_mac_length = _vf_mac_length
# used column
_vf_used_length = len(str(vf_information['usage']['used'])) + 1
if _vf_used_length > vf_used_length:
vf_used_length = _vf_used_length
# domain column
_vf_domain_length = len(str(vf_information['usage']['domain'])) + 1
if _vf_domain_length > vf_domain_length:
vf_domain_length = _vf_domain_length
# Format the string (header)
vf_list_output.append('{bold}{vf_header: <{vf_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
vf_header_length=vf_phy_length + vf_pf_length + vf_mtu_length + vf_mac_length + vf_used_length + vf_domain_length + 5,
vf_header='VFs ' + ''.join(['-' for _ in range(4, vf_phy_length + vf_pf_length + vf_mtu_length + vf_mac_length + vf_used_length + vf_domain_length + 4)]))
)
vf_list_output.append('{bold}\
{vf_phy: <{vf_phy_length}} \
{vf_pf: <{vf_pf_length}} \
{vf_mtu: <{vf_mtu_length}} \
{vf_mac: <{vf_mac_length}} \
{vf_used: <{vf_used_length}} \
{vf_domain: <{vf_domain_length}} \
{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
vf_phy_length=vf_phy_length,
vf_pf_length=vf_pf_length,
vf_mtu_length=vf_mtu_length,
vf_mac_length=vf_mac_length,
vf_used_length=vf_used_length,
vf_domain_length=vf_domain_length,
vf_phy='Device',
vf_pf='PF',
vf_mtu='MTU',
vf_mac='MAC Address',
vf_used='Used',
vf_domain='Domain')
)
for vf_information in sorted(vf_list, key=lambda v: v['phy']):
vf_domain = vf_information['usage']['domain']
if not vf_domain:
vf_domain = 'N/A'
vf_list_output.append('{bold}\
{vf_phy: <{vf_phy_length}} \
{vf_pf: <{vf_pf_length}} \
{vf_mtu: <{vf_mtu_length}} \
{vf_mac: <{vf_mac_length}} \
{vf_used: <{vf_used_length}} \
{vf_domain: <{vf_domain_length}} \
{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
vf_phy_length=vf_phy_length,
vf_pf_length=vf_pf_length,
vf_mtu_length=vf_mtu_length,
vf_mac_length=vf_mac_length,
vf_used_length=vf_used_length,
vf_domain_length=vf_domain_length,
vf_phy=vf_information['phy'],
vf_pf=vf_information['pf'],
vf_mtu=vf_information['mtu'],
vf_mac=vf_information['mac'],
vf_used=vf_information['usage']['used'],
vf_domain=vf_domain)
)
return '\n'.join(vf_list_output)
def format_info_sriov_vf(config, vf_information, node):
if not vf_information:
return "No VF found"
# Get information on the using VM if applicable
if vf_information['usage']['used'] == 'True' and vf_information['usage']['domain']:
vm_information = call_api(config, 'get', '/vm/{vm}'.format(vm=vf_information['usage']['domain'])).json()
if isinstance(vm_information, list) and len(vm_information) > 0:
vm_information = vm_information[0]
else:
vm_information = None
# Format a nice output: do this line-by-line then concat the elements at the end
ainformation = []
ainformation.append('{}SR-IOV VF information:{}'.format(ansiprint.bold(), ansiprint.end()))
ainformation.append('')
# Basic information
ainformation.append('{}PHY:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['phy']))
ainformation.append('{}PF:{} {} @ {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['pf'], node))
ainformation.append('{}MTU:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['mtu']))
ainformation.append('{}MAC Address:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['mac']))
ainformation.append('')
# Configuration information
ainformation.append('{}vLAN ID:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['config']['vlan_id']))
ainformation.append('{}vLAN QOS priority:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['config']['vlan_qos']))
ainformation.append('{}Minimum TX Rate:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['config']['tx_rate_min']))
ainformation.append('{}Maximum TX Rate:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['config']['tx_rate_max']))
ainformation.append('{}Link State:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['config']['link_state']))
ainformation.append('{}Spoof Checking:{} {}{}{}'.format(ansiprint.purple(), ansiprint.end(), getColour(vf_information['config']['spoof_check']), vf_information['config']['spoof_check'], ansiprint.end()))
ainformation.append('{}VF User Trust:{} {}{}{}'.format(ansiprint.purple(), ansiprint.end(), getColour(vf_information['config']['trust']), vf_information['config']['trust'], ansiprint.end()))
ainformation.append('{}Query RSS Config:{} {}{}{}'.format(ansiprint.purple(), ansiprint.end(), getColour(vf_information['config']['query_rss']), vf_information['config']['query_rss'], ansiprint.end()))
ainformation.append('')
# PCIe bus information
ainformation.append('{}PCIe domain:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['pci']['domain']))
ainformation.append('{}PCIe bus:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['pci']['bus']))
ainformation.append('{}PCIe slot:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['pci']['slot']))
ainformation.append('{}PCIe function:{} {}'.format(ansiprint.purple(), ansiprint.end(), vf_information['pci']['function']))
ainformation.append('')
# Usage information
ainformation.append('{}VF Used:{} {}{}{}'.format(ansiprint.purple(), ansiprint.end(), getColour(vf_information['usage']['used']), vf_information['usage']['used'], ansiprint.end()))
if vf_information['usage']['used'] == 'True' and vm_information is not None:
ainformation.append('{}Using Domain:{} {} ({}) ({}{}{})'.format(ansiprint.purple(), ansiprint.end(), vf_information['usage']['domain'], vm_information['name'], getColour(vm_information['state']), vm_information['state'], ansiprint.end()))
else:
ainformation.append('{}Using Domain:{} N/A'.format(ansiprint.purple(), ansiprint.end()))
# Join it all together
return '\n'.join(ainformation)

View File

@ -19,8 +19,10 @@
#
###############################################################################
import cli_lib.ansiprint as ansiprint
from cli_lib.common import call_api
import time
import pvc.cli_lib.ansiprint as ansiprint
from pvc.cli_lib.common import call_api
#
@ -69,6 +71,89 @@ def node_domain_state(config, node, action, wait):
return retstatus, response.json().get('message', '')
def view_node_log(config, node, lines=100):
"""
Return node log lines from the API (and display them in a pager in the main CLI)
API endpoint: GET /node/{node}/log
API arguments: lines={lines}
API schema: {"name":"{node}","data":"{node_log}"}
"""
params = {
'lines': lines
}
response = call_api(config, 'get', '/node/{node}/log'.format(node=node), params=params)
if response.status_code != 200:
return False, response.json().get('message', '')
node_log = response.json()['data']
# Shrink the log buffer to length lines
shrunk_log = node_log.split('\n')[-lines:]
loglines = '\n'.join(shrunk_log)
return True, loglines
def follow_node_log(config, node, lines=10):
"""
Return and follow node log lines from the API
API endpoint: GET /node/{node}/log
API arguments: lines={lines}
API schema: {"name":"{nodename}","data":"{node_log}"}
"""
# We always grab 200 to match the follow call, but only _show_ `lines` number
params = {
'lines': 200
}
response = call_api(config, 'get', '/node/{node}/log'.format(node=node), params=params)
if response.status_code != 200:
return False, response.json().get('message', '')
# Shrink the log buffer to length lines
node_log = response.json()['data']
shrunk_log = node_log.split('\n')[-int(lines):]
loglines = '\n'.join(shrunk_log)
# Print the initial data and begin following
print(loglines, end='')
print('\n', end='')
while True:
# Grab the next line set (200 is a reasonable number of lines per half-second; any more are skipped)
try:
params = {
'lines': 200
}
response = call_api(config, 'get', '/node/{node}/log'.format(node=node), params=params)
new_node_log = response.json()['data']
except Exception:
break
# Split the new and old log strings into constitutent lines
old_node_loglines = node_log.split('\n')
new_node_loglines = new_node_log.split('\n')
# Set the node log to the new log value for the next iteration
node_log = new_node_log
# Get the difference between the two sets of lines
old_node_loglines_set = set(old_node_loglines)
diff_node_loglines = [x for x in new_node_loglines if x not in old_node_loglines_set]
# If there's a difference, print it out
if len(diff_node_loglines) > 0:
print('\n'.join(diff_node_loglines), end='')
print('\n', end='')
# Wait half a second
time.sleep(0.5)
return True, ''
def node_info(config, node):
"""
Get information about node
@ -169,6 +254,7 @@ def format_info(node_information, long_output):
ainformation = []
# Basic information
ainformation.append('{}Name:{} {}'.format(ansiprint.purple(), ansiprint.end(), node_information['name']))
ainformation.append('{}PVC Version:{} {}'.format(ansiprint.purple(), ansiprint.end(), node_information['pvc_version']))
ainformation.append('{}Daemon State:{} {}{}{}'.format(ansiprint.purple(), ansiprint.end(), daemon_state_colour, node_information['daemon_state'], ansiprint.end()))
ainformation.append('{}Coordinator State:{} {}{}{}'.format(ansiprint.purple(), ansiprint.end(), coordinator_state_colour, node_information['coordinator_state'], ansiprint.end()))
ainformation.append('{}Domain State:{} {}{}{}'.format(ansiprint.purple(), ansiprint.end(), domain_state_colour, node_information['domain_state'], ansiprint.end()))
@ -204,9 +290,10 @@ def format_list(node_list, raw):
# Determine optimal column widths
node_name_length = 5
pvc_version_length = 8
daemon_state_length = 7
coordinator_state_length = 12
domain_state_length = 8
domain_state_length = 7
domains_count_length = 4
cpu_count_length = 6
load_length = 5
@ -220,6 +307,10 @@ def format_list(node_list, raw):
_node_name_length = len(node_information['name']) + 1
if _node_name_length > node_name_length:
node_name_length = _node_name_length
# node_pvc_version column
_pvc_version_length = len(node_information.get('pvc_version', 'N/A')) + 1
if _pvc_version_length > pvc_version_length:
pvc_version_length = _pvc_version_length
# daemon_state column
_daemon_state_length = len(node_information['daemon_state']) + 1
if _daemon_state_length > daemon_state_length:
@ -268,11 +359,27 @@ def format_list(node_list, raw):
# Format the string (header)
node_list_output.append(
'{bold}{node_name: <{node_name_length}} \
St: {daemon_state_colour}{node_daemon_state: <{daemon_state_length}}{end_colour} {coordinator_state_colour}{node_coordinator_state: <{coordinator_state_length}}{end_colour} {domain_state_colour}{node_domain_state: <{domain_state_length}}{end_colour} \
Res: {node_domains_count: <{domains_count_length}} {node_cpu_count: <{cpu_count_length}} {node_load: <{load_length}} \
Mem (M): {node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length}} {node_mem_free: <{mem_free_length}} {node_mem_allocated: <{mem_alloc_length}} {node_mem_provisioned: <{mem_prov_length}}{end_bold}'.format(
'{bold}{node_header: <{node_header_length}} {state_header: <{state_header_length}} {resource_header: <{resource_header_length}} {memory_header: <{memory_header_length}}{end_bold}'.format(
node_header_length=node_name_length + pvc_version_length + 1,
state_header_length=daemon_state_length + coordinator_state_length + domain_state_length + 2,
resource_header_length=domains_count_length + cpu_count_length + load_length + 2,
memory_header_length=mem_total_length + mem_used_length + mem_free_length + mem_alloc_length + mem_prov_length + 4,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
node_header='Nodes ' + ''.join(['-' for _ in range(6, node_name_length + pvc_version_length)]),
state_header='States ' + ''.join(['-' for _ in range(7, daemon_state_length + coordinator_state_length + domain_state_length + 1)]),
resource_header='Resources ' + ''.join(['-' for _ in range(10, domains_count_length + cpu_count_length + load_length + 1)]),
memory_header='Memory (M) ' + ''.join(['-' for _ in range(11, mem_total_length + mem_used_length + mem_free_length + mem_alloc_length + mem_prov_length + 3)])
)
)
node_list_output.append(
'{bold}{node_name: <{node_name_length}} {node_pvc_version: <{pvc_version_length}} \
{daemon_state_colour}{node_daemon_state: <{daemon_state_length}}{end_colour} {coordinator_state_colour}{node_coordinator_state: <{coordinator_state_length}}{end_colour} {domain_state_colour}{node_domain_state: <{domain_state_length}}{end_colour} \
{node_domains_count: <{domains_count_length}} {node_cpu_count: <{cpu_count_length}} {node_load: <{load_length}} \
{node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length}} {node_mem_free: <{mem_free_length}} {node_mem_allocated: <{mem_alloc_length}} {node_mem_provisioned: <{mem_prov_length}}{end_bold}'.format(
node_name_length=node_name_length,
pvc_version_length=pvc_version_length,
daemon_state_length=daemon_state_length,
coordinator_state_length=coordinator_state_length,
domain_state_length=domain_state_length,
@ -291,6 +398,7 @@ Mem (M): {node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length
domain_state_colour='',
end_colour='',
node_name='Name',
node_pvc_version='Version',
node_daemon_state='Daemon',
node_coordinator_state='Coordinator',
node_domain_state='Domain',
@ -306,14 +414,15 @@ Mem (M): {node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length
)
# Format the string (elements)
for node_information in node_list:
for node_information in sorted(node_list, key=lambda n: n['name']):
daemon_state_colour, coordinator_state_colour, domain_state_colour, mem_allocated_colour, mem_provisioned_colour = getOutputColours(node_information)
node_list_output.append(
'{bold}{node_name: <{node_name_length}} \
{daemon_state_colour}{node_daemon_state: <{daemon_state_length}}{end_colour} {coordinator_state_colour}{node_coordinator_state: <{coordinator_state_length}}{end_colour} {domain_state_colour}{node_domain_state: <{domain_state_length}}{end_colour} \
{node_domains_count: <{domains_count_length}} {node_cpu_count: <{cpu_count_length}} {node_load: <{load_length}} \
{node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length}} {node_mem_free: <{mem_free_length}} {mem_allocated_colour}{node_mem_allocated: <{mem_alloc_length}}{end_colour} {mem_provisioned_colour}{node_mem_provisioned: <{mem_prov_length}}{end_colour}{end_bold}'.format(
'{bold}{node_name: <{node_name_length}} {node_pvc_version: <{pvc_version_length}} \
{daemon_state_colour}{node_daemon_state: <{daemon_state_length}}{end_colour} {coordinator_state_colour}{node_coordinator_state: <{coordinator_state_length}}{end_colour} {domain_state_colour}{node_domain_state: <{domain_state_length}}{end_colour} \
{node_domains_count: <{domains_count_length}} {node_cpu_count: <{cpu_count_length}} {node_load: <{load_length}} \
{node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length}} {node_mem_free: <{mem_free_length}} {mem_allocated_colour}{node_mem_allocated: <{mem_alloc_length}}{end_colour} {mem_provisioned_colour}{node_mem_provisioned: <{mem_prov_length}}{end_colour}{end_bold}'.format(
node_name_length=node_name_length,
pvc_version_length=pvc_version_length,
daemon_state_length=daemon_state_length,
coordinator_state_length=coordinator_state_length,
domain_state_length=domain_state_length,
@ -334,6 +443,7 @@ Mem (M): {node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length
mem_provisioned_colour=mem_allocated_colour,
end_colour=ansiprint.end(),
node_name=node_information['name'],
node_pvc_version=node_information.get('pvc_version', 'N/A'),
node_daemon_state=node_information['daemon_state'],
node_coordinator_state=node_information['coordinator_state'],
node_domain_state=node_information['domain_state'],
@ -348,4 +458,4 @@ Mem (M): {node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length
)
)
return '\n'.join(sorted(node_list_output))
return '\n'.join(node_list_output)

View File

@ -19,12 +19,10 @@
#
###############################################################################
import ast
from requests_toolbelt.multipart.encoder import MultipartEncoder, MultipartEncoderMonitor
import cli_lib.ansiprint as ansiprint
from cli_lib.common import UploadProgressBar, call_api
import pvc.cli_lib.ansiprint as ansiprint
from pvc.cli_lib.common import UploadProgressBar, call_api
#
@ -721,10 +719,10 @@ def task_status(config, task_id=None, is_watching=False):
task['type'] = task_type
task['worker'] = task_host
task['id'] = task_job.get('id')
task_args = ast.literal_eval(task_job.get('args'))
task_args = task_job.get('args')
task['vm_name'] = task_args[0]
task['vm_profile'] = task_args[1]
task_kwargs = ast.literal_eval(task_job.get('kwargs'))
task_kwargs = task_job.get('kwargs')
task['vm_define'] = str(bool(task_kwargs['define_vm']))
task['vm_start'] = str(bool(task_kwargs['start_vm']))
task_data.append(task)
@ -755,22 +753,16 @@ def format_list_template(template_data, template_type=None):
normalized_template_data = template_data
if 'system' in template_types:
ainformation.append('System templates:')
ainformation.append('')
ainformation.append(format_list_template_system(normalized_template_data['system_templates']))
if len(template_types) > 1:
ainformation.append('')
if 'network' in template_types:
ainformation.append('Network templates:')
ainformation.append('')
ainformation.append(format_list_template_network(normalized_template_data['network_templates']))
if len(template_types) > 1:
ainformation.append('')
if 'storage' in template_types:
ainformation.append('Storage templates:')
ainformation.append('')
ainformation.append(format_list_template_storage(normalized_template_data['storage_templates']))
return '\n'.join(ainformation)
@ -783,13 +775,13 @@ def format_list_template_system(template_data):
template_list_output = []
# Determine optimal column widths
template_name_length = 5
template_id_length = 3
template_name_length = 15
template_id_length = 5
template_vcpu_length = 6
template_vram_length = 10
template_vram_length = 9
template_serial_length = 7
template_vnc_length = 4
template_vnc_bind_length = 10
template_vnc_bind_length = 9
template_node_limit_length = 6
template_node_selector_length = 9
template_node_autostart_length = 10
@ -842,16 +834,33 @@ def format_list_template_system(template_data):
template_migration_method_length = _template_migration_method_length
# Format the string (header)
template_list_output_header = '{bold}{template_name: <{template_name_length}} {template_id: <{template_id_length}} \
template_list_output.append('{bold}{template_header: <{template_header_length}} {resources_header: <{resources_header_length}} {consoles_header: <{consoles_header_length}} {metadata_header: <{metadata_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
template_header_length=template_name_length + template_id_length + 1,
resources_header_length=template_vcpu_length + template_vram_length + 1,
consoles_header_length=template_serial_length + template_vnc_length + template_vnc_bind_length + 2,
metadata_header_length=template_node_limit_length + template_node_selector_length + template_node_autostart_length + template_migration_method_length + 3,
template_header='System Templates ' + ''.join(['-' for _ in range(17, template_name_length + template_id_length)]),
resources_header='Resources ' + ''.join(['-' for _ in range(10, template_vcpu_length + template_vram_length)]),
consoles_header='Consoles ' + ''.join(['-' for _ in range(9, template_serial_length + template_vnc_length + template_vnc_bind_length + 1)]),
metadata_header='Metadata ' + ''.join(['-' for _ in range(9, template_node_limit_length + template_node_selector_length + template_node_autostart_length + template_migration_method_length + 2)]))
)
template_list_output.append('{bold}{template_name: <{template_name_length}} {template_id: <{template_id_length}} \
{template_vcpu: <{template_vcpu_length}} \
{template_vram: <{template_vram_length}} \
Console: {template_serial: <{template_serial_length}} \
{template_serial: <{template_serial_length}} \
{template_vnc: <{template_vnc_length}} \
{template_vnc_bind: <{template_vnc_bind_length}} \
Meta: {template_node_limit: <{template_node_limit_length}} \
{template_node_limit: <{template_node_limit_length}} \
{template_node_selector: <{template_node_selector_length}} \
{template_node_autostart: <{template_node_autostart_length}} \
{template_migration_method: <{template_migration_method_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
template_state_colour='',
end_colour='',
template_name_length=template_name_length,
template_id_length=template_id_length,
template_vcpu_length=template_vcpu_length,
@ -863,14 +872,10 @@ Meta: {template_node_limit: <{template_node_limit_length}} \
template_node_selector_length=template_node_selector_length,
template_node_autostart_length=template_node_autostart_length,
template_migration_method_length=template_migration_method_length,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
template_state_colour='',
end_colour='',
template_name='Name',
template_id='ID',
template_vcpu='vCPUs',
template_vram='vRAM [MB]',
template_vram='vRAM [M]',
template_serial='Serial',
template_vnc='VNC',
template_vnc_bind='VNC bind',
@ -878,6 +883,7 @@ Meta: {template_node_limit: <{template_node_limit_length}} \
template_node_selector='Selector',
template_node_autostart='Autostart',
template_migration_method='Migration')
)
# Format the string (elements)
for template in sorted(template_data, key=lambda i: i.get('name', None)):
@ -885,10 +891,10 @@ Meta: {template_node_limit: <{template_node_limit_length}} \
'{bold}{template_name: <{template_name_length}} {template_id: <{template_id_length}} \
{template_vcpu: <{template_vcpu_length}} \
{template_vram: <{template_vram_length}} \
{template_serial: <{template_serial_length}} \
{template_serial: <{template_serial_length}} \
{template_vnc: <{template_vnc_length}} \
{template_vnc_bind: <{template_vnc_bind_length}} \
{template_node_limit: <{template_node_limit_length}} \
{template_node_limit: <{template_node_limit_length}} \
{template_node_selector: <{template_node_selector_length}} \
{template_node_autostart: <{template_node_autostart_length}} \
{template_migration_method: <{template_migration_method_length}}{end_bold}'.format(
@ -919,9 +925,7 @@ Meta: {template_node_limit: <{template_node_limit_length}} \
)
)
return '\n'.join([template_list_output_header] + template_list_output)
return True, ''
return '\n'.join(template_list_output)
def format_list_template_network(template_template):
@ -931,8 +935,8 @@ def format_list_template_network(template_template):
template_list_output = []
# Determine optimal column widths
template_name_length = 5
template_id_length = 3
template_name_length = 18
template_id_length = 5
template_mac_template_length = 13
template_networks_length = 10
@ -962,7 +966,16 @@ def format_list_template_network(template_template):
template_networks_length = _template_networks_length
# Format the string (header)
template_list_output_header = '{bold}{template_name: <{template_name_length}} {template_id: <{template_id_length}} \
template_list_output.append('{bold}{template_header: <{template_header_length}} {details_header: <{details_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
template_header_length=template_name_length + template_id_length + 1,
details_header_length=template_mac_template_length + template_networks_length + 1,
template_header='Network Templates ' + ''.join(['-' for _ in range(18, template_name_length + template_id_length)]),
details_header='Details ' + ''.join(['-' for _ in range(8, template_mac_template_length + template_networks_length)]))
)
template_list_output.append('{bold}{template_name: <{template_name_length}} {template_id: <{template_id_length}} \
{template_mac_template: <{template_mac_template_length}} \
{template_networks: <{template_networks_length}}{end_bold}'.format(
template_name_length=template_name_length,
@ -975,6 +988,7 @@ def format_list_template_network(template_template):
template_id='ID',
template_mac_template='MAC template',
template_networks='Network VNIs')
)
# Format the string (elements)
for template in sorted(template_template, key=lambda i: i.get('name', None)):
@ -995,7 +1009,7 @@ def format_list_template_network(template_template):
)
)
return '\n'.join([template_list_output_header] + template_list_output)
return '\n'.join(template_list_output)
def format_list_template_storage(template_template):
@ -1005,12 +1019,12 @@ def format_list_template_storage(template_template):
template_list_output = []
# Determine optimal column widths
template_name_length = 5
template_id_length = 3
template_name_length = 18
template_id_length = 5
template_disk_id_length = 8
template_disk_pool_length = 8
template_disk_pool_length = 5
template_disk_source_length = 14
template_disk_size_length = 10
template_disk_size_length = 9
template_disk_filesystem_length = 11
template_disk_fsargs_length = 10
template_disk_mountpoint_length = 10
@ -1056,7 +1070,16 @@ def format_list_template_storage(template_template):
template_disk_mountpoint_length = _template_disk_mountpoint_length
# Format the string (header)
template_list_output_header = '{bold}{template_name: <{template_name_length}} {template_id: <{template_id_length}} \
template_list_output.append('{bold}{template_header: <{template_header_length}} {details_header: <{details_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
template_header_length=template_name_length + template_id_length + 1,
details_header_length=template_disk_id_length + template_disk_pool_length + template_disk_source_length + template_disk_size_length + template_disk_filesystem_length + template_disk_fsargs_length + template_disk_mountpoint_length + 7,
template_header='Storage Templates ' + ''.join(['-' for _ in range(18, template_name_length + template_id_length)]),
details_header='Details ' + ''.join(['-' for _ in range(8, template_disk_id_length + template_disk_pool_length + template_disk_source_length + template_disk_size_length + template_disk_filesystem_length + template_disk_fsargs_length + template_disk_mountpoint_length + 6)]))
)
template_list_output.append('{bold}{template_name: <{template_name_length}} {template_id: <{template_id_length}} \
{template_disk_id: <{template_disk_id_length}} \
{template_disk_pool: <{template_disk_pool_length}} \
{template_disk_source: <{template_disk_source_length}} \
@ -1080,10 +1103,11 @@ def format_list_template_storage(template_template):
template_disk_id='Disk ID',
template_disk_pool='Pool',
template_disk_source='Source Volume',
template_disk_size='Size [GB]',
template_disk_size='Size [G]',
template_disk_filesystem='Filesystem',
template_disk_fsargs='Arguments',
template_disk_mountpoint='Mountpoint')
)
# Format the string (elements)
for template in sorted(template_template, key=lambda i: i.get('name', None)):
@ -1130,7 +1154,7 @@ def format_list_template_storage(template_template):
)
)
return '\n'.join([template_list_output_header] + template_list_output)
return '\n'.join(template_list_output)
def format_list_userdata(userdata_data, lines=None):
@ -1140,8 +1164,9 @@ def format_list_userdata(userdata_data, lines=None):
userdata_list_output = []
# Determine optimal column widths
userdata_name_length = 5
userdata_id_length = 3
userdata_name_length = 12
userdata_id_length = 5
userdata_document_length = 92 - userdata_name_length - userdata_id_length
for userdata in userdata_data:
# userdata_name column
@ -1154,7 +1179,14 @@ def format_list_userdata(userdata_data, lines=None):
userdata_id_length = _userdata_id_length
# Format the string (header)
userdata_list_output_header = '{bold}{userdata_name: <{userdata_name_length}} {userdata_id: <{userdata_id_length}} \
userdata_list_output.append('{bold}{userdata_header: <{userdata_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
userdata_header_length=userdata_name_length + userdata_id_length + userdata_document_length + 2,
userdata_header='Userdata ' + ''.join(['-' for _ in range(9, userdata_name_length + userdata_id_length + userdata_document_length + 1)]))
)
userdata_list_output.append('{bold}{userdata_name: <{userdata_name_length}} {userdata_id: <{userdata_id_length}} \
{userdata_data}{end_bold}'.format(
userdata_name_length=userdata_name_length,
userdata_id_length=userdata_id_length,
@ -1163,6 +1195,7 @@ def format_list_userdata(userdata_data, lines=None):
userdata_name='Name',
userdata_id='ID',
userdata_data='Document')
)
# Format the string (elements)
for data in sorted(userdata_data, key=lambda i: i.get('name', None)):
@ -1204,7 +1237,7 @@ def format_list_userdata(userdata_data, lines=None):
)
)
return '\n'.join([userdata_list_output_header] + userdata_list_output)
return '\n'.join(userdata_list_output)
def format_list_script(script_data, lines=None):
@ -1214,8 +1247,9 @@ def format_list_script(script_data, lines=None):
script_list_output = []
# Determine optimal column widths
script_name_length = 5
script_id_length = 3
script_name_length = 12
script_id_length = 5
script_data_length = 92 - script_name_length - script_id_length
for script in script_data:
# script_name column
@ -1228,7 +1262,14 @@ def format_list_script(script_data, lines=None):
script_id_length = _script_id_length
# Format the string (header)
script_list_output_header = '{bold}{script_name: <{script_name_length}} {script_id: <{script_id_length}} \
script_list_output.append('{bold}{script_header: <{script_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
script_header_length=script_name_length + script_id_length + script_data_length + 2,
script_header='Script ' + ''.join(['-' for _ in range(7, script_name_length + script_id_length + script_data_length + 1)]))
)
script_list_output.append('{bold}{script_name: <{script_name_length}} {script_id: <{script_id_length}} \
{script_data}{end_bold}'.format(
script_name_length=script_name_length,
script_id_length=script_id_length,
@ -1237,6 +1278,7 @@ def format_list_script(script_data, lines=None):
script_name='Name',
script_id='ID',
script_data='Script')
)
# Format the string (elements)
for script in sorted(script_data, key=lambda i: i.get('name', None)):
@ -1278,7 +1320,7 @@ def format_list_script(script_data, lines=None):
)
)
return '\n'.join([script_list_output_header] + script_list_output)
return '\n'.join(script_list_output)
def format_list_ova(ova_data):
@ -1288,8 +1330,8 @@ def format_list_ova(ova_data):
ova_list_output = []
# Determine optimal column widths
ova_name_length = 5
ova_id_length = 3
ova_name_length = 18
ova_id_length = 5
ova_disk_id_length = 8
ova_disk_size_length = 10
ova_disk_pool_length = 5
@ -1329,7 +1371,16 @@ def format_list_ova(ova_data):
ova_disk_volume_name_length = _ova_disk_volume_name_length
# Format the string (header)
ova_list_output_header = '{bold}{ova_name: <{ova_name_length}} {ova_id: <{ova_id_length}} \
ova_list_output.append('{bold}{ova_header: <{ova_header_length}} {details_header: <{details_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
ova_header_length=ova_name_length + ova_id_length + 1,
details_header_length=ova_disk_id_length + ova_disk_size_length + ova_disk_pool_length + ova_disk_volume_format_length + ova_disk_volume_name_length + 4,
ova_header='OVAs ' + ''.join(['-' for _ in range(5, ova_name_length + ova_id_length)]),
details_header='Details ' + ''.join(['-' for _ in range(8, ova_disk_id_length + ova_disk_size_length + ova_disk_pool_length + ova_disk_volume_format_length + ova_disk_volume_name_length + 3)]))
)
ova_list_output.append('{bold}{ova_name: <{ova_name_length}} {ova_id: <{ova_id_length}} \
{ova_disk_id: <{ova_disk_id_length}} \
{ova_disk_size: <{ova_disk_size_length}} \
{ova_disk_pool: <{ova_disk_pool_length}} \
@ -1351,6 +1402,7 @@ def format_list_ova(ova_data):
ova_disk_pool='Pool',
ova_disk_volume_format='Format',
ova_disk_volume_name='Source Volume')
)
# Format the string (elements)
for ova in sorted(ova_data, key=lambda i: i.get('name', None)):
@ -1391,7 +1443,7 @@ def format_list_ova(ova_data):
)
)
return '\n'.join([ova_list_output_header] + ova_list_output)
return '\n'.join(ova_list_output)
def format_list_profile(profile_data):
@ -1411,8 +1463,8 @@ def format_list_profile(profile_data):
profile_list_output = []
# Determine optimal column widths
profile_name_length = 5
profile_id_length = 3
profile_name_length = 18
profile_id_length = 5
profile_source_length = 7
profile_system_template_length = 7
@ -1420,6 +1472,7 @@ def format_list_profile(profile_data):
profile_storage_template_length = 8
profile_userdata_length = 9
profile_script_length = 7
profile_arguments_length = 18
for profile in profile_data:
# profile_name column
@ -1456,11 +1509,22 @@ def format_list_profile(profile_data):
profile_script_length = _profile_script_length
# Format the string (header)
profile_list_output_header = '{bold}{profile_name: <{profile_name_length}} {profile_id: <{profile_id_length}} {profile_source: <{profile_source_length}} \
Templates: {profile_system_template: <{profile_system_template_length}} \
profile_list_output.append('{bold}{profile_header: <{profile_header_length}} {templates_header: <{templates_header_length}} {data_header: <{data_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
profile_header_length=profile_name_length + profile_id_length + profile_source_length + 2,
templates_header_length=profile_system_template_length + profile_network_template_length + profile_storage_template_length + 2,
data_header_length=profile_userdata_length + profile_script_length + profile_arguments_length + 2,
profile_header='Profiles ' + ''.join(['-' for _ in range(9, profile_name_length + profile_id_length + profile_source_length + 1)]),
templates_header='Templates ' + ''.join(['-' for _ in range(10, profile_system_template_length + profile_network_template_length + profile_storage_template_length + 1)]),
data_header='Data ' + ''.join(['-' for _ in range(5, profile_userdata_length + profile_script_length + profile_arguments_length + 1)]))
)
profile_list_output.append('{bold}{profile_name: <{profile_name_length}} {profile_id: <{profile_id_length}} {profile_source: <{profile_source_length}} \
{profile_system_template: <{profile_system_template_length}} \
{profile_network_template: <{profile_network_template_length}} \
{profile_storage_template: <{profile_storage_template_length}} \
Data: {profile_userdata: <{profile_userdata_length}} \
{profile_userdata: <{profile_userdata_length}} \
{profile_script: <{profile_script_length}} \
{profile_arguments}{end_bold}'.format(
profile_name_length=profile_name_length,
@ -1482,15 +1546,19 @@ Data: {profile_userdata: <{profile_userdata_length}} \
profile_userdata='Userdata',
profile_script='Script',
profile_arguments='Script Arguments')
)
# Format the string (elements)
for profile in sorted(profile_data, key=lambda i: i.get('name', None)):
arguments_list = ', '.join(profile['arguments'])
if not arguments_list:
arguments_list = 'N/A'
profile_list_output.append(
'{bold}{profile_name: <{profile_name_length}} {profile_id: <{profile_id_length}} {profile_source: <{profile_source_length}} \
{profile_system_template: <{profile_system_template_length}} \
{profile_system_template: <{profile_system_template_length}} \
{profile_network_template: <{profile_network_template_length}} \
{profile_storage_template: <{profile_storage_template_length}} \
{profile_userdata: <{profile_userdata_length}} \
{profile_userdata: <{profile_userdata_length}} \
{profile_script: <{profile_script_length}} \
{profile_arguments}{end_bold}'.format(
profile_name_length=profile_name_length,
@ -1511,11 +1579,11 @@ Data: {profile_userdata: <{profile_userdata_length}} \
profile_storage_template=profile['storage_template'],
profile_userdata=profile['userdata'],
profile_script=profile['script'],
profile_arguments=', '.join(profile['arguments'])
profile_arguments=arguments_list,
)
)
return '\n'.join([profile_list_output_header] + profile_list_output)
return '\n'.join(profile_list_output)
def format_list_task(task_data):
@ -1524,17 +1592,21 @@ def format_list_task(task_data):
# Determine optimal column widths
task_id_length = 7
task_type_length = 7
task_worker_length = 7
task_vm_name_length = 5
task_vm_profile_length = 8
task_vm_define_length = 8
task_vm_start_length = 7
task_worker_length = 8
for task in task_data:
# task_id column
_task_id_length = len(str(task['id'])) + 1
if _task_id_length > task_id_length:
task_id_length = _task_id_length
# task_worker column
_task_worker_length = len(str(task['worker'])) + 1
if _task_worker_length > task_worker_length:
task_worker_length = _task_worker_length
# task_type column
_task_type_length = len(str(task['type'])) + 1
if _task_type_length > task_type_length:
@ -1555,15 +1627,20 @@ def format_list_task(task_data):
_task_vm_start_length = len(str(task['vm_start'])) + 1
if _task_vm_start_length > task_vm_start_length:
task_vm_start_length = _task_vm_start_length
# task_worker column
_task_worker_length = len(str(task['worker'])) + 1
if _task_worker_length > task_worker_length:
task_worker_length = _task_worker_length
# Format the string (header)
task_list_output_header = '{bold}{task_id: <{task_id_length}} {task_type: <{task_type_length}} \
task_list_output.append('{bold}{task_header: <{task_header_length}} {vms_header: <{vms_header_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
task_header_length=task_id_length + task_type_length + task_worker_length + 2,
vms_header_length=task_vm_name_length + task_vm_profile_length + task_vm_define_length + task_vm_start_length + 3,
task_header='Tasks ' + ''.join(['-' for _ in range(6, task_id_length + task_type_length + task_worker_length + 1)]),
vms_header='VM Details ' + ''.join(['-' for _ in range(11, task_vm_name_length + task_vm_profile_length + task_vm_define_length + task_vm_start_length + 2)]))
)
task_list_output.append('{bold}{task_id: <{task_id_length}} {task_type: <{task_type_length}} \
{task_worker: <{task_worker_length}} \
VM: {task_vm_name: <{task_vm_name_length}} \
{task_vm_name: <{task_vm_name_length}} \
{task_vm_profile: <{task_vm_profile_length}} \
{task_vm_define: <{task_vm_define_length}} \
{task_vm_start: <{task_vm_start_length}}{end_bold}'.format(
@ -1583,13 +1660,14 @@ VM: {task_vm_name: <{task_vm_name_length}} \
task_vm_profile='Profile',
task_vm_define='Define?',
task_vm_start='Start?')
)
# Format the string (elements)
for task in sorted(task_data, key=lambda i: i.get('type', None)):
task_list_output.append(
'{bold}{task_id: <{task_id_length}} {task_type: <{task_type_length}} \
{task_worker: <{task_worker_length}} \
{task_vm_name: <{task_vm_name_length}} \
{task_vm_name: <{task_vm_name_length}} \
{task_vm_profile: <{task_vm_profile_length}} \
{task_vm_define: <{task_vm_define_length}} \
{task_vm_start: <{task_vm_start_length}}{end_bold}'.format(
@ -1612,4 +1690,4 @@ VM: {task_vm_name: <{task_vm_name_length}} \
)
)
return '\n'.join([task_list_output_header] + task_list_output)
return '\n'.join(task_list_output)

View File

@ -22,8 +22,8 @@
import time
import re
import cli_lib.ansiprint as ansiprint
from cli_lib.common import call_api, format_bytes, format_metric
import pvc.cli_lib.ansiprint as ansiprint
from pvc.cli_lib.common import call_api, format_bytes, format_metric
#
@ -54,12 +54,12 @@ def vm_info(config, vm):
return False, response.json().get('message', '')
def vm_list(config, limit, target_node, target_state):
def vm_list(config, limit, target_node, target_state, target_tag):
"""
Get list information about VMs (limited by {limit}, {target_node}, or {target_state})
API endpoint: GET /api/v1/vm
API arguments: limit={limit}, node={target_node}, state={target_state}
API arguments: limit={limit}, node={target_node}, state={target_state}, tag={target_tag}
API schema: [{json_data_object},{json_data_object},etc.]
"""
params = dict()
@ -69,6 +69,8 @@ def vm_list(config, limit, target_node, target_state):
params['node'] = target_node
if target_state:
params['state'] = target_state
if target_tag:
params['tag'] = target_tag
response = call_api(config, 'get', '/vm', params=params)
@ -78,12 +80,12 @@ def vm_list(config, limit, target_node, target_state):
return False, response.json().get('message', '')
def vm_define(config, xml, node, node_limit, node_selector, node_autostart, migration_method):
def vm_define(config, xml, node, node_limit, node_selector, node_autostart, migration_method, user_tags, protected_tags):
"""
Define a new VM on the cluster
API endpoint: POST /vm
API arguments: xml={xml}, node={node}, limit={node_limit}, selector={node_selector}, autostart={node_autostart}, migration_method={migration_method}
API arguments: xml={xml}, node={node}, limit={node_limit}, selector={node_selector}, autostart={node_autostart}, migration_method={migration_method}, user_tags={user_tags}, protected_tags={protected_tags}
API schema: {"message":"{data}"}
"""
params = {
@ -91,7 +93,9 @@ def vm_define(config, xml, node, node_limit, node_selector, node_autostart, migr
'limit': node_limit,
'selector': node_selector,
'autostart': node_autostart,
'migration_method': migration_method
'migration_method': migration_method,
'user_tags': user_tags,
'protected_tags': protected_tags
}
data = {
'xml': xml
@ -130,11 +134,32 @@ def vm_modify(config, vm, xml, restart):
return retstatus, response.json().get('message', '')
def vm_rename(config, vm, new_name):
"""
Rename VM to new name
API endpoint: POST /vm/{vm}/rename
API arguments: new_name={new_name}
API schema: {"message":"{data}"}
"""
params = {
'new_name': new_name
}
response = call_api(config, 'post', '/vm/{vm}/rename'.format(vm=vm), params=params)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get('message', '')
def vm_metadata(config, vm, node_limit, node_selector, node_autostart, migration_method, provisioner_profile):
"""
Modify PVC metadata of a VM
API endpoint: GET /vm/{vm}/meta, POST /vm/{vm}/meta
API endpoint: POST /vm/{vm}/meta
API arguments: limit={node_limit}, selector={node_selector}, autostart={node_autostart}, migration_method={migration_method} profile={provisioner_profile}
API schema: {"message":"{data}"}
"""
@ -167,6 +192,119 @@ def vm_metadata(config, vm, node_limit, node_selector, node_autostart, migration
return retstatus, response.json().get('message', '')
def vm_tags_get(config, vm):
"""
Get PVC tags of a VM
API endpoint: GET /vm/{vm}/tags
API arguments:
API schema: {{"name": "{name}", "type": "{type}"},...}
"""
response = call_api(config, 'get', '/vm/{vm}/tags'.format(vm=vm))
if response.status_code == 200:
retstatus = True
retdata = response.json()
else:
retstatus = False
retdata = response.json().get('message', '')
return retstatus, retdata
def vm_tag_set(config, vm, action, tag, protected=False):
"""
Modify PVC tags of a VM
API endpoint: POST /vm/{vm}/tags
API arguments: action={action}, tag={tag}, protected={protected}
API schema: {"message":"{data}"}
"""
params = {
'action': action,
'tag': tag,
'protected': protected
}
# Update the tags
response = call_api(config, 'post', '/vm/{vm}/tags'.format(vm=vm), params=params)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get('message', '')
def format_vm_tags(config, name, tags):
"""
Format the output of a tags dictionary in a nice table
"""
if len(tags) < 1:
return "No tags found."
output_list = []
name_length = 5
_name_length = len(name) + 1
if _name_length > name_length:
name_length = _name_length
tags_name_length = 4
tags_type_length = 5
tags_protected_length = 10
for tag in tags:
_tags_name_length = len(tag['name']) + 1
if _tags_name_length > tags_name_length:
tags_name_length = _tags_name_length
_tags_type_length = len(tag['type']) + 1
if _tags_type_length > tags_type_length:
tags_type_length = _tags_type_length
_tags_protected_length = len(str(tag['protected'])) + 1
if _tags_protected_length > tags_protected_length:
tags_protected_length = _tags_protected_length
output_list.append(
'{bold}{tags_name: <{tags_name_length}} \
{tags_type: <{tags_type_length}} \
{tags_protected: <{tags_protected_length}}{end_bold}'.format(
name_length=name_length,
tags_name_length=tags_name_length,
tags_type_length=tags_type_length,
tags_protected_length=tags_protected_length,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
tags_name='Name',
tags_type='Type',
tags_protected='Protected'
)
)
for tag in sorted(tags, key=lambda t: t['name']):
output_list.append(
'{bold}{tags_name: <{tags_name_length}} \
{tags_type: <{tags_type_length}} \
{tags_protected: <{tags_protected_length}}{end_bold}'.format(
name_length=name_length,
tags_type_length=tags_type_length,
tags_name_length=tags_name_length,
tags_protected_length=tags_protected_length,
bold='',
end_bold='',
tags_name=tag['name'],
tags_type=tag['type'],
tags_protected=str(tag['protected'])
)
)
return '\n'.join(output_list)
def vm_remove(config, vm, delete_disks=False):
"""
Remove a VM
@ -480,7 +618,7 @@ def format_vm_memory(config, name, memory):
return '\n'.join(output_list)
def vm_networks_add(config, vm, network, macaddr, model, restart):
def vm_networks_add(config, vm, network, macaddr, model, sriov, sriov_mode, restart):
"""
Add a new network to the VM
@ -491,19 +629,21 @@ def vm_networks_add(config, vm, network, macaddr, model, restart):
from lxml.objectify import fromstring
from lxml.etree import tostring
from random import randint
import cli_lib.network as pvc_network
import pvc.cli_lib.network as pvc_network
# Verify that the provided network is valid
retcode, retdata = pvc_network.net_info(config, network)
if not retcode:
# Ignore the three special networks
if network not in ['upstream', 'cluster', 'storage']:
return False, "Network {} is not present in the cluster.".format(network)
# Verify that the provided network is valid (not in SR-IOV mode)
if not sriov:
retcode, retdata = pvc_network.net_info(config, network)
if not retcode:
# Ignore the three special networks
if network not in ['upstream', 'cluster', 'storage']:
return False, "Network {} is not present in the cluster.".format(network)
if network in ['upstream', 'cluster', 'storage']:
br_prefix = 'br'
else:
br_prefix = 'vmbr'
# Set the bridge prefix
if network in ['upstream', 'cluster', 'storage']:
br_prefix = 'br'
else:
br_prefix = 'vmbr'
status, domain_information = vm_info(config, vm)
if not status:
@ -530,24 +670,74 @@ def vm_networks_add(config, vm, network, macaddr, model, restart):
octetC=random_octet_C
)
device_string = '<interface type="bridge"><mac address="{macaddr}"/><source bridge="{bridge}"/><model type="{model}"/></interface>'.format(
macaddr=macaddr,
bridge="{}{}".format(br_prefix, network),
model=model
)
# Add an SR-IOV network
if sriov:
valid, sriov_vf_information = pvc_network.net_sriov_vf_info(config, domain_information['node'], network)
if not valid:
return False, 'Specified SR-IOV VF "{}" does not exist on VM node "{}".'.format(network, domain_information['node'])
# Add a hostdev (direct PCIe) SR-IOV network
if sriov_mode == 'hostdev':
bus_address = 'domain="0x{pci_domain}" bus="0x{pci_bus}" slot="0x{pci_slot}" function="0x{pci_function}"'.format(
pci_domain=sriov_vf_information['pci']['domain'],
pci_bus=sriov_vf_information['pci']['bus'],
pci_slot=sriov_vf_information['pci']['slot'],
pci_function=sriov_vf_information['pci']['function'],
)
device_string = '<interface type="hostdev" managed="yes"><mac address="{macaddr}"/><source><address type="pci" {bus_address}/></source><sriov_device>{network}</sriov_device></interface>'.format(
macaddr=macaddr,
bus_address=bus_address,
network=network
)
# Add a macvtap SR-IOV network
elif sriov_mode == 'macvtap':
device_string = '<interface type="direct"><mac address="{macaddr}"/><source dev="{network}" mode="passthrough"/><model type="{model}"/></interface>'.format(
macaddr=macaddr,
network=network,
model=model
)
else:
return False, "ERROR: Invalid SR-IOV mode specified."
# Add a normal bridged PVC network
else:
device_string = '<interface type="bridge"><mac address="{macaddr}"/><source bridge="{bridge}"/><model type="{model}"/></interface>'.format(
macaddr=macaddr,
bridge="{}{}".format(br_prefix, network),
model=model
)
device_xml = fromstring(device_string)
last_interface = None
all_interfaces = parsed_xml.devices.find('interface')
if all_interfaces is None:
all_interfaces = []
for interface in all_interfaces:
last_interface = re.match(r'[vm]*br([0-9a-z]+)', interface.source.attrib.get('bridge')).group(1)
if last_interface == network:
return False, 'Network {} is already configured for VM {}.'.format(network, vm)
if last_interface is not None:
for interface in parsed_xml.devices.find('interface'):
if last_interface == re.match(r'[vm]*br([0-9a-z]+)', interface.source.attrib.get('bridge')).group(1):
if sriov:
if sriov_mode == 'hostdev':
if interface.attrib.get('type') == 'hostdev':
interface_address = 'domain="{pci_domain}" bus="{pci_bus}" slot="{pci_slot}" function="{pci_function}"'.format(
pci_domain=interface.source.address.attrib.get('domain'),
pci_bus=interface.source.address.attrib.get('bus'),
pci_slot=interface.source.address.attrib.get('slot'),
pci_function=interface.source.address.attrib.get('function')
)
if interface_address == bus_address:
return False, 'Network "{}" is already configured for VM "{}".'.format(network, vm)
elif sriov_mode == 'macvtap':
if interface.attrib.get('type') == 'direct':
interface_dev = interface.source.attrib.get('dev')
if interface_dev == network:
return False, 'Network "{}" is already configured for VM "{}".'.format(network, vm)
else:
if interface.attrib.get('type') == 'bridge':
interface_vni = re.match(r'[vm]*br([0-9a-z]+)', interface.source.attrib.get('bridge')).group(1)
if interface_vni == network:
return False, 'Network "{}" is already configured for VM "{}".'.format(network, vm)
# Add the interface at the end of the list (or, right above emulator)
if len(all_interfaces) > 0:
for idx, interface in enumerate(parsed_xml.devices.find('interface')):
if idx == len(all_interfaces) - 1:
interface.addnext(device_xml)
else:
parsed_xml.devices.find('emulator').addprevious(device_xml)
@ -560,7 +750,7 @@ def vm_networks_add(config, vm, network, macaddr, model, restart):
return vm_modify(config, vm, new_xml, restart)
def vm_networks_remove(config, vm, network, restart):
def vm_networks_remove(config, vm, network, sriov, restart):
"""
Remove a network to the VM
@ -584,17 +774,33 @@ def vm_networks_remove(config, vm, network, restart):
except Exception:
return False, 'ERROR: Failed to parse XML data.'
changed = False
for interface in parsed_xml.devices.find('interface'):
if_vni = re.match(r'[vm]*br([0-9a-z]+)', interface.source.attrib.get('bridge')).group(1)
if network == if_vni:
interface.getparent().remove(interface)
if sriov:
if interface.attrib.get('type') == 'hostdev':
if_dev = str(interface.sriov_device)
if network == if_dev:
interface.getparent().remove(interface)
changed = True
elif interface.attrib.get('type') == 'direct':
if_dev = str(interface.source.attrib.get('dev'))
if network == if_dev:
interface.getparent().remove(interface)
changed = True
else:
if_vni = re.match(r'[vm]*br([0-9a-z]+)', interface.source.attrib.get('bridge')).group(1)
if network == if_vni:
interface.getparent().remove(interface)
changed = True
if changed:
try:
new_xml = tostring(parsed_xml, pretty_print=True)
except Exception:
return False, 'ERROR: Failed to dump XML data.'
try:
new_xml = tostring(parsed_xml, pretty_print=True)
except Exception:
return False, 'ERROR: Failed to dump XML data.'
return vm_modify(config, vm, new_xml, restart)
return vm_modify(config, vm, new_xml, restart)
else:
return False, 'ERROR: Network "{}" does not exist on VM.'.format(network)
def vm_networks_get(config, vm):
@ -624,7 +830,14 @@ def vm_networks_get(config, vm):
for interface in parsed_xml.devices.find('interface'):
mac_address = interface.mac.attrib.get('address')
model = interface.model.attrib.get('type')
network = re.match(r'[vm]*br([0-9a-z]+)', interface.source.attrib.get('bridge')).group(1)
interface_type = interface.attrib.get('type')
if interface_type == 'bridge':
network = re.search(r'[vm]*br([0-9a-z]+)', interface.source.attrib.get('bridge')).group(1)
elif interface_type == 'direct':
network = 'macvtap:{}'.format(interface.source.attrib.get('dev'))
elif interface_type == 'hostdev':
network = 'hostdev:{}'.format(interface.source.attrib.get('dev'))
network_data.append((network, mac_address, model))
return True, network_data
@ -711,7 +924,7 @@ def vm_volumes_add(config, vm, volume, disk_id, bus, disk_type, restart):
from lxml.objectify import fromstring
from lxml.etree import tostring
from copy import deepcopy
import cli_lib.ceph as pvc_ceph
import pvc.cli_lib.ceph as pvc_ceph
if disk_type == 'rbd':
# Verify that the provided volume is valid
@ -823,7 +1036,7 @@ def vm_volumes_remove(config, vm, volume, restart):
xml = domain_information.get('xml', None)
if xml is None:
return False, "VM does not have a valid XML doccument."
return False, "VM does not have a valid XML document."
try:
parsed_xml = fromstring(xml)
@ -1002,8 +1215,9 @@ def follow_console_log(config, vm, lines=10):
API arguments: lines={lines}
API schema: {"name":"{vmname}","data":"{console_log}"}
"""
# We always grab 200 to match the follow call, but only _show_ `lines` number
params = {
'lines': lines
'lines': 200
}
response = call_api(config, 'get', '/vm/{vm}/console'.format(vm=vm), params=params)
@ -1012,17 +1226,17 @@ def follow_console_log(config, vm, lines=10):
# Shrink the log buffer to length lines
console_log = response.json()['data']
shrunk_log = console_log.split('\n')[-lines:]
shrunk_log = console_log.split('\n')[-int(lines):]
loglines = '\n'.join(shrunk_log)
# Print the initial data and begin following
print(loglines, end='')
while True:
# Grab the next line set (500 is a reasonable number of lines per second; any more are skipped)
# Grab the next line set (200 is a reasonable number of lines per half-second; any more are skipped)
try:
params = {
'lines': 500
'lines': 200
}
response = call_api(config, 'get', '/vm/{vm}/console'.format(vm=vm), params=params)
new_console_log = response.json()['data']
@ -1031,8 +1245,10 @@ def follow_console_log(config, vm, lines=10):
# Split the new and old log strings into constitutent lines
old_console_loglines = console_log.split('\n')
new_console_loglines = new_console_log.split('\n')
# Set the console log to the new log value for the next iteration
console_log = new_console_log
# Remove the lines from the old log until we hit the first line of the new log; this
# ensures that the old log is a string that we can remove from the new log entirely
for index, line in enumerate(old_console_loglines, start=0):
@ -1047,8 +1263,8 @@ def follow_console_log(config, vm, lines=10):
# If there's a difference, print it out
if diff_console_log:
print(diff_console_log, end='')
# Wait a second
time.sleep(1)
# Wait half a second
time.sleep(0.5)
return True, ''
@ -1062,8 +1278,8 @@ def format_info(config, domain_information, long_output):
ainformation.append('{}Virtual machine information:{}'.format(ansiprint.bold(), ansiprint.end()))
ainformation.append('')
# Basic information
ainformation.append('{}UUID:{} {}'.format(ansiprint.purple(), ansiprint.end(), domain_information['uuid']))
ainformation.append('{}Name:{} {}'.format(ansiprint.purple(), ansiprint.end(), domain_information['name']))
ainformation.append('{}UUID:{} {}'.format(ansiprint.purple(), ansiprint.end(), domain_information['uuid']))
ainformation.append('{}Description:{} {}'.format(ansiprint.purple(), ansiprint.end(), domain_information['description']))
ainformation.append('{}Profile:{} {}'.format(ansiprint.purple(), ansiprint.end(), domain_information['profile']))
ainformation.append('{}Memory (M):{} {}'.format(ansiprint.purple(), ansiprint.end(), domain_information['memory']))
@ -1151,19 +1367,64 @@ def format_info(config, domain_information, long_output):
ainformation.append('{}Autostart:{} {}'.format(ansiprint.purple(), ansiprint.end(), formatted_node_autostart))
ainformation.append('{}Migration Method:{} {}'.format(ansiprint.purple(), ansiprint.end(), formatted_migration_method))
# Tag list
tags_name_length = 5
tags_type_length = 5
tags_protected_length = 10
for tag in domain_information['tags']:
_tags_name_length = len(tag['name']) + 1
if _tags_name_length > tags_name_length:
tags_name_length = _tags_name_length
_tags_type_length = len(tag['type']) + 1
if _tags_type_length > tags_type_length:
tags_type_length = _tags_type_length
_tags_protected_length = len(str(tag['protected'])) + 1
if _tags_protected_length > tags_protected_length:
tags_protected_length = _tags_protected_length
if len(domain_information['tags']) > 0:
ainformation.append('')
ainformation.append('{purple}Tags:{end} {bold}{tags_name: <{tags_name_length}} {tags_type: <{tags_type_length}} {tags_protected: <{tags_protected_length}}{end}'.format(
purple=ansiprint.purple(),
bold=ansiprint.bold(),
end=ansiprint.end(),
tags_name_length=tags_name_length,
tags_type_length=tags_type_length,
tags_protected_length=tags_protected_length,
tags_name='Name',
tags_type='Type',
tags_protected='Protected'
))
for tag in sorted(domain_information['tags'], key=lambda t: t['type'] + t['name']):
ainformation.append(' {tags_name: <{tags_name_length}} {tags_type: <{tags_type_length}} {tags_protected: <{tags_protected_length}}'.format(
tags_name_length=tags_name_length,
tags_type_length=tags_type_length,
tags_protected_length=tags_protected_length,
tags_name=tag['name'],
tags_type=tag['type'],
tags_protected=str(tag['protected'])
))
else:
ainformation.append('')
ainformation.append('{purple}Tags:{end} N/A'.format(
purple=ansiprint.purple(),
bold=ansiprint.bold(),
end=ansiprint.end(),
))
# Network list
net_list = []
cluster_net_list = call_api(config, 'get', '/network').json()
for net in domain_information['networks']:
# Split out just the numerical (VNI) part of the brXXXX name
net_vnis = re.findall(r'\d+', net['source'])
if net_vnis:
net_vni = net_vnis[0]
else:
net_vni = re.sub('br', '', net['source'])
response = call_api(config, 'get', '/network/{net}'.format(net=net_vni))
if response.status_code != 200 and net_vni not in ['cluster', 'storage', 'upstream']:
net_list.append(ansiprint.red() + net_vni + ansiprint.end() + ' [invalid]')
net_vni = net['vni']
if net_vni not in ['cluster', 'storage', 'upstream'] and not re.match(r'^macvtap:.*', net_vni) and not re.match(r'^hostdev:.*', net_vni):
if int(net_vni) not in [net['vni'] for net in cluster_net_list]:
net_list.append(ansiprint.red() + net_vni + ansiprint.end() + ' [invalid]')
else:
net_list.append(net_vni)
else:
net_list.append(net_vni)
@ -1191,17 +1452,31 @@ def format_info(config, domain_information, long_output):
width=name_length
))
ainformation.append('')
ainformation.append('{}Interfaces:{} {}ID Type Source Model MAC Data (r/w) Packets (r/w) Errors (r/w){}'.format(ansiprint.purple(), ansiprint.end(), ansiprint.bold(), ansiprint.end()))
ainformation.append('{}Interfaces:{} {}ID Type Source Model MAC Data (r/w) Packets (r/w) Errors (r/w){}'.format(ansiprint.purple(), ansiprint.end(), ansiprint.bold(), ansiprint.end()))
for net in domain_information['networks']:
ainformation.append(' {0: <3} {1: <7} {2: <10} {3: <8} {4: <18} {5: <12} {6: <15} {7: <12}'.format(
net_type = net['type']
net_source = net['source']
net_mac = net['mac']
if net_type in ['direct', 'hostdev']:
net_model = 'N/A'
net_bytes = 'N/A'
net_packets = 'N/A'
net_errors = 'N/A'
elif net_type in ['bridge']:
net_model = net['model']
net_bytes = '/'.join([str(format_bytes(net.get('rd_bytes', 0))), str(format_bytes(net.get('wr_bytes', 0)))])
net_packets = '/'.join([str(format_metric(net.get('rd_packets', 0))), str(format_metric(net.get('wr_packets', 0)))])
net_errors = '/'.join([str(format_metric(net.get('rd_errors', 0))), str(format_metric(net.get('wr_errors', 0)))])
ainformation.append(' {0: <3} {1: <8} {2: <12} {3: <8} {4: <18} {5: <12} {6: <15} {7: <12}'.format(
domain_information['networks'].index(net),
net['type'],
net['source'],
net['model'],
net['mac'],
'/'.join([str(format_bytes(net.get('rd_bytes', 0))), str(format_bytes(net.get('wr_bytes', 0)))]),
'/'.join([str(format_metric(net.get('rd_packets', 0))), str(format_metric(net.get('wr_packets', 0)))]),
'/'.join([str(format_metric(net.get('rd_errors', 0))), str(format_metric(net.get('wr_errors', 0)))]),
net_type,
net_source,
net_model,
net_mac,
net_bytes,
net_packets,
net_errors
))
# Controller list
ainformation.append('')
@ -1220,15 +1495,17 @@ def format_list(config, vm_list, raw):
# Network list
net_list = []
for net in domain_information['networks']:
# Split out just the numerical (VNI) part of the brXXXX name
net_vnis = re.findall(r'\d+', net['source'])
if net_vnis:
net_vni = net_vnis[0]
else:
net_vni = re.sub('br', '', net['source'])
net_list.append(net_vni)
net_list.append(net['vni'])
return net_list
# Function to get tag names and returna nicer list
def getNiceTagName(domain_information):
# Tag list
tag_list = []
for tag in sorted(domain_information['tags'], key=lambda t: t['type'] + t['name']):
tag_list.append(tag['name'])
return tag_list
# Handle raw mode since it just lists the names
if raw:
ainformation = list()
@ -1241,15 +1518,16 @@ def format_list(config, vm_list, raw):
# Determine optimal column widths
# Dynamic columns: node_name, node, migrated
vm_name_length = 5
vm_uuid_length = 37
vm_state_length = 6
vm_tags_length = 5
vm_nets_length = 9
vm_ram_length = 8
vm_vcpu_length = 6
vm_node_length = 8
vm_migrated_length = 10
vm_migrated_length = 9
for domain_information in vm_list:
net_list = getNiceNetID(domain_information)
tag_list = getNiceTagName(domain_information)
# vm_name column
_vm_name_length = len(domain_information['name']) + 1
if _vm_name_length > vm_name_length:
@ -1258,6 +1536,10 @@ def format_list(config, vm_list, raw):
_vm_state_length = len(domain_information['state']) + 1
if _vm_state_length > vm_state_length:
vm_state_length = _vm_state_length
# vm_tags column
_vm_tags_length = len(','.join(tag_list)) + 1
if _vm_tags_length > vm_tags_length:
vm_tags_length = _vm_tags_length
# vm_nets column
_vm_nets_length = len(','.join(net_list)) + 1
if _vm_nets_length > vm_nets_length:
@ -1273,15 +1555,29 @@ def format_list(config, vm_list, raw):
# Format the string (header)
vm_list_output.append(
'{bold}{vm_name: <{vm_name_length}} {vm_uuid: <{vm_uuid_length}} \
'{bold}{vm_header: <{vm_header_length}} {resource_header: <{resource_header_length}} {node_header: <{node_header_length}}{end_bold}'.format(
vm_header_length=vm_name_length + vm_state_length + vm_tags_length + 2,
resource_header_length=vm_nets_length + vm_ram_length + vm_vcpu_length + 2,
node_header_length=vm_node_length + vm_migrated_length + 1,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
vm_header='VMs ' + ''.join(['-' for _ in range(4, vm_name_length + vm_state_length + vm_tags_length + 1)]),
resource_header='Resources ' + ''.join(['-' for _ in range(10, vm_nets_length + vm_ram_length + vm_vcpu_length + 1)]),
node_header='Node ' + ''.join(['-' for _ in range(5, vm_node_length + vm_migrated_length)])
)
)
vm_list_output.append(
'{bold}{vm_name: <{vm_name_length}} \
{vm_state_colour}{vm_state: <{vm_state_length}}{end_colour} \
{vm_tags: <{vm_tags_length}} \
{vm_networks: <{vm_nets_length}} \
{vm_memory: <{vm_ram_length}} {vm_vcpu: <{vm_vcpu_length}} \
{vm_node: <{vm_node_length}} \
{vm_migrated: <{vm_migrated_length}}{end_bold}'.format(
vm_name_length=vm_name_length,
vm_uuid_length=vm_uuid_length,
vm_state_length=vm_state_length,
vm_tags_length=vm_tags_length,
vm_nets_length=vm_nets_length,
vm_ram_length=vm_ram_length,
vm_vcpu_length=vm_vcpu_length,
@ -1292,20 +1588,21 @@ def format_list(config, vm_list, raw):
vm_state_colour='',
end_colour='',
vm_name='Name',
vm_uuid='UUID',
vm_state='State',
vm_tags='Tags',
vm_networks='Networks',
vm_memory='RAM (M)',
vm_vcpu='vCPUs',
vm_node='Node',
vm_node='Current',
vm_migrated='Migrated'
)
)
# Keep track of nets we found to be valid to cut down on duplicate API hits
valid_net_list = []
# Get a list of cluster networks for validity comparisons
cluster_net_list = call_api(config, 'get', '/network').json()
# Format the string (elements)
for domain_information in vm_list:
for domain_information in sorted(vm_list, key=lambda v: v['name']):
if domain_information['state'] == 'start':
vm_state_colour = ansiprint.green()
elif domain_information['state'] == 'restart':
@ -1320,29 +1617,27 @@ def format_list(config, vm_list, raw):
vm_state_colour = ansiprint.blue()
# Handle colouring for an invalid network config
raw_net_list = getNiceNetID(domain_information)
net_list = []
net_list = getNiceNetID(domain_information)
tag_list = getNiceTagName(domain_information)
if len(tag_list) < 1:
tag_list = ['N/A']
vm_net_colour = ''
for net_vni in raw_net_list:
if net_vni not in valid_net_list:
response = call_api(config, 'get', '/network/{net}'.format(net=net_vni))
if response.status_code != 200 and net_vni not in ['cluster', 'storage', 'upstream']:
for net_vni in net_list:
if net_vni not in ['cluster', 'storage', 'upstream'] and not re.match(r'^macvtap:.*', net_vni) and not re.match(r'^hostdev:.*', net_vni):
if int(net_vni) not in [net['vni'] for net in cluster_net_list]:
vm_net_colour = ansiprint.red()
else:
valid_net_list.append(net_vni)
net_list.append(net_vni)
vm_list_output.append(
'{bold}{vm_name: <{vm_name_length}} {vm_uuid: <{vm_uuid_length}} \
'{bold}{vm_name: <{vm_name_length}} \
{vm_state_colour}{vm_state: <{vm_state_length}}{end_colour} \
{vm_tags: <{vm_tags_length}} \
{vm_net_colour}{vm_networks: <{vm_nets_length}}{end_colour} \
{vm_memory: <{vm_ram_length}} {vm_vcpu: <{vm_vcpu_length}} \
{vm_node: <{vm_node_length}} \
{vm_migrated: <{vm_migrated_length}}{end_bold}'.format(
vm_name_length=vm_name_length,
vm_uuid_length=vm_uuid_length,
vm_state_length=vm_state_length,
vm_tags_length=vm_tags_length,
vm_nets_length=vm_nets_length,
vm_ram_length=vm_ram_length,
vm_vcpu_length=vm_vcpu_length,
@ -1353,8 +1648,8 @@ def format_list(config, vm_list, raw):
vm_state_colour=vm_state_colour,
end_colour=ansiprint.end(),
vm_name=domain_information['name'],
vm_uuid=domain_information['uuid'],
vm_state=domain_information['state'],
vm_tags=','.join(tag_list),
vm_net_colour=vm_net_colour,
vm_networks=','.join(net_list),
vm_memory=domain_information['memory'],
@ -1364,4 +1659,4 @@ def format_list(config, vm_list, raw):
)
)
return '\n'.join(sorted(vm_list_output))
return '\n'.join(vm_list_output)

View File

@ -34,16 +34,17 @@ from distutils.util import strtobool
from functools import wraps
import cli_lib.ansiprint as ansiprint
import cli_lib.cluster as pvc_cluster
import cli_lib.node as pvc_node
import cli_lib.vm as pvc_vm
import cli_lib.network as pvc_network
import cli_lib.ceph as pvc_ceph
import cli_lib.provisioner as pvc_provisioner
import pvc.cli_lib.ansiprint as ansiprint
import pvc.cli_lib.cluster as pvc_cluster
import pvc.cli_lib.node as pvc_node
import pvc.cli_lib.vm as pvc_vm
import pvc.cli_lib.network as pvc_network
import pvc.cli_lib.ceph as pvc_ceph
import pvc.cli_lib.provisioner as pvc_provisioner
myhostname = socket.gethostname().split('.')[0]
zk_host = ''
is_completion = True if os.environ.get('_PVC_COMPLETE', '') == 'complete' else False
default_store_data = {
'cfgfile': '/etc/pvc/pvcapid.yaml'
@ -133,32 +134,31 @@ def update_store(store_path, store_data):
fh.write(json.dumps(store_data, sort_keys=True, indent=4))
pvc_client_dir = os.environ.get('PVC_CLIENT_DIR', None)
home_dir = os.environ.get('HOME', None)
if pvc_client_dir:
store_path = '{}'.format(pvc_client_dir)
elif home_dir:
store_path = '{}/.config/pvc'.format(home_dir)
else:
print('WARNING: No client or home config dir found, using /tmp instead')
store_path = '/tmp/pvc'
if not is_completion:
pvc_client_dir = os.environ.get('PVC_CLIENT_DIR', None)
home_dir = os.environ.get('HOME', None)
if pvc_client_dir:
store_path = '{}'.format(pvc_client_dir)
elif home_dir:
store_path = '{}/.config/pvc'.format(home_dir)
else:
print('WARNING: No client or home config dir found, using /tmp instead')
store_path = '/tmp/pvc'
if not os.path.isdir(store_path):
os.makedirs(store_path)
if not os.path.isfile(store_path + '/pvc-cli.json'):
update_store(store_path, {"local": default_store_data})
if not os.path.isdir(store_path):
os.makedirs(store_path)
if not os.path.isfile(store_path + '/pvc-cli.json'):
update_store(store_path, {"local": default_store_data})
CONTEXT_SETTINGS = dict(help_option_names=['-h', '--help'], max_content_width=120)
def cleanup(retcode, retmsg):
if retmsg != '':
click.echo(retmsg)
if retcode is True:
if retmsg != '':
click.echo(retmsg)
exit(0)
else:
if retmsg != '':
click.echo(retmsg)
exit(1)
@ -251,7 +251,11 @@ def cluster_remove(name):
# pvc cluster list
###############################################################################
@click.command(name='list', short_help='List all available clusters.')
def cluster_list():
@click.option(
'-r', '--raw', 'raw', is_flag=True, default=False,
help='Display the raw list of cluster names only.'
)
def cluster_list(raw):
"""
List all the available PVC clusters configured in this CLI instance.
"""
@ -302,27 +306,28 @@ def cluster_list():
if _api_key_length > api_key_length:
api_key_length = _api_key_length
# Display the data nicely
click.echo("Available clusters:")
click.echo()
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
name="Name",
name_length=name_length,
description="Description",
description_length=description_length,
address="Address",
address_length=address_length,
port="Port",
port_length=port_length,
scheme="Scheme",
scheme_length=scheme_length,
api_key="API Key",
api_key_length=api_key_length
if not raw:
# Display the data nicely
click.echo("Available clusters:")
click.echo()
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
name="Name",
name_length=name_length,
description="Description",
description_length=description_length,
address="Address",
address_length=address_length,
port="Port",
port_length=port_length,
scheme="Scheme",
scheme_length=scheme_length,
api_key="API Key",
api_key_length=api_key_length
)
)
)
for cluster in clusters:
cluster_details = clusters[cluster]
@ -341,24 +346,27 @@ def cluster_list():
if not api_key:
api_key = 'N/A'
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold='',
end_bold='',
name=cluster,
name_length=name_length,
description=description,
description_length=description_length,
address=address,
address_length=address_length,
port=port,
port_length=port_length,
scheme=scheme,
scheme_length=scheme_length,
api_key=api_key,
api_key_length=api_key_length
if not raw:
click.echo(
'{bold}{name: <{name_length}} {description: <{description_length}} {address: <{address_length}} {port: <{port_length}} {scheme: <{scheme_length}} {api_key: <{api_key_length}}{end_bold}'.format(
bold='',
end_bold='',
name=cluster,
name_length=name_length,
description=description,
description_length=description_length,
address=address,
address_length=address_length,
port=port,
port_length=port_length,
scheme=scheme,
scheme_length=scheme_length,
api_key=api_key,
api_key_length=api_key_length
)
)
)
else:
click.echo(cluster)
# Validate that the cluster is set for a given command
@ -532,6 +540,43 @@ def node_unflush(node, wait):
cleanup(retcode, retmsg)
###############################################################################
# pvc node log
###############################################################################
@click.command(name='log', short_help='Show logs of a node.')
@click.argument(
'node'
)
@click.option(
'-l', '--lines', 'lines', default=None, show_default=False,
help='Display this many log lines from the end of the log buffer. [default: 1000; with follow: 10]'
)
@click.option(
'-f', '--follow', 'follow', is_flag=True, default=False,
help='Follow the log buffer; output may be delayed by a few seconds relative to the live system. The --lines value defaults to 10 for the initial output.'
)
@cluster_req
def node_log(node, lines, follow):
"""
Show node logs of virtual machine DOMAIN on its current node in a pager or continuously. DOMAIN may be a UUID or name. Note that migrating a VM to a different node will cause the log buffer to be overwritten by entries from the new node.
"""
# Set the default here so we can handle it
if lines is None:
if follow:
lines = 10
else:
lines = 1000
if follow:
retcode, retmsg = pvc_node.follow_node_log(config, node, lines)
else:
retcode, retmsg = pvc_node.view_node_log(config, node, lines)
click.echo_via_pager(retmsg)
retmsg = ''
cleanup(retcode, retmsg)
###############################################################################
# pvc node info
###############################################################################
@ -618,8 +663,8 @@ def cli_vm():
)
@click.option(
'-s', '--selector', 'node_selector', default='mem', show_default=True,
type=click.Choice(['mem', 'load', 'vcpus', 'vms']),
help='Method to determine optimal target node during autoselect; saved with VM.'
type=click.Choice(['mem', 'load', 'vcpus', 'vms', 'none']),
help='Method to determine optimal target node during autoselect; "none" will use the default for the cluster.'
)
@click.option(
'-a/-A', '--autostart/--no-autostart', 'node_autostart', is_flag=True, default=False,
@ -630,11 +675,21 @@ def cli_vm():
type=click.Choice(['none', 'live', 'shutdown']),
help='The preferred migration method of the VM between nodes; saved with VM.'
)
@click.option(
'-g', '--tag', 'user_tags',
default=[], multiple=True,
help='User tag for the VM; can be specified multiple times, once per tag.'
)
@click.option(
'-G', '--protected-tag', 'protected_tags',
default=[], multiple=True,
help='Protected user tag for the VM; can be specified multiple times, once per tag.'
)
@click.argument(
'vmconfig', type=click.File()
)
@cluster_req
def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart, migration_method):
def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart, migration_method, user_tags, protected_tags):
"""
Define a new virtual machine from Libvirt XML configuration file VMCONFIG.
"""
@ -650,7 +705,7 @@ def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart,
except Exception:
cleanup(False, 'Error: XML is malformed or invalid')
retcode, retmsg = pvc_vm.vm_define(config, new_cfg, target_node, node_limit, node_selector, node_autostart, migration_method)
retcode, retmsg = pvc_vm.vm_define(config, new_cfg, target_node, node_limit, node_selector, node_autostart, migration_method, user_tags, protected_tags)
cleanup(retcode, retmsg)
@ -664,8 +719,8 @@ def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart,
)
@click.option(
'-s', '--selector', 'node_selector', default=None, show_default=False,
type=click.Choice(['mem', 'load', 'vcpus', 'vms']),
help='Method to determine optimal target node during autoselect.'
type=click.Choice(['mem', 'load', 'vcpus', 'vms', 'none']),
help='Method to determine optimal target node during autoselect; "none" will use the default for the cluster.'
)
@click.option(
'-a/-A', '--autostart/--no-autostart', 'node_autostart', is_flag=True, default=None,
@ -674,7 +729,7 @@ def vm_define(vmconfig, target_node, node_limit, node_selector, node_autostart,
@click.option(
'-m', '--method', 'migration_method', default='none', show_default=True,
type=click.Choice(['none', 'live', 'shutdown']),
help='The preferred migration method of the VM between nodes; saved with VM.'
help='The preferred migration method of the VM between nodes.'
)
@click.option(
'-p', '--profile', 'provisioner_profile', default=None, show_default=False,
@ -728,17 +783,17 @@ def vm_modify(domain, cfgfile, editor, restart, confirm_flag):
cleanup(False, 'Either an XML config file or the "--editor" option must be specified.')
retcode, vm_information = pvc_vm.vm_info(config, domain)
if not retcode and not vm_information.get('name', None):
if not retcode or not vm_information.get('name', None):
cleanup(False, 'ERROR: Could not find VM "{}"!'.format(domain))
dom_name = vm_information.get('name')
if editor is True:
# Grab the current config
current_vm_cfg_raw = vm_information.get('xml')
xml_data = etree.fromstring(current_vm_cfg_raw)
current_vm_cfgfile = etree.tostring(xml_data, pretty_print=True).decode('utf8').strip()
# Grab the current config
current_vm_cfg_raw = vm_information.get('xml')
xml_data = etree.fromstring(current_vm_cfg_raw)
current_vm_cfgfile = etree.tostring(xml_data, pretty_print=True).decode('utf8').strip()
if editor is True:
new_vm_cfgfile = click.edit(text=current_vm_cfgfile, require_save=True, extension='.xml')
if new_vm_cfgfile is None:
click.echo('Aborting with no modifications.')
@ -790,6 +845,36 @@ def vm_modify(domain, cfgfile, editor, restart, confirm_flag):
cleanup(retcode, retmsg)
###############################################################################
# pvc vm rename
###############################################################################
@click.command(name='rename', short_help='Rename a virtual machine.')
@click.argument(
'domain'
)
@click.argument(
'new_name'
)
@click.option(
'-y', '--yes', 'confirm_flag',
is_flag=True, default=False,
help='Confirm the rename'
)
@cluster_req
def vm_rename(domain, new_name, confirm_flag):
"""
Rename virtual machine DOMAIN, and all its connected disk volumes, to NEW_NAME. DOMAIN may be a UUID or name.
"""
if not confirm_flag and not config['unsafe']:
try:
click.confirm('Rename VM {} to {}'.format(domain, new_name), prompt_suffix='? ', abort=True)
except Exception:
exit(0)
retcode, retmsg = pvc_vm.vm_rename(config, domain, new_name)
cleanup(retcode, retmsg)
###############################################################################
# pvc vm undefine
###############################################################################
@ -1073,6 +1158,90 @@ def vm_flush_locks(domain):
cleanup(retcode, retmsg)
###############################################################################
# pvc vm tag
###############################################################################
@click.group(name='tag', short_help='Manage tags of a virtual machine.', context_settings=CONTEXT_SETTINGS)
def vm_tags():
"""
Manage the tags of a virtual machine in the PVC cluster."
"""
pass
###############################################################################
# pvc vm tag get
###############################################################################
@click.command(name='get', short_help='Get the current tags of a virtual machine.')
@click.argument(
'domain'
)
@click.option(
'-r', '--raw', 'raw', is_flag=True, default=False,
help='Display the raw value only without formatting.'
)
@cluster_req
def vm_tags_get(domain, raw):
"""
Get the current tags of the virtual machine DOMAIN.
"""
retcode, retdata = pvc_vm.vm_tags_get(config, domain)
if retcode:
if not raw:
retdata = pvc_vm.format_vm_tags(config, domain, retdata['tags'])
else:
if len(retdata['tags']) > 0:
retdata = '\n'.join([tag['name'] for tag in retdata['tags']])
else:
retdata = 'No tags found.'
cleanup(retcode, retdata)
###############################################################################
# pvc vm tag add
###############################################################################
@click.command(name='add', short_help='Add new tags to a virtual machine.')
@click.argument(
'domain'
)
@click.argument(
'tag'
)
@click.option(
'-p', '--protected', 'protected', is_flag=True, required=False, default=False,
help="Set this tag as protected; protected tags cannot be removed."
)
@cluster_req
def vm_tags_add(domain, tag, protected):
"""
Add TAG to the virtual machine DOMAIN.
"""
retcode, retmsg = pvc_vm.vm_tag_set(config, domain, 'add', tag, protected)
cleanup(retcode, retmsg)
###############################################################################
# pvc vm tag remove
###############################################################################
@click.command(name='remove', short_help='Remove tags from a virtual machine.')
@click.argument(
'domain'
)
@click.argument(
'tag'
)
@cluster_req
def vm_tags_remove(domain, tag):
"""
Remove TAG from the virtual machine DOMAIN.
"""
retcode, retmsg = pvc_vm.vm_tag_set(config, domain, 'remove', tag)
cleanup(retcode, retmsg)
###############################################################################
# pvc vm vcpu
###############################################################################
@ -1281,15 +1450,24 @@ def vm_network_get(domain, raw):
'domain'
)
@click.argument(
'vni'
'net'
)
@click.option(
'-a', '--macaddr', 'macaddr', default=None,
help='Use this MAC address instead of random generation; must be a valid MAC address in colon-deliniated format.'
help='Use this MAC address instead of random generation; must be a valid MAC address in colon-delimited format.'
)
@click.option(
'-m', '--model', 'model', default='virtio',
help='The model for the interface; must be a valid libvirt model.'
'-m', '--model', 'model', default='virtio', show_default=True,
help='The model for the interface; must be a valid libvirt model. Not used for "netdev" SR-IOV NETs.'
)
@click.option(
'-s', '--sriov', 'sriov', is_flag=True, default=False,
help='Identify that NET is an SR-IOV device name and not a VNI. Required for adding SR-IOV NETs.'
)
@click.option(
'-d', '--sriov-mode', 'sriov_mode', default='macvtap', show_default=True,
type=click.Choice(['hostdev', 'macvtap']),
help='For SR-IOV NETs, the SR-IOV network device mode.'
)
@click.option(
'-r', '--restart', 'restart', is_flag=True, default=False,
@ -1301,9 +1479,18 @@ def vm_network_get(domain, raw):
help='Confirm the restart'
)
@cluster_req
def vm_network_add(domain, vni, macaddr, model, restart, confirm_flag):
def vm_network_add(domain, net, macaddr, model, sriov, sriov_mode, restart, confirm_flag):
"""
Add the network VNI to the virtual machine DOMAIN. Networks are always addded to the end of the current list of networks in the virtual machine.
Add the network NET to the virtual machine DOMAIN. Networks are always addded to the end of the current list of networks in the virtual machine.
NET may be a PVC network VNI, which is added as a bridged device, or a SR-IOV VF device connected in the given mode.
NOTE: Adding a SR-IOV network device in the "hostdev" mode has the following caveats:
1. The VM will not be able to be live migrated; it must be shut down to migrate between nodes. The VM metadata will be updated to force this.
2. If an identical SR-IOV VF device is not present on the target node, post-migration startup will fail. It may be prudent to use a node limit here.
"""
if restart and not confirm_flag and not config['unsafe']:
try:
@ -1311,7 +1498,7 @@ def vm_network_add(domain, vni, macaddr, model, restart, confirm_flag):
except Exception:
restart = False
retcode, retmsg = pvc_vm.vm_networks_add(config, domain, vni, macaddr, model, restart)
retcode, retmsg = pvc_vm.vm_networks_add(config, domain, net, macaddr, model, sriov, sriov_mode, restart)
if retcode and not restart:
retmsg = retmsg + " Changes will be applied on next VM start/restart."
cleanup(retcode, retmsg)
@ -1325,7 +1512,11 @@ def vm_network_add(domain, vni, macaddr, model, restart, confirm_flag):
'domain'
)
@click.argument(
'vni'
'net'
)
@click.option(
'-s', '--sriov', 'sriov', is_flag=True, default=False,
help='Identify that NET is an SR-IOV device name and not a VNI. Required for removing SR-IOV NETs.'
)
@click.option(
'-r', '--restart', 'restart', is_flag=True, default=False,
@ -1337,9 +1528,11 @@ def vm_network_add(domain, vni, macaddr, model, restart, confirm_flag):
help='Confirm the restart'
)
@cluster_req
def vm_network_remove(domain, vni, restart, confirm_flag):
def vm_network_remove(domain, net, sriov, restart, confirm_flag):
"""
Remove the network VNI to the virtual machine DOMAIN.
Remove the network NET from the virtual machine DOMAIN.
NET may be a PVC network VNI, which is added as a bridged device, or a SR-IOV VF device connected in the given mode.
"""
if restart and not confirm_flag and not config['unsafe']:
try:
@ -1347,7 +1540,7 @@ def vm_network_remove(domain, vni, restart, confirm_flag):
except Exception:
restart = False
retcode, retmsg = pvc_vm.vm_networks_remove(config, domain, vni, restart)
retcode, retmsg = pvc_vm.vm_networks_remove(config, domain, net, sriov, restart)
if retcode and not restart:
retmsg = retmsg + " Changes will be applied on next VM start/restart."
cleanup(retcode, retmsg)
@ -1454,7 +1647,7 @@ def vm_volume_add(domain, volume, disk_id, bus, disk_type, restart, confirm_flag
'domain'
)
@click.argument(
'vni'
'volume'
)
@click.option(
'-r', '--restart', 'restart', is_flag=True, default=False,
@ -1466,9 +1659,9 @@ def vm_volume_add(domain, volume, disk_id, bus, disk_type, restart, confirm_flag
help='Confirm the restart'
)
@cluster_req
def vm_volume_remove(domain, vni, restart, confirm_flag):
def vm_volume_remove(domain, volume, restart, confirm_flag):
"""
Remove the volume VNI to the virtual machine DOMAIN.
Remove VOLUME from the virtual machine DOMAIN; VOLUME must be a file path or RBD path in 'pool/volume' format.
"""
if restart and not confirm_flag and not config['unsafe']:
try:
@ -1476,7 +1669,7 @@ def vm_volume_remove(domain, vni, restart, confirm_flag):
except Exception:
restart = False
retcode, retmsg = pvc_vm.vm_volumes_remove(config, domain, vni, restart)
retcode, retmsg = pvc_vm.vm_volumes_remove(config, domain, volume, restart)
if retcode and not restart:
retmsg = retmsg + " Changes will be applied on next VM start/restart."
cleanup(retcode, retmsg)
@ -1546,24 +1739,34 @@ def vm_info(domain, long_output):
# pvc vm dump
###############################################################################
@click.command(name='dump', short_help='Dump a virtual machine XML to stdout.')
@click.option(
'-f', '--file', 'filename',
default=None, type=click.File(mode='w'),
help='Write VM XML to this file.'
)
@click.argument(
'domain'
)
@cluster_req
def vm_dump(domain):
def vm_dump(filename, domain):
"""
Dump the Libvirt XML definition of virtual machine DOMAIN to stdout. DOMAIN may be a UUID or name.
"""
retcode, vm_information = pvc_vm.vm_info(config, domain)
if not retcode and not vm_information.get('name', None):
retcode, retdata = pvc_vm.vm_info(config, domain)
if not retcode or not retdata.get('name', None):
cleanup(False, 'ERROR: Could not find VM "{}"!'.format(domain))
# Grab the current config
current_vm_cfg_raw = vm_information.get('xml')
current_vm_cfg_raw = retdata.get('xml')
xml_data = etree.fromstring(current_vm_cfg_raw)
current_vm_cfgfile = etree.tostring(xml_data, pretty_print=True).decode('utf8')
click.echo(current_vm_cfgfile.strip())
xml = current_vm_cfgfile.strip()
if filename is not None:
filename.write(xml)
cleanup(retcode, 'VM XML written to "{}".'.format(filename.name))
else:
cleanup(retcode, xml)
###############################################################################
@ -1581,19 +1784,23 @@ def vm_dump(domain):
'-s', '--state', 'target_state', default=None,
help='Limit list to VMs in the specified state.'
)
@click.option(
'-g', '--tag', 'target_tag', default=None,
help='Limit list to VMs with the specified tag.'
)
@click.option(
'-r', '--raw', 'raw', is_flag=True, default=False,
help='Display the raw list of VM names only.'
)
@cluster_req
def vm_list(target_node, target_state, limit, raw):
def vm_list(target_node, target_state, target_tag, limit, raw):
"""
List all virtual machines; optionally only match names matching regex LIMIT.
List all virtual machines; optionally only match names or full UUIDs matching regex LIMIT.
NOTE: Red-coloured network lists indicate one or more configured networks are missing/invalid.
"""
retcode, retdata = pvc_vm.vm_list(config, limit, target_node, target_state)
retcode, retdata = pvc_vm.vm_list(config, limit, target_node, target_state, target_tag)
if retcode:
retdata = pvc_vm.format_list(config, retdata, raw)
else:
@ -2063,6 +2270,154 @@ def net_acl_list(net, limit, direction):
cleanup(retcode, retdata)
###############################################################################
# pvc network sriov
###############################################################################
@click.group(name='sriov', short_help='Manage SR-IOV network resources.', context_settings=CONTEXT_SETTINGS)
def net_sriov():
"""
Manage SR-IOV network resources on nodes (PFs and VFs).
"""
pass
###############################################################################
# pvc network sriov pf
###############################################################################
@click.group(name='pf', short_help='Manage PF devices.', context_settings=CONTEXT_SETTINGS)
def net_sriov_pf():
"""
Manage SR-IOV PF devices on nodes.
"""
pass
###############################################################################
# pvc network sriov pf list
###############################################################################
@click.command(name='list', short_help='List PF devices.')
@click.argument(
'node'
)
@cluster_req
def net_sriov_pf_list(node):
"""
List all SR-IOV PFs on NODE.
"""
retcode, retdata = pvc_network.net_sriov_pf_list(config, node)
if retcode:
retdata = pvc_network.format_list_sriov_pf(retdata)
cleanup(retcode, retdata)
###############################################################################
# pvc network sriov vf
###############################################################################
@click.group(name='vf', short_help='Manage VF devices.', context_settings=CONTEXT_SETTINGS)
def net_sriov_vf():
"""
Manage SR-IOV VF devices on nodes.
"""
pass
###############################################################################
# pvc network sriov vf set
###############################################################################
@click.command(name='set', short_help='Set VF device properties.')
@click.option(
'--vlan-id', 'vlan_id', default=None, show_default=False,
help='The vLAN ID for vLAN tagging.'
)
@click.option(
'--qos-prio', 'vlan_qos', default=None, show_default=False,
help='The vLAN QOS priority.'
)
@click.option(
'--tx-min', 'tx_rate_min', default=None, show_default=False,
help='The minimum TX rate.'
)
@click.option(
'--tx-max', 'tx_rate_max', default=None, show_default=False,
help='The maximum TX rate.'
)
@click.option(
'--link-state', 'link_state', default=None, show_default=False,
type=click.Choice(['auto', 'enable', 'disable']),
help='The administrative link state.'
)
@click.option(
'--spoof-check/--no-spoof-check', 'spoof_check', is_flag=True, default=None, show_default=False,
help='Enable or disable spoof checking.'
)
@click.option(
'--trust/--no-trust', 'trust', is_flag=True, default=None, show_default=False,
help='Enable or disable VF user trust.'
)
@click.option(
'--query-rss/--no-query-rss', 'query_rss', is_flag=True, default=None, show_default=False,
help='Enable or disable query RSS support.'
)
@click.argument(
'node'
)
@click.argument(
'vf'
)
@cluster_req
def net_sriov_vf_set(node, vf, vlan_id, vlan_qos, tx_rate_min, tx_rate_max, link_state, spoof_check, trust, query_rss):
"""
Set a property of SR-IOV VF on NODE.
"""
if vlan_id is None and vlan_qos is None and tx_rate_min is None and tx_rate_max is None and link_state is None and spoof_check is None and trust is None and query_rss is None:
cleanup(False, 'At least one configuration property must be specified to update.')
retcode, retmsg = pvc_network.net_sriov_vf_set(config, node, vf, vlan_id, vlan_qos, tx_rate_min, tx_rate_max, link_state, spoof_check, trust, query_rss)
cleanup(retcode, retmsg)
###############################################################################
# pvc network sriov vf list
###############################################################################
@click.command(name='list', short_help='List VF devices.')
@click.argument(
'node'
)
@click.argument(
'pf', default=None, required=False
)
@cluster_req
def net_sriov_vf_list(node, pf):
"""
List all SR-IOV VFs on NODE, optionally limited to device PF.
"""
retcode, retdata = pvc_network.net_sriov_vf_list(config, node, pf)
if retcode:
retdata = pvc_network.format_list_sriov_vf(retdata)
cleanup(retcode, retdata)
###############################################################################
# pvc network sriov vf info
###############################################################################
@click.command(name='info', short_help='List VF devices.')
@click.argument(
'node'
)
@click.argument(
'vf'
)
@cluster_req
def net_sriov_vf_info(node, vf):
"""
Show details of the SR-IOV VF on NODE.
"""
retcode, retdata = pvc_network.net_sriov_vf_info(config, node, vf)
if retcode:
retdata = pvc_network.format_info_sriov_vf(config, retdata, node)
cleanup(retcode, retdata)
###############################################################################
# pvc storage
###############################################################################
@ -2865,9 +3220,9 @@ def provisioner_template_system_list(limit):
)
@click.option(
'--node-selector', 'node_selector',
type=click.Choice(['mem', 'vcpus', 'vms', 'load'], case_sensitive=False),
default=None, # Use cluster default
help='Use this selector to determine the optimal node during migrations.'
type=click.Choice(['mem', 'vcpus', 'vms', 'load', 'none'], case_sensitive=False),
default='none',
help='Method to determine optimal target node during autoselect; "none" will use the default for the cluster.'
)
@click.option(
'--node-autostart', 'node_autostart',
@ -2943,8 +3298,8 @@ def provisioner_template_system_add(name, vcpus, vram, serial, vnc, vnc_bind, no
)
@click.option(
'--node-selector', 'node_selector',
type=click.Choice(['mem', 'vcpus', 'vms', 'load'], case_sensitive=False),
help='Use this selector to determine the optimal node during migrations.'
type=click.Choice(['mem', 'vcpus', 'vms', 'load', 'none'], case_sensitive=False),
help='Method to determine optimal target node during autoselect; "none" will use the default for the cluster.'
)
@click.option(
'--node-autostart', 'node_autostart',
@ -4222,7 +4577,7 @@ def cli_task():
@click.command(name='backup', short_help='Create JSON backup of cluster.')
@click.option(
'-f', '--file', 'filename',
default=None, type=click.File(),
default=None, type=click.File(mode='w'),
help='Write backup data to this file.'
)
@cluster_req
@ -4232,11 +4587,14 @@ def task_backup(filename):
"""
retcode, retdata = pvc_cluster.backup(config)
if filename:
with open(filename, 'wb') as fh:
fh.write(retdata)
retdata = 'Data written to {}'.format(filename)
cleanup(retcode, retdata)
if retcode:
if filename is not None:
json.dump(json.loads(retdata), filename)
cleanup(retcode, 'Backup written to "{}".'.format(filename.name))
else:
cleanup(retcode, retdata)
else:
cleanup(retcode, retdata)
###############################################################################
@ -4274,15 +4632,26 @@ def task_restore(filename, confirm_flag):
# pvc task init
###############################################################################
@click.command(name='init', short_help='Initialize a new cluster.')
@click.option(
'-o', '--overwrite', 'overwrite_flag',
is_flag=True, default=False,
help='Remove and overwrite any existing data'
)
@click.option(
'-y', '--yes', 'confirm_flag',
is_flag=True, default=False,
help='Confirm the initialization'
)
@cluster_req
def task_init(confirm_flag):
def task_init(confirm_flag, overwrite_flag):
"""
Perform initialization of a new PVC cluster.
If the '-o'/'--overwrite' option is specified, all existing data in the cluster will be deleted
before new, empty data is written.
It is not advisable to do this against a running cluster - all node daemons should be stopped
first and the API daemon started manually before running this command.
"""
if not confirm_flag and not config['unsafe']:
@ -4294,7 +4663,7 @@ def task_init(confirm_flag):
# Easter-egg
click.echo("Some music while we're Layin' Pipe? https://youtu.be/sw8S_Kv89IU")
retcode, retmsg = pvc_cluster.initialize(config)
retcode, retmsg = pvc_cluster.initialize(config, overwrite_flag)
cleanup(retcode, retmsg)
@ -4375,9 +4744,14 @@ cli_node.add_command(node_primary)
cli_node.add_command(node_flush)
cli_node.add_command(node_ready)
cli_node.add_command(node_unflush)
cli_node.add_command(node_log)
cli_node.add_command(node_info)
cli_node.add_command(node_list)
vm_tags.add_command(vm_tags_get)
vm_tags.add_command(vm_tags_add)
vm_tags.add_command(vm_tags_remove)
vm_vcpu.add_command(vm_vcpu_get)
vm_vcpu.add_command(vm_vcpu_set)
@ -4395,6 +4769,7 @@ vm_volume.add_command(vm_volume_remove)
cli_vm.add_command(vm_define)
cli_vm.add_command(vm_meta)
cli_vm.add_command(vm_modify)
cli_vm.add_command(vm_rename)
cli_vm.add_command(vm_undefine)
cli_vm.add_command(vm_remove)
cli_vm.add_command(vm_dump)
@ -4407,6 +4782,7 @@ cli_vm.add_command(vm_move)
cli_vm.add_command(vm_migrate)
cli_vm.add_command(vm_unmigrate)
cli_vm.add_command(vm_flush_locks)
cli_vm.add_command(vm_tags)
cli_vm.add_command(vm_vcpu)
cli_vm.add_command(vm_memory)
cli_vm.add_command(vm_network)
@ -4422,6 +4798,7 @@ cli_network.add_command(net_info)
cli_network.add_command(net_list)
cli_network.add_command(net_dhcp)
cli_network.add_command(net_acl)
cli_network.add_command(net_sriov)
net_dhcp.add_command(net_dhcp_list)
net_dhcp.add_command(net_dhcp_add)
@ -4431,6 +4808,15 @@ net_acl.add_command(net_acl_add)
net_acl.add_command(net_acl_remove)
net_acl.add_command(net_acl_list)
net_sriov.add_command(net_sriov_pf)
net_sriov.add_command(net_sriov_vf)
net_sriov_pf.add_command(net_sriov_pf_list)
net_sriov_vf.add_command(net_sriov_vf_list)
net_sriov_vf.add_command(net_sriov_vf_info)
net_sriov_vf.add_command(net_sriov_vf_set)
ceph_benchmark.add_command(ceph_benchmark_run)
ceph_benchmark.add_command(ceph_benchmark_info)
ceph_benchmark.add_command(ceph_benchmark_list)

20
client-cli/setup.py Normal file
View File

@ -0,0 +1,20 @@
from setuptools import setup
setup(
name='pvc',
version='0.9.27',
packages=['pvc', 'pvc.cli_lib'],
install_requires=[
'Click',
'PyYAML',
'lxml',
'colorama',
'requests',
'requests-toolbelt'
],
entry_points={
'console_scripts': [
'pvc = pvc.pvc:cli',
],
},
)

View File

@ -25,8 +25,9 @@ import json
import time
import math
from concurrent.futures import ThreadPoolExecutor
import daemon_lib.vm as vm
import daemon_lib.zkhandler as zkhandler
import daemon_lib.common as common
@ -35,42 +36,30 @@ import daemon_lib.common as common
#
# Verify OSD is valid in cluster
def verifyOSD(zk_conn, osd_id):
if zkhandler.exists(zk_conn, '/ceph/osds/{}'.format(osd_id)):
return True
else:
return False
def verifyOSD(zkhandler, osd_id):
return zkhandler.exists(('osd', osd_id))
# Verify Pool is valid in cluster
def verifyPool(zk_conn, name):
if zkhandler.exists(zk_conn, '/ceph/pools/{}'.format(name)):
return True
else:
return False
def verifyPool(zkhandler, name):
return zkhandler.exists(('pool', name))
# Verify Volume is valid in cluster
def verifyVolume(zk_conn, pool, name):
if zkhandler.exists(zk_conn, '/ceph/volumes/{}/{}'.format(pool, name)):
return True
else:
return False
def verifyVolume(zkhandler, pool, name):
return zkhandler.exists(('volume', f'{pool}/{name}'))
# Verify Snapshot is valid in cluster
def verifySnapshot(zk_conn, pool, volume, name):
if zkhandler.exists(zk_conn, '/ceph/snapshots/{}/{}/{}'.format(pool, volume, name)):
return True
else:
return False
def verifySnapshot(zkhandler, pool, volume, name):
return zkhandler.exists(('snapshot', f'{pool}/{volume}/{name}'))
# Verify OSD path is valid in cluster
def verifyOSDBlock(zk_conn, node, device):
for osd in zkhandler.listchildren(zk_conn, '/ceph/osds'):
osd_node = zkhandler.readdata(zk_conn, '/ceph/osds/{}/node'.format(osd))
osd_device = zkhandler.readdata(zk_conn, '/ceph/osds/{}/device'.format(osd))
def verifyOSDBlock(zkhandler, node, device):
for osd in zkhandler.children('base.osd'):
osd_node = zkhandler.read(('osd.node', osd))
osd_device = zkhandler.read(('osd.device', osd))
if node == osd_node and device == osd_device:
return osd
return None
@ -156,9 +145,9 @@ def format_pct_tohuman(datapct):
#
# Status functions
#
def get_status(zk_conn):
primary_node = zkhandler.readdata(zk_conn, '/primary_node')
ceph_status = zkhandler.readdata(zk_conn, '/ceph').rstrip()
def get_status(zkhandler):
primary_node = zkhandler.read('base.config.primary_node')
ceph_status = zkhandler.read('base.storage').rstrip()
# Create a data structure for the information
status_data = {
@ -169,9 +158,9 @@ def get_status(zk_conn):
return True, status_data
def get_util(zk_conn):
primary_node = zkhandler.readdata(zk_conn, '/primary_node')
ceph_df = zkhandler.readdata(zk_conn, '/ceph/util').rstrip()
def get_util(zkhandler):
primary_node = zkhandler.read('base.config.primary_node')
ceph_df = zkhandler.read('base.storage.util').rstrip()
# Create a data structure for the information
status_data = {
@ -185,15 +174,14 @@ def get_util(zk_conn):
#
# OSD functions
#
def getClusterOSDList(zk_conn):
def getClusterOSDList(zkhandler):
# Get a list of VNIs by listing the children of /networks
osd_list = zkhandler.listchildren(zk_conn, '/ceph/osds')
return osd_list
return zkhandler.children('base.osd')
def getOSDInformation(zk_conn, osd_id):
def getOSDInformation(zkhandler, osd_id):
# Parse the stats data
osd_stats_raw = zkhandler.readdata(zk_conn, '/ceph/osds/{}/stats'.format(osd_id))
osd_stats_raw = zkhandler.read(('osd.stats', osd_id))
osd_stats = dict(json.loads(osd_stats_raw))
osd_information = {
@ -205,26 +193,27 @@ def getOSDInformation(zk_conn, osd_id):
# OSD addition and removal uses the /cmd/ceph pipe
# These actions must occur on the specific node they reference
def add_osd(zk_conn, node, device, weight):
def add_osd(zkhandler, node, device, weight):
# Verify the target node exists
if not common.verifyNode(zk_conn, node):
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
# Verify target block device isn't in use
block_osd = verifyOSDBlock(zk_conn, node, device)
block_osd = verifyOSDBlock(zkhandler, node, device)
if block_osd:
return False, 'ERROR: Block device "{}" on node "{}" is used by OSD "{}"'.format(device, node, block_osd)
# Tell the cluster to create a new OSD for the host
add_osd_string = 'osd_add {},{},{}'.format(node, device, weight)
zkhandler.writedata(zk_conn, {'/cmd/ceph': add_osd_string})
zkhandler.write([
('base.cmd.ceph', add_osd_string)
])
# Wait 1/2 second for the cluster to get the message and start working
time.sleep(0.5)
# Acquire a read lock, so we get the return exclusively
lock = zkhandler.readlock(zk_conn, '/cmd/ceph')
with lock:
with zkhandler.readlock('base.cmd.ceph'):
try:
result = zkhandler.readdata(zk_conn, '/cmd/ceph').split()[0]
result = zkhandler.read('base.cmd.ceph').split()[0]
if result == 'success-osd_add':
message = 'Created new OSD with block device "{}" on node "{}".'.format(device, node)
success = True
@ -236,28 +225,30 @@ def add_osd(zk_conn, node, device, weight):
success = False
# Acquire a write lock to ensure things go smoothly
lock = zkhandler.writelock(zk_conn, '/cmd/ceph')
with lock:
with zkhandler.writelock('base.cmd.ceph'):
time.sleep(0.5)
zkhandler.writedata(zk_conn, {'/cmd/ceph': ''})
zkhandler.write([
('base.cmd.ceph', '')
])
return success, message
def remove_osd(zk_conn, osd_id):
if not verifyOSD(zk_conn, osd_id):
def remove_osd(zkhandler, osd_id):
if not verifyOSD(zkhandler, osd_id):
return False, 'ERROR: No OSD with ID "{}" is present in the cluster.'.format(osd_id)
# Tell the cluster to remove an OSD
remove_osd_string = 'osd_remove {}'.format(osd_id)
zkhandler.writedata(zk_conn, {'/cmd/ceph': remove_osd_string})
zkhandler.write([
('base.cmd.ceph', remove_osd_string)
])
# Wait 1/2 second for the cluster to get the message and start working
time.sleep(0.5)
# Acquire a read lock, so we get the return exclusively
lock = zkhandler.readlock(zk_conn, '/cmd/ceph')
with lock:
with zkhandler.readlock('base.cmd.ceph'):
try:
result = zkhandler.readdata(zk_conn, '/cmd/ceph').split()[0]
result = zkhandler.read('base.cmd.ceph').split()[0]
if result == 'success-osd_remove':
message = 'Removed OSD "{}" from the cluster.'.format(osd_id)
success = True
@ -269,16 +260,17 @@ def remove_osd(zk_conn, osd_id):
message = 'ERROR Command ignored by node.'
# Acquire a write lock to ensure things go smoothly
lock = zkhandler.writelock(zk_conn, '/cmd/ceph')
with lock:
with zkhandler.writelock('base.cmd.ceph'):
time.sleep(0.5)
zkhandler.writedata(zk_conn, {'/cmd/ceph': ''})
zkhandler.write([
('base.cmd.ceph', '')
])
return success, message
def in_osd(zk_conn, osd_id):
if not verifyOSD(zk_conn, osd_id):
def in_osd(zkhandler, osd_id):
if not verifyOSD(zkhandler, osd_id):
return False, 'ERROR: No OSD with ID "{}" is present in the cluster.'.format(osd_id)
retcode, stdout, stderr = common.run_os_command('ceph osd in {}'.format(osd_id))
@ -288,8 +280,8 @@ def in_osd(zk_conn, osd_id):
return True, 'Set OSD {} online.'.format(osd_id)
def out_osd(zk_conn, osd_id):
if not verifyOSD(zk_conn, osd_id):
def out_osd(zkhandler, osd_id):
if not verifyOSD(zkhandler, osd_id):
return False, 'ERROR: No OSD with ID "{}" is present in the cluster.'.format(osd_id)
retcode, stdout, stderr = common.run_os_command('ceph osd out {}'.format(osd_id))
@ -299,7 +291,7 @@ def out_osd(zk_conn, osd_id):
return True, 'Set OSD {} offline.'.format(osd_id)
def set_osd(zk_conn, option):
def set_osd(zkhandler, option):
retcode, stdout, stderr = common.run_os_command('ceph osd set {}'.format(option))
if retcode:
return False, 'ERROR: Failed to set property "{}": {}'.format(option, stderr)
@ -307,7 +299,7 @@ def set_osd(zk_conn, option):
return True, 'Set OSD property "{}".'.format(option)
def unset_osd(zk_conn, option):
def unset_osd(zkhandler, option):
retcode, stdout, stderr = common.run_os_command('ceph osd unset {}'.format(option))
if retcode:
return False, 'ERROR: Failed to unset property "{}": {}'.format(option, stderr)
@ -315,9 +307,9 @@ def unset_osd(zk_conn, option):
return True, 'Unset OSD property "{}".'.format(option)
def get_list_osd(zk_conn, limit, is_fuzzy=True):
def get_list_osd(zkhandler, limit, is_fuzzy=True):
osd_list = []
full_osd_list = zkhandler.listchildren(zk_conn, '/ceph/osds')
full_osd_list = zkhandler.children('base.osd')
if is_fuzzy and limit:
# Implicitly assume fuzzy limits
@ -330,11 +322,11 @@ def get_list_osd(zk_conn, limit, is_fuzzy=True):
if limit:
try:
if re.match(limit, osd):
osd_list.append(getOSDInformation(zk_conn, osd))
osd_list.append(getOSDInformation(zkhandler, osd))
except Exception as e:
return False, 'Regex Error: {}'.format(e)
else:
osd_list.append(getOSDInformation(zk_conn, osd))
osd_list.append(getOSDInformation(zkhandler, osd))
return True, sorted(osd_list, key=lambda x: int(x['id']))
@ -342,11 +334,11 @@ def get_list_osd(zk_conn, limit, is_fuzzy=True):
#
# Pool functions
#
def getPoolInformation(zk_conn, pool):
def getPoolInformation(zkhandler, pool):
# Parse the stats data
pool_stats_raw = zkhandler.readdata(zk_conn, '/ceph/pools/{}/stats'.format(pool))
pool_stats_raw = zkhandler.read(('pool.stats', pool))
pool_stats = dict(json.loads(pool_stats_raw))
volume_count = len(getCephVolumes(zk_conn, pool))
volume_count = len(getCephVolumes(zkhandler, pool))
pool_information = {
'name': pool,
@ -356,7 +348,7 @@ def getPoolInformation(zk_conn, pool):
return pool_information
def add_pool(zk_conn, name, pgs, replcfg):
def add_pool(zkhandler, name, pgs, replcfg):
# Prepare the copies/mincopies variables
try:
copies, mincopies = replcfg.split(',')
@ -388,24 +380,24 @@ def add_pool(zk_conn, name, pgs, replcfg):
return False, 'ERROR: Failed to enable RBD application on pool "{}" : {}'.format(name, stderr)
# 4. Add the new pool to Zookeeper
zkhandler.writedata(zk_conn, {
'/ceph/pools/{}'.format(name): '',
'/ceph/pools/{}/pgs'.format(name): pgs,
'/ceph/pools/{}/stats'.format(name): '{}',
'/ceph/volumes/{}'.format(name): '',
'/ceph/snapshots/{}'.format(name): '',
})
zkhandler.write([
(('pool', name), ''),
(('pool.pgs', name), pgs),
(('pool.stats', name), '{}'),
(('volume', name), ''),
(('snapshot', name), ''),
])
return True, 'Created RBD pool "{}" with {} PGs'.format(name, pgs)
def remove_pool(zk_conn, name):
if not verifyPool(zk_conn, name):
def remove_pool(zkhandler, name):
if not verifyPool(zkhandler, name):
return False, 'ERROR: No pool with name "{}" is present in the cluster.'.format(name)
# 1. Remove pool volumes
for volume in zkhandler.listchildren(zk_conn, '/ceph/volumes/{}'.format(name)):
remove_volume(zk_conn, name, volume)
for volume in zkhandler.children(('volume', name)):
remove_volume(zkhandler, name, volume)
# 2. Remove the pool
retcode, stdout, stderr = common.run_os_command('ceph osd pool rm {pool} {pool} --yes-i-really-really-mean-it'.format(pool=name))
@ -413,54 +405,68 @@ def remove_pool(zk_conn, name):
return False, 'ERROR: Failed to remove pool "{}": {}'.format(name, stderr)
# 3. Delete pool from Zookeeper
zkhandler.deletekey(zk_conn, '/ceph/pools/{}'.format(name))
zkhandler.deletekey(zk_conn, '/ceph/volumes/{}'.format(name))
zkhandler.deletekey(zk_conn, '/ceph/snapshots/{}'.format(name))
zkhandler.delete([
('pool', name),
('volume', name),
('snapshot', name),
])
return True, 'Removed RBD pool "{}" and all volumes.'.format(name)
def get_list_pool(zk_conn, limit, is_fuzzy=True):
pool_list = []
full_pool_list = zkhandler.listchildren(zk_conn, '/ceph/pools')
def get_list_pool(zkhandler, limit, is_fuzzy=True):
full_pool_list = zkhandler.children('base.pool')
if limit:
if not is_fuzzy:
limit = '^' + limit + '$'
get_pool_info = dict()
for pool in full_pool_list:
is_limit_match = False
if limit:
try:
if re.match(limit, pool):
pool_list.append(getPoolInformation(zk_conn, pool))
is_limit_match = True
except Exception as e:
return False, 'Regex Error: {}'.format(e)
else:
pool_list.append(getPoolInformation(zk_conn, pool))
is_limit_match = True
return True, sorted(pool_list, key=lambda x: int(x['stats']['id']))
get_pool_info[pool] = True if is_limit_match else False
pool_execute_list = [pool for pool in full_pool_list if get_pool_info[pool]]
pool_data_list = list()
with ThreadPoolExecutor(max_workers=32, thread_name_prefix='pool_list') as executor:
futures = []
for pool in pool_execute_list:
futures.append(executor.submit(getPoolInformation, zkhandler, pool))
for future in futures:
pool_data_list.append(future.result())
return True, sorted(pool_data_list, key=lambda x: int(x['stats']['id']))
#
# Volume functions
#
def getCephVolumes(zk_conn, pool):
def getCephVolumes(zkhandler, pool):
volume_list = list()
if not pool:
pool_list = zkhandler.listchildren(zk_conn, '/ceph/pools')
pool_list = zkhandler.children('base.pool')
else:
pool_list = [pool]
for pool_name in pool_list:
for volume_name in zkhandler.listchildren(zk_conn, '/ceph/volumes/{}'.format(pool_name)):
for volume_name in zkhandler.children(('volume', pool_name)):
volume_list.append('{}/{}'.format(pool_name, volume_name))
return volume_list
def getVolumeInformation(zk_conn, pool, volume):
def getVolumeInformation(zkhandler, pool, volume):
# Parse the stats data
volume_stats_raw = zkhandler.readdata(zk_conn, '/ceph/volumes/{}/{}/stats'.format(pool, volume))
volume_stats_raw = zkhandler.read(('volume.stats', f'{pool}/{volume}'))
volume_stats = dict(json.loads(volume_stats_raw))
# Format the size to something nicer
volume_stats['size'] = format_bytes_tohuman(volume_stats['size'])
@ -473,9 +479,9 @@ def getVolumeInformation(zk_conn, pool, volume):
return volume_information
def add_volume(zk_conn, pool, name, size):
def add_volume(zkhandler, pool, name, size):
# 1. Verify the size of the volume
pool_information = getPoolInformation(zk_conn, pool)
pool_information = getPoolInformation(zkhandler, pool)
size_bytes = format_bytes_fromhuman(size)
if size_bytes >= int(pool_information['stats']['free_bytes']):
return False, 'ERROR: Requested volume size is greater than the available free space in the pool'
@ -494,17 +500,17 @@ def add_volume(zk_conn, pool, name, size):
volstats = stdout
# 3. Add the new volume to Zookeeper
zkhandler.writedata(zk_conn, {
'/ceph/volumes/{}/{}'.format(pool, name): '',
'/ceph/volumes/{}/{}/stats'.format(pool, name): volstats,
'/ceph/snapshots/{}/{}'.format(pool, name): '',
})
zkhandler.write([
(('volume', f'{pool}/{name}'), ''),
(('volume.stats', f'{pool}/{name}'), volstats),
(('snapshot', f'{pool}/{name}'), ''),
])
return True, 'Created RBD volume "{}/{}" ({}).'.format(pool, name, size)
def clone_volume(zk_conn, pool, name_src, name_new):
if not verifyVolume(zk_conn, pool, name_src):
def clone_volume(zkhandler, pool, name_src, name_new):
if not verifyVolume(zkhandler, pool, name_src):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name_src, pool)
# 1. Clone the volume
@ -517,17 +523,17 @@ def clone_volume(zk_conn, pool, name_src, name_new):
volstats = stdout
# 3. Add the new volume to Zookeeper
zkhandler.writedata(zk_conn, {
'/ceph/volumes/{}/{}'.format(pool, name_new): '',
'/ceph/volumes/{}/{}/stats'.format(pool, name_new): volstats,
'/ceph/snapshots/{}/{}'.format(pool, name_new): '',
})
zkhandler.write([
(('volume', f'{pool}/{name_new}'), ''),
(('volume.stats', f'{pool}/{name_new}'), volstats),
(('snapshot', f'{pool}/{name_new}'), ''),
])
return True, 'Cloned RBD volume "{}" to "{}" in pool "{}"'.format(name_src, name_new, pool)
def resize_volume(zk_conn, pool, name, size):
if not verifyVolume(zk_conn, pool, name):
def resize_volume(zkhandler, pool, name, size):
if not verifyVolume(zkhandler, pool, name):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool)
# 1. Resize the volume
@ -538,7 +544,7 @@ def resize_volume(zk_conn, pool, name, size):
# 2a. Determine the node running this VM if applicable
active_node = None
volume_vm_name = name.split('_')[0]
retcode, vm_info = vm.get_info(zk_conn, volume_vm_name)
retcode, vm_info = vm.get_info(zkhandler, volume_vm_name)
if retcode:
for disk in vm_info['disks']:
# This block device is present in this VM so we can continue
@ -564,17 +570,17 @@ def resize_volume(zk_conn, pool, name, size):
volstats = stdout
# 3. Add the new volume to Zookeeper
zkhandler.writedata(zk_conn, {
'/ceph/volumes/{}/{}'.format(pool, name): '',
'/ceph/volumes/{}/{}/stats'.format(pool, name): volstats,
'/ceph/snapshots/{}/{}'.format(pool, name): '',
})
zkhandler.write([
(('volume', f'{pool}/{name}'), ''),
(('volume.stats', f'{pool}/{name}'), volstats),
(('snapshot', f'{pool}/{name}'), ''),
])
return True, 'Resized RBD volume "{}" to size "{}" in pool "{}".'.format(name, size, pool)
def rename_volume(zk_conn, pool, name, new_name):
if not verifyVolume(zk_conn, pool, name):
def rename_volume(zkhandler, pool, name, new_name):
if not verifyVolume(zkhandler, pool, name):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool)
# 1. Rename the volume
@ -583,30 +589,30 @@ def rename_volume(zk_conn, pool, name, new_name):
return False, 'ERROR: Failed to rename volume "{}" to "{}" in pool "{}": {}'.format(name, new_name, pool, stderr)
# 2. Rename the volume in Zookeeper
zkhandler.renamekey(zk_conn, {
'/ceph/volumes/{}/{}'.format(pool, name): '/ceph/volumes/{}/{}'.format(pool, new_name),
'/ceph/snapshots/{}/{}'.format(pool, name): '/ceph/snapshots/{}/{}'.format(pool, new_name)
})
zkhandler.rename([
(('volume', f'{pool}/{name}'), ('volume', f'{pool}/{new_name}')),
(('snapshot', f'{pool}/{name}'), ('snapshot', f'{pool}/{new_name}')),
])
# 3. Get volume stats
retcode, stdout, stderr = common.run_os_command('rbd info --format json {}/{}'.format(pool, new_name))
volstats = stdout
# 4. Update the volume stats in Zookeeper
zkhandler.writedata(zk_conn, {
'/ceph/volumes/{}/{}/stats'.format(pool, new_name): volstats,
})
zkhandler.write([
(('volume.stats', f'{pool}/{new_name}'), volstats),
])
return True, 'Renamed RBD volume "{}" to "{}" in pool "{}".'.format(name, new_name, pool)
def remove_volume(zk_conn, pool, name):
if not verifyVolume(zk_conn, pool, name):
def remove_volume(zkhandler, pool, name):
if not verifyVolume(zkhandler, pool, name):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool)
# 1. Remove volume snapshots
for snapshot in zkhandler.listchildren(zk_conn, '/ceph/snapshots/{}/{}'.format(pool, name)):
remove_snapshot(zk_conn, pool, name, snapshot)
for snapshot in zkhandler.children(('snapshot', f'{pool}/{name}')):
remove_snapshot(zkhandler, pool, name, snapshot)
# 2. Remove the volume
retcode, stdout, stderr = common.run_os_command('rbd rm {}/{}'.format(pool, name))
@ -614,14 +620,16 @@ def remove_volume(zk_conn, pool, name):
return False, 'ERROR: Failed to remove RBD volume "{}" in pool "{}": {}'.format(name, pool, stderr)
# 3. Delete volume from Zookeeper
zkhandler.deletekey(zk_conn, '/ceph/volumes/{}/{}'.format(pool, name))
zkhandler.deletekey(zk_conn, '/ceph/snapshots/{}/{}'.format(pool, name))
zkhandler.delete([
('volume', f'{pool}/{name}'),
('snapshot', f'{pool}/{name}'),
])
return True, 'Removed RBD volume "{}" in pool "{}".'.format(name, pool)
def map_volume(zk_conn, pool, name):
if not verifyVolume(zk_conn, pool, name):
def map_volume(zkhandler, pool, name):
if not verifyVolume(zkhandler, pool, name):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool)
# 1. Map the volume onto the local system
@ -639,8 +647,8 @@ def map_volume(zk_conn, pool, name):
return True, mapped_volume
def unmap_volume(zk_conn, pool, name):
if not verifyVolume(zk_conn, pool, name):
def unmap_volume(zkhandler, pool, name):
if not verifyVolume(zkhandler, pool, name):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(name, pool)
mapped_volume = '/dev/rbd/{}/{}'.format(pool, name)
@ -657,12 +665,11 @@ def unmap_volume(zk_conn, pool, name):
return True, 'Unmapped RBD volume at "{}".'.format(mapped_volume)
def get_list_volume(zk_conn, pool, limit, is_fuzzy=True):
volume_list = []
if pool and not verifyPool(zk_conn, pool):
def get_list_volume(zkhandler, pool, limit, is_fuzzy=True):
if pool and not verifyPool(zkhandler, pool):
return False, 'ERROR: No pool with name "{}" is present in the cluster.'.format(pool)
full_volume_list = getCephVolumes(zk_conn, pool)
full_volume_list = getCephVolumes(zkhandler, pool)
if limit:
if not is_fuzzy:
@ -674,28 +681,46 @@ def get_list_volume(zk_conn, pool, limit, is_fuzzy=True):
if not re.match(r'.*\$', limit):
limit = limit + '.*'
get_volume_info = dict()
for volume in full_volume_list:
pool_name, volume_name = volume.split('/')
is_limit_match = False
# Check on limit
if limit:
# Try to match the limit against the volume name
try:
if re.match(limit, volume_name):
volume_list.append(getVolumeInformation(zk_conn, pool_name, volume_name))
is_limit_match = True
except Exception as e:
return False, 'Regex Error: {}'.format(e)
else:
volume_list.append(getVolumeInformation(zk_conn, pool_name, volume_name))
is_limit_match = True
return True, sorted(volume_list, key=lambda x: str(x['name']))
get_volume_info[volume] = True if is_limit_match else False
# Obtain our volume data in a thread pool
volume_execute_list = [volume for volume in full_volume_list if get_volume_info[volume]]
volume_data_list = list()
with ThreadPoolExecutor(max_workers=32, thread_name_prefix='volume_list') as executor:
futures = []
for volume in volume_execute_list:
pool_name, volume_name = volume.split('/')
futures.append(executor.submit(getVolumeInformation, zkhandler, pool_name, volume_name))
for future in futures:
volume_data_list.append(future.result())
return True, sorted(volume_data_list, key=lambda x: str(x['name']))
#
# Snapshot functions
#
def getCephSnapshots(zk_conn, pool, volume):
def getCephSnapshots(zkhandler, pool, volume):
snapshot_list = list()
volume_list = list()
volume_list = getCephVolumes(zk_conn, pool)
volume_list = getCephVolumes(zkhandler, pool)
if volume:
for volume_entry in volume_list:
volume_pool, volume_name = volume_entry.split('/')
@ -703,14 +728,14 @@ def getCephSnapshots(zk_conn, pool, volume):
volume_list = ['{}/{}'.format(volume_pool, volume_name)]
for volume_entry in volume_list:
for snapshot_name in zkhandler.listchildren(zk_conn, '/ceph/snapshots/{}'.format(volume_entry)):
for snapshot_name in zkhandler.children(('snapshot', volume_entry)):
snapshot_list.append('{}@{}'.format(volume_entry, snapshot_name))
return snapshot_list
def add_snapshot(zk_conn, pool, volume, name):
if not verifyVolume(zk_conn, pool, volume):
def add_snapshot(zkhandler, pool, volume, name):
if not verifyVolume(zkhandler, pool, volume):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(volume, pool)
# 1. Create the snapshot
@ -719,47 +744,47 @@ def add_snapshot(zk_conn, pool, volume, name):
return False, 'ERROR: Failed to create RBD snapshot "{}" of volume "{}" in pool "{}": {}'.format(name, volume, pool, stderr)
# 2. Add the snapshot to Zookeeper
zkhandler.writedata(zk_conn, {
'/ceph/snapshots/{}/{}/{}'.format(pool, volume, name): '',
'/ceph/snapshots/{}/{}/{}/stats'.format(pool, volume, name): '{}'
})
zkhandler.write([
(('snapshot', f'{pool}/{volume}/{name}'), ''),
(('snapshot.stats', f'{pool}/{volume}/{name}'), '{}'),
])
# 3. Update the count of snapshots on this volume
volume_stats_raw = zkhandler.readdata(zk_conn, '/ceph/volumes/{}/{}/stats'.format(pool, volume))
volume_stats_raw = zkhandler.read(('volume.stats', f'{pool}/{volume}'))
volume_stats = dict(json.loads(volume_stats_raw))
# Format the size to something nicer
volume_stats['snapshot_count'] = volume_stats['snapshot_count'] + 1
volume_stats_raw = json.dumps(volume_stats)
zkhandler.writedata(zk_conn, {
'/ceph/volumes/{}/{}/stats'.format(pool, volume): volume_stats_raw
})
zkhandler.write([
(('volume.stats', f'{pool}/{volume}'), volume_stats_raw),
])
return True, 'Created RBD snapshot "{}" of volume "{}" in pool "{}".'.format(name, volume, pool)
def rename_snapshot(zk_conn, pool, volume, name, new_name):
if not verifyVolume(zk_conn, pool, volume):
def rename_snapshot(zkhandler, pool, volume, name, new_name):
if not verifyVolume(zkhandler, pool, volume):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(volume, pool)
if not verifySnapshot(zk_conn, pool, volume, name):
if not verifySnapshot(zkhandler, pool, volume, name):
return False, 'ERROR: No snapshot with name "{}" is present for volume "{}" in pool "{}".'.format(name, volume, pool)
# 1. Rename the snapshot
retcode, stdout, stderr = common.run_os_command('rbd snap rename {}/{}@{} {}'.format(pool, volume, name, new_name))
retcode, stdout, stderr = common.run_os_command('rbd snap rename {pool}/{volume}@{name} {pool}/{volume}@{new_name}'.format(pool=pool, volume=volume, name=name, new_name=new_name))
if retcode:
return False, 'ERROR: Failed to rename RBD snapshot "{}" to "{}" for volume "{}" in pool "{}": {}'.format(name, new_name, volume, pool, stderr)
# 2. Rename the snapshot in ZK
zkhandler.renamekey(zk_conn, {
'/ceph/snapshots/{}/{}/{}'.format(pool, volume, name): '/ceph/snapshots/{}/{}/{}'.format(pool, volume, new_name)
})
zkhandler.rename([
(('snapshot', f'{pool}/{volume}/{name}'), ('snapshot', f'{pool}/{volume}/{new_name}')),
])
return True, 'Renamed RBD snapshot "{}" to "{}" for volume "{}" in pool "{}".'.format(name, new_name, volume, pool)
def remove_snapshot(zk_conn, pool, volume, name):
if not verifyVolume(zk_conn, pool, volume):
def remove_snapshot(zkhandler, pool, volume, name):
if not verifyVolume(zkhandler, pool, volume):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(volume, pool)
if not verifySnapshot(zk_conn, pool, volume, name):
if not verifySnapshot(zkhandler, pool, volume, name):
return False, 'ERROR: No snapshot with name "{}" is present of volume {} in pool {}.'.format(name, volume, pool)
# 1. Remove the snapshot
@ -768,30 +793,32 @@ def remove_snapshot(zk_conn, pool, volume, name):
return False, 'Failed to remove RBD snapshot "{}" of volume "{}" in pool "{}": {}'.format(name, volume, pool, stderr)
# 2. Delete snapshot from Zookeeper
zkhandler.deletekey(zk_conn, '/ceph/snapshots/{}/{}/{}'.format(pool, volume, name))
zkhandler.delete([
('snapshot', f'{pool}/{volume}/{name}')
])
# 3. Update the count of snapshots on this volume
volume_stats_raw = zkhandler.readdata(zk_conn, '/ceph/volumes/{}/{}/stats'.format(pool, volume))
volume_stats_raw = zkhandler.read(('volume.stats', f'{pool}/{volume}'))
volume_stats = dict(json.loads(volume_stats_raw))
# Format the size to something nicer
volume_stats['snapshot_count'] = volume_stats['snapshot_count'] - 1
volume_stats_raw = json.dumps(volume_stats)
zkhandler.writedata(zk_conn, {
'/ceph/volumes/{}/{}/stats'.format(pool, volume): volume_stats_raw
})
zkhandler.write([
(('volume.stats', f'{pool}/{volume}'), volume_stats_raw)
])
return True, 'Removed RBD snapshot "{}" of volume "{}" in pool "{}".'.format(name, volume, pool)
def get_list_snapshot(zk_conn, pool, volume, limit, is_fuzzy=True):
def get_list_snapshot(zkhandler, pool, volume, limit, is_fuzzy=True):
snapshot_list = []
if pool and not verifyPool(zk_conn, pool):
if pool and not verifyPool(zkhandler, pool):
return False, 'ERROR: No pool with name "{}" is present in the cluster.'.format(pool)
if volume and not verifyPool(zk_conn, volume):
if volume and not verifyPool(zkhandler, volume):
return False, 'ERROR: No volume with name "{}" is present in the cluster.'.format(volume)
full_snapshot_list = getCephSnapshots(zk_conn, pool, volume)
full_snapshot_list = getCephSnapshots(zkhandler, pool, volume)
if is_fuzzy and limit:
# Implicitly assume fuzzy limits

View File

@ -21,7 +21,6 @@
import re
import daemon_lib.zkhandler as zkhandler
import daemon_lib.common as common
import daemon_lib.vm as pvc_vm
import daemon_lib.node as pvc_node
@ -29,43 +28,48 @@ import daemon_lib.network as pvc_network
import daemon_lib.ceph as pvc_ceph
def set_maintenance(zk_conn, maint_state):
try:
def set_maintenance(zkhandler, maint_state):
current_maint_state = zkhandler.read('base.config.maintenance')
if maint_state == current_maint_state:
if maint_state == 'true':
zkhandler.writedata(zk_conn, {'/maintenance': 'true'})
return True, 'Successfully set cluster in maintenance mode'
return True, 'Cluster is already in maintenance mode'
else:
zkhandler.writedata(zk_conn, {'/maintenance': 'false'})
return True, 'Successfully set cluster in normal mode'
except Exception:
return False, 'Failed to set cluster maintenance state'
return True, 'Cluster is already in normal mode'
if maint_state == 'true':
zkhandler.write([
('base.config.maintenance', 'true')
])
return True, 'Successfully set cluster in maintenance mode'
else:
zkhandler.write([
('base.config.maintenance', 'false')
])
return True, 'Successfully set cluster in normal mode'
def getClusterInformation(zk_conn):
def getClusterInformation(zkhandler):
# Get cluster maintenance state
try:
maint_state = zkhandler.readdata(zk_conn, '/maintenance')
except Exception:
maint_state = 'false'
maint_state = zkhandler.read('base.config.maintenance')
# List of messages to display to the clients
cluster_health_msg = []
storage_health_msg = []
# Get node information object list
retcode, node_list = pvc_node.get_list(zk_conn, None)
retcode, node_list = pvc_node.get_list(zkhandler, None)
# Get vm information object list
retcode, vm_list = pvc_vm.get_list(zk_conn, None, None, None)
retcode, vm_list = pvc_vm.get_list(zkhandler, None, None, None, None)
# Get network information object list
retcode, network_list = pvc_network.get_list(zk_conn, None, None)
retcode, network_list = pvc_network.get_list(zkhandler, None, None)
# Get storage information object list
retcode, ceph_osd_list = pvc_ceph.get_list_osd(zk_conn, None)
retcode, ceph_pool_list = pvc_ceph.get_list_pool(zk_conn, None)
retcode, ceph_volume_list = pvc_ceph.get_list_volume(zk_conn, None, None)
retcode, ceph_snapshot_list = pvc_ceph.get_list_snapshot(zk_conn, None, None, None)
retcode, ceph_osd_list = pvc_ceph.get_list_osd(zkhandler, None)
retcode, ceph_pool_list = pvc_ceph.get_list_pool(zkhandler, None)
retcode, ceph_volume_list = pvc_ceph.get_list_volume(zkhandler, None, None)
retcode, ceph_snapshot_list = pvc_ceph.get_list_snapshot(zkhandler, None, None, None)
# Determine, for each subsection, the total count
node_count = len(node_list)
@ -164,7 +168,7 @@ def getClusterInformation(zk_conn):
cluster_health = 'Optimal'
# Find out our storage health from Ceph
ceph_status = zkhandler.readdata(zk_conn, '/ceph').split('\n')
ceph_status = zkhandler.read('base.storage').split('\n')
ceph_health = ceph_status[2].split()[-1]
# Parse the status output to get the health indicators
@ -234,8 +238,8 @@ def getClusterInformation(zk_conn):
'health_msg': cluster_health_msg,
'storage_health': storage_health,
'storage_health_msg': storage_health_msg,
'primary_node': common.getPrimaryNode(zk_conn),
'upstream_ip': zkhandler.readdata(zk_conn, '/upstream_ip'),
'primary_node': common.getPrimaryNode(zkhandler),
'upstream_ip': zkhandler.read('base.config.upstream_ip'),
'nodes': formatted_node_states,
'vms': formatted_vm_states,
'networks': network_count,
@ -248,10 +252,88 @@ def getClusterInformation(zk_conn):
return cluster_information
def get_info(zk_conn):
def get_info(zkhandler):
# This is a thin wrapper function for naming purposes
cluster_information = getClusterInformation(zk_conn)
cluster_information = getClusterInformation(zkhandler)
if cluster_information:
return True, cluster_information
else:
return False, 'ERROR: Failed to obtain cluster information!'
def cluster_initialize(zkhandler, overwrite=False):
# Abort if we've initialized the cluster before
if zkhandler.exists('base.config.primary_node') and not overwrite:
return False, 'ERROR: Cluster contains data and overwrite not set.'
if overwrite:
# Delete the existing keys
for key in zkhandler.schema.keys('base'):
if key == 'root':
# Don't delete the root key
continue
status = zkhandler.delete('base.{}'.format(key), recursive=True)
if not status:
return False, 'ERROR: Failed to delete data in cluster; running nodes perhaps?'
# Create the root keys
zkhandler.schema.apply(zkhandler)
return True, 'Successfully initialized cluster'
def cluster_backup(zkhandler):
# Dictionary of values to come
cluster_data = dict()
def get_data(path):
data = zkhandler.read(path)
children = zkhandler.children(path)
cluster_data[path] = data
if children:
if path == '/':
child_prefix = '/'
else:
child_prefix = path + '/'
for child in children:
if child_prefix + child == '/zookeeper':
# We must skip the built-in /zookeeper tree
continue
if child_prefix + child == '/patroni':
# We must skip the /patroni tree
continue
get_data(child_prefix + child)
try:
get_data('/')
except Exception as e:
return False, 'ERROR: Failed to obtain backup: {}'.format(e)
return True, cluster_data
def cluster_restore(zkhandler, cluster_data):
# Build a key+value list
kv = []
schema_version = None
for key in cluster_data:
if key == zkhandler.schema.path('base.schema.version'):
schema_version = cluster_data[key]
data = cluster_data[key]
kv.append((key, data))
if int(schema_version) != int(zkhandler.schema.version):
return False, 'ERROR: Schema version of backup ({}) does not match cluster schema version ({}).'.format(schema_version, zkhandler.schema.version)
# Close the Zookeeper connection
result = zkhandler.write(kv)
if result:
return True, 'Restore completed successfully.'
else:
return False, 'Restore failed.'

View File

@ -22,48 +22,153 @@
import time
import uuid
import lxml
import shlex
import subprocess
import kazoo.client
import signal
from json import loads
from re import match as re_match
from re import split as re_split
from distutils.util import strtobool
from threading import Thread
from shlex import split as shlex_split
from functools import wraps
###############################################################################
# Performance Profiler decorator
###############################################################################
# Get performance statistics on a function or class
class Profiler(object):
def __init__(self, config):
self.is_debug = config['debug']
self.pvc_logdir = '/var/log/pvc'
def __call__(self, function):
if not callable(function):
return
if not self.is_debug:
return function
@wraps(function)
def profiler_wrapper(*args, **kwargs):
import cProfile
import pstats
from os import path, makedirs
from datetime import datetime
if not path.exists(self.pvc_logdir):
print('Profiler: Requested profiling of {} but no log dir present; printing instead.'.format(str(function.__name__)))
log_result = False
else:
log_result = True
profiler_logdir = '{}/profiler'.format(self.pvc_logdir)
if not path.exists(profiler_logdir):
makedirs(profiler_logdir)
pr = cProfile.Profile()
pr.enable()
ret = function(*args, **kwargs)
pr.disable()
stats = pstats.Stats(pr)
stats.sort_stats(pstats.SortKey.TIME)
if log_result:
stats.dump_stats(filename='{}/{}_{}.log'.format(profiler_logdir, str(function.__name__), str(datetime.now()).replace(' ', '_')))
else:
print('Profiler stats for function {} at {}:'.format(str(function.__name__), str(datetime.now())))
stats.print_stats()
return ret
return profiler_wrapper
import daemon_lib.zkhandler as zkhandler
###############################################################################
# Supplemental functions
###############################################################################
#
# Run a local OS daemon in the background
#
class OSDaemon(object):
def __init__(self, command_string, environment, logfile):
command = shlex_split(command_string)
# Set stdout to be a logfile if set
if logfile:
stdout = open(logfile, 'a')
else:
stdout = subprocess.PIPE
# Invoke the process
self.proc = subprocess.Popen(
command,
env=environment,
stdout=stdout,
stderr=stdout,
)
# Signal the process
def signal(self, sent_signal):
signal_map = {
'hup': signal.SIGHUP,
'int': signal.SIGINT,
'term': signal.SIGTERM,
'kill': signal.SIGKILL
}
self.proc.send_signal(signal_map[sent_signal])
def run_os_daemon(command_string, environment=None, logfile=None):
daemon = OSDaemon(command_string, environment, logfile)
return daemon
#
# Run a local OS command via shell
#
def run_os_command(command_string, background=False, environment=None, timeout=None, shell=False):
command = shlex.split(command_string)
try:
command_output = subprocess.run(
command,
shell=shell,
env=environment,
timeout=timeout,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
retcode = command_output.returncode
except subprocess.TimeoutExpired:
retcode = 128
def run_os_command(command_string, background=False, environment=None, timeout=None):
command = shlex_split(command_string)
if background:
def runcmd():
try:
subprocess.run(
command,
env=environment,
timeout=timeout,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
except subprocess.TimeoutExpired:
pass
thread = Thread(target=runcmd, args=())
thread.start()
return 0, None, None
else:
try:
command_output = subprocess.run(
command,
env=environment,
timeout=timeout,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
retcode = command_output.returncode
except subprocess.TimeoutExpired:
retcode = 128
except Exception:
retcode = 255
try:
stdout = command_output.stdout.decode('ascii')
except Exception:
stdout = ''
try:
stderr = command_output.stderr.decode('ascii')
except Exception:
stderr = ''
return retcode, stdout, stderr
try:
stdout = command_output.stdout.decode('ascii')
except Exception:
stdout = ''
try:
stderr = command_output.stderr.decode('ascii')
except Exception:
stderr = ''
return retcode, stdout, stderr
#
@ -77,34 +182,12 @@ def validateUUID(dom_uuid):
return False
#
# Connect and disconnect from Zookeeper
#
def startZKConnection(zk_host):
zk_conn = kazoo.client.KazooClient(hosts=zk_host)
try:
zk_conn.start()
except kazoo.handlers.threading.KazooTimeoutError:
print('Timed out connecting to Zookeeper at "{}".'.format(zk_host))
exit(1)
except Exception as e:
print('Failed to connect to Zookeeper at "{}": {}'.format(zk_host, e))
exit(1)
return zk_conn
def stopZKConnection(zk_conn):
zk_conn.stop()
zk_conn.close()
return 0
#
# Parse a Domain XML object
#
def getDomainXML(zk_conn, dom_uuid):
def getDomainXML(zkhandler, dom_uuid):
try:
xml = zkhandler.readdata(zk_conn, '/domains/{}/xml'.format(dom_uuid))
xml = zkhandler.read(('domain.xml', dom_uuid))
except Exception:
return None
@ -214,8 +297,8 @@ def getDomainDisks(parsed_xml, stats_data):
#
# Get a list of disk devices
#
def getDomainDiskList(zk_conn, dom_uuid):
domain_information = getInformationFromXML(zk_conn, dom_uuid)
def getDomainDiskList(zkhandler, dom_uuid):
domain_information = getInformationFromXML(zkhandler, dom_uuid)
disk_list = []
for disk in domain_information['disks']:
disk_list.append(disk['name'])
@ -224,34 +307,37 @@ def getDomainDiskList(zk_conn, dom_uuid):
#
# Get domain information from XML
# Get a list of domain tags
#
def getInformationFromXML(zk_conn, uuid):
def getDomainTags(zkhandler, dom_uuid):
"""
Gather information about a VM from the Libvirt XML configuration in the Zookeper database
and return a dict() containing it.
"""
domain_state = zkhandler.readdata(zk_conn, '/domains/{}/state'.format(uuid))
domain_node = zkhandler.readdata(zk_conn, '/domains/{}/node'.format(uuid))
domain_lastnode = zkhandler.readdata(zk_conn, '/domains/{}/lastnode'.format(uuid))
domain_failedreason = zkhandler.readdata(zk_conn, '/domains/{}/failedreason'.format(uuid))
Get a list of tags for domain dom_uuid
try:
domain_node_limit = zkhandler.readdata(zk_conn, '/domains/{}/node_limit'.format(uuid))
except Exception:
domain_node_limit = None
try:
domain_node_selector = zkhandler.readdata(zk_conn, '/domains/{}/node_selector'.format(uuid))
except Exception:
domain_node_selector = None
try:
domain_node_autostart = zkhandler.readdata(zk_conn, '/domains/{}/node_autostart'.format(uuid))
except Exception:
domain_node_autostart = None
try:
domain_migration_method = zkhandler.readdata(zk_conn, '/domains/{}/migration_method'.format(uuid))
except Exception:
domain_migration_method = None
The UUID must be validated before calling this function!
"""
tags = list()
for tag in zkhandler.children(('domain.meta.tags', dom_uuid)):
tag_type = zkhandler.read(('domain.meta.tags', dom_uuid, 'tag.type', tag))
protected = bool(strtobool(zkhandler.read(('domain.meta.tags', dom_uuid, 'tag.protected', tag))))
tags.append({'name': tag, 'type': tag_type, 'protected': protected})
return tags
#
# Get a set of domain metadata
#
def getDomainMetadata(zkhandler, dom_uuid):
"""
Get the domain metadata for domain dom_uuid
The UUID must be validated before calling this function!
"""
domain_node_limit = zkhandler.read(('domain.meta.node_limit', dom_uuid))
domain_node_selector = zkhandler.read(('domain.meta.node_selector', dom_uuid))
domain_node_autostart = zkhandler.read(('domain.meta.autostart', dom_uuid))
domain_migration_method = zkhandler.read(('domain.meta.migrate_method', dom_uuid))
if not domain_node_limit:
domain_node_limit = None
@ -261,23 +347,42 @@ def getInformationFromXML(zk_conn, uuid):
if not domain_node_autostart:
domain_node_autostart = None
try:
domain_profile = zkhandler.readdata(zk_conn, '/domains/{}/profile'.format(uuid))
except Exception:
domain_profile = None
return domain_node_limit, domain_node_selector, domain_node_autostart, domain_migration_method
try:
domain_vnc = zkhandler.readdata(zk_conn, '/domains/{}/vnc'.format(uuid))
#
# Get domain information from XML
#
def getInformationFromXML(zkhandler, uuid):
"""
Gather information about a VM from the Libvirt XML configuration in the Zookeper database
and return a dict() containing it.
"""
domain_state = zkhandler.read(('domain.state', uuid))
domain_node = zkhandler.read(('domain.node', uuid))
domain_lastnode = zkhandler.read(('domain.last_node', uuid))
domain_failedreason = zkhandler.read(('domain.failed_reason', uuid))
domain_node_limit, domain_node_selector, domain_node_autostart, domain_migration_method = getDomainMetadata(zkhandler, uuid)
domain_tags = getDomainTags(zkhandler, uuid)
domain_profile = zkhandler.read(('domain.profile', uuid))
domain_vnc = zkhandler.read(('domain.console.vnc', uuid))
if domain_vnc:
domain_vnc_listen, domain_vnc_port = domain_vnc.split(':')
except Exception:
else:
domain_vnc_listen = 'None'
domain_vnc_port = 'None'
parsed_xml = getDomainXML(zk_conn, uuid)
parsed_xml = getDomainXML(zkhandler, uuid)
try:
stats_data = loads(zkhandler.readdata(zk_conn, '/domains/{}/stats'.format(uuid)))
except Exception:
stats_data = zkhandler.read(('domain.stats', uuid))
if stats_data is not None:
try:
stats_data = loads(stats_data)
except Exception:
stats_data = {}
else:
stats_data = {}
domain_uuid, domain_name, domain_description, domain_memory, domain_vcpu, domain_vcputopo = getDomainMainDetails(parsed_xml)
@ -306,6 +411,7 @@ def getInformationFromXML(zk_conn, uuid):
'node_selector': domain_node_selector,
'node_autostart': bool(strtobool(domain_node_autostart)),
'migration_method': domain_migration_method,
'tags': domain_tags,
'description': domain_description,
'profile': domain_profile,
'memory': int(domain_memory),
@ -343,23 +449,28 @@ def getDomainNetworks(parsed_xml, stats_data):
net_type = device.attrib.get('type')
except Exception:
net_type = None
try:
net_mac = device.mac.attrib.get('address')
except Exception:
net_mac = None
try:
net_bridge = device.source.attrib.get(net_type)
except Exception:
net_bridge = None
try:
net_model = device.model.attrib.get('type')
except Exception:
net_model = None
try:
net_stats_list = [x for x in stats_data.get('net_stats', []) if x.get('bridge') == net_bridge]
net_stats = net_stats_list[0]
except Exception:
net_stats = {}
net_rd_bytes = net_stats.get('rd_bytes', 0)
net_rd_packets = net_stats.get('rd_packets', 0)
net_rd_errors = net_stats.get('rd_errors', 0)
@ -368,9 +479,19 @@ def getDomainNetworks(parsed_xml, stats_data):
net_wr_packets = net_stats.get('wr_packets', 0)
net_wr_errors = net_stats.get('wr_errors', 0)
net_wr_drops = net_stats.get('wr_drops', 0)
if net_type == 'direct':
net_vni = 'macvtap:' + device.source.attrib.get('dev')
net_bridge = device.source.attrib.get('dev')
elif net_type == 'hostdev':
net_vni = 'hostdev:' + str(device.sriov_device)
net_bridge = str(device.sriov_device)
else:
net_vni = re_match(r'[vm]*br([0-9a-z]+)', net_bridge).group(1)
net_obj = {
'type': net_type,
'vni': re_match(r'[vm]*br([0-9a-z]+)', net_bridge).group(1),
'vni': net_vni,
'mac': net_mac,
'source': net_bridge,
'model': net_model,
@ -409,21 +530,18 @@ def getDomainControllers(parsed_xml):
#
# Verify node is valid in cluster
#
def verifyNode(zk_conn, node):
if zkhandler.exists(zk_conn, '/nodes/{}'.format(node)):
return True
else:
return False
def verifyNode(zkhandler, node):
return zkhandler.exists(('node', node))
#
# Get the primary coordinator node
#
def getPrimaryNode(zk_conn):
def getPrimaryNode(zkhandler):
failcount = 0
while True:
try:
primary_node = zkhandler.readdata(zk_conn, '/primary_node')
primary_node = zkhandler.read('base.config.primary_node')
except Exception:
primary_node == 'none'
@ -444,10 +562,10 @@ def getPrimaryNode(zk_conn):
#
# Find a migration target
#
def findTargetNode(zk_conn, dom_uuid):
def findTargetNode(zkhandler, dom_uuid):
# Determine VM node limits; set config value if read fails
try:
node_limit = zkhandler.readdata(zk_conn, '/domains/{}/node_limit'.format(dom_uuid)).split(',')
node_limit = zkhandler.read(('domain.meta.node_limit', dom_uuid)).split(',')
if not any(node_limit):
node_limit = None
except Exception:
@ -455,39 +573,42 @@ def findTargetNode(zk_conn, dom_uuid):
# Determine VM search field or use default; set config value if read fails
try:
search_field = zkhandler.readdata(zk_conn, '/domains/{}/node_selector'.format(dom_uuid))
search_field = zkhandler.read(('domain.meta.node_selector', dom_uuid))
except Exception:
search_field = 'mem'
search_field = None
# If our search field is invalid, use the default
if search_field is None or search_field == 'None':
search_field = zkhandler.read('base.config.migration_target_selector')
# Execute the search
if search_field == 'mem':
return findTargetNodeMem(zk_conn, node_limit, dom_uuid)
return findTargetNodeMem(zkhandler, node_limit, dom_uuid)
if search_field == 'load':
return findTargetNodeLoad(zk_conn, node_limit, dom_uuid)
return findTargetNodeLoad(zkhandler, node_limit, dom_uuid)
if search_field == 'vcpus':
return findTargetNodeVCPUs(zk_conn, node_limit, dom_uuid)
return findTargetNodeVCPUs(zkhandler, node_limit, dom_uuid)
if search_field == 'vms':
return findTargetNodeVMs(zk_conn, node_limit, dom_uuid)
return findTargetNodeVMs(zkhandler, node_limit, dom_uuid)
# Nothing was found
return None
#
# Get the list of valid target nodes
def getNodes(zk_conn, node_limit, dom_uuid):
#
def getNodes(zkhandler, node_limit, dom_uuid):
valid_node_list = []
full_node_list = zkhandler.listchildren(zk_conn, '/nodes')
try:
current_node = zkhandler.readdata(zk_conn, '/domains/{}/node'.format(dom_uuid))
except kazoo.exceptions.NoNodeError:
current_node = None
full_node_list = zkhandler.children('base.node')
current_node = zkhandler.read(('domain.node', dom_uuid))
for node in full_node_list:
if node_limit and node not in node_limit:
continue
daemon_state = zkhandler.readdata(zk_conn, '/nodes/{}/daemonstate'.format(node))
domain_state = zkhandler.readdata(zk_conn, '/nodes/{}/domainstate'.format(node))
daemon_state = zkhandler.read(('node.state.daemon', node))
domain_state = zkhandler.read(('node.state.domain', node))
if node == current_node:
continue
@ -500,16 +621,18 @@ def getNodes(zk_conn, node_limit, dom_uuid):
return valid_node_list
#
# via free memory (relative to allocated memory)
def findTargetNodeMem(zk_conn, node_limit, dom_uuid):
#
def findTargetNodeMem(zkhandler, node_limit, dom_uuid):
most_provfree = 0
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
node_list = getNodes(zkhandler, node_limit, dom_uuid)
for node in node_list:
memprov = int(zkhandler.readdata(zk_conn, '/nodes/{}/memprov'.format(node)))
memused = int(zkhandler.readdata(zk_conn, '/nodes/{}/memused'.format(node)))
memfree = int(zkhandler.readdata(zk_conn, '/nodes/{}/memfree'.format(node)))
memprov = int(zkhandler.read(('node.memory.provisioned', node)))
memused = int(zkhandler.read(('node.memory.used', node)))
memfree = int(zkhandler.read(('node.memory.free', node)))
memtotal = memused + memfree
provfree = memtotal - memprov
@ -520,14 +643,16 @@ def findTargetNodeMem(zk_conn, node_limit, dom_uuid):
return target_node
#
# via load average
def findTargetNodeLoad(zk_conn, node_limit, dom_uuid):
#
def findTargetNodeLoad(zkhandler, node_limit, dom_uuid):
least_load = 9999.0
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
node_list = getNodes(zkhandler, node_limit, dom_uuid)
for node in node_list:
load = float(zkhandler.readdata(zk_conn, '/nodes/{}/cpuload'.format(node)))
load = float(zkhandler.read(('node.cpu.load', node)))
if load < least_load:
least_load = load
@ -536,14 +661,16 @@ def findTargetNodeLoad(zk_conn, node_limit, dom_uuid):
return target_node
#
# via total vCPUs
def findTargetNodeVCPUs(zk_conn, node_limit, dom_uuid):
#
def findTargetNodeVCPUs(zkhandler, node_limit, dom_uuid):
least_vcpus = 9999
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
node_list = getNodes(zkhandler, node_limit, dom_uuid)
for node in node_list:
vcpus = int(zkhandler.readdata(zk_conn, '/nodes/{}/vcpualloc'.format(node)))
vcpus = int(zkhandler.read(('node.vcpu.allocated', node)))
if vcpus < least_vcpus:
least_vcpus = vcpus
@ -552,14 +679,16 @@ def findTargetNodeVCPUs(zk_conn, node_limit, dom_uuid):
return target_node
#
# via total VMs
def findTargetNodeVMs(zk_conn, node_limit, dom_uuid):
#
def findTargetNodeVMs(zkhandler, node_limit, dom_uuid):
least_vms = 9999
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
node_list = getNodes(zkhandler, node_limit, dom_uuid)
for node in node_list:
vms = int(zkhandler.readdata(zk_conn, '/nodes/{}/domainscount'.format(node)))
vms = int(zkhandler.read(('node.count.provisioned_domains', node)))
if vms < least_vms:
least_vms = vms
@ -568,7 +697,9 @@ def findTargetNodeVMs(zk_conn, node_limit, dom_uuid):
return target_node
# Connect to the primary host and run a command
#
# Connect to the primary node and run a command
#
def runRemoteCommand(node, command, become=False):
import paramiko
import hashlib
@ -598,3 +729,69 @@ def runRemoteCommand(node, command, become=False):
ssh_client.connect(node)
stdin, stdout, stderr = ssh_client.exec_command(command)
return stdout.read().decode('ascii').rstrip(), stderr.read().decode('ascii').rstrip()
#
# Reload the firewall rules of the system
#
def reload_firewall_rules(rules_file, logger=None):
if logger is not None:
logger.out('Reloading firewall configuration', state='o')
retcode, stdout, stderr = run_os_command('/usr/sbin/nft -f {}'.format(rules_file))
if retcode != 0 and logger is not None:
logger.out('Failed to reload configuration: {}'.format(stderr), state='e')
#
# Create an IP address
#
def createIPAddress(ipaddr, cidrnetmask, dev):
run_os_command(
'ip address add {}/{} dev {}'.format(
ipaddr,
cidrnetmask,
dev
)
)
run_os_command(
'arping -P -U -W 0.02 -c 2 -i {dev} -S {ip} {ip}'.format(
dev=dev,
ip=ipaddr
)
)
#
# Remove an IP address
#
def removeIPAddress(ipaddr, cidrnetmask, dev):
run_os_command(
'ip address delete {}/{} dev {}'.format(
ipaddr,
cidrnetmask,
dev
)
)
#
# Sort a set of interface names (e.g. ens1f1v10)
#
def sortInterfaceNames(interface_names):
# We can't handle non-list inputs
if not isinstance(interface_names, list):
return interface_names
def atoi(text):
return int(text) if text.isdigit() else text
def natural_keys(text):
"""
alist.sort(key=natural_keys) sorts in human order
http://nedbatchelder.com/blog/200712/human_sorting.html
(See Toothy's implementation in the comments)
"""
return [atoi(c) for c in re_split(r'(\d+)', text)]
return sorted(interface_names, key=natural_keys)

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python3
# log.py - Output (stdout + logfile) functions
# log.py - PVC daemon logger functions
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2021 Joshua M. Boniface <joshua@boniface.me>
@ -19,7 +19,12 @@
#
###############################################################################
import datetime
from collections import deque
from threading import Thread
from queue import Queue
from datetime import datetime
from daemon_lib.zkhandler import ZKHandler
class Logger(object):
@ -72,22 +77,37 @@ class Logger(object):
if self.config['file_logging']:
self.logfile = self.config['log_directory'] + '/pvc.log'
# We open the logfile for the duration of our session, but have a hup function
self.writer = open(self.logfile, 'a', buffering=1)
self.writer = open(self.logfile, 'a', buffering=0)
self.last_colour = ''
self.last_prompt = ''
if self.config['zookeeper_logging']:
self.zookeeper_logger = ZookeeperLogger(config)
self.zookeeper_logger.start()
# Provide a hup function to close and reopen the writer
def hup(self):
self.writer.close()
self.writer = open(self.logfile, 'a', buffering=0)
# Provide a termination function so all messages are flushed before terminating the main daemon
def terminate(self):
if self.config['file_logging']:
self.writer.close()
if self.config['zookeeper_logging']:
self.out("Waiting for Zookeeper message queue to drain", state='s')
while not self.zookeeper_logger.queue.empty():
pass
self.zookeeper_logger.stop()
self.zookeeper_logger.join()
# Output function
def out(self, message, state=None, prefix=''):
# Get the date
if self.config['log_dates']:
date = '{} - '.format(datetime.datetime.now().strftime('%Y/%m/%d %H:%M:%S.%f'))
date = '{} '.format(datetime.now().strftime('%Y/%m/%d %H:%M:%S.%f'))
else:
date = ''
@ -123,6 +143,71 @@ class Logger(object):
if self.config['file_logging']:
self.writer.write(message + '\n')
# Log to Zookeeper
if self.config['zookeeper_logging']:
self.zookeeper_logger.queue.put(message)
# Set last message variables
self.last_colour = colour
self.last_prompt = prompt
class ZookeeperLogger(Thread):
"""
Defines a threaded writer for Zookeeper locks. Threading prevents the blocking of other
daemon events while the records are written. They will be eventually-consistent
"""
def __init__(self, config):
self.config = config
self.node = self.config['node']
self.max_lines = self.config['node_log_lines']
self.queue = Queue()
self.zkhandler = None
self.start_zkhandler()
# Ensure the root keys for this are instantiated
self.zkhandler.write([
('base.logs', ''),
(('logs', self.node), '')
])
self.running = False
Thread.__init__(self, args=(), kwargs=None)
def start_zkhandler(self):
# We must open our own dedicated Zookeeper instance because we can't guarantee one already exists when this starts
if self.zkhandler is not None:
try:
self.zkhandler.disconnect()
except Exception:
pass
self.zkhandler = ZKHandler(self.config, logger=None)
self.zkhandler.connect(persistent=True)
def run(self):
self.running = True
# Get the logs that are currently in Zookeeper and populate our deque
raw_logs = self.zkhandler.read(('logs.messages', self.node))
if raw_logs is None:
raw_logs = ''
logs = deque(raw_logs.split('\n'), self.max_lines)
while self.running:
# Get a new message
try:
message = self.queue.get(timeout=1)
if not message:
continue
except Exception:
continue
if not self.config['log_dates']:
# We want to log dates here, even if the log_dates config is not set
date = '{} '.format(datetime.now().strftime('%Y/%m/%d %H:%M:%S.%f'))
else:
date = ''
# Add the message to the deque
logs.append(f'{date}{message}')
# Write the updated messages into Zookeeper
self.zkhandler.write([(('logs.messages', self.node), '\n'.join(logs))])
return
def stop(self):
self.running = False

View File

@ -0,0 +1 @@
{"version": "0", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "cmd": "/cmd", "cmd.node": "/cmd/nodes", "cmd.domain": "/cmd/domains", "cmd.ceph": "/cmd/ceph", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "migrate.sync_lock": "/migrate_sync_lock"}, "network": {"vni": "", "type": "/nettype", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}

View File

@ -0,0 +1 @@
{"version": "1", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "cmd": "/cmd", "cmd.node": "/cmd/nodes", "cmd.domain": "/cmd/domains", "cmd.ceph": "/cmd/ceph", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword", "sriov": "/sriov", "sriov.pf": "/sriov/pf", "sriov.vf": "/sriov/vf"}, "sriov_pf": {"phy": "", "mtu": "/mtu", "vfcount": "/vfcount"}, "sriov_vf": {"phy": "", "pf": "/pf", "mtu": "/mtu", "mac": "/mac", "phy_mac": "/phy_mac", "config": "/config", "config.vlan_id": "/config/vlan_id", "config.vlan_qos": "/config/vlan_qos", "config.tx_rate_min": "/config/tx_rate_min", "config.tx_rate_max": "/config/tx_rate_max", "config.spoof_check": "/config/spoof_check", "config.link_state": "/config/link_state", "config.trust": "/config/trust", "config.query_rss": "/config/query_rss", "pci": "/pci", "pci.domain": "/pci/domain", "pci.bus": "/pci/bus", "pci.slot": "/pci/slot", "pci.function": "/pci/function", "used": "/used", "used_by": "/used_by"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "migrate.sync_lock": "/migrate_sync_lock"}, "network": {"vni": "", "type": "/nettype", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}

View File

@ -0,0 +1 @@
{"version": "2", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "cmd": "/cmd", "cmd.node": "/cmd/nodes", "cmd.domain": "/cmd/domains", "cmd.ceph": "/cmd/ceph", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "data.pvc_version": "/pvcversion", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword", "sriov": "/sriov", "sriov.pf": "/sriov/pf", "sriov.vf": "/sriov/vf"}, "sriov_pf": {"phy": "", "mtu": "/mtu", "vfcount": "/vfcount"}, "sriov_vf": {"phy": "", "pf": "/pf", "mtu": "/mtu", "mac": "/mac", "phy_mac": "/phy_mac", "config": "/config", "config.vlan_id": "/config/vlan_id", "config.vlan_qos": "/config/vlan_qos", "config.tx_rate_min": "/config/tx_rate_min", "config.tx_rate_max": "/config/tx_rate_max", "config.spoof_check": "/config/spoof_check", "config.link_state": "/config/link_state", "config.trust": "/config/trust", "config.query_rss": "/config/query_rss", "pci": "/pci", "pci.domain": "/pci/domain", "pci.bus": "/pci/bus", "pci.slot": "/pci/slot", "pci.function": "/pci/function", "used": "/used", "used_by": "/used_by"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "migrate.sync_lock": "/migrate_sync_lock"}, "network": {"vni": "", "type": "/nettype", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}

View File

@ -0,0 +1 @@
{"version": "3", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "cmd": "/cmd", "cmd.node": "/cmd/nodes", "cmd.domain": "/cmd/domains", "cmd.ceph": "/cmd/ceph", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "data.pvc_version": "/pvcversion", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword", "sriov": "/sriov", "sriov.pf": "/sriov/pf", "sriov.vf": "/sriov/vf"}, "sriov_pf": {"phy": "", "mtu": "/mtu", "vfcount": "/vfcount"}, "sriov_vf": {"phy": "", "pf": "/pf", "mtu": "/mtu", "mac": "/mac", "phy_mac": "/phy_mac", "config": "/config", "config.vlan_id": "/config/vlan_id", "config.vlan_qos": "/config/vlan_qos", "config.tx_rate_min": "/config/tx_rate_min", "config.tx_rate_max": "/config/tx_rate_max", "config.spoof_check": "/config/spoof_check", "config.link_state": "/config/link_state", "config.trust": "/config/trust", "config.query_rss": "/config/query_rss", "pci": "/pci", "pci.domain": "/pci/domain", "pci.bus": "/pci/bus", "pci.slot": "/pci/slot", "pci.function": "/pci/function", "used": "/used", "used_by": "/used_by"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "meta.tags": "/tags", "migrate.sync_lock": "/migrate_sync_lock"}, "tag": {"name": "", "type": "/type", "protected": "/protected"}, "network": {"vni": "", "type": "/nettype", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}

View File

@ -0,0 +1 @@
{"version": "4", "root": "", "base": {"root": "", "schema": "/schema", "schema.version": "/schema/version", "config": "/config", "config.maintenance": "/config/maintenance", "config.primary_node": "/config/primary_node", "config.primary_node.sync_lock": "/config/primary_node/sync_lock", "config.upstream_ip": "/config/upstream_ip", "config.migration_target_selector": "/config/migration_target_selector", "cmd": "/cmd", "cmd.node": "/cmd/nodes", "cmd.domain": "/cmd/domains", "cmd.ceph": "/cmd/ceph", "logs": "/logs", "node": "/nodes", "domain": "/domains", "network": "/networks", "storage": "/ceph", "storage.util": "/ceph/util", "osd": "/ceph/osds", "pool": "/ceph/pools", "volume": "/ceph/volumes", "snapshot": "/ceph/snapshots"}, "logs": {"node": "", "messages": "/messages"}, "node": {"name": "", "keepalive": "/keepalive", "mode": "/daemonmode", "data.active_schema": "/activeschema", "data.latest_schema": "/latestschema", "data.static": "/staticdata", "data.pvc_version": "/pvcversion", "running_domains": "/runningdomains", "count.provisioned_domains": "/domainscount", "count.networks": "/networkscount", "state.daemon": "/daemonstate", "state.router": "/routerstate", "state.domain": "/domainstate", "cpu.load": "/cpuload", "vcpu.allocated": "/vcpualloc", "memory.total": "/memtotal", "memory.used": "/memused", "memory.free": "/memfree", "memory.allocated": "/memalloc", "memory.provisioned": "/memprov", "ipmi.hostname": "/ipmihostname", "ipmi.username": "/ipmiusername", "ipmi.password": "/ipmipassword", "sriov": "/sriov", "sriov.pf": "/sriov/pf", "sriov.vf": "/sriov/vf"}, "sriov_pf": {"phy": "", "mtu": "/mtu", "vfcount": "/vfcount"}, "sriov_vf": {"phy": "", "pf": "/pf", "mtu": "/mtu", "mac": "/mac", "phy_mac": "/phy_mac", "config": "/config", "config.vlan_id": "/config/vlan_id", "config.vlan_qos": "/config/vlan_qos", "config.tx_rate_min": "/config/tx_rate_min", "config.tx_rate_max": "/config/tx_rate_max", "config.spoof_check": "/config/spoof_check", "config.link_state": "/config/link_state", "config.trust": "/config/trust", "config.query_rss": "/config/query_rss", "pci": "/pci", "pci.domain": "/pci/domain", "pci.bus": "/pci/bus", "pci.slot": "/pci/slot", "pci.function": "/pci/function", "used": "/used", "used_by": "/used_by"}, "domain": {"name": "", "xml": "/xml", "state": "/state", "profile": "/profile", "stats": "/stats", "node": "/node", "last_node": "/lastnode", "failed_reason": "/failedreason", "storage.volumes": "/rbdlist", "console.log": "/consolelog", "console.vnc": "/vnc", "meta.autostart": "/node_autostart", "meta.migrate_method": "/migration_method", "meta.node_selector": "/node_selector", "meta.node_limit": "/node_limit", "meta.tags": "/tags", "migrate.sync_lock": "/migrate_sync_lock"}, "tag": {"name": "", "type": "/type", "protected": "/protected"}, "network": {"vni": "", "type": "/nettype", "rule": "/firewall_rules", "rule.in": "/firewall_rules/in", "rule.out": "/firewall_rules/out", "nameservers": "/name_servers", "domain": "/domain", "reservation": "/dhcp4_reservations", "lease": "/dhcp4_leases", "ip4.gateway": "/ip4_gateway", "ip4.network": "/ip4_network", "ip4.dhcp": "/dhcp4_flag", "ip4.dhcp_start": "/dhcp4_start", "ip4.dhcp_end": "/dhcp4_end", "ip6.gateway": "/ip6_gateway", "ip6.network": "/ip6_network", "ip6.dhcp": "/dhcp6_flag"}, "reservation": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname"}, "lease": {"mac": "", "ip": "/ipaddr", "hostname": "/hostname", "expiry": "/expiry", "client_id": "/clientid"}, "rule": {"description": "", "rule": "/rule", "order": "/order"}, "osd": {"id": "", "node": "/node", "device": "/device", "stats": "/stats"}, "pool": {"name": "", "pgs": "/pgs", "stats": "/stats"}, "volume": {"name": "", "stats": "/stats"}, "snapshot": {"name": "", "stats": "/stats"}}

View File

@ -21,28 +21,26 @@
import re
from kazoo.exceptions import NoNodeError
import daemon_lib.zkhandler as zkhandler
import daemon_lib.common as common
#
# Cluster search functions
#
def getClusterNetworkList(zk_conn):
def getClusterNetworkList(zkhandler):
# Get a list of VNIs by listing the children of /networks
vni_list = zkhandler.listchildren(zk_conn, '/networks')
vni_list = zkhandler.children('base.network')
description_list = []
# For each VNI, get the corresponding description from the data
for vni in vni_list:
description_list.append(zkhandler.readdata(zk_conn, '/networks/{}'.format(vni)))
description_list.append(zkhandler.read(('network', vni)))
return vni_list, description_list
def searchClusterByVNI(zk_conn, vni):
def searchClusterByVNI(zkhandler, vni):
try:
# Get the lists
vni_list, description_list = getClusterNetworkList(zk_conn)
vni_list, description_list = getClusterNetworkList(zkhandler)
# We're looking for UUID, so find that element ID
index = vni_list.index(vni)
# Get the name_list element at that index
@ -54,10 +52,10 @@ def searchClusterByVNI(zk_conn, vni):
return description
def searchClusterByDescription(zk_conn, description):
def searchClusterByDescription(zkhandler, description):
try:
# Get the lists
vni_list, description_list = getClusterNetworkList(zk_conn)
vni_list, description_list = getClusterNetworkList(zkhandler)
# We're looking for name, so find that element ID
index = description_list.index(description)
# Get the uuid_list element at that index
@ -69,43 +67,43 @@ def searchClusterByDescription(zk_conn, description):
return vni
def getNetworkVNI(zk_conn, network):
def getNetworkVNI(zkhandler, network):
# Validate and obtain alternate passed value
if network.isdigit():
net_description = searchClusterByVNI(zk_conn, network)
net_vni = searchClusterByDescription(zk_conn, net_description)
net_description = searchClusterByVNI(zkhandler, network)
net_vni = searchClusterByDescription(zkhandler, net_description)
else:
net_vni = searchClusterByDescription(zk_conn, network)
net_description = searchClusterByVNI(zk_conn, net_vni)
net_vni = searchClusterByDescription(zkhandler, network)
net_description = searchClusterByVNI(zkhandler, net_vni)
return net_vni
def getNetworkDescription(zk_conn, network):
def getNetworkDescription(zkhandler, network):
# Validate and obtain alternate passed value
if network.isdigit():
net_description = searchClusterByVNI(zk_conn, network)
net_vni = searchClusterByDescription(zk_conn, net_description)
net_description = searchClusterByVNI(zkhandler, network)
net_vni = searchClusterByDescription(zkhandler, net_description)
else:
net_vni = searchClusterByDescription(zk_conn, network)
net_description = searchClusterByVNI(zk_conn, net_vni)
net_vni = searchClusterByDescription(zkhandler, network)
net_description = searchClusterByVNI(zkhandler, net_vni)
return net_description
def getNetworkDHCPLeases(zk_conn, vni):
def getNetworkDHCPLeases(zkhandler, vni):
# Get a list of DHCP leases by listing the children of /networks/<vni>/dhcp4_leases
dhcp4_leases = zkhandler.listchildren(zk_conn, '/networks/{}/dhcp4_leases'.format(vni))
return sorted(dhcp4_leases)
leases = zkhandler.children(('network.lease', vni))
return sorted(leases)
def getNetworkDHCPReservations(zk_conn, vni):
def getNetworkDHCPReservations(zkhandler, vni):
# Get a list of DHCP reservations by listing the children of /networks/<vni>/dhcp4_reservations
dhcp4_reservations = zkhandler.listchildren(zk_conn, '/networks/{}/dhcp4_reservations'.format(vni))
return sorted(dhcp4_reservations)
reservations = zkhandler.children(('network.reservation', vni))
return sorted(reservations)
def getNetworkACLs(zk_conn, vni, _direction):
def getNetworkACLs(zkhandler, vni, _direction):
# Get the (sorted) list of active ACLs
if _direction == 'both':
directions = ['in', 'out']
@ -114,32 +112,39 @@ def getNetworkACLs(zk_conn, vni, _direction):
full_acl_list = []
for direction in directions:
unordered_acl_list = zkhandler.listchildren(zk_conn, '/networks/{}/firewall_rules/{}'.format(vni, direction))
unordered_acl_list = zkhandler.children((f'network.rule.{direction}', vni))
if len(unordered_acl_list) < 1:
continue
ordered_acls = dict()
for acl in unordered_acl_list:
order = zkhandler.readdata(zk_conn, '/networks/{}/firewall_rules/{}/{}/order'.format(vni, direction, acl))
order = zkhandler.read((f'network.rule.{direction}', vni, 'rule.order', acl))
if order is None:
continue
ordered_acls[order] = acl
for order in sorted(ordered_acls.keys()):
rule = zkhandler.readdata(zk_conn, '/networks/{}/firewall_rules/{}/{}/rule'.format(vni, direction, acl))
rule = zkhandler.read((f'network.rule.{direction}', vni, 'rule.rule', acl))
if rule is None:
continue
full_acl_list.append({'direction': direction, 'order': int(order), 'description': ordered_acls[order], 'rule': rule})
return full_acl_list
def getNetworkInformation(zk_conn, vni):
description = zkhandler.readdata(zk_conn, '/networks/{}'.format(vni))
nettype = zkhandler.readdata(zk_conn, '/networks/{}/nettype'.format(vni))
domain = zkhandler.readdata(zk_conn, '/networks/{}/domain'.format(vni))
name_servers = zkhandler.readdata(zk_conn, '/networks/{}/name_servers'.format(vni))
ip6_network = zkhandler.readdata(zk_conn, '/networks/{}/ip6_network'.format(vni))
ip6_gateway = zkhandler.readdata(zk_conn, '/networks/{}/ip6_gateway'.format(vni))
dhcp6_flag = zkhandler.readdata(zk_conn, '/networks/{}/dhcp6_flag'.format(vni))
ip4_network = zkhandler.readdata(zk_conn, '/networks/{}/ip4_network'.format(vni))
ip4_gateway = zkhandler.readdata(zk_conn, '/networks/{}/ip4_gateway'.format(vni))
dhcp4_flag = zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_flag'.format(vni))
dhcp4_start = zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_start'.format(vni))
dhcp4_end = zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_end'.format(vni))
def getNetworkInformation(zkhandler, vni):
description = zkhandler.read(('network', vni))
nettype = zkhandler.read(('network.type', vni))
domain = zkhandler.read(('network.domain', vni))
name_servers = zkhandler.read(('network.nameservers', vni))
ip6_network = zkhandler.read(('network.ip6.network', vni))
ip6_gateway = zkhandler.read(('network.ip6.gateway', vni))
dhcp6_flag = zkhandler.read(('network.ip6.dhcp', vni))
ip4_network = zkhandler.read(('network.ip4.network', vni))
ip4_gateway = zkhandler.read(('network.ip4.gateway', vni))
dhcp4_flag = zkhandler.read(('network.ip4.dhcp', vni))
dhcp4_start = zkhandler.read(('network.ip4.dhcp_start', vni))
dhcp4_end = zkhandler.read(('network.ip4.dhcp_end', vni))
# Construct a data structure to represent the data
network_information = {
@ -164,19 +169,19 @@ def getNetworkInformation(zk_conn, vni):
return network_information
def getDHCPLeaseInformation(zk_conn, vni, mac_address):
def getDHCPLeaseInformation(zkhandler, vni, mac_address):
# Check whether this is a dynamic or static lease
try:
zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_leases/{}'.format(vni, mac_address))
type_key = 'dhcp4_leases'
except NoNodeError:
zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_reservations/{}'.format(vni, mac_address))
type_key = 'dhcp4_reservations'
if zkhandler.exists(('network.lease', vni, 'lease', mac_address)):
type_key = 'lease'
elif zkhandler.exists(('network.reservation', vni, 'reservation', mac_address)):
type_key = 'reservation'
else:
return None
hostname = zkhandler.readdata(zk_conn, '/networks/{}/{}/{}/hostname'.format(vni, type_key, mac_address))
ip4_address = zkhandler.readdata(zk_conn, '/networks/{}/{}/{}/ipaddr'.format(vni, type_key, mac_address))
if type_key == 'dhcp4_leases':
timestamp = zkhandler.readdata(zk_conn, '/networks/{}/{}/{}/expiry'.format(vni, type_key, mac_address))
hostname = zkhandler.read((f'network.{type_key}', vni, f'{type_key}.hostname', mac_address))
ip4_address = zkhandler.read((f'network.{type_key}', vni, f'{type_key}.ip', mac_address))
if type_key == 'lease':
timestamp = zkhandler.read((f'network.{type_key}', vni, f'{type_key}.expiry', mac_address))
else:
timestamp = 'static'
@ -190,14 +195,14 @@ def getDHCPLeaseInformation(zk_conn, vni, mac_address):
return lease_information
def getACLInformation(zk_conn, vni, direction, description):
order = zkhandler.readdata(zk_conn, '/networks/{}/firewall_rules/{}/{}/order'.format(vni, direction, description))
rule = zkhandler.readdata(zk_conn, '/networks/{}/firewall_rules/{}/{}/rule'.format(vni, direction, description))
def getACLInformation(zkhandler, vni, direction, acl):
order = zkhandler.read((f'network.rule.{direction}', vni, 'rule.order', acl))
rule = zkhandler.read((f'network.rule.{direction}', vni, 'rule.rule', acl))
# Construct a data structure to represent the data
acl_information = {
'order': order,
'description': description,
'description': acl,
'rule': rule,
'direction': direction
}
@ -205,12 +210,7 @@ def getACLInformation(zk_conn, vni, direction, description):
def isValidMAC(macaddr):
allowed = re.compile(r"""
(
^([0-9A-F]{2}[:]){5}([0-9A-F]{2})$
)
""",
re.VERBOSE | re.IGNORECASE)
allowed = re.compile(r'(^([0-9A-F]{2}[:]){5}([0-9A-F]{2})$)', re.VERBOSE | re.IGNORECASE)
if allowed.match(macaddr):
return True
@ -235,7 +235,7 @@ def isValidIP(ipaddr):
#
# Direct functions
#
def add_network(zk_conn, vni, description, nettype,
def add_network(zkhandler, vni, description, nettype,
domain, name_servers, ip4_network, ip4_gateway, ip6_network, ip6_gateway,
dhcp4_flag, dhcp4_start, dhcp4_end):
# Ensure start and end DHCP ranges are set if the flag is set
@ -243,10 +243,11 @@ def add_network(zk_conn, vni, description, nettype,
return False, 'ERROR: DHCPv4 start and end addresses are required for a DHCPv4-enabled network.'
# Check if a network with this VNI or description already exists
if zkhandler.exists(zk_conn, '/networks/{}'.format(vni)):
if zkhandler.exists(('network', vni)):
return False, 'ERROR: A network with VNI "{}" already exists!'.format(vni)
for network in zkhandler.listchildren(zk_conn, '/networks'):
network_description = zkhandler.readdata(zk_conn, '/networks/{}'.format(network))
for network in zkhandler.children('base.network'):
network_description = zkhandler.read(('network', network))
if network_description == description:
return False, 'ERROR: A network with description "{}" already exists!'.format(description)
@ -259,91 +260,96 @@ def add_network(zk_conn, vni, description, nettype,
else:
dhcp6_flag = 'False'
if nettype == 'managed' and not domain:
if nettype in ['managed'] and not domain:
domain = '{}.local'.format(description)
# Add the new network to Zookeeper
zkhandler.writedata(zk_conn, {
'/networks/{}'.format(vni): description,
'/networks/{}/nettype'.format(vni): nettype,
'/networks/{}/domain'.format(vni): domain,
'/networks/{}/name_servers'.format(vni): name_servers,
'/networks/{}/ip6_network'.format(vni): ip6_network,
'/networks/{}/ip6_gateway'.format(vni): ip6_gateway,
'/networks/{}/dhcp6_flag'.format(vni): dhcp6_flag,
'/networks/{}/ip4_network'.format(vni): ip4_network,
'/networks/{}/ip4_gateway'.format(vni): ip4_gateway,
'/networks/{}/dhcp4_flag'.format(vni): dhcp4_flag,
'/networks/{}/dhcp4_start'.format(vni): dhcp4_start,
'/networks/{}/dhcp4_end'.format(vni): dhcp4_end,
'/networks/{}/dhcp4_leases'.format(vni): '',
'/networks/{}/dhcp4_reservations'.format(vni): '',
'/networks/{}/firewall_rules'.format(vni): '',
'/networks/{}/firewall_rules/in'.format(vni): '',
'/networks/{}/firewall_rules/out'.format(vni): ''
})
result = zkhandler.write([
(('network', vni), description),
(('network.type', vni), nettype),
(('network.domain', vni), domain),
(('network.nameservers', vni), name_servers),
(('network.ip6.network', vni), ip6_network),
(('network.ip6.gateway', vni), ip6_gateway),
(('network.ip6.dhcp', vni), dhcp6_flag),
(('network.ip4.network', vni), ip4_network),
(('network.ip4.gateway', vni), ip4_gateway),
(('network.ip4.dhcp', vni), dhcp4_flag),
(('network.ip4.dhcp_start', vni), dhcp4_start),
(('network.ip4.dhcp_end', vni), dhcp4_end),
(('network.lease', vni), ''),
(('network.reservation', vni), ''),
(('network.rule', vni), ''),
(('network.rule.in', vni), ''),
(('network.rule.out', vni), '')
])
return True, 'Network "{}" added successfully!'.format(description)
if result:
return True, 'Network "{}" added successfully!'.format(description)
else:
return False, 'ERROR: Failed to add network.'
def modify_network(zk_conn, vni, description=None, domain=None, name_servers=None,
def modify_network(zkhandler, vni, description=None, domain=None, name_servers=None,
ip4_network=None, ip4_gateway=None, ip6_network=None, ip6_gateway=None,
dhcp4_flag=None, dhcp4_start=None, dhcp4_end=None):
# Add the modified parameters to Zookeeper
zk_data = dict()
update_data = list()
if description is not None:
zk_data.update({'/networks/{}'.format(vni): description})
update_data.append((('network', vni), description))
if domain is not None:
zk_data.update({'/networks/{}/domain'.format(vni): domain})
update_data.append((('network.domain', vni), domain))
if name_servers is not None:
zk_data.update({'/networks/{}/name_servers'.format(vni): name_servers})
update_data.append((('network.nameservers', vni), name_servers))
if ip4_network is not None:
zk_data.update({'/networks/{}/ip4_network'.format(vni): ip4_network})
update_data.append((('network.ip4.network', vni), ip4_network))
if ip4_gateway is not None:
zk_data.update({'/networks/{}/ip4_gateway'.format(vni): ip4_gateway})
update_data.append((('network.ip4.gateway', vni), ip4_gateway))
if ip6_network is not None:
zk_data.update({'/networks/{}/ip6_network'.format(vni): ip6_network})
update_data.append((('network.ip6.network', vni), ip6_network))
if ip6_network:
zk_data.update({'/networks/{}/dhcp6_flag'.format(vni): 'True'})
update_data.append((('network.ip6.dhcp', vni), 'True'))
else:
zk_data.update({'/networks/{}/dhcp6_flag'.format(vni): 'False'})
update_data.append((('network.ip6.dhcp', vni), 'False'))
if ip6_gateway is not None:
zk_data.update({'/networks/{}/ip6_gateway'.format(vni): ip6_gateway})
update_data.append((('network.ip6.gateway', vni), ip6_gateway))
else:
# If we're changing the network, but don't also specify the gateway,
# generate a new one automatically
if ip6_network:
ip6_netpart, ip6_maskpart = ip6_network.split('/')
ip6_gateway = '{}1'.format(ip6_netpart)
zk_data.update({'/networks/{}/ip6_gateway'.format(vni): ip6_gateway})
update_data.append((('network.ip6.gateway', vni), ip6_gateway))
if dhcp4_flag is not None:
zk_data.update({'/networks/{}/dhcp4_flag'.format(vni): dhcp4_flag})
update_data.append((('network.ip4.dhcp', vni), dhcp4_flag))
if dhcp4_start is not None:
zk_data.update({'/networks/{}/dhcp4_start'.format(vni): dhcp4_start})
update_data.append((('network.ip4.dhcp_start', vni), dhcp4_start))
if dhcp4_end is not None:
zk_data.update({'/networks/{}/dhcp4_end'.format(vni): dhcp4_end})
update_data.append((('network.ip4.dhcp_end', vni), dhcp4_end))
zkhandler.writedata(zk_conn, zk_data)
zkhandler.write(update_data)
return True, 'Network "{}" modified successfully!'.format(vni)
def remove_network(zk_conn, network):
def remove_network(zkhandler, network):
# Validate and obtain alternate passed value
vni = getNetworkVNI(zk_conn, network)
description = getNetworkDescription(zk_conn, network)
vni = getNetworkVNI(zkhandler, network)
description = getNetworkDescription(zkhandler, network)
if not vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
# Delete the configuration
zkhandler.deletekey(zk_conn, '/networks/{}'.format(vni))
zkhandler.delete([
('network', vni)
])
return True, 'Network "{}" removed successfully!'.format(description)
def add_dhcp_reservation(zk_conn, network, ipaddress, macaddress, hostname):
def add_dhcp_reservation(zkhandler, network, ipaddress, macaddress, hostname):
# Validate and obtain standard passed value
net_vni = getNetworkVNI(zk_conn, network)
net_vni = getNetworkVNI(zkhandler, network)
if not net_vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
@ -356,71 +362,67 @@ def add_dhcp_reservation(zk_conn, network, ipaddress, macaddress, hostname):
if not isValidIP(ipaddress):
return False, 'ERROR: IP address "{}" is not valid!'.format(macaddress)
if zkhandler.exists(zk_conn, '/networks/{}/dhcp4_reservations/{}'.format(net_vni, macaddress)):
if zkhandler.exists(('network.reservation', net_vni, 'reservation', macaddress)):
return False, 'ERROR: A reservation with MAC "{}" already exists!'.format(macaddress)
# Add the new static lease to ZK
try:
zkhandler.writedata(zk_conn, {
'/networks/{}/dhcp4_reservations/{}'.format(net_vni, macaddress): 'static',
'/networks/{}/dhcp4_reservations/{}/hostname'.format(net_vni, macaddress): hostname,
'/networks/{}/dhcp4_reservations/{}/ipaddr'.format(net_vni, macaddress): ipaddress
})
except Exception as e:
return False, 'ERROR: Failed to write to Zookeeper! Exception: "{}".'.format(e)
zkhandler.write([
(('network.reservation', net_vni, 'reservation', macaddress), 'static'),
(('network.reservation', net_vni, 'reservation.hostname', macaddress), hostname),
(('network.reservation', net_vni, 'reservation.ip', macaddress), ipaddress),
])
return True, 'DHCP reservation "{}" added successfully!'.format(macaddress)
def remove_dhcp_reservation(zk_conn, network, reservation):
def remove_dhcp_reservation(zkhandler, network, reservation):
# Validate and obtain standard passed value
net_vni = getNetworkVNI(zk_conn, network)
net_vni = getNetworkVNI(zkhandler, network)
if not net_vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
match_description = ''
# Check if the reservation matches a static reservation description, a mac, or an IP address currently in the database
dhcp4_reservations_list = getNetworkDHCPReservations(zk_conn, net_vni)
dhcp4_reservations_list = getNetworkDHCPReservations(zkhandler, net_vni)
for macaddr in dhcp4_reservations_list:
hostname = zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_reservations/{}/hostname'.format(net_vni, macaddr))
ipaddress = zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_reservations/{}/ipaddr'.format(net_vni, macaddr))
hostname = zkhandler.read(('network.reservation', net_vni, 'reservation.hostname', macaddr))
ipaddress = zkhandler.read(('network.reservation', net_vni, 'reservation.ip', macaddr))
if reservation == macaddr or reservation == hostname or reservation == ipaddress:
match_description = macaddr
lease_type_zk = 'reservations'
lease_type_zk = 'reservation'
lease_type_human = 'static reservation'
# Check if the reservation matches a dynamic reservation description, a mac, or an IP address currently in the database
dhcp4_leases_list = getNetworkDHCPLeases(zk_conn, net_vni)
dhcp4_leases_list = getNetworkDHCPLeases(zkhandler, net_vni)
for macaddr in dhcp4_leases_list:
hostname = zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_leases/{}/hostname'.format(net_vni, macaddr))
ipaddress = zkhandler.readdata(zk_conn, '/networks/{}/dhcp4_leases/{}/ipaddr'.format(net_vni, macaddr))
hostname = zkhandler.read(('network.lease', net_vni, 'lease.hostname', macaddr))
ipaddress = zkhandler.read(('network.lease', net_vni, 'lease.ip', macaddr))
if reservation == macaddr or reservation == hostname or reservation == ipaddress:
match_description = macaddr
lease_type_zk = 'leases'
lease_type_zk = 'lease'
lease_type_human = 'dynamic lease'
if not match_description:
return False, 'ERROR: No DHCP reservation or lease exists matching "{}"!'.format(reservation)
# Remove the entry from zookeeper
try:
zkhandler.deletekey(zk_conn, '/networks/{}/dhcp4_{}/{}'.format(net_vni, lease_type_zk, match_description))
except Exception:
return False, 'ERROR: Failed to write to Zookeeper!'
zkhandler.delete([
(f'network.{lease_type_zk}', net_vni, f'{lease_type_zk}', match_description),
])
return True, 'DHCP {} "{}" removed successfully!'.format(lease_type_human, match_description)
def add_acl(zk_conn, network, direction, description, rule, order):
def add_acl(zkhandler, network, direction, description, rule, order):
# Validate and obtain standard passed value
net_vni = getNetworkVNI(zk_conn, network)
net_vni = getNetworkVNI(zkhandler, network)
if not net_vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
# Check if the ACL matches a description currently in the database
match_description = ''
full_acl_list = getNetworkACLs(zk_conn, net_vni, 'both')
full_acl_list = getNetworkACLs(zkhandler, net_vni, 'both')
for acl in full_acl_list:
if acl['description'] == description:
match_description = acl['description']
@ -435,7 +437,7 @@ def add_acl(zk_conn, network, direction, description, rule, order):
direction = "out"
# Handle reordering
full_acl_list = getNetworkACLs(zk_conn, net_vni, direction)
full_acl_list = getNetworkACLs(zkhandler, net_vni, direction)
acl_list_length = len(full_acl_list)
# Set order to len
if not order or int(order) > acl_list_length:
@ -448,44 +450,37 @@ def add_acl(zk_conn, network, direction, description, rule, order):
full_acl_list.insert(order, {'direction': direction, 'description': description, 'rule': rule})
# Update the existing ordering
updated_orders = dict()
for idx, acl in enumerate(full_acl_list):
if acl['description'] == description:
continue
updated_orders[
'/networks/{}/firewall_rules/{}/{}/order'.format(net_vni, direction, acl['description'])
] = idx
if updated_orders:
try:
zkhandler.writedata(zk_conn, updated_orders)
except Exception as e:
return False, 'ERROR: Failed to write to Zookeeper! Exception: "{}".'.format(e)
if idx == acl['order']:
continue
else:
zkhandler.write([
((f'network.rule.{direction}', net_vni, 'rule.order', acl['description']), idx)
])
# Add the new rule
try:
zkhandler.writedata(zk_conn, {
'/networks/{}/firewall_rules/{}/{}'.format(net_vni, direction, description): '',
'/networks/{}/firewall_rules/{}/{}/order'.format(net_vni, direction, description): order,
'/networks/{}/firewall_rules/{}/{}/rule'.format(net_vni, direction, description): rule
})
except Exception as e:
return False, 'ERROR: Failed to write to Zookeeper! Exception: "{}".'.format(e)
zkhandler.write([
((f'network.rule.{direction}', net_vni, 'rule', description), ''),
((f'network.rule.{direction}', net_vni, 'rule.order', description), order),
((f'network.rule.{direction}', net_vni, 'rule.rule', description), rule),
])
return True, 'Firewall rule "{}" added successfully!'.format(description)
def remove_acl(zk_conn, network, description):
def remove_acl(zkhandler, network, description):
# Validate and obtain standard passed value
net_vni = getNetworkVNI(zk_conn, network)
net_vni = getNetworkVNI(zkhandler, network)
if not net_vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
match_description = ''
# Check if the ACL matches a description currently in the database
acl_list = getNetworkACLs(zk_conn, net_vni, 'both')
acl_list = getNetworkACLs(zkhandler, net_vni, 'both')
for acl in acl_list:
if acl['description'] == description:
match_description = acl['description']
@ -495,77 +490,75 @@ def remove_acl(zk_conn, network, description):
return False, 'ERROR: No firewall rule exists matching description "{}"!'.format(description)
# Remove the entry from zookeeper
try:
zkhandler.deletekey(zk_conn, '/networks/{}/firewall_rules/{}/{}'.format(net_vni, match_direction, match_description))
except Exception as e:
return False, 'ERROR: Failed to write to Zookeeper! Exception: "{}".'.format(e)
zkhandler.delete([
(f'network.rule.{match_direction}', net_vni, 'rule', match_description)
])
# Update the existing ordering
updated_acl_list = getNetworkACLs(zk_conn, net_vni, match_direction)
updated_orders = dict()
updated_acl_list = getNetworkACLs(zkhandler, net_vni, match_direction)
for idx, acl in enumerate(updated_acl_list):
updated_orders[
'/networks/{}/firewall_rules/{}/{}/order'.format(net_vni, match_direction, acl['description'])
] = idx
if acl['description'] == description:
continue
if updated_orders:
try:
zkhandler.writedata(zk_conn, updated_orders)
except Exception as e:
return False, 'ERROR: Failed to write to Zookeeper! Exception: "{}".'.format(e)
if idx == acl['order']:
continue
else:
zkhandler.write([
((f'network.rule.{match_direction}', net_vni, 'rule.order', acl['description']), idx),
])
return True, 'Firewall rule "{}" removed successfully!'.format(match_description)
def get_info(zk_conn, network):
def get_info(zkhandler, network):
# Validate and obtain alternate passed value
net_vni = getNetworkVNI(zk_conn, network)
net_vni = getNetworkVNI(zkhandler, network)
if not net_vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
network_information = getNetworkInformation(zk_conn, network)
network_information = getNetworkInformation(zkhandler, network)
if not network_information:
return False, 'ERROR: Could not get information about network "{}"'.format(network)
return True, network_information
def get_list(zk_conn, limit, is_fuzzy=True):
def get_list(zkhandler, limit, is_fuzzy=True):
net_list = []
full_net_list = zkhandler.listchildren(zk_conn, '/networks')
full_net_list = zkhandler.children('base.network')
for net in full_net_list:
description = zkhandler.readdata(zk_conn, '/networks/{}'.format(net))
description = zkhandler.read(('network', net))
if limit:
try:
if not is_fuzzy:
limit = '^' + limit + '$'
if re.match(limit, net):
net_list.append(getNetworkInformation(zk_conn, net))
net_list.append(getNetworkInformation(zkhandler, net))
if re.match(limit, description):
net_list.append(getNetworkInformation(zk_conn, net))
net_list.append(getNetworkInformation(zkhandler, net))
except Exception as e:
return False, 'Regex Error: {}'.format(e)
else:
net_list.append(getNetworkInformation(zk_conn, net))
net_list.append(getNetworkInformation(zkhandler, net))
return True, net_list
def get_list_dhcp(zk_conn, network, limit, only_static=False, is_fuzzy=True):
def get_list_dhcp(zkhandler, network, limit, only_static=False, is_fuzzy=True):
# Validate and obtain alternate passed value
net_vni = getNetworkVNI(zk_conn, network)
net_vni = getNetworkVNI(zkhandler, network)
if not net_vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
dhcp_list = []
if only_static:
full_dhcp_list = getNetworkDHCPReservations(zk_conn, net_vni)
full_dhcp_list = getNetworkDHCPReservations(zkhandler, net_vni)
else:
full_dhcp_list = getNetworkDHCPReservations(zk_conn, net_vni)
full_dhcp_list += getNetworkDHCPLeases(zk_conn, net_vni)
full_dhcp_list = getNetworkDHCPReservations(zkhandler, net_vni)
full_dhcp_list += getNetworkDHCPLeases(zkhandler, net_vni)
if limit:
try:
@ -591,14 +584,14 @@ def get_list_dhcp(zk_conn, network, limit, only_static=False, is_fuzzy=True):
valid_lease = True
if valid_lease:
dhcp_list.append(getDHCPLeaseInformation(zk_conn, net_vni, lease))
dhcp_list.append(getDHCPLeaseInformation(zkhandler, net_vni, lease))
return True, dhcp_list
def get_list_acl(zk_conn, network, limit, direction, is_fuzzy=True):
def get_list_acl(zkhandler, network, limit, direction, is_fuzzy=True):
# Validate and obtain alternate passed value
net_vni = getNetworkVNI(zk_conn, network)
net_vni = getNetworkVNI(zkhandler, network)
if not net_vni:
return False, 'ERROR: Could not find network "{}" in the cluster!'.format(network)
@ -611,7 +604,7 @@ def get_list_acl(zk_conn, network, limit, direction, is_fuzzy=True):
direction = "out"
acl_list = []
full_acl_list = getNetworkACLs(zk_conn, net_vni, direction)
full_acl_list = getNetworkACLs(zkhandler, net_vni, direction)
if limit:
try:
@ -638,3 +631,226 @@ def get_list_acl(zk_conn, network, limit, direction, is_fuzzy=True):
acl_list.append(acl)
return True, acl_list
#
# SR-IOV functions
#
# These are separate since they don't work like other network types
#
def getSRIOVPFInformation(zkhandler, node, pf):
mtu = zkhandler.read(('node.sriov.pf', node, 'sriov_pf.mtu', pf))
retcode, vf_list = get_list_sriov_vf(zkhandler, node, pf)
if retcode:
vfs = common.sortInterfaceNames([vf['phy'] for vf in vf_list if vf['pf'] == pf])
else:
vfs = []
# Construct a data structure to represent the data
pf_information = {
'phy': pf,
'mtu': mtu,
'vfs': vfs,
}
return pf_information
def get_info_sriov_pf(zkhandler, node, pf):
pf_information = getSRIOVPFInformation(zkhandler, node, pf)
if not pf_information:
return False, 'ERROR: Could not get information about SR-IOV PF "{}" on node "{}"'.format(pf, node)
return True, pf_information
def get_list_sriov_pf(zkhandler, node):
pf_list = list()
pf_phy_list = zkhandler.children(('node.sriov.pf', node))
for phy in pf_phy_list:
retcode, pf_information = get_info_sriov_pf(zkhandler, node, phy)
if retcode:
pf_list.append(pf_information)
return True, pf_list
def getSRIOVVFInformation(zkhandler, node, vf):
if not zkhandler.exists(('node.sriov.vf', node, 'sriov_vf', vf)):
return []
pf = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.pf', vf))
mtu = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.mtu', vf))
mac = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.mac', vf))
vlan_id = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.vlan_id', vf))
vlan_qos = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.vlan_qos', vf))
tx_rate_min = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.tx_rate_min', vf))
tx_rate_max = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.tx_rate_max', vf))
link_state = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.link_state', vf))
spoof_check = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.spoof_check', vf))
trust = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.trust', vf))
query_rss = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.config.query_rss', vf))
pci_domain = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.pci.domain', vf))
pci_bus = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.pci.bus', vf))
pci_slot = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.pci.slot', vf))
pci_function = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.pci.function', vf))
used = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.used', vf))
used_by_domain = zkhandler.read(('node.sriov.vf', node, 'sriov_vf.used_by', vf))
vf_information = {
'phy': vf,
'pf': pf,
'mtu': mtu,
'mac': mac,
'config': {
'vlan_id': vlan_id,
'vlan_qos': vlan_qos,
'tx_rate_min': tx_rate_min,
'tx_rate_max': tx_rate_max,
'link_state': link_state,
'spoof_check': spoof_check,
'trust': trust,
'query_rss': query_rss,
},
'pci': {
'domain': pci_domain,
'bus': pci_bus,
'slot': pci_slot,
'function': pci_function,
},
'usage': {
'used': used,
'domain': used_by_domain,
}
}
return vf_information
def get_info_sriov_vf(zkhandler, node, vf):
# Verify node is valid
valid_node = common.verifyNode(zkhandler, node)
if not valid_node:
return False, 'ERROR: Specified node "{}" is invalid.'.format(node)
vf_information = getSRIOVVFInformation(zkhandler, node, vf)
if not vf_information:
return False, 'ERROR: Could not find SR-IOV VF "{}" on node "{}"'.format(vf, node)
return True, vf_information
def get_list_sriov_vf(zkhandler, node, pf=None):
# Verify node is valid
valid_node = common.verifyNode(zkhandler, node)
if not valid_node:
return False, 'ERROR: Specified node "{}" is invalid.'.format(node)
vf_list = list()
vf_phy_list = common.sortInterfaceNames(zkhandler.children(('node.sriov.vf', node)))
for phy in vf_phy_list:
retcode, vf_information = get_info_sriov_vf(zkhandler, node, phy)
if retcode:
if pf is not None:
if vf_information['pf'] == pf:
vf_list.append(vf_information)
else:
vf_list.append(vf_information)
return True, vf_list
def set_sriov_vf_config(zkhandler, node, vf, vlan_id=None, vlan_qos=None, tx_rate_min=None, tx_rate_max=None, link_state=None, spoof_check=None, trust=None, query_rss=None):
# Verify node is valid
valid_node = common.verifyNode(zkhandler, node)
if not valid_node:
return False, 'ERROR: Specified node "{}" is invalid.'.format(node)
# Verify VF is valid
vf_information = getSRIOVVFInformation(zkhandler, node, vf)
if not vf_information:
return False, 'ERROR: Could not find SR-IOV VF "{}" on node "{}".'.format(vf, node)
update_list = list()
if vlan_id is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.vlan_id', vf), vlan_id))
if vlan_qos is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.vlan_qos', vf), vlan_qos))
if tx_rate_min is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.tx_rate_min', vf), tx_rate_min))
if tx_rate_max is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.tx_rate_max', vf), tx_rate_max))
if link_state is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.link_state', vf), link_state))
if spoof_check is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.spoof_check', vf), spoof_check))
if trust is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.trust', vf), trust))
if query_rss is not None:
update_list.append((('node.sriov.vf', node, 'sriov_vf.config.query_rss', vf), query_rss))
if len(update_list) < 1:
return False, 'ERROR: No changes to apply.'
result = zkhandler.write(update_list)
if result:
return True, 'Successfully modified configuration of SR-IOV VF "{}" on node "{}".'.format(vf, node)
else:
return False, 'Failed to modify configuration of SR-IOV VF "{}" on node "{}".'.format(vf, node)
def set_sriov_vf_vm(zkhandler, vm_uuid, node, vf, vf_macaddr, vf_type):
# Verify node is valid
valid_node = common.verifyNode(zkhandler, node)
if not valid_node:
return False
# Verify VF is valid
vf_information = getSRIOVVFInformation(zkhandler, node, vf)
if not vf_information:
return False
update_list = [
(('node.sriov.vf', node, 'sriov_vf.used', vf), 'True'),
(('node.sriov.vf', node, 'sriov_vf.used_by', vf), vm_uuid),
(('node.sriov.vf', node, 'sriov_vf.mac', vf), vf_macaddr),
]
# Hostdev type SR-IOV prevents the guest from live migrating
if vf_type == 'hostdev':
update_list.append(
(('domain.meta.migrate_method', vm_uuid), 'shutdown')
)
zkhandler.write(update_list)
return True
def unset_sriov_vf_vm(zkhandler, node, vf):
# Verify node is valid
valid_node = common.verifyNode(zkhandler, node)
if not valid_node:
return False
# Verify VF is valid
vf_information = getSRIOVVFInformation(zkhandler, node, vf)
if not vf_information:
return False
update_list = [
(('node.sriov.vf', node, 'sriov_vf.used', vf), 'False'),
(('node.sriov.vf', node, 'sriov_vf.used_by', vf), ''),
(('node.sriov.vf', node, 'sriov_vf.mac', vf), zkhandler.read(('node.sriov.vf', node, 'sriov_vf.phy_mac', vf)))
]
zkhandler.write(update_list)
return True

View File

@ -22,31 +22,31 @@
import time
import re
import daemon_lib.zkhandler as zkhandler
import daemon_lib.common as common
def getNodeInformation(zk_conn, node_name):
def getNodeInformation(zkhandler, node_name):
"""
Gather information about a node from the Zookeeper database and return a dict() containing it.
"""
node_daemon_state = zkhandler.readdata(zk_conn, '/nodes/{}/daemonstate'.format(node_name))
node_coordinator_state = zkhandler.readdata(zk_conn, '/nodes/{}/routerstate'.format(node_name))
node_domain_state = zkhandler.readdata(zk_conn, '/nodes/{}/domainstate'.format(node_name))
node_static_data = zkhandler.readdata(zk_conn, '/nodes/{}/staticdata'.format(node_name)).split()
node_daemon_state = zkhandler.read(('node.state.daemon', node_name))
node_coordinator_state = zkhandler.read(('node.state.router', node_name))
node_domain_state = zkhandler.read(('node.state.domain', node_name))
node_static_data = zkhandler.read(('node.data.static', node_name)).split()
node_pvc_version = zkhandler.read(('node.data.pvc_version', node_name))
node_cpu_count = int(node_static_data[0])
node_kernel = node_static_data[1]
node_os = node_static_data[2]
node_arch = node_static_data[3]
node_vcpu_allocated = int(zkhandler.readdata(zk_conn, 'nodes/{}/vcpualloc'.format(node_name)))
node_mem_total = int(zkhandler.readdata(zk_conn, '/nodes/{}/memtotal'.format(node_name)))
node_mem_allocated = int(zkhandler.readdata(zk_conn, '/nodes/{}/memalloc'.format(node_name)))
node_mem_provisioned = int(zkhandler.readdata(zk_conn, '/nodes/{}/memprov'.format(node_name)))
node_mem_used = int(zkhandler.readdata(zk_conn, '/nodes/{}/memused'.format(node_name)))
node_mem_free = int(zkhandler.readdata(zk_conn, '/nodes/{}/memfree'.format(node_name)))
node_load = float(zkhandler.readdata(zk_conn, '/nodes/{}/cpuload'.format(node_name)))
node_domains_count = int(zkhandler.readdata(zk_conn, '/nodes/{}/domainscount'.format(node_name)))
node_running_domains = zkhandler.readdata(zk_conn, '/nodes/{}/runningdomains'.format(node_name)).split()
node_vcpu_allocated = int(zkhandler.read(('node.vcpu.allocated', node_name)))
node_mem_total = int(zkhandler.read(('node.memory.total', node_name)))
node_mem_allocated = int(zkhandler.read(('node.memory.allocated', node_name)))
node_mem_provisioned = int(zkhandler.read(('node.memory.provisioned', node_name)))
node_mem_used = int(zkhandler.read(('node.memory.used', node_name)))
node_mem_free = int(zkhandler.read(('node.memory.free', node_name)))
node_load = float(zkhandler.read(('node.cpu.load', node_name)))
node_domains_count = int(zkhandler.read(('node.count.provisioned_domains', node_name)))
node_running_domains = zkhandler.read(('node.running_domains', node_name)).split()
# Construct a data structure to represent the data
node_information = {
@ -54,6 +54,7 @@ def getNodeInformation(zk_conn, node_name):
'daemon_state': node_daemon_state,
'coordinator_state': node_coordinator_state,
'domain_state': node_domain_state,
'pvc_version': node_pvc_version,
'cpu_count': node_cpu_count,
'kernel': node_kernel,
'os': node_os,
@ -79,118 +80,142 @@ def getNodeInformation(zk_conn, node_name):
#
# Direct Functions
#
def secondary_node(zk_conn, node):
def secondary_node(zkhandler, node):
# Verify node is valid
if not common.verifyNode(zk_conn, node):
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
# Ensure node is a coordinator
daemon_mode = zkhandler.readdata(zk_conn, '/nodes/{}/daemonmode'.format(node))
daemon_mode = zkhandler.read(('node.mode', node))
if daemon_mode == 'hypervisor':
return False, 'ERROR: Cannot change router mode on non-coordinator node "{}"'.format(node)
# Ensure node is in run daemonstate
daemon_state = zkhandler.readdata(zk_conn, '/nodes/{}/daemonstate'.format(node))
daemon_state = zkhandler.read(('node.state.daemon', node))
if daemon_state != 'run':
return False, 'ERROR: Node "{}" is not active'.format(node)
# Get current state
current_state = zkhandler.readdata(zk_conn, '/nodes/{}/routerstate'.format(node))
if current_state == 'primary':
retmsg = 'Setting node {} in secondary router mode.'.format(node)
zkhandler.writedata(zk_conn, {
'/primary_node': 'none'
})
else:
return False, 'Node "{}" is already in secondary router mode.'.format(node)
return True, retmsg
def primary_node(zk_conn, node):
# Verify node is valid
if not common.verifyNode(zk_conn, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
# Ensure node is a coordinator
daemon_mode = zkhandler.readdata(zk_conn, '/nodes/{}/daemonmode'.format(node))
if daemon_mode == 'hypervisor':
return False, 'ERROR: Cannot change router mode on non-coordinator node "{}"'.format(node)
# Ensure node is in run daemonstate
daemon_state = zkhandler.readdata(zk_conn, '/nodes/{}/daemonstate'.format(node))
if daemon_state != 'run':
return False, 'ERROR: Node "{}" is not active'.format(node)
# Get current state
current_state = zkhandler.readdata(zk_conn, '/nodes/{}/routerstate'.format(node))
current_state = zkhandler.read(('node.state.router', node))
if current_state == 'secondary':
retmsg = 'Setting node {} in primary router mode.'.format(node)
zkhandler.writedata(zk_conn, {
'/primary_node': node
})
else:
return False, 'Node "{}" is already in primary router mode.'.format(node)
return True, 'Node "{}" is already in secondary router mode.'.format(node)
retmsg = 'Setting node {} in secondary router mode.'.format(node)
zkhandler.write([
('base.config.primary_node', 'none')
])
return True, retmsg
def flush_node(zk_conn, node, wait=False):
def primary_node(zkhandler, node):
# Verify node is valid
if not common.verifyNode(zk_conn, node):
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
# Ensure node is a coordinator
daemon_mode = zkhandler.read(('node.mode', node))
if daemon_mode == 'hypervisor':
return False, 'ERROR: Cannot change router mode on non-coordinator node "{}"'.format(node)
# Ensure node is in run daemonstate
daemon_state = zkhandler.read(('node.state.daemon', node))
if daemon_state != 'run':
return False, 'ERROR: Node "{}" is not active'.format(node)
# Get current state
current_state = zkhandler.read(('node.state.router', node))
if current_state == 'primary':
return True, 'Node "{}" is already in primary router mode.'.format(node)
retmsg = 'Setting node {} in primary router mode.'.format(node)
zkhandler.write([
('base.config.primary_node', node)
])
return True, retmsg
def flush_node(zkhandler, node, wait=False):
# Verify node is valid
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
if zkhandler.read(('node.state.domain', node)) == 'flushed':
return True, 'Hypervisor {} is already flushed.'.format(node)
retmsg = 'Flushing hypervisor {} of running VMs.'.format(node)
# Add the new domain to Zookeeper
zkhandler.writedata(zk_conn, {
'/nodes/{}/domainstate'.format(node): 'flush'
})
zkhandler.write([
(('node.state.domain', node), 'flush')
])
if wait:
while zkhandler.readdata(zk_conn, '/nodes/{}/domainstate'.format(node)) == 'flush':
while zkhandler.read(('node.state.domain', node)) == 'flush':
time.sleep(1)
retmsg = 'Flushed hypervisor {} of running VMs.'.format(node)
return True, retmsg
def ready_node(zk_conn, node, wait=False):
def ready_node(zkhandler, node, wait=False):
# Verify node is valid
if not common.verifyNode(zk_conn, node):
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
if zkhandler.read(('node.state.domain', node)) == 'ready':
return True, 'Hypervisor {} is already ready.'.format(node)
retmsg = 'Restoring hypervisor {} to active service.'.format(node)
# Add the new domain to Zookeeper
zkhandler.writedata(zk_conn, {
'/nodes/{}/domainstate'.format(node): 'unflush'
})
zkhandler.write([
(('node.state.domain', node), 'unflush')
])
if wait:
while zkhandler.readdata(zk_conn, '/nodes/{}/domainstate'.format(node)) == 'unflush':
while zkhandler.read(('node.state.domain', node)) == 'unflush':
time.sleep(1)
retmsg = 'Restored hypervisor {} to active service.'.format(node)
return True, retmsg
def get_info(zk_conn, node):
def get_node_log(zkhandler, node, lines=2000):
# Verify node is valid
if not common.verifyNode(zk_conn, node):
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
# Get the data from ZK
node_log = zkhandler.read(('logs.messages', node))
if node_log is None:
return True, ''
# Shrink the log buffer to length lines
shrunk_log = node_log.split('\n')[-lines:]
loglines = '\n'.join(shrunk_log)
return True, loglines
def get_info(zkhandler, node):
# Verify node is valid
if not common.verifyNode(zkhandler, node):
return False, 'ERROR: No node named "{}" is present in the cluster.'.format(node)
# Get information about node in a pretty format
node_information = getNodeInformation(zk_conn, node)
node_information = getNodeInformation(zkhandler, node)
if not node_information:
return False, 'ERROR: Could not get information about node "{}".'.format(node)
return True, node_information
def get_list(zk_conn, limit, daemon_state=None, coordinator_state=None, domain_state=None, is_fuzzy=True):
def get_list(zkhandler, limit, daemon_state=None, coordinator_state=None, domain_state=None, is_fuzzy=True):
node_list = []
full_node_list = zkhandler.listchildren(zk_conn, '/nodes')
full_node_list = zkhandler.children('base.node')
for node in full_node_list:
if limit:
@ -199,11 +224,11 @@ def get_list(zk_conn, limit, daemon_state=None, coordinator_state=None, domain_s
limit = '^' + limit + '$'
if re.match(limit, node):
node_list.append(getNodeInformation(zk_conn, node))
node_list.append(getNodeInformation(zkhandler, node))
except Exception as e:
return False, 'Regex Error: {}'.format(e)
else:
node_list.append(getNodeInformation(zk_conn, node))
node_list.append(getNodeInformation(zkhandler, node))
if daemon_state or coordinator_state or domain_state:
limited_node_list = []

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

112
debian/changelog vendored
View File

@ -1,3 +1,115 @@
pvc (0.9.27-0) unstable; urgency=high
* [CLI Client] Fixes a bug with vm modify command when passed a file
-- Joshua M. Boniface <joshua@boniface.me> Mon, 19 Jul 2021 00:03:40 -0400
pvc (0.9.26-0) unstable; urgency=high
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures
* [All] Implements VM tagging functionality
* [All] Implements Node log access via PVC functionality
-- Joshua M. Boniface <joshua@boniface.me> Sun, 18 Jul 2021 20:49:52 -0400
pvc (0.9.25-0) unstable; urgency=high
* [Node Daemon] Returns to Rados library calls for Ceph due to performance problems
* [Node Daemon] Adds a date output to keepalive messages
* [Daemons] Configures ZK connection logging only for persistent connections
* [API Provisioner] Add context manager-based chroot to Debootstrap example script
* [Node Daemon] Fixes a bug where shutdown daemon state was overwritten
-- Joshua M. Boniface <joshua@boniface.me> Sun, 11 Jul 2021 23:19:09 -0400
pvc (0.9.24-0) unstable; urgency=high
* [Node Daemon] Removes Rados module polling of Ceph cluster and returns to command-based polling for timeout purposes, and removes some flaky return statements
* [Node Daemon] Removes flaky Zookeeper connection renewals that caused problems
* [CLI Client] Allow raw lists of clusters from `pvc cluster list`
* [API Daemon] Fixes several issues when getting VM data without stats
* [API Daemon] Fixes issues with removing VMs while disks are still in use (failed provisioning, etc.)
-- Joshua M. Boniface <joshua@boniface.me> Fri, 09 Jul 2021 15:58:36 -0400
pvc (0.9.23-0) unstable; urgency=high
* [Daemons] Fixes a critical overwriting bug in zkhandler when schema paths are not yet valid
* [Node Daemon] Ensures the daemon mode is updated on every startup (fixes the side effect of the above bug in 0.9.22)
-- Joshua M. Boniface <joshua@boniface.me> Mon, 05 Jul 2021 23:40:32 -0400
pvc (0.9.22-0) unstable; urgency=high
* [API Daemon] Drastically improves performance when getting large lists (e.g. VMs)
* [Daemons] Adds profiler functions for use in debug mode
* [Daemons] Improves reliability of ZK locking
* [Daemons] Adds the new logo in ASCII form to the Daemon startup message
* [Node Daemon] Fixes bug where VMs would sometimes not stop
* [Node Daemon] Code cleanups in various classes
* [Node Daemon] Fixes a bug when reading node schema data
* [All] Adds node PVC version information to the list output
* [CLI Client] Improves the style and formatting of list output including a new header line
* [API Worker] Fixes a bug that prevented the storage benchmark job from running
-- Joshua M. Boniface <joshua@boniface.me> Mon, 05 Jul 2021 14:18:51 -0400
pvc (0.9.21-0) unstable; urgency=high
* [API Daemon] Ensures VMs stop before removing them
* [Node Daemon] Fixes a bug with VM shutdowns not timing out
* [Documentation] Adds information about georedundancy caveats
* [All] Adds support for SR-IOV NICs (hostdev and macvtap) and surrounding documentation
* [Node Daemon] Fixes a bug where shutdown aborted migrations unexpectedly
* [Node Daemon] Fixes a bug where the migration method was not updated realtime
* [Node Daemon] Adjusts the Patroni commands to remove reference to Zookeeper path
* [CLI Client] Adjusts several help messages and fixes some typos
* [CLI Client] Converts the CLI client to a proper Python module
* [API Daemon] Improves VM list performance
* [API Daemon] Adjusts VM list matching critera (only matches against the UUID if it's a full UUID)
* [API Worker] Fixes incompatibility between Deb 10 and 11 in launching Celery worker
* [API Daemon] Corrects several bugs with initialization command
* [Documentation] Adds a shiny new logo and revamps introduction text
-- Joshua M. Boniface <joshua@boniface.me> Tue, 29 Jun 2021 19:21:31 -0400
pvc (0.9.20-0) unstable; urgency=high
* [Daemons] Implemented a Zookeeper schema handler and version 0 schema
* [Daemons] Completes major refactoring of codebase to make use of the schema handler
* [Daemons] Adds support for dynamic chema changges and "hot reloading" of pvcnoded processes
* [Daemons] Adds a functional testing script for verifying operation against a test cluster
* [Daemons, CLI] Fixes several minor bugs found by the above script
* [Daemons, CLI] Add support for Debian 11 "Bullseye"
-- Joshua M. Boniface <joshua@boniface.me> Mon, 14 Jun 2021 18:06:27 -0400
pvc (0.9.19-0) unstable; urgency=high
* [CLI] Corrects some flawed conditionals
* [API] Disables SQLAlchemy modification tracking functionality (not used by us)
* [Daemons] Implements new zkhandler module for improved reliability and reusability
* [Daemons] Refactors some code to use new zkhandler module
* [API, CLI] Adds support for "none" migration selector (uses cluster default instead)
* [Daemons] Moves some configuration keys to new /config tree
* [Node Daemon] Increases initial lock timeout for VM migrations to avoid out-of-sync potential
* [Provisioner] Support storing and using textual cluster network labels ("upstream", "storage", "cluster") in templates
* [API] Avoid duplicating existing node states
-- Joshua M. Boniface <joshua@boniface.me> Sun, 06 Jun 2021 01:47:41 -0400
pvc (0.9.18-0) unstable; urgency=high
* Adds VM rename functionality to API and CLI client
-- Joshua M. Boniface <joshua@boniface.me> Sun, 23 May 2021 17:23:10 -0400
pvc (0.9.17-0) unstable; urgency=high
* [CLI] Fixes bugs in log follow output
-- Joshua M. Boniface <joshua@boniface.me> Wed, 19 May 2021 17:06:29 -0400
pvc (0.9.16-0) unstable; urgency=high
* Improves some CLI help messages

View File

@ -1,3 +0,0 @@
client-cli/pvc.py usr/share/pvc
client-cli/cli_lib usr/share/pvc
client-cli/scripts usr/share/pvc

View File

@ -1,4 +1,8 @@
#!/bin/sh
# Install client binary to /usr/bin via symlink
ln -s /usr/share/pvc/pvc.py /usr/bin/pvc
# Generate the bash completion configuration
if [ -d /etc/bash_completion.d ]; then
_PVC_COMPLETE=source_bash pvc > /etc/bash_completion.d/pvc
fi
exit 0

View File

@ -1,4 +1,8 @@
#!/bin/sh
# Remove client binary symlink
rm -f /usr/bin/pvc
# Remove the bash completion
if [ -f /etc/bash_completion.d/pvc ]; then
rm -f /etc/bash_completion.d/pvc
fi
exit 0

View File

@ -5,5 +5,6 @@ api-daemon/pvcapid.sample.yaml etc/pvc
api-daemon/pvcapid usr/share/pvc
api-daemon/pvcapid.service lib/systemd/system
api-daemon/pvcapid-worker.service lib/systemd/system
api-daemon/pvcapid-worker.sh usr/share/pvc
api-daemon/provisioner usr/share/pvc
api-daemon/migrations usr/share/pvc

12
debian/rules vendored
View File

@ -1,13 +1,19 @@
#!/usr/bin/make -f
# See debhelper(7) (uncomment to enable)
# output every command that modifies files on the build system.
#export DH_VERBOSE = 1
export DH_VERBOSE = 1
%:
dh $@
dh $@ --with python3
override_dh_python3:
cd $(CURDIR)/client-cli; pybuild --system=distutils --dest-dir=../debian/pvc-client-cli/
mkdir -p debian/pvc-client-cli/usr/lib/python3
mv debian/pvc-client-cli/usr/lib/python3*/* debian/pvc-client-cli/usr/lib/python3/
rm -r $(CURDIR)/client-cli/.pybuild $(CURDIR)/client-cli/pvc.egg-info
override_dh_auto_clean:
find . -name "__pycache__" -exec rm -r {} \; || true
find . -name "__pycache__" -o -name ".pybuild" -exec rm -r {} \; || true
# If you need to rebuild the Sphinx documentation
# Add spinxdoc to the dh --with line

View File

@ -49,7 +49,7 @@ Node network routing for managed networks providing EBGP VXLAN and route-learnin
The storage subsystem is provided by Ceph, a distributed object-based storage subsystem with extensive scalability, self-managing, and self-healing functionality. The Ceph RBD (RADOS Block Device) subsystem is used to provide VM block devices similar to traditional LVM or ZFS zvols, but in a distributed, shared-storage manner.
All the components are designed to be run on top of Debian GNU/Linux, specifically Debian 10.X "Buster", with the SystemD system service manager. This OS provides a stable base to run the various other subsystems while remaining truly Free Software, while SystemD provides functionality such as automatic daemon restarting and complex startup/shutdown ordering.
All the components are designed to be run on top of Debian GNU/Linux, specifically Debian 10.x "Buster" or 11.x "Bullseye", with the SystemD system service manager. This OS provides a stable base to run the various other subsystems while remaining truly Free Software, while SystemD provides functionality such as automatic daemon restarting and complex startup/shutdown ordering.
## Cluster Architecture
@ -156,7 +156,7 @@ For optimal performance, nodes should use at least 10-Gigabit Ethernet network i
#### What Ceph version does PVC use?
PVC requires Ceph 14.x (Nautilus). The official PVC repository at https://repo.bonifacelabs.ca includes Ceph 14.2.x (updated regularly), since Debian Buster by default includes only 12.x (Luminous).
PVC requires Ceph 14.x (Nautilus). The official PVC repository at https://repo.bonifacelabs.ca includes Ceph 14.2.x for Debian Buster (updated regularly), since by default it only includes 12.x (Luminous).
## About The Author

View File

@ -12,6 +12,7 @@
+ [PVC client networks](#pvc-client-networks)
- [Bridged (unmanaged) Client Networks](#bridged--unmanaged--client-networks)
- [VXLAN (managed) Client Networks](#vxlan--managed--client-networks)
- [SR-IOV Client Networks](#sriov-client-networks)
- [Other Client Networks](#other-client-networks)
* [Node Layout: Considering how nodes are laid out](#node-layout--considering-how-nodes-are-laid-out)
+ [Node Functions: Coordinators versus Hypervisors](#node-functions--coordinators-versus-hypervisors)
@ -74,7 +75,7 @@ By default, the Java heap and stack sizes are set to 256MB and 512MB respectivel
### Operating System and Architecture
As an underlying OS, only Debian GNU/Linux 10.x "Buster" is supported by PVC. This is the operating system installed by the PVC [node installer](https://github.com/parallelvirtualcluster/pvc-installer) and expected by the PVC [Ansible configuration system](https://github.com/parallelvirtualcluster/pvc-ansible). Ubuntu or other Debian-derived distributions may work, but are not officially supported. PVC also makes use of a custom repository to provide the PVC software and an updated version of Ceph beyond what is available in the base operating system, and this is only compatible officially with Debian 10 "Buster". PVC will, in the future, upgrade to future versions of Debian based on their release schedule and testing; releases may be skipped for official support if required. As a general rule, using the current versions of the official node installer and Ansible repository is the preferred and only supported method for deploying PVC.
As an underlying OS, only Debian GNU/Linux 10.x "Buster" or 11.x "Bullseye" is supported by PVC. This is the operating system installed by the PVC [node installer](https://github.com/parallelvirtualcluster/pvc-installer) and expected by the PVC [Ansible configuration system](https://github.com/parallelvirtualcluster/pvc-ansible). Ubuntu or other Debian-derived distributions may work, but are not officially supported. PVC also makes use of a custom repository to provide the PVC software and (for Debian Buster) an updated version of Ceph beyond what is available in the base operating system, and this is only compatible officially with Debian 10 or 11. PVC will generally be upgraded regularly to support new Debian versions. As a rule, using the current versions of the official node installer and Ansible repository is the preferred and only supported method for deploying PVC.
Currently, only the `amd64` (Intel 64 or AMD64) architecture is officially supported by PVC. Given the cross-platform nature of Python and the various software components in Debian, it may work on `armhf` or `arm64` systems as well, however this has not been tested by the author and is not officially supported at this time.
@ -184,6 +185,26 @@ With this client network type, PVC is in full control of the network. No vLAN co
NOTE: These networks may introduce a bottleneck and tromboning if there is a large amount of external and/or inter-network traffic on the cluster. The administrator should consider this carefully when deciding whether to use managed or bridged networks and properly evaluate the inter-network traffic requirements.
#### SR-IOV Client Networks
The third type of client network is the SR-IOV network. SR-IOV (Single-Root I/O Virtualization) is a technique and feature enabled on modern high-performance NICs (for instance, those from Intel or nVidia) which allows a single physical Ethernet port (a "PF" in SR-IOV terminology) to be split, at a hardware level, into multiple virtual Ethernet ports ("VF"s), which can then be managed separately. Starting with version 0.9.21, PVC support SR-IOV PF and VF configuration at the node level, and these VFs can be passed into VMs in two ways.
SR-IOV's main benefit is to offload bridging and network functions from the hypervisor layer, and direct them onto the hardware itself. This can increase network throughput in some situations, as well as provide near-complete isolation of guest networks from the hypervisors (in contrast with bridges which *can* expose client traffic to the hypervisors, and VXLANs which *do* expose client traffic to the hypervisors). For instance, a VF can have a vLAN specified, and the tagging/untagging of packets is then carried out at the hardware layer.
There are however caveats to working with SR-IOV. At the most basic level, the biggest difference with SR-IOV compared to the other two network types is that SR-IOV must be configured on a per-node basis. That is, each node must have SR-IOV explicitly enabled, it's specific PF devices defined, and a set of VFs created at PVC startup. Generally, with identical PVC nodes, this will not be a problem but is something to consider, especially if the servers are mismatched in any way. It is thus also possible to set some nodes with SR-IOV functionality, and others without, though care must be taken in this situation to set node limits in the VM metadata of any VMs which use SR-IOV VFs to prevent failed migrations.
PFs are defined in the `pvcnoded.yml` configuration of each node, via the `sriov_device` list. Each PF can have an arbitrary number of VFs (`vfcount`) allocated, though each NIC vendor and model has specific limits. Once configured, specifically with Intel NICs, PFs (and specifically, the `vfcount` attribute in the driver) are immutable and cannot be changed easily without completely flushing the node and rebooting it, so care should be taken to select the desired settings as early in the cluster configuration as possible.
Once created, VFs are also managed on a per-node basis. That is, each VF, on each host, even if they have the exact same device names, is managed separately. For instance, the PF `ens1f0` creating a VF `ens1f0v0` on "`hv1`", can have a different configuration from the identically-named VF `ens1f0v0` on "`hv2`". The administrator is responsible for ensuring consistency here, and for ensuring that devices do not overlap (e.g. assigning the same VF name to VMs on two separate nodes which might migrate to each other). PVC will however explicitly prevent two VMs from being assigned to the same VF on the same node, even if this may be technically possible in some cases.
When attaching VFs to VMs, there are two supported modes: `macvtap`, and `hostdev`.
`macvtap`, as the name suggests, uses the Linux `macvtap` driver to connect the VF to the VM. Once attached, the vNIC behaves just like a "bridged" network connection above, and like "bridged" connections, the "mode" of the NIC can be specificed, defaulting to "virtio" but supporting various emulated devices instead. Note that in this mode, vLANs cannot be configured on the guest side; they must be specified in the VF configuration (`pvc network sriov vf set`) with one vLAN per VF. VMs with `macvtap` interfaces can be live migrated between nodes without issue, assuming there is a corresponding free VF on the destination node, and the SR-IOV functionality is transparent to the VM.
`hostdev` is a direct PCIe passthrough method. With a VF attached to a VM in `hostdev` mode, the virtual PCIe NIC device itself becomes hidden from the node, and is visible only to the guest, where it appears as a discrete PCIe device. In this mode, vLANs and other attributes can be set on the guest side at will, though setting vLANs and other properties in the VF configuration is still supported. The main caveat to this mode is that VMs with connected `hostdev` SR-IOV VFs *cannot be live migrated between nodes*. Only a `shutdown` migration is supported, and, like `macvtap`, an identical PCIe device at the same bus address must be present on the target node. To prevent unexpected failures, PVC will explicitly set the VM metadata for the "migration method" to "shutdown" the first time that a `hostdev` VF is attached to it; if this changes later, the administrator must change this back explicitly.
Generally speaking, SR-IOV connections are not recommended unless there is a good usecase for them. On modern hardware, software bridges are extremely performant, and are much simpler to manage. The functionality is provided for those rare usecases where SR-IOV is asbolutely required by the administrator, but care must be taken to understand all the requirements and caveats of SR-IOV before using it in production.
#### Other Client Networks
Future PVC versions may support other client network types, such as direct-routing between VMs.
@ -235,10 +256,17 @@ When using geographic redundancy, there are several caveats to keep in mind:
* The number of sites and positioning of coordinators at those sites is important. A majority (at least 2 in a 3-coordinator cluster, or 3 in a 5-coordinator) of coordinators must be able to reach each other in a failure scenario for the cluster as a whole to remain functional. Thus, configurations such as 2 + 1 or 3 + 2 splits across 2 sites do *not* provide full redundancy, and the whole cluster will be down if the majority site is down. It is thus recommended to always have an odd number of sites to match the odd number of coordinators, for instance a 1 + 1 + 1 or 2 + 2 + 1 configuration. Also note that all hypervisors much be able to reach the majority coordinator group or their storage will be impacted as well.
This diagram outlines the supported and unsupported/unreliable georedundant configurations for 3 nodes. Care must always be taken to ensure that the cluster can operate with the loss of any given georeundant site.
![georeundancy-caveats](/images/georedundancy-caveats.png)
*Above: Supported and unsupported/unreliable georedundant configurations*
* Even if the PVC software itself is in an unmanageable state, VMs will continue to run if at all possible. However, since the storage subsystem makes use of the same quorum, losing more than half of the nodes will very likely result in storage interruption as well, which will affect running VMs.
If these requirements cannot be fulfilled, it may be best to have separate PVC clusters at each site and handle service redundancy at a higher layer to avoid a major disruption.
## Example Configurations
This section provides diagrams of 3 possible node configurations. These diagrams can be extrapolated out to almost any possible configuration and number of nodes.

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

BIN
docs/images/pvc_icon.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

View File

@ -1,23 +1,137 @@
# PVC - The Parallel Virtual Cluster system
<p align="center">
<img alt="Logo banner" src="https://git.bonifacelabs.ca/uploads/-/system/project/avatar/135/pvc_logo.png"/>
<img alt="Logo banner" src="images/pvc_logo_black.png"/>
<br/><br/>
<a href="https://github.com/parallelvirtualcluster/pvc"><img alt="License" src="https://img.shields.io/github/license/parallelvirtualcluster/pvc"/></a>
<a href="https://github.com/parallelvirtualcluster/pvc/releases"><img alt="Release" src="https://img.shields.io/github/release-pre/parallelvirtualcluster/pvc"/></a>
<a href="https://parallelvirtualcluster.readthedocs.io/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/parallelvirtualcluster/badge/?version=latest"/></a>
</p>
PVC is a KVM+Ceph+Zookeeper-based, Free Software, scalable, redundant, self-healing, and self-managing private cloud solution designed with administrator simplicity in mind. It is built from the ground-up to be redundant at the host layer, allowing the cluster to gracefully handle the loss of nodes or their components, both due to hardware failure or due to maintenance. It is able to scale from a minimum of 3 nodes up to 12 or more nodes, while retaining performance and flexibility, allowing the administrator to build a small cluster today and grow it as needed.
## What is PVC?
PVC is a virtual machine-based hyperconverged infrastructure (HCI) virtualization cluster solution that is fully Free Software, scalable, redundant, self-healing, self-managing, and designed for administrator simplicity. It is an alternative to other HCI solutions such as Harvester, Nutanix, and VMWare, as well as to other common virtualization stacks such as ProxMox and OpenStack.
PVC is a complete HCI solution, built from well-known and well-trusted Free Software tools, to assist an administrator in creating and managing a cluster of servers to run virtual machines, as well as self-managing several important aspects including storage failover, node failure and recovery, virtual machine failure and recovery, and network plumbing. It is designed to act consistently, reliably, and unobtrusively, letting the administrator concentrate on more important things.
PVC is highly scalable. From a minimum (production) node count of 3, up to 12 or more, and supporting many dozens of VMs, PVC scales along with your workload and requirements. Deploy a cluster once and grow it as your needs expand.
As a consequence of its features, PVC makes administrating very high-uptime VMs extremely easy, featuring VM live migration, built-in always-enabled shared storage with transparent multi-node replication, and consistent network plumbing throughout the cluster. Nodes can also be seamlessly removed from or added to service, with zero VM downtime, to facilitate maintenance, upgrades, or other work.
PVC also features an optional, fully customizable VM provisioning framework, designed to automate and simplify VM deployments using custom provisioning profiles, scripts, and CloudInit userdata API support.
Installation of PVC is accomplished by two main components: a [Node installer ISO](https://github.com/parallelvirtualcluster/pvc-installer) which creates on-demand installer ISOs, and an [Ansible role framework](https://github.com/parallelvirtualcluster/pvc-ansible) to configure, bootstrap, and administrate the nodes. Once up, the cluster is managed via an HTTP REST API, accessible via a Python Click CLI client or WebUI.
Just give it physical servers, and it will run your VMs without you having to think about it, all in just an hour or two of setup time.
## What is it based on?
The core node and API daemons, as well as the CLI API client, are written in Python 3 and are fully Free Software (GNU GPL v3). In addition to these, PVC makes use of the following software tools to provide a holistic hyperconverged infrastructure solution:
* Debian GNU/Linux as the base OS.
* Linux KVM, QEMU, and Libvirt for VM management.
* Linux `ip`, FRRouting, NFTables, DNSMasq, and PowerDNS for network management.
* Ceph for storage management.
* Apache Zookeeper for the primary cluster state database.
* Patroni PostgreSQL manager for the secondary relation databases (DNS aggregation, Provisioner configuration).
The major goal of PVC is to be administrator friendly, providing the power of Enterprise-grade private clouds like OpenStack, Nutanix, and VMWare to homelabbers, SMBs, and small ISPs, without the cost or complexity. It believes in picking the best tool for a job and abstracting it behind the cluster as a whole, freeing the administrator from the boring and time-consuming task of selecting the best component, and letting them get on with the things that really matter. Administration can be done from a simple CLI or via a RESTful API capable of building full-featured web frontends or additional applications, taking a self-documenting approach to keep the administrator learning curvet as low as possible. Setup is easy and straightforward with an [ISO-based node installer](https://github.com/parallelvirtualcluster/pvc-installer) and [Ansible role framework](https://github.com/parallelvirtualcluster/pvc-ansible) designed to get a cluster up and running as quickly as possible. Build your cloud in an hour, grow it as you need, and never worry about it: just add physical servers.
## Getting Started
To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your cluster.
To get started with PVC, please see the [About](https://parallelvirtualcluster.readthedocs.io/en/latest/about/) page for general information about the project, and the [Getting Started](https://parallelvirtualcluster.readthedocs.io/en/latest/getting-started/) page for details on configuring your first cluster.
## Changelog
#### v0.9.27
* [CLI Client] Fixes a bug with vm modify command when passed a file
#### v0.9.26
* [Node Daemon] Corrects some bad assumptions about fencing results during hardware failures
* [All] Implements VM tagging functionality
* [All] Implements Node log access via PVC functionality
#### v0.9.25
* [Node Daemon] Returns to Rados library calls for Ceph due to performance problems
* [Node Daemon] Adds a date output to keepalive messages
* [Daemons] Configures ZK connection logging only for persistent connections
* [API Provisioner] Add context manager-based chroot to Debootstrap example script
* [Node Daemon] Fixes a bug where shutdown daemon state was overwritten
#### v0.9.24
* [Node Daemon] Removes Rados module polling of Ceph cluster and returns to command-based polling for timeout purposes, and removes some flaky return statements
* [Node Daemon] Removes flaky Zookeeper connection renewals that caused problems
* [CLI Client] Allow raw lists of clusters from `pvc cluster list`
* [API Daemon] Fixes several issues when getting VM data without stats
* [API Daemon] Fixes issues with removing VMs while disks are still in use (failed provisioning, etc.)
#### v0.9.23
* [Daemons] Fixes a critical overwriting bug in zkhandler when schema paths are not yet valid
* [Node Daemon] Ensures the daemon mode is updated on every startup (fixes the side effect of the above bug in 0.9.22)
#### v0.9.22
* [API Daemon] Drastically improves performance when getting large lists (e.g. VMs)
* [Daemons] Adds profiler functions for use in debug mode
* [Daemons] Improves reliability of ZK locking
* [Daemons] Adds the new logo in ASCII form to the Daemon startup message
* [Node Daemon] Fixes bug where VMs would sometimes not stop
* [Node Daemon] Code cleanups in various classes
* [Node Daemon] Fixes a bug when reading node schema data
* [All] Adds node PVC version information to the list output
* [CLI Client] Improves the style and formatting of list output including a new header line
* [API Worker] Fixes a bug that prevented the storage benchmark job from running
#### v0.9.21
* [API Daemon] Ensures VMs stop before removing them
* [Node Daemon] Fixes a bug with VM shutdowns not timing out
* [Documentation] Adds information about georedundancy caveats
* [All] Adds support for SR-IOV NICs (hostdev and macvtap) and surrounding documentation
* [Node Daemon] Fixes a bug where shutdown aborted migrations unexpectedly
* [Node Daemon] Fixes a bug where the migration method was not updated realtime
* [Node Daemon] Adjusts the Patroni commands to remove reference to Zookeeper path
* [CLI Client] Adjusts several help messages and fixes some typos
* [CLI Client] Converts the CLI client to a proper Python module
* [API Daemon] Improves VM list performance
* [API Daemon] Adjusts VM list matching critera (only matches against the UUID if it's a full UUID)
* [API Worker] Fixes incompatibility between Deb 10 and 11 in launching Celery worker
* [API Daemon] Corrects several bugs with initialization command
* [Documentation] Adds a shiny new logo and revamps introduction text
#### v0.9.20
* [Daemons] Implemented a Zookeeper schema handler and version 0 schema
* [Daemons] Completes major refactoring of codebase to make use of the schema handler
* [Daemons] Adds support for dynamic chema changges and "hot reloading" of pvcnoded processes
* [Daemons] Adds a functional testing script for verifying operation against a test cluster
* [Daemons, CLI] Fixes several minor bugs found by the above script
* [Daemons, CLI] Add support for Debian 11 "Bullseye"
#### v0.9.19
* [CLI] Corrects some flawed conditionals
* [API] Disables SQLAlchemy modification tracking functionality (not used by us)
* [Daemons] Implements new zkhandler module for improved reliability and reusability
* [Daemons] Refactors some code to use new zkhandler module
* [API, CLI] Adds support for "none" migration selector (uses cluster default instead)
* [Daemons] Moves some configuration keys to new /config tree
* [Node Daemon] Increases initial lock timeout for VM migrations to avoid out-of-sync potential
* [Provisioner] Support storing and using textual cluster network labels ("upstream", "storage", "cluster") in templates
* [API] Avoid duplicating existing node states
#### v0.9.18
* Adds VM rename functionality to API and CLI client
#### v0.9.17
* [CLI] Fixes bugs in log follow output
#### v0.9.16
* Improves some CLI help messages

View File

@ -451,6 +451,12 @@ pvc_nodes:
pvc_bridge_device: bondU
pvc_sriov_enable: True
pvc_sriov_device:
- phy: ens1f0
mtu: 9000
vfcount: 6
pvc_upstream_device: "{{ networks['upstream']['device'] }}"
pvc_upstream_mtu: "{{ networks['upstream']['mtu'] }}"
pvc_upstream_domain: "{{ networks['upstream']['domain'] }}"
@ -901,6 +907,18 @@ The IPMI password for the node management controller. Unless a per-host override
The device name of the underlying network interface to be used for "bridged"-type client networks. For each "bridged"-type network, an IEEE 802.3q vLAN and bridge will be created on top of this device to pass these networks. In most cases, using the reflexive `networks['cluster']['raw_device']` or `networks['upstream']['raw_device']` from the Base role is sufficient.
#### `pvc_sriov_enable`
* *optional*
Whether to enable or disable SR-IOV functionality.
#### `pvc_sriov_device`
* *optional*
A list of SR-IOV devices. See the Daemon manual for details.
#### `pvc_<network>_*`
The next set of entries is hard-coded to use the values from the global `networks` list. It should not need to be changed under most circumstances. Refer to the previous sections for specific notes about each entry.

View File

@ -146,6 +146,11 @@ pvc:
console_log_lines: 1000
networking:
bridge_device: ens4
sriov_enable: True
sriov_device:
- phy: ens1f0
mtu: 9000
vfcount: 7
upstream:
device: ens4
mtu: 1500
@ -422,6 +427,34 @@ How many lines of VM console logs to keep in the Zookeeper database for each VM.
The network interface device used to create Bridged client network vLANs on. For most clusters, should match the underlying device of the various static networks (e.g. `ens4` or `bond0`), though may also use a separate network interface.
#### `system` → `configuration` → `networking` → `sriov_enable`
* *optional*, defaults to `False`
* *requires* `functions``enable_networking`
Enables (or disables) SR-IOV functionality in PVC. If enabled, at least one `sriov_device` entry should be specified.
#### `system` → `configuration` → `networking` → `sriov_device`
* *optional*
* *requires* `functions``enable_networking`
Contains a list of SR-IOV PF (physical function) devices and their basic configuration. Each element contains the following entries:
##### `phy`:
* *required*
The raw Linux network device with SR-IOV PF functionality.
##### `mtu`
The MTU of the PF device, set on daemon startup.
##### `vfcount`
The number of VF devices to create on this PF. VF devices are then managed via PVC on a per-node basis.
#### `system` → `configuration` → `networking`
* *optional*

View File

@ -144,6 +144,19 @@
},
"type": "object"
},
"NodeLog": {
"properties": {
"data": {
"description": "The recent log text",
"type": "string"
},
"name": {
"description": "The name of the Node",
"type": "string"
}
},
"type": "object"
},
"VMLog": {
"properties": {
"data": {
@ -215,6 +228,23 @@
},
"type": "object"
},
"VMTags": {
"properties": {
"name": {
"description": "The name of the VM",
"type": "string"
},
"tags": {
"description": "The tag(s) of the VM",
"items": {
"id": "VMTag",
"type": "object"
},
"type": "array"
}
},
"type": "object"
},
"acl": {
"properties": {
"description": {
@ -464,6 +494,10 @@
"description": "The current operating system type",
"type": "string"
},
"pvc_version": {
"description": "The current running PVC node daemon version",
"type": "string"
},
"running_domains": {
"description": "The list of running domains (VMs) by UUID",
"type": "string"
@ -764,6 +798,99 @@
},
"type": "object"
},
"sriov_pf": {
"properties": {
"mtu": {
"description": "The MTU of the SR-IOV PF device",
"type": "string"
},
"phy": {
"description": "The name of the SR-IOV PF device",
"type": "string"
},
"vfs": {
"items": {
"description": "The PHY name of a VF of this PF",
"type": "string"
},
"type": "list"
}
},
"type": "object"
},
"sriov_vf": {
"properties": {
"config": {
"id": "sriov_vf_config",
"properties": {
"link_state": {
"description": "The current SR-IOV VF link state (either enabled, disabled, or auto)",
"type": "string"
},
"query_rss": {
"description": "Whether VF RSS querying is enabled or disabled",
"type": "boolean"
},
"spoof_check": {
"description": "Whether device spoof checking is enabled or disabled",
"type": "boolean"
},
"trust": {
"description": "Whether guest device trust is enabled or disabled",
"type": "boolean"
},
"tx_rate_max": {
"description": "The maximum TX rate of the SR-IOV VF device",
"type": "string"
},
"tx_rate_min": {
"description": "The minimum TX rate of the SR-IOV VF device",
"type": "string"
},
"vlan_id": {
"description": "The tagged vLAN ID of the SR-IOV VF device",
"type": "string"
},
"vlan_qos": {
"description": "The QOS group of the tagged vLAN",
"type": "string"
}
},
"type": "object"
},
"mac": {
"description": "The current MAC address of the VF device",
"type": "string"
},
"mtu": {
"description": "The current MTU of the VF device",
"type": "integer"
},
"pf": {
"description": "The name of the SR-IOV PF parent of this VF device",
"type": "string"
},
"phy": {
"description": "The name of the SR-IOV VF device",
"type": "string"
},
"usage": {
"id": "sriov_vf_usage",
"properties": {
"domain": {
"description": "The UUID of the domain the SR-IOV VF is currently used by",
"type": "boolean"
},
"used": {
"description": "Whether the SR-IOV VF is currently used by a VM or not",
"type": "boolean"
}
},
"type": "object"
}
},
"type": "object"
},
"storage-template": {
"properties": {
"disks": {
@ -1273,6 +1400,28 @@
"description": "The current state of the VM",
"type": "string"
},
"tags": {
"description": "The tag(s) of the VM",
"items": {
"id": "VMTag",
"properties": {
"name": {
"description": "The name of the tag",
"type": "string"
},
"protected": {
"description": "Whether the tag is protected or not",
"type": "boolean"
},
"type": {
"description": "The type of the tag (user, system)",
"type": "string"
}
},
"type": "object"
},
"type": "array"
},
"type": {
"description": "The type of the VM",
"type": "string"
@ -1459,8 +1608,15 @@
},
"/api/v1/initialize": {
"post": {
"description": "Note: Normally used only once during cluster bootstrap; checks for the existence of the \"/primary_node\" key before proceeding and returns 400 if found",
"description": "<br/>If the 'overwrite' option is not True, the cluster will return 400 if the `/config/primary_node` key is found. If 'overwrite' is True, the existing cluster<br/>data will be erased and new, empty data written in its place.<br/><br/>All node daemons should be stopped before running this command, and the API daemon started manually to avoid undefined behavior.",
"parameters": [
{
"description": "A flag to enable or disable (default) overwriting existing data",
"in": "query",
"name": "overwrite",
"required": false,
"type": "bool"
},
{
"description": "A confirmation string to ensure that the API consumer really means it",
"in": "query",
@ -2311,7 +2467,7 @@
"description": "",
"parameters": [
{
"description": "A search limit; fuzzy by default, use ^/$ to force exact matches",
"description": "A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches",
"in": "query",
"name": "limit",
"required": false,
@ -2522,6 +2678,38 @@
]
}
},
"/api/v1/node/{node}/log": {
"get": {
"description": "",
"parameters": [
{
"description": "The number of lines to retrieve",
"in": "query",
"name": "lines",
"required": false,
"type": "integer"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/NodeLog"
}
},
"404": {
"description": "Node not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Return the recent logs of {node}",
"tags": [
"node"
]
}
},
"/api/v1/provisioner/create": {
"post": {
"description": "Note: Starts a background job in the pvc-provisioner-worker Celery worker while returning a task ID; the task ID can be used to query the \"GET /provisioner/status/<task_id>\" endpoint for the job status",
@ -4453,6 +4641,181 @@
]
}
},
"/api/v1/sriov/pf": {
"get": {
"description": "",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/sriov_pf"
}
}
},
"summary": "Return a list of SR-IOV PFs on a given node",
"tags": [
"network / sriov"
]
}
},
"/api/v1/sriov/pf/{node}": {
"get": {
"description": "",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/sriov_pf"
}
}
},
"summary": "Return a list of SR-IOV PFs on node {node}",
"tags": [
"network / sriov"
]
}
},
"/api/v1/sriov/vf": {
"get": {
"description": "",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/sriov_vf"
}
}
},
"summary": "Return a list of SR-IOV VFs on a given node, optionally limited to those in the specified PF",
"tags": [
"network / sriov"
]
}
},
"/api/v1/sriov/vf/{node}": {
"get": {
"description": "",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/sriov_vf"
}
}
},
"summary": "Return a list of SR-IOV VFs on node {node}, optionally limited to those in the specified PF",
"tags": [
"network / sriov"
]
}
},
"/api/v1/sriov/vf/{node}/{vf}": {
"get": {
"description": "",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/sriov_vf"
}
},
"404": {
"description": "Not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Return information about {vf} on {node}",
"tags": [
"network / sriov"
]
},
"put": {
"description": "",
"parameters": [
{
"description": "The vLAN ID for vLAN tagging (0 is disabled)",
"in": "query",
"name": "vlan_id",
"required": false,
"type": "integer"
},
{
"description": "The vLAN QOS priority (0 is disabled)",
"in": "query",
"name": "vlan_qos",
"required": false,
"type": "integer"
},
{
"description": "The minimum TX rate (0 is disabled)",
"in": "query",
"name": "tx_rate_min",
"required": false,
"type": "integer"
},
{
"description": "The maximum TX rate (0 is disabled)",
"in": "query",
"name": "tx_rate_max",
"required": false,
"type": "integer"
},
{
"description": "The administrative link state",
"enum": [
"auto",
"enable",
"disable"
],
"in": "query",
"name": "link_state",
"required": false,
"type": "string"
},
{
"description": "Enable or disable spoof checking",
"in": "query",
"name": "spoof_check",
"required": false,
"type": "boolean"
},
{
"description": "Enable or disable VF user trust",
"in": "query",
"name": "trust",
"required": false,
"type": "boolean"
},
{
"description": "Enable or disable query RSS support",
"in": "query",
"name": "query_rss",
"required": false,
"type": "boolean"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/Message"
}
},
"400": {
"description": "Bad request",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Set the configuration of {vf} on {node}",
"tags": [
"network / sriov"
]
}
},
"/api/v1/status": {
"get": {
"description": "",
@ -5516,7 +5879,7 @@
"description": "",
"parameters": [
{
"description": "A name search limit; fuzzy by default, use ^/$ to force exact matches",
"description": "A search limit in the name, tags, or an exact UUID; fuzzy by default, use ^/$ to force exact matches",
"in": "query",
"name": "limit",
"required": false,
@ -5535,6 +5898,13 @@
"name": "state",
"required": false,
"type": "string"
},
{
"description": "Limit list to VMs with this tag",
"in": "query",
"name": "tag",
"required": false,
"type": "string"
}
],
"responses": {
@ -5610,6 +5980,26 @@
"name": "migration_method",
"required": false,
"type": "string"
},
{
"description": "The user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "user_tags",
"required": false,
"type": "array"
},
{
"description": "The protected user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "protected_tags",
"required": false,
"type": "array"
}
],
"responses": {
@ -5721,7 +6111,8 @@
"mem",
"vcpus",
"load",
"vms"
"vms",
"none (cluster default)"
],
"in": "query",
"name": "selector",
@ -5747,6 +6138,26 @@
"name": "migration_method",
"required": false,
"type": "string"
},
{
"description": "The user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "user_tags",
"required": false,
"type": "array"
},
{
"description": "The protected user tag(s) of the VM",
"in": "query",
"items": {
"type": "string"
},
"name": "protected_tags",
"required": false,
"type": "array"
}
],
"responses": {
@ -5871,7 +6282,7 @@
}
},
"404": {
"description": "Not found",
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
@ -5945,6 +6356,12 @@
"schema": {
"$ref": "#/definitions/Message"
}
},
"404": {
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Set the metadata of {vm}",
@ -6035,6 +6452,38 @@
]
}
},
"/api/v1/vm/{vm}/rename": {
"post": {
"description": "",
"parameters": [
{
"description": "The new name of the VM",
"in": "query",
"name": "new_name",
"required": true,
"type": "string"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/Message"
}
},
"400": {
"description": "Bad request",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Rename VM {vm}, and all connected disk volumes which include this name, to {new_name}",
"tags": [
"vm"
]
}
},
"/api/v1/vm/{vm}/state": {
"get": {
"description": "",
@ -6100,6 +6549,84 @@
"vm"
]
}
},
"/api/v1/vm/{vm}/tags": {
"get": {
"description": "",
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/VMTags"
}
},
"404": {
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Return the tags of {vm}",
"tags": [
"vm"
]
},
"post": {
"description": "",
"parameters": [
{
"description": "The action to perform with the tag",
"enum": [
"add",
"remove"
],
"in": "query",
"name": "action",
"required": true,
"type": "string"
},
{
"description": "The text value of the tag",
"in": "query",
"name": "tag",
"required": true,
"type": "string"
},
{
"default": false,
"description": "Set the protected state of the tag",
"in": "query",
"name": "protected",
"required": false,
"type": "boolean"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"$ref": "#/definitions/Message"
}
},
"400": {
"description": "Bad request",
"schema": {
"$ref": "#/definitions/Message"
}
},
"404": {
"description": "VM not found",
"schema": {
"$ref": "#/definitions/Message"
}
}
},
"summary": "Set the tags of {vm}",
"tags": [
"vm"
]
}
}
},
"swagger": "2.0"

7
gen-zk-migrations Executable file
View File

@ -0,0 +1,7 @@
#!/bin/bash
# Generate the Zookeeper migration files
pushd api-daemon
./pvcapid-manage-zk.py
popd

2
lint
View File

@ -7,7 +7,7 @@ fi
flake8 \
--ignore=E501 \
--exclude=api-daemon/migrations/versions,api-daemon/provisioner/examples
--exclude=debian,api-daemon/migrations/versions,api-daemon/provisioner/examples
ret=$?
if [[ $ret -eq 0 ]]; then
echo "No linting issues found!"

1
node-daemon/daemon_lib Symbolic link
View File

@ -0,0 +1 @@
../daemon-common

View File

@ -140,6 +140,8 @@ pvc:
file_logging: True
# stdout_logging: Enable or disable logging to stdout (i.e. journald)
stdout_logging: True
# zookeeper_logging: Enable ot disable logging to Zookeeper (for `pvc node log` functionality)
zookeeper_logging: True
# log_colours: Enable or disable ANSI colours in log output
log_colours: True
# log_dates: Enable or disable date strings in log output
@ -152,11 +154,28 @@ pvc:
log_keepalive_storage_details: True
# console_log_lines: Number of console log lines to store in Zookeeper per VM
console_log_lines: 1000
# node_log_lines: Number of node log lines to store in Zookeeper per node
node_log_lines: 2000
# networking: PVC networking configuration
# OPTIONAL if enable_networking: False
networking:
# bridge_device: Underlying device to use for bridged vLAN networks; usually the device underlying <cluster>
# bridge_device: Underlying device to use for bridged vLAN networks; usually the device of <cluster>
bridge_device: ens4
# sriov_enable: Enable or disable (default if absent) SR-IOV network support
sriov_enable: False
# sriov_device: Underlying device(s) to use for SR-IOV networks; can be bridge_device or other NIC(s)
sriov_device:
# The physical device name
- phy: ens1f1
# The preferred MTU of the physical device; OPTIONAL - defaults to the interface default if unset
mtu: 9000
# The number of VFs to enable on this device
# NOTE: This defines the maximum number of VMs which can be provisioned on this physical device; VMs
# are allocated to these VFs manually by the administrator and thus all nodes should have the
# same number
# NOTE: This value cannot be changed at runtime on Intel(R) NICs; the node will need to be restarted
# if this value changes
vfcount: 8
# upstream: Upstream physical interface device
upstream:
# device: Upstream interface device name

View File

@ -23,20 +23,19 @@ import time
import json
import psutil
import pvcnoded.zkhandler as zkhandler
import pvcnoded.common as common
import daemon_lib.common as common
class CephOSDInstance(object):
def __init__(self, zk_conn, this_node, osd_id):
self.zk_conn = zk_conn
def __init__(self, zkhandler, this_node, osd_id):
self.zkhandler = zkhandler
self.this_node = this_node
self.osd_id = osd_id
self.node = None
self.size = None
self.stats = dict()
@self.zk_conn.DataWatch('/ceph/osds/{}/node'.format(self.osd_id))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('osd.node', self.osd_id))
def watch_osd_node(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -51,7 +50,7 @@ class CephOSDInstance(object):
if data and data != self.node:
self.node = data
@self.zk_conn.DataWatch('/ceph/osds/{}/stats'.format(self.osd_id))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('osd.stats', self.osd_id))
def watch_osd_stats(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -67,7 +66,7 @@ class CephOSDInstance(object):
self.stats = json.loads(data)
def add_osd(zk_conn, logger, node, device, weight):
def add_osd(zkhandler, logger, node, device, weight):
# We are ready to create a new OSD on this node
logger.out('Creating new OSD disk on block device {}'.format(device), state='i')
try:
@ -174,12 +173,12 @@ def add_osd(zk_conn, logger, node, device, weight):
# 7. Add the new OSD to the list
logger.out('Adding new OSD disk with ID {} to Zookeeper'.format(osd_id), state='i')
zkhandler.writedata(zk_conn, {
'/ceph/osds/{}'.format(osd_id): '',
'/ceph/osds/{}/node'.format(osd_id): node,
'/ceph/osds/{}/device'.format(osd_id): device,
'/ceph/osds/{}/stats'.format(osd_id): '{}'
})
zkhandler.write([
(('osd', osd_id), ''),
(('osd.node', osd_id), node),
(('osd.device', osd_id), device),
(('osd.stats', osd_id), '{}'),
])
# Log it
logger.out('Created new OSD disk with ID {}'.format(osd_id), state='o')
@ -190,7 +189,7 @@ def add_osd(zk_conn, logger, node, device, weight):
return False
def remove_osd(zk_conn, logger, osd_id, osd_obj):
def remove_osd(zkhandler, logger, osd_id, osd_obj):
logger.out('Removing OSD disk {}'.format(osd_id), state='i')
try:
# 1. Verify the OSD is present
@ -273,7 +272,7 @@ def remove_osd(zk_conn, logger, osd_id, osd_obj):
# 7. Delete OSD from ZK
logger.out('Deleting OSD disk with ID {} from Zookeeper'.format(osd_id), state='i')
zkhandler.deletekey(zk_conn, '/ceph/osds/{}'.format(osd_id))
zkhandler.delete(('osd', osd_id), recursive=True)
# Log it
logger.out('Removed OSD disk with ID {}'.format(osd_id), state='o')
@ -285,14 +284,14 @@ def remove_osd(zk_conn, logger, osd_id, osd_obj):
class CephPoolInstance(object):
def __init__(self, zk_conn, this_node, name):
self.zk_conn = zk_conn
def __init__(self, zkhandler, this_node, name):
self.zkhandler = zkhandler
self.this_node = this_node
self.name = name
self.pgs = ''
self.stats = dict()
@self.zk_conn.DataWatch('/ceph/pools/{}/pgs'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('pool.pgs', self.name))
def watch_pool_node(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -307,7 +306,7 @@ class CephPoolInstance(object):
if data and data != self.pgs:
self.pgs = data
@self.zk_conn.DataWatch('/ceph/pools/{}/stats'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('pool.stats', self.name))
def watch_pool_stats(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -324,14 +323,14 @@ class CephPoolInstance(object):
class CephVolumeInstance(object):
def __init__(self, zk_conn, this_node, pool, name):
self.zk_conn = zk_conn
def __init__(self, zkhandler, this_node, pool, name):
self.zkhandler = zkhandler
self.this_node = this_node
self.pool = pool
self.name = name
self.stats = dict()
@self.zk_conn.DataWatch('/ceph/volumes/{}/{}/stats'.format(self.pool, self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('volume.stats', f'{self.pool}/{self.name}'))
def watch_volume_stats(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -348,15 +347,15 @@ class CephVolumeInstance(object):
class CephSnapshotInstance(object):
def __init__(self, zk_conn, this_node, pool, volume, name):
self.zk_conn = zk_conn
def __init__(self, zkhandler, this_node, pool, volume, name):
self.zkhandler = zkhandler
self.this_node = this_node
self.pool = pool
self.volume = volume
self.name = name
self.stats = dict()
@self.zk_conn.DataWatch('/ceph/snapshots/{}/{}/{}/stats'.format(self.pool, self.volume, self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('snapshot.stats', f'{self.pool}/{self.volume}/{self.name}'))
def watch_snapshot_stats(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -374,7 +373,7 @@ class CephSnapshotInstance(object):
# Primary command function
# This command pipe is only used for OSD adds and removes
def run_command(zk_conn, logger, this_node, data, d_osd):
def run_command(zkhandler, logger, this_node, data, d_osd):
# Get the command and args
command, args = data.split()
@ -383,18 +382,22 @@ def run_command(zk_conn, logger, this_node, data, d_osd):
node, device, weight = args.split(',')
if node == this_node.name:
# Lock the command queue
zk_lock = zkhandler.writelock(zk_conn, '/cmd/ceph')
zk_lock = zkhandler.writelock('base.cmd.ceph')
with zk_lock:
# Add the OSD
result = add_osd(zk_conn, logger, node, device, weight)
result = add_osd(zkhandler, logger, node, device, weight)
# Command succeeded
if result:
# Update the command queue
zkhandler.writedata(zk_conn, {'/cmd/ceph': 'success-{}'.format(data)})
zkhandler.write([
('base.cmd.ceph', 'success-{}'.format(data))
])
# Command failed
else:
# Update the command queue
zkhandler.writedata(zk_conn, {'/cmd/ceph': 'failure-{}'.format(data)})
zkhandler.write([
('base.cmd.ceph', 'failure-{}'.format(data))
])
# Wait 1 seconds before we free the lock, to ensure the client hits the lock
time.sleep(1)
@ -405,17 +408,21 @@ def run_command(zk_conn, logger, this_node, data, d_osd):
# Verify osd_id is in the list
if d_osd[osd_id] and d_osd[osd_id].node == this_node.name:
# Lock the command queue
zk_lock = zkhandler.writelock(zk_conn, '/cmd/ceph')
zk_lock = zkhandler.writelock('base.cmd.ceph')
with zk_lock:
# Remove the OSD
result = remove_osd(zk_conn, logger, osd_id, d_osd[osd_id])
result = remove_osd(zkhandler, logger, osd_id, d_osd[osd_id])
# Command succeeded
if result:
# Update the command queue
zkhandler.writedata(zk_conn, {'/cmd/ceph': 'success-{}'.format(data)})
zkhandler.write([
('base.cmd.ceph', 'success-{}'.format(data))
])
# Command failed
else:
# Update the command queue
zkhandler.writedata(zk_conn, {'/cmd/ceph': 'failure-{}'.format(data)})
zkhandler.write([
('base.cmd.ceph', 'failure-{}'.format(data))
])
# Wait 1 seconds before we free the lock, to ensure the client hits the lock
time.sleep(1)

View File

@ -26,13 +26,12 @@ import psycopg2
from threading import Thread, Event
import pvcnoded.common as common
import daemon_lib.common as common
class DNSAggregatorInstance(object):
# Initialization function
def __init__(self, zk_conn, config, logger):
self.zk_conn = zk_conn
def __init__(self, config, logger):
self.config = config
self.logger = logger
self.dns_networks = dict()
@ -481,6 +480,9 @@ class AXFRDaemonInstance(object):
except psycopg2.IntegrityError as e:
if self.config['debug']:
self.logger.out('Failed to add record due to {}: {}'.format(e, name), state='d', prefix='dns-aggregator')
except psycopg2.errors.InFailedSqlTransaction as e:
if self.config['debug']:
self.logger.out('Failed to add record due to {}: {}'.format(e, name), state='d', prefix='dns-aggregator')
if changed:
# Increase SOA serial

File diff suppressed because it is too large Load Diff

View File

@ -36,8 +36,8 @@ class MetadataAPIInstance(object):
mdapi = flask.Flask(__name__)
# Initialization function
def __init__(self, zk_conn, config, logger):
self.zk_conn = zk_conn
def __init__(self, zkhandler, config, logger):
self.zkhandler = zkhandler
self.config = config
self.logger = logger
self.thread = None
@ -152,21 +152,24 @@ class MetadataAPIInstance(object):
cur.execute(query, args)
data_raw = cur.fetchone()
self.close_database(conn, cur)
data = data_raw.get('userdata', None)
return data
if data_raw is not None:
data = data_raw.get('userdata', None)
return data
else:
return None
# VM details function
def get_vm_details(self, source_address):
# Start connection to Zookeeper
_discard, networks = pvc_network.get_list(self.zk_conn, None)
_discard, networks = pvc_network.get_list(self.zkhandler, None)
# Figure out which server this is via the DHCP address
host_information = dict()
networks_managed = (x for x in networks if x.get('type') == 'managed')
for network in networks_managed:
network_leases = pvc_network.getNetworkDHCPLeases(self.zk_conn, network.get('vni'))
network_leases = pvc_network.getNetworkDHCPLeases(self.zkhandler, network.get('vni'))
for network_lease in network_leases:
information = pvc_network.getDHCPLeaseInformation(self.zk_conn, network.get('vni'), network_lease)
information = pvc_network.getDHCPLeaseInformation(self.zkhandler, network.get('vni'), network_lease)
try:
if information.get('ip4_address', None) == source_address:
host_information = information
@ -177,7 +180,7 @@ class MetadataAPIInstance(object):
client_macaddr = host_information.get('mac_address', None)
# Find the VM with that MAC address - we can't assume that the hostname is actually right
_discard, vm_list = pvc_vm.get_list(self.zk_conn, None, None, None)
_discard, vm_list = pvc_vm.get_list(self.zkhandler, None, None, None, None)
vm_details = dict()
for vm in vm_list:
try:

View File

@ -23,23 +23,22 @@ import time
from threading import Thread
import pvcnoded.zkhandler as zkhandler
import pvcnoded.common as common
import daemon_lib.common as common
class NodeInstance(object):
# Initialization function
def __init__(self, name, this_node, zk_conn, config, logger, d_node, d_network, d_domain, dns_aggregator, metadata_api):
def __init__(self, name, this_node, zkhandler, config, logger, d_node, d_network, d_domain, dns_aggregator, metadata_api):
# Passed-in variables on creation
self.name = name
self.this_node = this_node
self.zk_conn = zk_conn
self.zkhandler = zkhandler
self.config = config
self.logger = logger
# Which node is primary
self.primary_node = None
# States
self.daemon_mode = zkhandler.readdata(self.zk_conn, '/nodes/{}/daemonmode'.format(self.name))
self.daemon_mode = self.zkhandler.read(('node.mode', self.name))
self.daemon_state = 'stop'
self.router_state = 'client'
self.domain_state = 'ready'
@ -91,7 +90,7 @@ class NodeInstance(object):
self.flush_stopper = False
# Zookeeper handlers for changed states
@self.zk_conn.DataWatch('/nodes/{}/daemonstate'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.state.daemon', self.name))
def watch_node_daemonstate(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -106,7 +105,7 @@ class NodeInstance(object):
if data != self.daemon_state:
self.daemon_state = data
@self.zk_conn.DataWatch('/nodes/{}/routerstate'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.state.router', self.name))
def watch_node_routerstate(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -135,9 +134,11 @@ class NodeInstance(object):
transition_thread.start()
else:
# We did nothing, so just become secondary state
zkhandler.writedata(self.zk_conn, {'/nodes/{}/routerstate'.format(self.name): 'secondary'})
self.zkhandler.write([
(('node.state.router', self.name), 'secondary')
])
@self.zk_conn.DataWatch('/nodes/{}/domainstate'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.state.domain', self.name))
def watch_node_domainstate(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -170,7 +171,7 @@ class NodeInstance(object):
self.flush_thread = Thread(target=self.unflush, args=(), kwargs={})
self.flush_thread.start()
@self.zk_conn.DataWatch('/nodes/{}/memfree'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.memory.free', self.name))
def watch_node_memfree(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -185,7 +186,7 @@ class NodeInstance(object):
if data != self.memfree:
self.memfree = data
@self.zk_conn.DataWatch('/nodes/{}/memused'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.memory.used', self.name))
def watch_node_memused(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -200,7 +201,7 @@ class NodeInstance(object):
if data != self.memused:
self.memused = data
@self.zk_conn.DataWatch('/nodes/{}/memalloc'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.memory.allocated', self.name))
def watch_node_memalloc(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -215,7 +216,7 @@ class NodeInstance(object):
if data != self.memalloc:
self.memalloc = data
@self.zk_conn.DataWatch('/nodes/{}/vcpualloc'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.vcpu.allocated', self.name))
def watch_node_vcpualloc(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -230,7 +231,7 @@ class NodeInstance(object):
if data != self.vcpualloc:
self.vcpualloc = data
@self.zk_conn.DataWatch('/nodes/{}/runningdomains'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.running_domains', self.name))
def watch_node_runningdomains(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -245,7 +246,7 @@ class NodeInstance(object):
if data != self.domain_list:
self.domain_list = data
@self.zk_conn.DataWatch('/nodes/{}/domainscount'.format(self.name))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.count.provisioned_domains', self.name))
def watch_node_domainscount(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -323,26 +324,30 @@ class NodeInstance(object):
Acquire primary coordinator status from a peer node
"""
# Lock the primary node until transition is complete
primary_lock = zkhandler.exclusivelock(self.zk_conn, '/primary_node')
primary_lock = self.zkhandler.exclusivelock('base.config.primary_node')
primary_lock.acquire()
# Ensure our lock key is populated
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
# Synchronize nodes A (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.writelock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring write lock for synchronization phase A', state='i')
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase A', state='o')
time.sleep(1) # Time fir reader to acquire the lock
self.logger.out('Releasing write lock for synchronization phase A', state='i')
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
lock.release()
self.logger.out('Released write lock for synchronization phase A', state='o')
time.sleep(0.1) # Time fir new writer to acquire the lock
# Synchronize nodes B (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.readlock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring read lock for synchronization phase B', state='i')
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase B', state='o')
@ -351,7 +356,7 @@ class NodeInstance(object):
self.logger.out('Released read lock for synchronization phase B', state='o')
# Synchronize nodes C (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.writelock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring write lock for synchronization phase C', state='i')
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase C', state='o')
@ -367,12 +372,14 @@ class NodeInstance(object):
)
common.createIPAddress(self.upstream_floatingipaddr, self.upstream_cidrnetmask, 'brupstream')
self.logger.out('Releasing write lock for synchronization phase C', state='i')
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
lock.release()
self.logger.out('Released write lock for synchronization phase C', state='o')
# Synchronize nodes D (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.writelock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring write lock for synchronization phase D', state='i')
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase D', state='o')
@ -397,12 +404,14 @@ class NodeInstance(object):
)
common.createIPAddress(self.storage_floatingipaddr, self.storage_cidrnetmask, 'brstorage')
self.logger.out('Releasing write lock for synchronization phase D', state='i')
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
lock.release()
self.logger.out('Released write lock for synchronization phase D', state='o')
# Synchronize nodes E (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.writelock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring write lock for synchronization phase E', state='i')
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase E', state='o')
@ -418,12 +427,14 @@ class NodeInstance(object):
)
common.createIPAddress('169.254.169.254', '32', 'lo')
self.logger.out('Releasing write lock for synchronization phase E', state='i')
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
lock.release()
self.logger.out('Released write lock for synchronization phase E', state='o')
# Synchronize nodes F (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.writelock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring write lock for synchronization phase F', state='i')
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase F', state='o')
@ -432,12 +443,14 @@ class NodeInstance(object):
for network in self.d_network:
self.d_network[network].createGateways()
self.logger.out('Releasing write lock for synchronization phase F', state='i')
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
lock.release()
self.logger.out('Released write lock for synchronization phase F', state='o')
# Synchronize nodes G (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.writelock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring write lock for synchronization phase G', state='i')
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase G', state='o')
@ -453,7 +466,6 @@ class NodeInstance(object):
"""
patronictl
-c /etc/patroni/config.yml
-d zookeeper://localhost:2181
switchover
--candidate {}
--force
@ -490,6 +502,7 @@ class NodeInstance(object):
# 6. Start client API (and provisioner worker)
if self.config['enable_api']:
self.logger.out('Starting PVC API client service', state='i')
common.run_os_command("systemctl enable pvcapid.service")
common.run_os_command("systemctl start pvcapid.service")
self.logger.out('Starting PVC Provisioner Worker service', state='i')
common.run_os_command("systemctl start pvcapid-worker.service")
@ -504,14 +517,18 @@ class NodeInstance(object):
else:
self.logger.out('Not starting DNS aggregator due to Patroni failures', state='e')
self.logger.out('Releasing write lock for synchronization phase G', state='i')
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
lock.release()
self.logger.out('Released write lock for synchronization phase G', state='o')
# Wait 2 seconds for everything to stabilize before we declare all-done
time.sleep(2)
primary_lock.release()
zkhandler.writedata(self.zk_conn, {'/nodes/{}/routerstate'.format(self.name): 'primary'})
self.zkhandler.write([
(('node.state.router', self.name), 'primary')
])
self.logger.out('Node {} transitioned to primary state'.format(self.name), state='o')
def become_secondary(self):
@ -521,7 +538,7 @@ class NodeInstance(object):
time.sleep(0.2) # Initial delay for the first writer to grab the lock
# Synchronize nodes A (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.readlock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring read lock for synchronization phase A', state='i')
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase A', state='o')
@ -530,7 +547,7 @@ class NodeInstance(object):
self.logger.out('Released read lock for synchronization phase A', state='o')
# Synchronize nodes B (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.writelock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring write lock for synchronization phase B', state='i')
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase B', state='o')
@ -541,19 +558,22 @@ class NodeInstance(object):
for network in self.d_network:
self.d_network[network].stopDHCPServer()
self.logger.out('Releasing write lock for synchronization phase B', state='i')
zkhandler.writedata(self.zk_conn, {'/locks/primary_node': ''})
self.zkhandler.write([
('base.config.primary_node.sync_lock', '')
])
lock.release()
self.logger.out('Released write lock for synchronization phase B', state='o')
# 3. Stop client API
if self.config['enable_api']:
self.logger.out('Stopping PVC API client service', state='i')
common.run_os_command("systemctl stop pvcapid.service")
common.run_os_command("systemctl disable pvcapid.service")
# 4. Stop metadata API
self.metadata_api.stop()
time.sleep(0.1) # Time fir new writer to acquire the lock
# Synchronize nodes C (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.readlock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring read lock for synchronization phase C', state='i')
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase C', state='o')
@ -572,7 +592,7 @@ class NodeInstance(object):
self.logger.out('Released read lock for synchronization phase C', state='o')
# Synchronize nodes D (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.readlock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring read lock for synchronization phase D', state='i')
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase D', state='o')
@ -600,7 +620,7 @@ class NodeInstance(object):
self.logger.out('Released read lock for synchronization phase D', state='o')
# Synchronize nodes E (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.readlock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring read lock for synchronization phase E', state='i')
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase E', state='o')
@ -619,7 +639,7 @@ class NodeInstance(object):
self.logger.out('Released read lock for synchronization phase E', state='o')
# Synchronize nodes F (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.readlock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring read lock for synchronization phase F', state='i')
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase F', state='o')
@ -631,7 +651,7 @@ class NodeInstance(object):
self.logger.out('Released read lock for synchronization phase F', state='o')
# Synchronize nodes G (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/primary_node')
lock = self.zkhandler.readlock('base.config.primary_node.sync_lock')
self.logger.out('Acquiring read lock for synchronization phase G', state='i')
try:
lock.acquire(timeout=60) # Don't wait forever and completely block us
@ -644,7 +664,9 @@ class NodeInstance(object):
# Wait 2 seconds for everything to stabilize before we declare all-done
time.sleep(2)
zkhandler.writedata(self.zk_conn, {'/nodes/{}/routerstate'.format(self.name): 'secondary'})
self.zkhandler.write([
(('node.state.router', self.name), 'secondary')
])
self.logger.out('Node {} transitioned to secondary state'.format(self.name), state='o')
# Flush all VMs on the host
@ -664,38 +686,42 @@ class NodeInstance(object):
self.logger.out('Selecting target to migrate VM "{}"'.format(dom_uuid), state='i')
# Don't replace the previous node if the VM is already migrated
if zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(dom_uuid)):
current_node = zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(dom_uuid))
if self.zkhandler.read(('domain.last_node', dom_uuid)):
current_node = self.zkhandler.read(('domain.last_node', dom_uuid))
else:
current_node = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(dom_uuid))
current_node = self.zkhandler.read(('domain.node', dom_uuid))
target_node = common.findTargetNode(self.zk_conn, self.config, self.logger, dom_uuid)
target_node = common.findTargetNode(self.zkhandler, dom_uuid)
if target_node == current_node:
target_node = None
if target_node is None:
self.logger.out('Failed to find migration target for VM "{}"; shutting down and setting autostart flag'.format(dom_uuid), state='e')
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(dom_uuid): 'shutdown'})
zkhandler.writedata(self.zk_conn, {'/domains/{}/node_autostart'.format(dom_uuid): 'True'})
self.zkhandler.write([
(('domain.state', dom_uuid), 'shutdown'),
(('domain.meta.autostart', dom_uuid), 'True'),
])
else:
self.logger.out('Migrating VM "{}" to node "{}"'.format(dom_uuid, target_node), state='i')
zkhandler.writedata(self.zk_conn, {
'/domains/{}/state'.format(dom_uuid): 'migrate',
'/domains/{}/node'.format(dom_uuid): target_node,
'/domains/{}/lastnode'.format(dom_uuid): current_node
})
self.zkhandler.write([
(('domain.state', dom_uuid), 'migrate'),
(('domain.node', dom_uuid), target_node),
(('domain.last_node', dom_uuid), current_node),
])
# Wait for the VM to migrate so the next VM's free RAM count is accurate (they migrate in serial anyways)
ticks = 0
while zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(dom_uuid)) in ['migrate', 'unmigrate', 'shutdown']:
while self.zkhandler.read(('domain.state', dom_uuid)) in ['migrate', 'unmigrate', 'shutdown']:
ticks += 1
if ticks > 600:
# Abort if we've waited for 120 seconds, the VM is messed and just continue
break
time.sleep(0.2)
zkhandler.writedata(self.zk_conn, {'/nodes/{}/runningdomains'.format(self.name): ''})
zkhandler.writedata(self.zk_conn, {'/nodes/{}/domainstate'.format(self.name): 'flushed'})
self.zkhandler.write([
(('node.running_domains', self.name), ''),
(('node.state.domain', self.name), 'flushed'),
])
self.flush_thread = None
self.flush_stopper = False
return
@ -712,20 +738,20 @@ class NodeInstance(object):
return
# Handle autostarts
autostart = zkhandler.readdata(self.zk_conn, '/domains/{}/node_autostart'.format(dom_uuid))
node = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(dom_uuid))
autostart = self.zkhandler.read(('domain.meta.autostart', dom_uuid))
node = self.zkhandler.read(('domain.node', dom_uuid))
if autostart == 'True' and node == self.name:
self.logger.out('Starting autostart VM "{}"'.format(dom_uuid), state='i')
zkhandler.writedata(self.zk_conn, {
'/domains/{}/state'.format(dom_uuid): 'start',
'/domains/{}/node'.format(dom_uuid): self.name,
'/domains/{}/lastnode'.format(dom_uuid): '',
'/domains/{}/node_autostart'.format(dom_uuid): 'False'
})
self.zkhandler.write([
(('domain.state', dom_uuid), 'start'),
(('domain.node', dom_uuid), self.name),
(('domain.last_node', dom_uuid), ''),
(('domain.meta.autostart', dom_uuid), 'False'),
])
continue
try:
last_node = zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(dom_uuid))
last_node = self.zkhandler.read(('domain.last_node', dom_uuid))
except Exception:
continue
@ -733,17 +759,19 @@ class NodeInstance(object):
continue
self.logger.out('Setting unmigration for VM "{}"'.format(dom_uuid), state='i')
zkhandler.writedata(self.zk_conn, {
'/domains/{}/state'.format(dom_uuid): 'migrate',
'/domains/{}/node'.format(dom_uuid): self.name,
'/domains/{}/lastnode'.format(dom_uuid): ''
})
self.zkhandler.write([
(('domain.state', dom_uuid), 'migrate'),
(('domain.node', dom_uuid), self.name),
(('domain.last_node', dom_uuid), ''),
])
# Wait for the VM to migrate back
while zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(dom_uuid)) in ['migrate', 'unmigrate', 'shutdown']:
while self.zkhandler.read(('domain.state', dom_uuid)) in ['migrate', 'unmigrate', 'shutdown']:
time.sleep(0.1)
zkhandler.writedata(self.zk_conn, {'/nodes/{}/domainstate'.format(self.name): 'ready'})
self.zkhandler.write([
(('node.state.domain', self.name), 'ready')
])
self.flush_thread = None
self.flush_stopper = False
return

View File

@ -0,0 +1,210 @@
#!/usr/bin/env python3
# SRIOVVFInstance.py - Class implementing a PVC SR-IOV VF and run by pvcnoded
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2021 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import daemon_lib.common as common
def boolToOnOff(state):
if state and str(state) == 'True':
return 'on'
else:
return 'off'
class SRIOVVFInstance(object):
# Initialization function
def __init__(self, vf, zkhandler, config, logger, this_node):
self.vf = vf
self.zkhandler = zkhandler
self.config = config
self.logger = logger
self.this_node = this_node
self.myhostname = self.this_node.name
self.pf = self.zkhandler.read(('node.sriov.vf', self.myhostname, 'sriov_vf.pf', self.vf))
self.mtu = self.zkhandler.read(('node.sriov.vf', self.myhostname, 'sriov_vf.mtu', self.vf))
self.vfid = self.vf.replace('{}v'.format(self.pf), '')
self.logger.out('Setting MTU to {}'.format(self.mtu), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} mtu {}'.format(self.vf, self.mtu))
# These properties are set via the DataWatch functions, to ensure they are configured on the system
self.mac = None
self.vlan_id = None
self.vlan_qos = None
self.tx_rate_min = None
self.tx_rate_max = None
self.spoof_check = None
self.link_state = None
self.trust = None
self.query_rss = None
# Zookeeper handlers for changed configs
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.mac', self.vf))
def watch_vf_mac(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = '00:00:00:00:00:00'
if data != self.mac:
self.mac = data
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.vlan_id', self.vf))
def watch_vf_vlan_id(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = '0'
if data != self.vlan_id:
self.vlan_id = data
self.logger.out('Setting vLAN ID to {}'.format(self.vlan_id), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} vlan {} qos {}'.format(self.pf, self.vfid, self.vlan_id, self.vlan_qos))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.vlan_qos', self.vf))
def watch_vf_vlan_qos(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = '0'
if data != self.vlan_qos:
self.vlan_qos = data
self.logger.out('Setting vLAN QOS to {}'.format(self.vlan_qos), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} vlan {} qos {}'.format(self.pf, self.vfid, self.vlan_id, self.vlan_qos))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.tx_rate_min', self.vf))
def watch_vf_tx_rate_min(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = '0'
if data != self.tx_rate_min:
self.tx_rate_min = data
self.logger.out('Setting minimum TX rate to {}'.format(self.tx_rate_min), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} min_tx_rate {}'.format(self.pf, self.vfid, self.tx_rate_min))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.tx_rate_max', self.vf))
def watch_vf_tx_rate_max(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; termaxate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = '0'
if data != self.tx_rate_max:
self.tx_rate_max = data
self.logger.out('Setting maximum TX rate to {}'.format(self.tx_rate_max), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} max_tx_rate {}'.format(self.pf, self.vfid, self.tx_rate_max))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.spoof_check', self.vf))
def watch_vf_spoof_check(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = '0'
if data != self.spoof_check:
self.spoof_check = data
self.logger.out('Setting spoof checking {}'.format(boolToOnOff(self.spoof_check)), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} spoofchk {}'.format(self.pf, self.vfid, boolToOnOff(self.spoof_check)))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.link_state', self.vf))
def watch_vf_link_state(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = 'on'
if data != self.link_state:
self.link_state = data
self.logger.out('Setting link state to {}'.format(boolToOnOff(self.link_state)), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} state {}'.format(self.pf, self.vfid, self.link_state))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.trust', self.vf))
def watch_vf_trust(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = 'off'
if data != self.trust:
self.trust = data
self.logger.out('Setting trust mode {}'.format(boolToOnOff(self.trust)), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} trust {}'.format(self.pf, self.vfid, boolToOnOff(self.trust)))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('node.sriov.vf', self.myhostname) + self.zkhandler.schema.path('sriov_vf.config.query_rss', self.vf))
def watch_vf_query_rss(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
# because this class instance is about to be reaped in Daemon.py
return False
try:
data = data.decode('ascii')
except AttributeError:
data = 'off'
if data != self.query_rss:
self.query_rss = data
self.logger.out('Setting RSS query ability {}'.format(boolToOnOff(self.query_rss)), state='i', prefix='SR-IOV VF {}'.format(self.vf))
common.run_os_command('ip link set {} vf {} query_rss {}'.format(self.pf, self.vfid, boolToOnOff(self.query_rss)))

View File

@ -25,15 +25,13 @@ import time
from threading import Thread, Event
from collections import deque
import pvcnoded.zkhandler as zkhandler
class VMConsoleWatcherInstance(object):
# Initialization function
def __init__(self, domuuid, domname, zk_conn, config, logger, this_node):
def __init__(self, domuuid, domname, zkhandler, config, logger, this_node):
self.domuuid = domuuid
self.domname = domname
self.zk_conn = zk_conn
self.zkhandler = zkhandler
self.config = config
self.logfile = '{}/{}.log'.format(config['console_log_directory'], self.domname)
self.console_log_lines = config['console_log_lines']
@ -73,7 +71,7 @@ class VMConsoleWatcherInstance(object):
# Stop execution thread
def stop(self):
if self.thread and self.thread.isAlive():
if self.thread and self.thread.is_alive():
self.logger.out('Stopping VM log parser', state='i', prefix='Domain {}'.format(self.domuuid))
self.thread_stopper.set()
# Do one final flush
@ -93,7 +91,9 @@ class VMConsoleWatcherInstance(object):
self.fetch_lines()
# Update Zookeeper with the new loglines if they changed
if self.loglines != self.last_loglines:
zkhandler.writedata(self.zk_conn, {'/domains/{}/consolelog'.format(self.domuuid): self.loglines})
self.zkhandler.write([
(('domain.console.log', self.domuuid), self.loglines)
])
self.last_loglines = self.loglines
def fetch_lines(self):

View File

@ -28,18 +28,18 @@ from threading import Thread
from xml.etree import ElementTree
import pvcnoded.zkhandler as zkhandler
import pvcnoded.common as common
import daemon_lib.common as common
import pvcnoded.VMConsoleWatcherInstance as VMConsoleWatcherInstance
import daemon_lib.common as daemon_common
def flush_locks(zk_conn, logger, dom_uuid, this_node=None):
def flush_locks(zkhandler, logger, dom_uuid, this_node=None):
logger.out('Flushing RBD locks for VM "{}"'.format(dom_uuid), state='i')
# Get the list of RBD images
rbd_list = zkhandler.readdata(zk_conn, '/domains/{}/rbdlist'.format(dom_uuid)).split(',')
rbd_list = zkhandler.read(('domain.storage.volumes', dom_uuid)).split(',')
for rbd in rbd_list:
# Check if a lock exists
lock_list_retcode, lock_list_stdout, lock_list_stderr = common.run_os_command('rbd lock list --format json {}'.format(rbd))
@ -57,17 +57,21 @@ def flush_locks(zk_conn, logger, dom_uuid, this_node=None):
if lock_list:
# Loop through the locks
for lock in lock_list:
if this_node is not None and zkhandler.readdata(zk_conn, '/domains/{}/state'.format(dom_uuid)) != 'stop' and lock['address'].split(':')[0] != this_node.storage_ipaddr:
if this_node is not None and zkhandler.read(('domain.state', dom_uuid)) != 'stop' and lock['address'].split(':')[0] != this_node.storage_ipaddr:
logger.out('RBD lock does not belong to this host (lock owner: {}): freeing this lock would be unsafe, aborting'.format(lock['address'].split(':')[0], state='e'))
zkhandler.writedata(zk_conn, {'/domains/{}/state'.format(dom_uuid): 'fail'})
zkhandler.writedata(zk_conn, {'/domains/{}/failedreason'.format(dom_uuid): 'Could not safely free RBD lock {} ({}) on volume {}; stop VM and flush locks manually'.format(lock['id'], lock['address'], rbd)})
zkhandler.write([
(('domain.state', dom_uuid), 'fail'),
(('domain.failed_reason', dom_uuid), 'Could not safely free RBD lock {} ({}) on volume {}; stop VM and flush locks manually'.format(lock['id'], lock['address'], rbd)),
])
break
# Free the lock
lock_remove_retcode, lock_remove_stdout, lock_remove_stderr = common.run_os_command('rbd lock remove {} "{}" "{}"'.format(rbd, lock['id'], lock['locker']))
if lock_remove_retcode != 0:
logger.out('Failed to free RBD lock "{}" on volume "{}": {}'.format(lock['id'], rbd, lock_remove_stderr), state='e')
zkhandler.writedata(zk_conn, {'/domains/{}/state'.format(dom_uuid): 'fail'})
zkhandler.writedata(zk_conn, {'/domains/{}/failedreason'.format(dom_uuid): 'Could not free RBD lock {} ({}) on volume {}: {}'.format(lock['id'], lock['address'], rbd, lock_remove_stderr)})
zkhandler.write([
(('domain.state', dom_uuid), 'fail'),
(('domain.failed_reason', dom_uuid), 'Could not free RBD lock {} ({}) on volume {}: {}'.format(lock['id'], lock['address'], rbd, lock_remove_stderr)),
])
break
logger.out('Freed RBD lock "{}" on volume "{}"'.format(lock['id'], rbd), state='o')
@ -75,7 +79,7 @@ def flush_locks(zk_conn, logger, dom_uuid, this_node=None):
# Primary command function
def run_command(zk_conn, logger, this_node, data):
def run_command(zkhandler, logger, this_node, data):
# Get the command and args
command, args = data.split()
@ -86,45 +90,45 @@ def run_command(zk_conn, logger, this_node, data):
# Verify that the VM is set to run on this node
if this_node.d_domain[dom_uuid].getnode() == this_node.name:
# Lock the command queue
zk_lock = zkhandler.writelock(zk_conn, '/cmd/domains')
zk_lock = zkhandler.writelock('base.cmd.domain')
with zk_lock:
# Flush the lock
result = flush_locks(zk_conn, logger, dom_uuid, this_node)
result = flush_locks(zkhandler, logger, dom_uuid, this_node)
# Command succeeded
if result:
# Update the command queue
zkhandler.writedata(zk_conn, {'/cmd/domains': 'success-{}'.format(data)})
zkhandler.write([
('base.cmd.domain', 'success-{}'.format(data))
])
# Command failed
else:
# Update the command queue
zkhandler.writedata(zk_conn, {'/cmd/domains': 'failure-{}'.format(data)})
zkhandler.write([
('base.cmd.domain', 'failure-{}'.format(data))
])
# Wait 1 seconds before we free the lock, to ensure the client hits the lock
time.sleep(1)
class VMInstance(object):
# Initialization function
def __init__(self, domuuid, zk_conn, config, logger, this_node):
def __init__(self, domuuid, zkhandler, config, logger, this_node):
# Passed-in variables on creation
self.domuuid = domuuid
self.zk_conn = zk_conn
self.zkhandler = zkhandler
self.config = config
self.logger = logger
self.this_node = this_node
# Get data from zookeeper
self.domname = zkhandler.readdata(zk_conn, '/domains/{}'.format(domuuid))
self.state = zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(self.domuuid))
self.node = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(self.domuuid))
self.lastnode = zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(self.domuuid))
self.last_currentnode = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(self.domuuid))
self.last_lastnode = zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(self.domuuid))
self.domname = self.zkhandler.read(('domain', domuuid))
self.state = self.zkhandler.read(('domain.state', domuuid))
self.node = self.zkhandler.read(('domain.node', domuuid))
self.lastnode = self.zkhandler.read(('domain.last_node', domuuid))
self.last_currentnode = self.zkhandler.read(('domain.node', domuuid))
self.last_lastnode = self.zkhandler.read(('domain.last_node', domuuid))
try:
self.pinpolicy = zkhandler.readdata(self.zk_conn, '/domains/{}/pinpolicy'.format(self.domuuid))
except Exception:
self.pinpolicy = "none"
try:
self.migration_method = zkhandler.readdata(self.zk_conn, '/domains/{}/migration_method'.format(self.domuuid))
self.migration_method = self.zkhandler.read(('domain.meta.migrate_method', self.domuuid))
except Exception:
self.migration_method = 'none'
@ -140,10 +144,10 @@ class VMInstance(object):
self.dom = self.lookupByUUID(self.domuuid)
# Log watcher instance
self.console_log_instance = VMConsoleWatcherInstance.VMConsoleWatcherInstance(self.domuuid, self.domname, self.zk_conn, self.config, self.logger, self.this_node)
self.console_log_instance = VMConsoleWatcherInstance.VMConsoleWatcherInstance(self.domuuid, self.domname, self.zkhandler, self.config, self.logger, self.this_node)
# Watch for changes to the state field in Zookeeper
@self.zk_conn.DataWatch('/domains/{}/state'.format(self.domuuid))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('domain.state', self.domuuid))
def watch_state(data, stat, event=""):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -173,7 +177,7 @@ class VMInstance(object):
if self.dom is not None:
memory = int(self.dom.info()[2] / 1024)
else:
domain_information = daemon_common.getInformationFromXML(self.zk_conn, self.domuuid)
domain_information = daemon_common.getInformationFromXML(self.zkhandler, self.domuuid)
memory = int(domain_information['memory'])
except Exception:
memory = 0
@ -195,7 +199,9 @@ class VMInstance(object):
# Add the domain to the domain_list array
self.this_node.domain_list.append(self.domuuid)
# Push the change up to Zookeeper
zkhandler.writedata(self.zk_conn, {'/nodes/{}/runningdomains'.format(self.this_node.name): ' '.join(self.this_node.domain_list)})
self.zkhandler.write([
(('node.running_domains', self.this_node.name), ' '.join(self.this_node.domain_list))
])
except Exception as e:
self.logger.out('Error adding domain to list: {}'.format(e), state='e')
@ -205,7 +211,9 @@ class VMInstance(object):
# Remove the domain from the domain_list array
self.this_node.domain_list.remove(self.domuuid)
# Push the change up to Zookeeper
zkhandler.writedata(self.zk_conn, {'/nodes/{}/runningdomains'.format(self.this_node.name): ' '.join(self.this_node.domain_list)})
self.zkhandler.write([
(('node.running_domains', self.this_node.name), ' '.join(self.this_node.domain_list))
])
except Exception as e:
self.logger.out('Error removing domain from list: {}'.format(e), state='e')
@ -218,11 +226,17 @@ class VMInstance(object):
self.logger.out('Updating VNC data', state='i', prefix='Domain {}'.format(self.domuuid))
port = graphics.get('port', '')
listen = graphics.get('listen', '')
zkhandler.writedata(self.zk_conn, {'/domains/{}/vnc'.format(self.domuuid): '{}:{}'.format(listen, port)})
self.zkhandler.write([
(('domain.console.vnc', self.domuuid), '{}:{}'.format(listen, port))
])
else:
zkhandler.writedata(self.zk_conn, {'/domains/{}/vnc'.format(self.domuuid): ''})
self.zkhandler.write([
(('domain.console.vnc', self.domuuid), '')
])
else:
zkhandler.writedata(self.zk_conn, {'/domains/{}/vnc'.format(self.domuuid): ''})
self.zkhandler.write([
(('domain.console.vnc', self.domuuid), '')
])
# Start up the VM
def start_vm(self):
@ -251,8 +265,8 @@ class VMInstance(object):
if self.getdom() is None or self.getdom().state()[0] != libvirt.VIR_DOMAIN_RUNNING:
# Flush locks
self.logger.out('Flushing RBD locks', state='i', prefix='Domain {}'.format(self.domuuid))
flush_locks(self.zk_conn, self.logger, self.domuuid, self.this_node)
if zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(self.domuuid)) == 'fail':
flush_locks(self.zkhandler, self.logger, self.domuuid, self.this_node)
if self.zkhandler.read(('domain.state', self.domuuid)) == 'fail':
lv_conn.close()
self.dom = None
self.instart = False
@ -261,21 +275,27 @@ class VMInstance(object):
if curstate == libvirt.VIR_DOMAIN_RUNNING:
# If it is running just update the model
self.addDomainToList()
zkhandler.writedata(self.zk_conn, {'/domains/{}/failedreason'.format(self.domuuid): ''})
self.zkhandler.write([
(('domain.failed_reason', self.domuuid), '')
])
else:
# Or try to create it
try:
# Grab the domain information from Zookeeper
xmlconfig = zkhandler.readdata(self.zk_conn, '/domains/{}/xml'.format(self.domuuid))
xmlconfig = self.zkhandler.read(('domain.xml', self.domuuid))
dom = lv_conn.createXML(xmlconfig, 0)
self.addDomainToList()
self.logger.out('Successfully started VM', state='o', prefix='Domain {}'.format(self.domuuid))
self.dom = dom
zkhandler.writedata(self.zk_conn, {'/domains/{}/failedreason'.format(self.domuuid): ''})
self.zkhandler.write([
(('domain.failed_reason', self.domuuid), '')
])
except libvirt.libvirtError as e:
self.logger.out('Failed to create VM', state='e', prefix='Domain {}'.format(self.domuuid))
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'fail'})
zkhandler.writedata(self.zk_conn, {'/domains/{}/failedreason'.format(self.domuuid): str(e)})
self.zkhandler.write([
(('domain.state', self.domuuid), 'fail'),
(('domain.failed_reason', self.domuuid), str(e))
])
lv_conn.close()
self.dom = None
self.instart = False
@ -303,7 +323,9 @@ class VMInstance(object):
self.start_vm()
self.addDomainToList()
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'start'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'start')
])
lv_conn.close()
self.inrestart = False
@ -313,6 +335,13 @@ class VMInstance(object):
self.instop = True
try:
self.dom.destroy()
time.sleep(0.2)
try:
if self.getdom().state()[0] == libvirt.VIR_DOMAIN_RUNNING:
# It didn't terminate, try again
self.dom.destroy()
except libvirt.libvirtError:
pass
except AttributeError:
self.logger.out('Failed to terminate VM', state='e', prefix='Domain {}'.format(self.domuuid))
self.removeDomainFromList()
@ -329,12 +358,21 @@ class VMInstance(object):
self.instop = True
try:
self.dom.destroy()
time.sleep(0.2)
try:
if self.getdom().state()[0] == libvirt.VIR_DOMAIN_RUNNING:
# It didn't terminate, try again
self.dom.destroy()
except libvirt.libvirtError:
pass
except AttributeError:
self.logger.out('Failed to stop VM', state='e', prefix='Domain {}'.format(self.domuuid))
self.removeDomainFromList()
if self.inrestart is False:
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'stop'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'stop')
])
self.logger.out('Successfully stopped VM', state='o', prefix='Domain {}'.format(self.domuuid))
self.dom = None
@ -355,8 +393,8 @@ class VMInstance(object):
time.sleep(1)
# Abort shutdown if the state changes to start
current_state = zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(self.domuuid))
if current_state not in ['shutdown', 'restart']:
current_state = self.zkhandler.read(('domain.state', self.domuuid))
if current_state not in ['shutdown', 'restart', 'migrate']:
self.logger.out('Aborting VM shutdown due to state change', state='i', prefix='Domain {}'.format(self.domuuid))
is_aborted = True
break
@ -368,7 +406,9 @@ class VMInstance(object):
if lvdomstate != libvirt.VIR_DOMAIN_RUNNING:
self.removeDomainFromList()
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'stop'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'stop')
])
self.logger.out('Successfully shutdown VM', state='o', prefix='Domain {}'.format(self.domuuid))
self.dom = None
# Stop the log watcher
@ -377,7 +417,9 @@ class VMInstance(object):
if tick >= self.config['vm_shutdown_timeout']:
self.logger.out('Shutdown timeout ({}s) expired, forcing off'.format(self.config['vm_shutdown_timeout']), state='e', prefix='Domain {}'.format(self.domuuid))
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'stop'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'stop')
])
break
self.inshutdown = False
@ -388,7 +430,9 @@ class VMInstance(object):
if self.inrestart:
# Wait to prevent race conditions
time.sleep(1)
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'start'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'start')
])
# Migrate the VM to a target host
def migrate_vm(self, force_live=False, force_shutdown=False):
@ -405,24 +449,24 @@ class VMInstance(object):
self.logger.out('Migrating VM to node "{}"'.format(self.node), state='i', prefix='Domain {}'.format(self.domuuid))
# Used for sanity checking later
target_node = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(self.domuuid))
target_node = self.zkhandler.read(('domain.node', self.domuuid))
aborted = False
def abort_migrate(reason):
zkhandler.writedata(self.zk_conn, {
'/domains/{}/state'.format(self.domuuid): 'start',
'/domains/{}/node'.format(self.domuuid): self.this_node.name,
'/domains/{}/lastnode'.format(self.domuuid): self.last_lastnode
})
self.zkhandler.write([
(('domain.state', self.domuuid), 'start'),
(('domain.node', self.domuuid), self.this_node.name),
(('domain.last_node', self.domuuid), self.last_lastnode)
])
migrate_lock_node.release()
migrate_lock_state.release()
self.inmigrate = False
self.logger.out('Aborted migration: {}'.format(reason), state='i', prefix='Domain {}'.format(self.domuuid))
# Acquire exclusive lock on the domain node key
migrate_lock_node = zkhandler.exclusivelock(self.zk_conn, '/domains/{}/node'.format(self.domuuid))
migrate_lock_state = zkhandler.exclusivelock(self.zk_conn, '/domains/{}/state'.format(self.domuuid))
migrate_lock_node = self.zkhandler.exclusivelock(('domain.node', self.domuuid))
migrate_lock_state = self.zkhandler.exclusivelock(('domain.state', self.domuuid))
migrate_lock_node.acquire()
migrate_lock_state.acquire()
@ -434,14 +478,14 @@ class VMInstance(object):
return
# Synchronize nodes A (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.readlock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring read lock for synchronization phase A', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase A', state='o', prefix='Domain {}'.format(self.domuuid))
if zkhandler.readdata(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid)) == '':
if self.zkhandler.read(('domain.migrate.sync_lock', self.domuuid)) == '':
self.logger.out('Waiting for peer', state='i', prefix='Domain {}'.format(self.domuuid))
ticks = 0
while zkhandler.readdata(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid)) == '':
while self.zkhandler.read(('domain.migrate.sync_lock', self.domuuid)) == '':
time.sleep(0.1)
ticks += 1
if ticks > 300:
@ -457,11 +501,11 @@ class VMInstance(object):
return
# Synchronize nodes B (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.writelock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring write lock for synchronization phase B', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase B', state='o', prefix='Domain {}'.format(self.domuuid))
time.sleep(0.5) # Time fir reader to acquire the lock
time.sleep(0.5) # Time for reader to acquire the lock
def migrate_live():
self.logger.out('Setting up live migration', state='i', prefix='Domain {}'.format(self.domuuid))
@ -498,9 +542,7 @@ class VMInstance(object):
def migrate_shutdown():
self.logger.out('Shutting down VM for offline migration', state='i', prefix='Domain {}'.format(self.domuuid))
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'shutdown'})
while zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(self.domuuid)) != 'stop':
time.sleep(0.5)
self.shutdown_vm()
return True
do_migrate_shutdown = False
@ -545,11 +587,11 @@ class VMInstance(object):
return
# Synchronize nodes C (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.writelock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring write lock for synchronization phase C', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase C', state='o', prefix='Domain {}'.format(self.domuuid))
time.sleep(0.5) # Time fir reader to acquire the lock
time.sleep(0.5) # Time for reader to acquire the lock
if do_migrate_shutdown:
migrate_shutdown()
@ -559,21 +601,26 @@ class VMInstance(object):
self.logger.out('Released write lock for synchronization phase C', state='o', prefix='Domain {}'.format(self.domuuid))
# Synchronize nodes D (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.readlock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring read lock for synchronization phase D', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase D', state='o', prefix='Domain {}'.format(self.domuuid))
self.last_currentnode = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(self.domuuid))
self.last_lastnode = zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(self.domuuid))
self.last_currentnode = self.zkhandler.read(('domain.node', self.domuuid))
self.last_lastnode = self.zkhandler.read(('domain.last_node', self.domuuid))
self.logger.out('Releasing read lock for synchronization phase D', state='i', prefix='Domain {}'.format(self.domuuid))
lock.release()
self.logger.out('Released read lock for synchronization phase D', state='o', prefix='Domain {}'.format(self.domuuid))
# Wait for the receive side to complete before we declare all-done and release locks
while zkhandler.readdata(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid)) != '':
time.sleep(0.5)
ticks = 0
while self.zkhandler.read(('domain.migrate.sync_lock', self.domuuid)) != '':
time.sleep(0.1)
ticks += 1
if ticks > 100:
self.logger.out('Sync lock clear exceeded 10s timeout, continuing', state='w', prefix='Domain {}'.format(self.domuuid))
break
migrate_lock_node.release()
migrate_lock_state.release()
@ -590,22 +637,27 @@ class VMInstance(object):
self.logger.out('Receiving VM migration from node "{}"'.format(self.node), state='i', prefix='Domain {}'.format(self.domuuid))
# Short delay to ensure sender is in sync
time.sleep(0.5)
# Ensure our lock key is populated
zkhandler.writedata(self.zk_conn, {'/locks/domain_migrate/{}'.format(self.domuuid): self.domuuid})
self.zkhandler.write([
(('domain.migrate.sync_lock', self.domuuid), self.domuuid)
])
# Synchronize nodes A (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.writelock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring write lock for synchronization phase A', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase A', state='o', prefix='Domain {}'.format(self.domuuid))
time.sleep(0.5) # Time fir reader to acquire the lock
time.sleep(1) # Time for reader to acquire the lock
self.logger.out('Releasing write lock for synchronization phase A', state='i', prefix='Domain {}'.format(self.domuuid))
lock.release()
self.logger.out('Released write lock for synchronization phase A', state='o', prefix='Domain {}'.format(self.domuuid))
time.sleep(0.1) # Time fir new writer to acquire the lock
time.sleep(0.1) # Time for new writer to acquire the lock
# Synchronize nodes B (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.readlock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring read lock for synchronization phase B', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase B', state='o', prefix='Domain {}'.format(self.domuuid))
@ -614,38 +666,44 @@ class VMInstance(object):
self.logger.out('Released read lock for synchronization phase B', state='o', prefix='Domain {}'.format(self.domuuid))
# Synchronize nodes C (I am reader)
lock = zkhandler.readlock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.readlock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring read lock for synchronization phase C', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired read lock for synchronization phase C', state='o', prefix='Domain {}'.format(self.domuuid))
# Set the updated data
self.last_currentnode = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(self.domuuid))
self.last_lastnode = zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(self.domuuid))
self.last_currentnode = self.zkhandler.read(('domain.node', self.domuuid))
self.last_lastnode = self.zkhandler.read(('domain.last_node', self.domuuid))
self.logger.out('Releasing read lock for synchronization phase C', state='i', prefix='Domain {}'.format(self.domuuid))
lock.release()
self.logger.out('Released read lock for synchronization phase C', state='o', prefix='Domain {}'.format(self.domuuid))
# Synchronize nodes D (I am writer)
lock = zkhandler.writelock(self.zk_conn, '/locks/domain_migrate/{}'.format(self.domuuid))
lock = self.zkhandler.writelock(('domain.migrate.sync_lock', self.domuuid))
self.logger.out('Acquiring write lock for synchronization phase D', state='i', prefix='Domain {}'.format(self.domuuid))
lock.acquire()
self.logger.out('Acquired write lock for synchronization phase D', state='o', prefix='Domain {}'.format(self.domuuid))
time.sleep(0.5) # Time fir reader to acquire the lock
time.sleep(0.5) # Time for reader to acquire the lock
self.state = zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(self.domuuid))
self.state = self.zkhandler.read(('domain.state', self.domuuid))
self.dom = self.lookupByUUID(self.domuuid)
if self.dom:
lvdomstate = self.dom.state()[0]
if lvdomstate == libvirt.VIR_DOMAIN_RUNNING:
# VM has been received and started
self.addDomainToList()
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'start'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'start')
])
self.logger.out('Successfully received migrated VM', state='o', prefix='Domain {}'.format(self.domuuid))
else:
# The receive somehow failed
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'fail'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'fail'),
(('domain.failed_reason', self.domuuid), 'Failed to receive migration')
])
self.logger.out('Failed to receive migrated VM', state='e', prefix='Domain {}'.format(self.domuuid))
else:
if self.node == self.this_node.name:
if self.state in ['start']:
@ -653,7 +711,9 @@ class VMInstance(object):
self.logger.out('Receive aborted via state change', state='w', prefix='Domain {}'.format(self.domuuid))
elif self.state in ['stop']:
# The send was shutdown-based
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'start'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'start')
])
else:
# The send failed or was aborted
self.logger.out('Migrate aborted or failed; VM in state {}'.format(self.state), state='w', prefix='Domain {}'.format(self.domuuid))
@ -662,7 +722,9 @@ class VMInstance(object):
lock.release()
self.logger.out('Released write lock for synchronization phase D', state='o', prefix='Domain {}'.format(self.domuuid))
zkhandler.writedata(self.zk_conn, {'/locks/domain_migrate/{}'.format(self.domuuid): ''})
self.zkhandler.write([
(('domain.migrate.sync_lock', self.domuuid), '')
])
self.inreceive = False
return
@ -671,9 +733,10 @@ class VMInstance(object):
#
def manage_vm_state(self):
# Update the current values from zookeeper
self.state = zkhandler.readdata(self.zk_conn, '/domains/{}/state'.format(self.domuuid))
self.node = zkhandler.readdata(self.zk_conn, '/domains/{}/node'.format(self.domuuid))
self.lastnode = zkhandler.readdata(self.zk_conn, '/domains/{}/lastnode'.format(self.domuuid))
self.state = self.zkhandler.read(('domain.state', self.domuuid))
self.node = self.zkhandler.read(('domain.node', self.domuuid))
self.lastnode = self.zkhandler.read(('domain.last_node', self.domuuid))
self.migration_method = self.zkhandler.read(('domain.meta.migrate_method', self.domuuid))
# Check the current state of the VM
try:
@ -721,7 +784,9 @@ class VMInstance(object):
elif self.state == "migrate" or self.state == "migrate-live":
# Start the log watcher
self.console_log_instance.start()
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'start'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'start')
])
# Add domain to running list
self.addDomainToList()
# VM should be restarted
@ -744,7 +809,9 @@ class VMInstance(object):
self.receive_migrate()
# VM should be restarted (i.e. started since it isn't running)
if self.state == "restart":
zkhandler.writedata(self.zk_conn, {'/domains/{}/state'.format(self.domuuid): 'start'})
self.zkhandler.write([
(('domain.state', self.domuuid), 'start')
])
# VM should be shut down; ensure it's gone from this node's domain_list
elif self.state == "shutdown":
self.removeDomainFromList()

View File

@ -24,15 +24,14 @@ import time
from textwrap import dedent
import pvcnoded.zkhandler as zkhandler
import pvcnoded.common as common
import daemon_lib.common as common
class VXNetworkInstance(object):
# Initialization function
def __init__(self, vni, zk_conn, config, logger, this_node, dns_aggregator):
def __init__(self, vni, zkhandler, config, logger, this_node, dns_aggregator):
self.vni = vni
self.zk_conn = zk_conn
self.zkhandler = zkhandler
self.config = config
self.logger = logger
self.this_node = this_node
@ -41,7 +40,7 @@ class VXNetworkInstance(object):
self.vni_mtu = config['vni_mtu']
self.bridge_dev = config['bridge_dev']
self.nettype = zkhandler.readdata(self.zk_conn, '/networks/{}/nettype'.format(self.vni))
self.nettype = self.zkhandler.read(('network.type', self.vni))
if self.nettype == 'bridged':
self.logger.out(
'Creating new bridged network',
@ -73,7 +72,7 @@ class VXNetworkInstance(object):
self.bridge_nic = 'vmbr{}'.format(self.vni)
# Zookeper handlers for changed states
@self.zk_conn.DataWatch('/networks/{}'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network', self.vni))
def watch_network_description(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -92,16 +91,16 @@ class VXNetworkInstance(object):
self.description = None
self.domain = None
self.name_servers = None
self.ip6_gateway = zkhandler.readdata(self.zk_conn, '/networks/{}/ip6_gateway'.format(self.vni))
self.ip6_network = zkhandler.readdata(self.zk_conn, '/networks/{}/ip6_network'.format(self.vni))
self.ip6_cidrnetmask = zkhandler.readdata(self.zk_conn, '/networks/{}/ip6_network'.format(self.vni)).split('/')[-1]
self.dhcp6_flag = (zkhandler.readdata(self.zk_conn, '/networks/{}/dhcp6_flag'.format(self.vni)) == 'True')
self.ip4_gateway = zkhandler.readdata(self.zk_conn, '/networks/{}/ip4_gateway'.format(self.vni))
self.ip4_network = zkhandler.readdata(self.zk_conn, '/networks/{}/ip4_network'.format(self.vni))
self.ip4_cidrnetmask = zkhandler.readdata(self.zk_conn, '/networks/{}/ip4_network'.format(self.vni)).split('/')[-1]
self.dhcp4_flag = (zkhandler.readdata(self.zk_conn, '/networks/{}/dhcp4_flag'.format(self.vni)) == 'True')
self.dhcp4_start = (zkhandler.readdata(self.zk_conn, '/networks/{}/dhcp4_start'.format(self.vni)) == 'True')
self.dhcp4_end = (zkhandler.readdata(self.zk_conn, '/networks/{}/dhcp4_end'.format(self.vni)) == 'True')
self.ip6_gateway = self.zkhandler.read(('network.ip6.gateway', self.vni))
self.ip6_network = self.zkhandler.read(('network.ip6.network', self.vni))
self.ip6_cidrnetmask = self.zkhandler.read(('network.ip6.network', self.vni)).split('/')[-1]
self.dhcp6_flag = self.zkhandler.read(('network.ip6.dhcp', self.vni))
self.ip4_gateway = self.zkhandler.read(('network.ip4.gateway', self.vni))
self.ip4_network = self.zkhandler.read(('network.ip4.network', self.vni))
self.ip4_cidrnetmask = self.zkhandler.read(('network.ip4.network', self.vni)).split('/')[-1]
self.dhcp4_flag = self.zkhandler.read(('network.ip4.dhcp', self.vni))
self.dhcp4_start = self.zkhandler.read(('network.ip4.dhcp_start', self.vni))
self.dhcp4_end = self.zkhandler.read(('network.ip4.dhcp_end', self.vni))
self.vxlan_nic = 'vxlan{}'.format(self.vni)
self.bridge_nic = 'vmbr{}'.format(self.vni)
@ -158,11 +157,11 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
vxlannic=self.vxlan_nic,
)
self.firewall_rules_in = zkhandler.listchildren(self.zk_conn, '/networks/{}/firewall_rules/in'.format(self.vni))
self.firewall_rules_out = zkhandler.listchildren(self.zk_conn, '/networks/{}/firewall_rules/out'.format(self.vni))
self.firewall_rules_in = self.zkhandler.children(('network.rule.in', self.vni))
self.firewall_rules_out = self.zkhandler.children(('network.rule.out', self.vni))
# Zookeper handlers for changed states
@self.zk_conn.DataWatch('/networks/{}'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network', self.vni))
def watch_network_description(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -176,7 +175,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/domain'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.domain', self.vni))
def watch_network_domain(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -193,7 +192,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/name_servers'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.nameservers', self.vni))
def watch_network_name_servers(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -210,7 +209,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/ip6_network'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip6.network', self.vni))
def watch_network_ip6_network(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -225,7 +224,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/ip6_gateway'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip6.gateway', self.vni))
def watch_network_gateway6(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -247,7 +246,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/dhcp6_flag'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip6.dhcp', self.vni))
def watch_network_dhcp6_status(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -261,7 +260,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
elif self.dhcp_server_daemon and not self.dhcp4_flag and self.this_node.router_state in ['primary', 'takeover']:
self.stopDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/ip4_network'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip4.network', self.vni))
def watch_network_ip4_network(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -276,7 +275,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/ip4_gateway'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip4.gateway', self.vni))
def watch_network_gateway4(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -298,7 +297,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/dhcp4_flag'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip4.dhcp', self.vni))
def watch_network_dhcp4_status(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -312,7 +311,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
elif self.dhcp_server_daemon and not self.dhcp6_flag and self.this_node.router_state in ['primary', 'takeover']:
self.stopDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/dhcp4_start'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip4.dhcp_start', self.vni))
def watch_network_dhcp4_start(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -325,7 +324,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.DataWatch('/networks/{}/dhcp4_end'.format(self.vni))
@self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip4.dhcp_end', self.vni))
def watch_network_dhcp4_end(data, stat, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -338,7 +337,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.ChildrenWatch('/networks/{}/dhcp4_reservations'.format(self.vni))
@self.zkhandler.zk_conn.ChildrenWatch(self.zkhandler.schema.path('network.reservation', self.vni))
def watch_network_dhcp_reservations(new_reservations, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -354,7 +353,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.stopDHCPServer()
self.startDHCPServer()
@self.zk_conn.ChildrenWatch('/networks/{}/firewall_rules/in'.format(self.vni))
@self.zkhandler.zk_conn.ChildrenWatch(self.zkhandler.schema.path('network.rule.in', self.vni))
def watch_network_firewall_rules_in(new_rules, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -366,7 +365,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
self.firewall_rules_in = new_rules
self.updateFirewallRules()
@self.zk_conn.ChildrenWatch('/networks/{}/firewall_rules/out'.format(self.vni))
@self.zkhandler.zk_conn.ChildrenWatch(self.zkhandler.schema.path('network.rule.out', self.vni))
def watch_network_firewall_rules_out(new_rules, event=''):
if event and event.type == 'DELETED':
# The key has been deleted after existing before; terminate this watcher
@ -389,13 +388,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
if reservation not in old_reservations_list:
# Add new reservation file
filename = '{}/{}'.format(self.dnsmasq_hostsdir, reservation)
ipaddr = zkhandler.readdata(
self.zk_conn,
'/networks/{}/dhcp4_reservations/{}/ipaddr'.format(
self.vni,
reservation
)
)
ipaddr = self.zkhandler.read(('network.reservation', self.vni, 'reservation.ip', reservation))
entry = '{},{}'.format(reservation, ipaddr)
# Write the entry
with open(filename, 'w') as outfile:
@ -426,10 +419,10 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
full_ordered_rules = []
for acl in self.firewall_rules_in:
order = zkhandler.readdata(self.zk_conn, '/networks/{}/firewall_rules/in/{}/order'.format(self.vni, acl))
order = self.zkhandler.read(('network.rule.in', self.vni, 'rule.order', acl))
ordered_acls_in[order] = acl
for acl in self.firewall_rules_out:
order = zkhandler.readdata(self.zk_conn, '/networks/{}/firewall_rules/out/{}/order'.format(self.vni, acl))
order = self.zkhandler.read(('network.rule.out', self.vni, 'rule.order', acl))
ordered_acls_out[order] = acl
for order in sorted(ordered_acls_in.keys()):
@ -440,7 +433,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
for direction in 'in', 'out':
for acl in sorted_acl_list[direction]:
rule_prefix = "add rule inet filter vxlan{}-{} counter".format(self.vni, direction)
rule_data = zkhandler.readdata(self.zk_conn, '/networks/{}/firewall_rules/{}/{}/rule'.format(self.vni, direction, acl))
rule_data = self.zkhandler.read((f'network.rule.{direction}', self.vni, 'rule.rule', acl))
rule = '{} {}'.format(rule_prefix, rule_data)
full_ordered_rules.append(rule)
@ -459,7 +452,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
# Reload firewall rules
nftables_base_filename = '{}/base.nft'.format(self.config['nft_dynamic_directory'])
common.reload_firewall_rules(self.logger, nftables_base_filename)
common.reload_firewall_rules(nftables_base_filename, logger=self.logger)
# Create bridged network configuration
def createNetworkBridged(self):
@ -805,7 +798,7 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
# Reload firewall rules
nftables_base_filename = '{}/base.nft'.format(self.config['nft_dynamic_directory'])
common.reload_firewall_rules(self.logger, nftables_base_filename)
common.reload_firewall_rules(nftables_base_filename, logger=self.logger)
def removeGateways(self):
if self.nettype == 'managed':

View File

@ -1,302 +0,0 @@
#!/usr/bin/env python3
# common.py - PVC daemon function library, common fuctions
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2021 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import subprocess
import signal
from threading import Thread
from shlex import split as shlex_split
import pvcnoded.zkhandler as zkhandler
class OSDaemon(object):
def __init__(self, command_string, environment, logfile):
command = shlex_split(command_string)
# Set stdout to be a logfile if set
if logfile:
stdout = open(logfile, 'a')
else:
stdout = subprocess.PIPE
# Invoke the process
self.proc = subprocess.Popen(
command,
env=environment,
stdout=stdout,
stderr=stdout,
)
# Signal the process
def signal(self, sent_signal):
signal_map = {
'hup': signal.SIGHUP,
'int': signal.SIGINT,
'term': signal.SIGTERM,
'kill': signal.SIGKILL
}
self.proc.send_signal(signal_map[sent_signal])
def run_os_daemon(command_string, environment=None, logfile=None):
daemon = OSDaemon(command_string, environment, logfile)
return daemon
# Run a oneshot command, optionally without blocking
def run_os_command(command_string, background=False, environment=None, timeout=None):
command = shlex_split(command_string)
if background:
def runcmd():
try:
subprocess.run(
command,
env=environment,
timeout=timeout,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
except subprocess.TimeoutExpired:
pass
thread = Thread(target=runcmd, args=())
thread.start()
return 0, None, None
else:
try:
command_output = subprocess.run(
command,
env=environment,
timeout=timeout,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
retcode = command_output.returncode
except subprocess.TimeoutExpired:
retcode = 128
except Exception:
retcode = 255
try:
stdout = command_output.stdout.decode('ascii')
except Exception:
stdout = ''
try:
stderr = command_output.stderr.decode('ascii')
except Exception:
stderr = ''
return retcode, stdout, stderr
# Reload the firewall rules of the system
def reload_firewall_rules(logger, rules_file):
logger.out('Reloading firewall configuration', state='o')
retcode, stdout, stderr = run_os_command('/usr/sbin/nft -f {}'.format(rules_file))
if retcode != 0:
logger.out('Failed to reload configuration: {}'.format(stderr), state='e')
# Create IP address
def createIPAddress(ipaddr, cidrnetmask, dev):
run_os_command(
'ip address add {}/{} dev {}'.format(
ipaddr,
cidrnetmask,
dev
)
)
run_os_command(
'arping -P -U -W 0.02 -c 2 -i {dev} -S {ip} {ip}'.format(
dev=dev,
ip=ipaddr
)
)
# Remove IP address
def removeIPAddress(ipaddr, cidrnetmask, dev):
run_os_command(
'ip address delete {}/{} dev {}'.format(
ipaddr,
cidrnetmask,
dev
)
)
#
# Find a migration target
#
def findTargetNode(zk_conn, config, logger, dom_uuid):
# Determine VM node limits; set config value if read fails
try:
node_limit = zkhandler.readdata(zk_conn, '/domains/{}/node_limit'.format(dom_uuid)).split(',')
if not any(node_limit):
node_limit = ''
except Exception:
node_limit = ''
zkhandler.writedata(zk_conn, {'/domains/{}/node_limit'.format(dom_uuid): ''})
# Determine VM search field
try:
search_field = zkhandler.readdata(zk_conn, '/domains/{}/node_selector'.format(dom_uuid))
except Exception:
search_field = None
# If our search field is invalid, use and set the default (for next time)
if search_field is None or search_field == 'None':
search_field = config['migration_target_selector']
zkhandler.writedata(zk_conn, {'/domains/{}/node_selector'.format(dom_uuid): config['migration_target_selector']})
if config['debug']:
logger.out('Migrating VM {} with selector {}'.format(dom_uuid, search_field), state='d', prefix='node-flush')
# Execute the search
if search_field == 'mem':
return findTargetNodeMem(zk_conn, config, logger, node_limit, dom_uuid)
if search_field == 'load':
return findTargetNodeLoad(zk_conn, config, logger, node_limit, dom_uuid)
if search_field == 'vcpus':
return findTargetNodeVCPUs(zk_conn, config, logger, node_limit, dom_uuid)
if search_field == 'vms':
return findTargetNodeVMs(zk_conn, config, logger, node_limit, dom_uuid)
# Nothing was found
return None
# Get the list of valid target nodes
def getNodes(zk_conn, node_limit, dom_uuid):
valid_node_list = []
full_node_list = zkhandler.listchildren(zk_conn, '/nodes')
current_node = zkhandler.readdata(zk_conn, '/domains/{}/node'.format(dom_uuid))
for node in full_node_list:
if node_limit and node not in node_limit:
continue
daemon_state = zkhandler.readdata(zk_conn, '/nodes/{}/daemonstate'.format(node))
domain_state = zkhandler.readdata(zk_conn, '/nodes/{}/domainstate'.format(node))
if node == current_node:
continue
if daemon_state != 'run' or domain_state != 'ready':
continue
valid_node_list.append(node)
return valid_node_list
# via free memory (relative to allocated memory)
def findTargetNodeMem(zk_conn, config, logger, node_limit, dom_uuid):
most_provfree = 0
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
if config['debug']:
logger.out('Found nodes: {}'.format(node_list), state='d', prefix='node-flush')
for node in node_list:
memprov = int(zkhandler.readdata(zk_conn, '/nodes/{}/memprov'.format(node)))
memused = int(zkhandler.readdata(zk_conn, '/nodes/{}/memused'.format(node)))
memfree = int(zkhandler.readdata(zk_conn, '/nodes/{}/memfree'.format(node)))
memtotal = memused + memfree
provfree = memtotal - memprov
if config['debug']:
logger.out('Evaluating node {} with {} provfree'.format(node, provfree), state='d', prefix='node-flush')
if provfree > most_provfree:
most_provfree = provfree
target_node = node
if config['debug']:
logger.out('Selected node {}'.format(target_node), state='d', prefix='node-flush')
return target_node
# via load average
def findTargetNodeLoad(zk_conn, config, logger, node_limit, dom_uuid):
least_load = 9999.0
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
if config['debug']:
logger.out('Found nodes: {}'.format(node_list), state='d', prefix='node-flush')
for node in node_list:
load = float(zkhandler.readdata(zk_conn, '/nodes/{}/cpuload'.format(node)))
if config['debug']:
logger.out('Evaluating node {} with load {}'.format(node, load), state='d', prefix='node-flush')
if load < least_load:
least_load = load
target_node = node
if config['debug']:
logger.out('Selected node {}'.format(target_node), state='d', prefix='node-flush')
return target_node
# via total vCPUs
def findTargetNodeVCPUs(zk_conn, config, logger, node_limit, dom_uuid):
least_vcpus = 9999
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
if config['debug']:
logger.out('Found nodes: {}'.format(node_list), state='d', prefix='node-flush')
for node in node_list:
vcpus = int(zkhandler.readdata(zk_conn, '/nodes/{}/vcpualloc'.format(node)))
if config['debug']:
logger.out('Evaluating node {} with vcpualloc {}'.format(node, vcpus), state='d', prefix='node-flush')
if vcpus < least_vcpus:
least_vcpus = vcpus
target_node = node
if config['debug']:
logger.out('Selected node {}'.format(target_node), state='d', prefix='node-flush')
return target_node
# via total VMs
def findTargetNodeVMs(zk_conn, config, logger, node_limit, dom_uuid):
least_vms = 9999
target_node = None
node_list = getNodes(zk_conn, node_limit, dom_uuid)
if config['debug']:
logger.out('Found nodes: {}'.format(node_list), state='d', prefix='node-flush')
for node in node_list:
vms = int(zkhandler.readdata(zk_conn, '/nodes/{}/domainscount'.format(node)))
if config['debug']:
logger.out('Evaluating node {} with VM count {}'.format(node, vms), state='d', prefix='node-flush')
if vms < least_vms:
least_vms = vms
target_node = node
if config['debug']:
logger.out('Selected node {}'.format(target_node), state='d', prefix='node-flush')
return target_node

View File

@ -21,15 +21,14 @@
import time
import pvcnoded.zkhandler as zkhandler
import pvcnoded.common as common
import daemon_lib.common as common
import pvcnoded.VMInstance as VMInstance
#
# Fence thread entry function
#
def fenceNode(node_name, zk_conn, config, logger):
def fenceNode(node_name, zkhandler, config, logger):
# We allow exactly 6 saving throws (30 seconds) for the host to come back online or we kill it
failcount_limit = 6
failcount = 0
@ -37,7 +36,7 @@ def fenceNode(node_name, zk_conn, config, logger):
# Wait 5 seconds
time.sleep(config['keepalive_interval'])
# Get the state
node_daemon_state = zkhandler.readdata(zk_conn, '/nodes/{}/daemonstate'.format(node_name))
node_daemon_state = zkhandler.read(('node.state.daemon', node_name))
# Is it still 'dead'
if node_daemon_state == 'dead':
failcount += 1
@ -50,9 +49,9 @@ def fenceNode(node_name, zk_conn, config, logger):
logger.out('Fencing node "{}" via IPMI reboot signal'.format(node_name), state='w')
# Get IPMI information
ipmi_hostname = zkhandler.readdata(zk_conn, '/nodes/{}/ipmihostname'.format(node_name))
ipmi_username = zkhandler.readdata(zk_conn, '/nodes/{}/ipmiusername'.format(node_name))
ipmi_password = zkhandler.readdata(zk_conn, '/nodes/{}/ipmipassword'.format(node_name))
ipmi_hostname = zkhandler.read(('node.ipmi.hostname', node_name))
ipmi_username = zkhandler.read(('node.ipmi.username', node_name))
ipmi_password = zkhandler.read(('node.ipmi.password', node_name))
# Shoot it in the head
fence_status = rebootViaIPMI(ipmi_hostname, ipmi_username, ipmi_password, logger)
@ -62,47 +61,53 @@ def fenceNode(node_name, zk_conn, config, logger):
# Force into secondary network state if needed
if node_name in config['coordinators']:
logger.out('Forcing secondary status for node "{}"'.format(node_name), state='i')
zkhandler.writedata(zk_conn, {'/nodes/{}/routerstate'.format(node_name): 'secondary'})
if zkhandler.readdata(zk_conn, '/primary_node') == node_name:
zkhandler.writedata(zk_conn, {'/primary_node': 'none'})
zkhandler.write([
(('node.state.router', node_name), 'secondary')
])
if zkhandler.read('base.config.primary_node') == node_name:
zkhandler.write([
('base.config.primary_node', 'none')
])
# If the fence succeeded and successful_fence is migrate
if fence_status and config['successful_fence'] == 'migrate':
migrateFromFencedNode(zk_conn, node_name, config, logger)
migrateFromFencedNode(zkhandler, node_name, config, logger)
# If the fence failed and failed_fence is migrate
if not fence_status and config['failed_fence'] == 'migrate' and config['suicide_intervals'] != '0':
migrateFromFencedNode(zk_conn, node_name, config, logger)
migrateFromFencedNode(zkhandler, node_name, config, logger)
# Migrate hosts away from a fenced node
def migrateFromFencedNode(zk_conn, node_name, config, logger):
def migrateFromFencedNode(zkhandler, node_name, config, logger):
logger.out('Migrating VMs from dead node "{}" to new hosts'.format(node_name), state='i')
# Get the list of VMs
dead_node_running_domains = zkhandler.readdata(zk_conn, '/nodes/{}/runningdomains'.format(node_name)).split()
dead_node_running_domains = zkhandler.read(('node.running_domains', node_name)).split()
# Set the node to a custom domainstate so we know what's happening
zkhandler.writedata(zk_conn, {'/nodes/{}/domainstate'.format(node_name): 'fence-flush'})
zkhandler.write([
(('node.state.domain', node_name), 'fence-flush')
])
# Migrate a VM after a flush
def fence_migrate_vm(dom_uuid):
VMInstance.flush_locks(zk_conn, logger, dom_uuid)
VMInstance.flush_locks(zkhandler, logger, dom_uuid)
target_node = common.findTargetNode(zk_conn, config, logger, dom_uuid)
target_node = common.findTargetNode(zkhandler, dom_uuid)
if target_node is not None:
logger.out('Migrating VM "{}" to node "{}"'.format(dom_uuid, target_node), state='i')
zkhandler.writedata(zk_conn, {
'/domains/{}/state'.format(dom_uuid): 'start',
'/domains/{}/node'.format(dom_uuid): target_node,
'/domains/{}/lastnode'.format(dom_uuid): node_name
})
zkhandler.write([
(('domain.state', dom_uuid), 'start'),
(('domain.node', dom_uuid), target_node),
(('domain.last_node', dom_uuid), node_name),
])
else:
logger.out('No target node found for VM "{}"; VM will autostart on next unflush/ready of current node'.format(dom_uuid), state='i')
zkhandler.writedata(zk_conn, {
'/domains/{}/state'.format(dom_uuid): 'stopped',
'/domains/{}/node_autostart'.format(dom_uuid): 'True'
zkhandler.write({
(('domain.state', dom_uuid), 'stopped'),
(('domain.meta.autostart', dom_uuid), 'True'),
})
# Loop through the VMs
@ -110,7 +115,9 @@ def migrateFromFencedNode(zk_conn, node_name, config, logger):
fence_migrate_vm(dom_uuid)
# Set node in flushed state for easy remigrating when it comes back
zkhandler.writedata(zk_conn, {'/nodes/{}/domainstate'.format(node_name): 'flushed'})
zkhandler.write([
(('node.state.domain', node_name), 'flushed')
])
#
@ -126,31 +133,46 @@ def rebootViaIPMI(ipmi_hostname, ipmi_user, ipmi_password, logger):
if ipmi_reset_retcode != 0:
logger.out('Failed to reboot dead node', state='e')
print(ipmi_reset_stderr)
return False
time.sleep(1)
# Power on the node (just in case it is offline)
ipmi_command_start = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power on'.format(
ipmi_hostname, ipmi_user, ipmi_password
)
ipmi_start_retcode, ipmi_start_stdout, ipmi_start_stderr = common.run_os_command(ipmi_command_start)
time.sleep(2)
# Ensure the node is powered on
# Check the chassis power state
logger.out('Checking power state of dead node', state='i')
ipmi_command_status = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power status'.format(
ipmi_hostname, ipmi_user, ipmi_password
)
ipmi_status_retcode, ipmi_status_stdout, ipmi_status_stderr = common.run_os_command(ipmi_command_status)
# Trigger a power start if needed
if ipmi_status_stdout != "Chassis Power is on":
ipmi_command_start = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power on'.format(
ipmi_hostname, ipmi_user, ipmi_password
)
ipmi_start_retcode, ipmi_start_stdout, ipmi_start_stderr = common.run_os_command(ipmi_command_start)
if ipmi_start_retcode != 0:
logger.out('Failed to start powered-off dead node', state='e')
print(ipmi_reset_stderr)
if ipmi_reset_retcode == 0:
if ipmi_status_stdout == "Chassis Power is on":
# We successfully rebooted the node and it is powered on; this is a succeessful fence
logger.out('Successfully rebooted dead node', state='o')
return True
elif ipmi_status_stdout == "Chassis Power is off":
# We successfully rebooted the node but it is powered off; this might be expected or not, but the node is confirmed off so we can call it a successful fence
logger.out('Chassis power is in confirmed off state after successfuly IPMI reboot; proceeding with fence-flush', state='o')
return True
else:
# We successfully rebooted the node but it is in some unknown power state; since this might indicate a silent failure, we must call it a failed fence
logger.out('Chassis power is in an unknown state after successful IPMI reboot; not performing fence-flush', state='e')
return False
else:
if ipmi_status_stdout == "Chassis Power is off":
# We failed to reboot the node but it is powered off; it has probably suffered a serious hardware failure, but the node is confirmed off so we can call it a successful fence
logger.out('Chassis power is in confirmed off state after failed IPMI reboot; proceeding with fence-flush', state='o')
return True
else:
# We failed to reboot the node but it is in some unknown power state (including "on"); since this might indicate a silent failure, we must call it a failed fence
logger.out('Chassis power is not in confirmed off state after failed IPMI reboot; not performing fence-flush', state='e')
return False
# Declare success
logger.out('Successfully rebooted dead node', state='o')
return True
#

View File

@ -1,189 +0,0 @@
#!/usr/bin/env python3
# zkhandler.py - Secure versioned ZooKeeper updates
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2021 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import uuid
# Child list function
def listchildren(zk_conn, key):
try:
children = zk_conn.get_children(key)
return children
except Exception:
return None
# Key deletion function
def deletekey(zk_conn, key, recursive=True):
try:
zk_conn.delete(key, recursive=recursive)
return True
except Exception:
return False
# Data read function
def readdata(zk_conn, key):
try:
data_raw = zk_conn.get(key)
data = data_raw[0].decode('utf8')
return data
except Exception:
return None
# Data write function
def writedata(zk_conn, kv):
# Commit the transaction
try:
# Start up a transaction
zk_transaction = zk_conn.transaction()
# Proceed one KV pair at a time
for key in sorted(kv):
data = kv[key]
if not data:
data = ''
# Check if this key already exists or not
if not zk_conn.exists(key):
# We're creating a new key
zk_transaction.create(key, str(data).encode('utf8'))
else:
# We're updating a key with version validation
orig_data = zk_conn.get(key)
version = orig_data[1].version
# Set what we expect the new version to be
new_version = version + 1
# Update the data
zk_transaction.set_data(key, str(data).encode('utf8'))
# Set up the check
try:
zk_transaction.check(key, new_version)
except TypeError:
print('Zookeeper key "{}" does not match expected version'.format(key))
return False
zk_transaction.commit()
return True
except Exception:
return False
# Key rename function
def renamekey(zk_conn, kv):
# This one is not transactional because, inexplicably, transactions don't
# support either the recursive delete or recursive create operations that
# we need. Why? No explanation in the docs that I can find.
try:
# Proceed one KV pair at a time
for key in sorted(kv):
old_name = key
new_name = kv[key]
old_data = zk_conn.get(old_name)[0]
child_keys = list()
# Find the children of old_name recursively
def get_children(key):
children = zk_conn.get_children(key)
if not children:
child_keys.append(key)
else:
for ckey in children:
get_children('{}/{}'.format(key, ckey))
get_children(old_name)
# Get the data out of each of the child keys
child_data = dict()
for ckey in child_keys:
child_data[ckey] = zk_conn.get(ckey)[0]
# Create the new parent key
zk_conn.create(new_name, old_data, makepath=True)
# For each child key, create the key and add the data
for ckey in child_keys:
new_ckey_name = ckey.replace(old_name, new_name)
zk_conn.create(new_ckey_name, child_data[ckey], makepath=True)
# Remove recursively the old key
zk_conn.delete(old_name, recursive=True)
return True
except Exception:
return False
# Write lock function
def writelock(zk_conn, key):
count = 1
while True:
try:
lock_id = str(uuid.uuid1())
lock = zk_conn.WriteLock('{}'.format(key), lock_id)
break
except Exception:
count += 1
if count > 5:
break
else:
continue
return lock
# Read lock function
def readlock(zk_conn, key):
count = 1
while True:
try:
lock_id = str(uuid.uuid1())
lock = zk_conn.ReadLock('{}'.format(key), lock_id)
break
except Exception:
count += 1
if count > 5:
break
else:
continue
return lock
# Exclusive lock function
def exclusivelock(zk_conn, key):
count = 1
while True:
try:
lock_id = str(uuid.uuid1())
lock = zk_conn.Lock('{}'.format(key), lock_id)
break
except Exception:
count += 1
if count > 5:
break
else:
continue
return lock

File diff suppressed because it is too large Load Diff

Before

Width:  |  Height:  |  Size: 355 KiB

148
test-cluster.sh Executable file
View File

@ -0,0 +1,148 @@
#!/usr/bin/env bash
set -o errexit
if [[ -z ${1} ]]; then
echo "Please specify a cluster to run tests against."
exit 1
fi
test_cluster="${1}"
_pvc() {
echo "> pvc --cluster ${test_cluster} $@"
pvc --quiet --cluster ${test_cluster} "$@"
sleep 1
}
time_start=$(date +%s)
# Cluster tests
_pvc maintenance on
_pvc maintenance off
backup_tmp=$(mktemp)
_pvc task backup --file ${backup_tmp}
_pvc task restore --yes --file ${backup_tmp}
rm ${backup_tmp} || true
# Provisioner tests
_pvc provisioner profile list test
_pvc provisioner create --wait testX test
sleep 30
# VM tests
vm_tmp=$(mktemp)
_pvc vm dump testX --file ${vm_tmp}
_pvc vm shutdown --yes --wait testX
_pvc vm start testX
sleep 30
_pvc vm stop --yes testX
_pvc vm disable testX
_pvc vm undefine --yes testX
_pvc vm define --target hv3 --tag pvc-test ${vm_tmp}
_pvc vm start testX
sleep 30
_pvc vm restart --yes --wait testX
sleep 30
_pvc vm migrate --wait testX
sleep 5
_pvc vm unmigrate --wait testX
sleep 5
_pvc vm move --wait --target hv1 testX
sleep 5
_pvc vm meta testX --limit hv1 --selector vms --method live --profile test --no-autostart
_pvc vm tag add testX mytag
_pvc vm tag get testX
_pvc vm list --tag mytag
_pvc vm tag remove testX mytag
_pvc vm network get testX
_pvc vm vcpu set testX 4
_pvc vm vcpu get testX
_pvc vm memory set testX 4096
_pvc vm memory get testX
_pvc vm vcpu set testX 2
_pvc vm memory set testX 2048 --restart --yes
sleep 5
_pvc vm list testX
_pvc vm info --long testX
rm ${vm_tmp} || true
# Node tests
_pvc node primary --wait hv1
sleep 10
_pvc node secondary --wait hv1
sleep 10
_pvc node primary --wait hv1
sleep 10
_pvc node flush --wait hv1
_pvc node ready --wait hv1
_pvc node list hv1
_pvc node info hv1
# Network tests
_pvc network add 10001 --description testing --type managed --domain testing.local --ipnet 10.100.100.0/24 --gateway 10.100.100.1 --dhcp --dhcp-start 10.100.100.100 --dhcp-end 10.100.100.199
sleep 5
_pvc vm network add --restart --yes testX 10001
sleep 30
_pvc vm network remove --restart --yes testX 10001
sleep 5
_pvc network acl add 10001 --in --description test-acl --order 0 --rule "'ip daddr 10.0.0.0/8 counter'"
_pvc network acl list 10001
_pvc network acl remove --yes 10001 test-acl
_pvc network dhcp add 10001 10.100.100.200 test99 12:34:56:78:90:ab
_pvc network dhcp list 10001
_pvc network dhcp remove --yes 10001 12:34:56:78:90:ab
_pvc network modify --domain test10001.local 10001
_pvc network list
_pvc network info --long 10001
# Network-VM interaction tests
_pvc vm network add testX 10001 --model virtio --restart --yes
sleep 30
_pvc vm network get testX
_pvc vm network remove testX 10001 --restart --yes
sleep 5
_pvc network remove --yes 10001
# Storage tests
_pvc storage status
_pvc storage util
_pvc storage osd set noout
_pvc storage osd out 0
_pvc storage osd in 0
_pvc storage osd unset noout
_pvc storage osd list
_pvc storage pool add testing 64 --replcfg "copies=3,mincopies=2"
sleep 5
_pvc storage pool list
_pvc storage volume add testing testX 1G
_pvc storage volume resize testing testX 2G
_pvc storage volume rename testing testX testerX
_pvc storage volume clone testing testerX testerY
_pvc storage volume list --pool testing
_pvc storage volume snapshot add testing testerX asnapshotX
_pvc storage volume snapshot rename testing testerX asnapshotX asnapshotY
_pvc storage volume snapshot list
_pvc storage volume snapshot remove --yes testing testerX asnapshotY
# Storage-VM interaction tests
_pvc vm volume add testX --type rbd --disk-id sdh --bus scsi testing/testerY --restart --yes
sleep 30
_pvc vm volume get testX
_pvc vm volume remove testX testing/testerY --restart --yes
sleep 5
_pvc storage volume remove --yes testing testerY
_pvc storage volume remove --yes testing testerX
_pvc storage pool remove --yes testing
# Remove the VM
_pvc vm stop --yes testX
_pvc vm remove --yes testX
time_end=$(date +%s)
echo
echo "Completed PVC functionality tests against cluster ${test_cluster} in $(( ${time_end} - ${time_start} )) seconds."