Compare commits

...

48 Commits

Author SHA1 Message Date
5b4dd61754 Bump version to 0.9.80 2023-10-27 09:56:31 -04:00
2fccbcda89 Add enhancements to autobackup
1. Add a cron mode to avoid exit(1) during cronjobs/timers
2. Revamp the remote_mount settings into auto_mount
   This removes a lot of unnecessary complexity while giving the
   administrator more flexibility in what they want to execute to mount
   a filesystem and how. The naming reflects the goal but the possible
   commands are arbitrary.
2023-10-27 02:07:24 -04:00
6ad51ea4bb Handle store exceptions in cli() function
Avoids having an unsuppressable error message in some contexts, and
provides a cleaner module.
2023-10-26 23:30:22 -04:00
5954feaa31 Add autobackup functionality to CLI
Adds autobackup (integrated, managed VM backups with automatic remote
filesystem mounting, included backup expiry/removal and automatic
full/incremental selection, independent from the manual "pvc vm backup"
commands) to the CLI client.

This is a bit of a large command to handle only inside the CLI client,
but this was chosen as it's the only real place for it aside from an
external script.

There are several major restrictions on this command, mainly that it
must be run from the primary coordinator using the "local" connection,
and that it must be run as "root".

The command is designed to run in a cron/systemd timer installed by
pvc-ansible when the appropriate group_vars are enabled, and otherwise
not touched.
2023-10-26 21:25:23 -04:00
e63d8e59e9 Install sample configs to /usr/share/pvc instead
Also clean up the old versions in the postinst as they are obsolete and
not needed going forward. These only ever served as reference for a
manual installation which itself is long-obsoleted, and thus can be put
somewhere less "important".
2023-10-26 13:00:54 -04:00
82b0301c0e Improve audit log output
Show the full command path in the actual audit log message, but still
only show the command name in the prefix.
2023-10-25 09:48:48 -04:00
2ee2b2cb33 Avoid loading pkg_resources until needed
This import took forever (0.2s) and was used only for the version
command, so don't import it except where it's needed.
2023-10-25 01:51:08 -04:00
198d083ea6 Remove old CLI code 2023-10-25 01:40:26 -04:00
1306054a98 Readd images for README 2023-10-24 10:59:40 -04:00
221af3f241 Bump version to 0.9.79 2023-10-24 02:10:24 -04:00
35f80e544c Use more hierarchical backup path structure 2023-10-24 02:04:16 -04:00
83b937654c Avoid removing nonexistent snapshots
Store retain_snapshot in JSON and use that to check during delete.
2023-10-24 01:35:00 -04:00
714bde89e6 Fix incorrect variable ref 2023-10-24 01:25:01 -04:00
c87736eb0a Use consistent path name and format 2023-10-24 01:20:44 -04:00
63d0a85e29 Add backup deletion command 2023-10-24 01:18:27 -04:00
43e8cd3b07 Clarify restore help text 2023-10-24 00:32:53 -04:00
55ca131c2c Handle snapshots on restore and provide options
Also rename the retain option to remove superfluous plural.
2023-10-24 00:25:06 -04:00
0769f1ea52 Increase service start time to 10s 2023-10-23 22:24:03 -04:00
c858ae8fed Improve waiting in build-and-deploy 2023-10-23 22:23:48 -04:00
8d256a1737 Complete VM restore functionality 2023-10-23 22:23:17 -04:00
d3b3fdfc80 Revert "Export backup images to a tar archive"
This reverts commit 38abd078af.
2023-10-23 11:01:16 -04:00
f1b29ea94e Initial VM restore work 2023-10-23 11:00:54 -04:00
38abd078af Export backup images to a tar archive
This helps ensure an easier restore as the tar archive(s) can be sent
directly to the API via the normal process of image uploading, instead
of individual disks.
2023-10-23 09:56:50 -04:00
fabb97cf48 Only split a command_string if its not a list 2023-10-23 09:50:58 -04:00
50aabde320 Ensure bond count is compared with actual qty 2023-10-22 02:28:04 -04:00
68124db323 Remove extra spaces 2023-10-17 13:01:38 -04:00
8921efd269 Fix incorrect tuple construct 2023-10-17 12:55:44 -04:00
3e259bd926 Add state confirmation to newline 2023-10-17 12:53:20 -04:00
3d12915989 Further improve return messages 2023-10-17 12:53:08 -04:00
67b0b19bca Use better time functionality 2023-10-17 12:39:37 -04:00
5d0c674d1d Add runtime and adjust ordering 2023-10-17 12:32:40 -04:00
f3bc4dee04 Fix ordering of empty line 2023-10-17 12:27:06 -04:00
f441b0d823 Improve missing parent message 2023-10-17 12:17:29 -04:00
fd2331faa6 Add waiting message during backup 2023-10-17 12:16:31 -04:00
a5d0f219e4 Improve return messages 2023-10-17 12:10:55 -04:00
0169510df0 Fix up datestring generation 2023-10-17 12:05:45 -04:00
a58c1d5a8c Fix bad snapshot removals 2023-10-17 12:02:24 -04:00
a8e4b01b67 Handle return data even better 2023-10-17 11:51:03 -04:00
45c4c86911 Handle extra return variable 2023-10-17 11:47:01 -04:00
6448b31d2c Improve VM list arguments
Use kwargs here instead of fixed args to allow default None values.
2023-10-17 11:01:38 -04:00
4fc9b15652 Fix bad function name 2023-10-17 10:56:32 -04:00
75b839692b Fix missing comma 2023-10-17 10:51:30 -04:00
751cfe0b29 Use consistent shebangs in scripts 2023-10-17 10:35:38 -04:00
b997c6f31e Add support for full VM backups
Adds support for exporting full VM backups, including configuration,
metainfo, and RBD disk images, with incremental support.
2023-10-17 10:15:06 -04:00
6e83300d78 Increase ipmi plugin timeout 2023-10-04 19:21:59 -04:00
522da3fd95 Adjust wording for volume create too 2023-10-03 09:42:23 -04:00
3a1bf0724e Mention file_size as bytes 2023-10-03 09:39:19 -04:00
ee494fb1c0 Adjust the help text of storage pools
Makes some places clearer, cleans up cruft, and adds references to the
main documentation as required.
2023-10-02 11:46:12 -04:00
52 changed files with 1626 additions and 16539 deletions

View File

@ -1 +1 @@
0.9.78
0.9.80

View File

@ -1,5 +1,23 @@
## PVC Changelog
###### [v0.9.80](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.80)
* [CLI] Improves CLI performance by not loading "pkg_resources" until needed
* [CLI] Improves the output of the audit log (full command paths)
* [Node Daemon/API Daemon] Moves the sample YAML configurations to /usr/share/pvc instead of /etc/pvc and cleans up the old locations automatically
* [CLI] Adds VM autobackup functionality to automate VM backup/retention and scheduling
* [CLI] Handles the internal store in a better way to ensure CLI can be used as a module properly
###### [v0.9.79](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.79)
**API Changes**: New endpoints /vm/{vm}/backup, /vm/{vm}/restore
* [CLI Client] Fixes some storage pool help text messages
* [Node Daemon] Increases the IPMI monitoring plugin timeout
* [All] Adds support for VM backups, including creation, removal, and restore
* [Repository] Fixes shebangs in scripts to be consistent
* [Daemon Library] Improves the handling of VM list arguments (default None)
###### [v0.9.78](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.78)
* [API, Client CLI] Fixes several bugs around image uploads; adds a new query parameter for non-raw images

View File

@ -27,7 +27,7 @@ from ssl import SSLContext, TLSVersion
from distutils.util import strtobool as dustrtobool
# Daemon version
version = "0.9.78"
version = "0.9.80"
# API version
API_VERSION = 1.0

View File

@ -2140,7 +2140,7 @@ class API_VM_Locks(Resource):
api.add_resource(API_VM_Locks, "/vm/<vm>/locks")
# /vm/<vm</console
# /vm/<vm>/console
class API_VM_Console(Resource):
@RequestParser([{"name": "lines"}])
@Authenticator
@ -2293,6 +2293,202 @@ class API_VM_Device(Resource):
api.add_resource(API_VM_Device, "/vm/<vm>/device")
# /vm/<vm>/backup
class API_VM_Backup(Resource):
@RequestParser(
[
{
"name": "backup_path",
"required": True,
"helptext": "A local filesystem path on the primary coordinator must be specified",
},
{
"name": "incremental_parent",
"required": False,
},
{
"name": "retain_snapshot",
"required": False,
},
]
)
@Authenticator
def post(self, vm, reqargs):
"""
Create a backup of {vm} and its volumes to a local primary coordinator filesystem path
---
tags:
- vm
parameters:
- in: query
name: backup_path
type: string
required: true
description: A local filesystem path on the primary coordinator to store the backup
- in: query
name: incremental_parent
type: string
required: false
description: A previous backup datestamp to use as an incremental parent; if unspecified a full backup is taken
- in: query
name: retain_snapshot
type: boolean
required: false
default: false
description: Whether or not to retain this backup's volume snapshots to use as a future incremental parent; full backups only
responses:
200:
description: OK
schema:
type: object
id: Message
400:
description: Execution error
schema:
type: object
id: Message
404:
description: Not found
schema:
type: object
id: Message
"""
backup_path = reqargs.get("backup_path", None)
incremental_parent = reqargs.get("incremental_parent", None)
retain_snapshot = bool(strtobool(reqargs.get("retain_snapshot", "false")))
return api_helper.vm_backup(
vm, backup_path, incremental_parent, retain_snapshot
)
@RequestParser(
[
{
"name": "backup_path",
"required": True,
"helptext": "A local filesystem path on the primary coordinator must be specified",
},
{
"name": "backup_datestring",
"required": True,
"helptext": "A backup datestring must be specified",
},
]
)
@Authenticator
def delete(self, vm, reqargs):
"""
Remove a backup of {vm}, including snapshots, from a local primary coordinator filesystem path
---
tags:
- vm
parameters:
- in: query
name: backup_path
type: string
required: true
description: A local filesystem path on the primary coordinator where the backup is stored
- in: query
name: backup_datestring
type: string
required: true
description: The backup datestring identifier (e.g. 20230102030405)
responses:
200:
description: OK
schema:
type: object
id: Message
400:
description: Execution error
schema:
type: object
id: Message
404:
description: Not found
schema:
type: object
id: Message
"""
backup_path = reqargs.get("backup_path", None)
backup_datestring = reqargs.get("backup_datestring", None)
return api_helper.vm_remove_backup(vm, backup_path, backup_datestring)
api.add_resource(API_VM_Backup, "/vm/<vm>/backup")
# /vm/<vm>/restore
class API_VM_Restore(Resource):
@RequestParser(
[
{
"name": "backup_path",
"required": True,
"helptext": "A local filesystem path on the primary coordinator must be specified",
},
{
"name": "backup_datestring",
"required": True,
"helptext": "A backup datestring must be specified",
},
{
"name": "retain_snapshot",
"required": False,
},
]
)
@Authenticator
def post(self, vm, reqargs):
"""
Restore a backup of {vm} and its volumes from a local primary coordinator filesystem path
---
tags:
- vm
parameters:
- in: query
name: backup_path
type: string
required: true
description: A local filesystem path on the primary coordinator where the backup is stored
- in: query
name: backup_datestring
type: string
required: true
description: The backup datestring identifier (e.g. 20230102030405)
- in: query
name: retain_snapshot
type: boolean
required: false
default: true
description: Whether or not to retain the (parent, if incremental) volume snapshot after restore
responses:
200:
description: OK
schema:
type: object
id: Message
400:
description: Execution error
schema:
type: object
id: Message
404:
description: Not found
schema:
type: object
id: Message
"""
backup_path = reqargs.get("backup_path", None)
backup_datestring = reqargs.get("backup_datestring", None)
retain_snapshot = bool(strtobool(reqargs.get("retain_snapshot", "true")))
return api_helper.vm_restore(
vm, backup_path, backup_datestring, retain_snapshot
)
api.add_resource(API_VM_Restore, "/vm/<vm>/restore")
##########################################################
# Client API - Network
##########################################################
@ -4843,7 +5039,7 @@ class API_Storage_Ceph_Volume_Root(Resource):
{
"name": "size",
"required": True,
"helptext": "A volume size in bytes (or with k/M/G/T suffix) must be specified.",
"helptext": "A volume size in bytes (B implied or with SI suffix k/M/G/T) must be specified.",
},
]
)
@ -4869,7 +5065,7 @@ class API_Storage_Ceph_Volume_Root(Resource):
name: size
type: string
required: true
description: The volume size in bytes (or with a metric suffix, i.e. k/M/G/T)
description: The volume size, in bytes (B implied) or with a single-character SI suffix (k/M/G/T)
responses:
200:
description: OK
@ -5122,7 +5318,7 @@ class API_Storage_Ceph_Volume_Element_Upload(Resource):
name: file_size
type: integer
required: false
description: The size of the image file, if {image_format} is not "raw"
description: The size of the image file, in bytes, if {image_format} is not "raw"
responses:
200:
description: OK

View File

@ -470,6 +470,88 @@ def vm_define(
return output, retcode
@ZKConnection(config)
def vm_backup(
zkhandler,
domain,
backup_path,
incremental_parent=None,
retain_snapshot=False,
):
"""
Back up a VM to a local (primary coordinator) filesystem path.
"""
retflag, retdata = pvc_vm.backup_vm(
zkhandler,
domain,
backup_path,
incremental_parent,
retain_snapshot,
)
if retflag:
retcode = 200
else:
retcode = 400
output = {"message": retdata.replace('"', "'")}
return output, retcode
@ZKConnection(config)
def vm_remove_backup(
zkhandler,
domain,
source_path,
datestring,
):
"""
Remove a VM backup from snapshots and a local (primary coordinator) filesystem path.
"""
retflag, retdata = pvc_vm.remove_backup(
zkhandler,
domain,
source_path,
datestring,
)
if retflag:
retcode = 200
else:
retcode = 400
output = {"message": retdata.replace('"', "'")}
return output, retcode
@ZKConnection(config)
def vm_restore(
zkhandler,
domain,
backup_path,
datestring,
retain_snapshot=False,
):
"""
Restore a VM from a local (primary coordinator) filesystem path.
"""
retflag, retdata = pvc_vm.restore_vm(
zkhandler,
domain,
backup_path,
datestring,
retain_snapshot,
)
if retflag:
retcode = 200
else:
retcode = 400
output = {"message": retdata.replace('"', "'")}
return output, retcode
@ZKConnection(config)
def vm_attach_device(zkhandler, vm, device_spec_xml):
"""

View File

@ -1,4 +1,4 @@
#!/bin/bash
#!/usr/bin/env bash
# A useful script for testing out changes to PVC by building the debs and deploying them out to a
# set of hosts automatically, including restarting the daemon (with a pause between) on the remote
@ -36,34 +36,37 @@ echo "Preparing code (format and lint)..."
./lint || exit 1
# Build the packages
echo -n "Building packages... "
echo -n "Building packages..."
version="$( ./build-unstable-deb.sh 2>/dev/null )"
echo "done. Package version ${version}."
echo " done. Package version ${version}."
# Install the client(s) locally
echo -n "Installing client packages locally... "
echo -n "Installing client packages locally..."
$SUDO dpkg -i ../pvc-client*_${version}*.deb &>/dev/null
echo "done".
echo " done".
for HOST in ${HOSTS[@]}; do
echo "> Deploying packages to host ${HOST}"
echo -n "Copying packages... "
echo -n "Copying packages..."
ssh $HOST $SUDO rm -rf /tmp/pvc &>/dev/null
ssh $HOST mkdir /tmp/pvc &>/dev/null
scp ../pvc-*_${version}*.deb $HOST:/tmp/pvc/ &>/dev/null
echo "done."
echo -n "Installing packages... "
echo " done."
echo -n "Installing packages..."
ssh $HOST $SUDO dpkg -i /tmp/pvc/{pvc-client-cli,pvc-daemon-common,pvc-daemon-api,pvc-daemon-node}*.deb &>/dev/null
ssh $HOST rm -rf /tmp/pvc &>/dev/null
echo "done."
echo -n "Restarting PVC daemons... "
echo " done."
echo -n "Restarting PVC daemons..."
ssh $HOST $SUDO systemctl restart pvcapid &>/dev/null
ssh $HOST $SUDO systemctl restart pvcapid-worker &>/dev/null
ssh $HOST $SUDO systemctl restart pvcnoded &>/dev/null
echo "done."
echo -n "Waiting 30s for host to stabilize... "
sleep 30
echo "done."
echo " done."
echo -n "Waiting for node daemon to be running..."
while [[ $( ssh $HOST "pvc -q node list -f json ${HOST%%.*} | jq -r '.[].daemon_state'" ) != "run" ]]; do
sleep 5
echo -n "."
done
echo " done."
done
if [[ -z ${KEEP_ARTIFACTS} ]]; then
rm ../pvc*_${version}*

View File

@ -1,4 +1,4 @@
#!/bin/sh
#!/usr/bin/env bash
pushd $( git rev-parse --show-toplevel ) &>/dev/null
ver="$( head -1 debian/changelog | awk -F'[()-]' '{ print $2 }' )"
git pull

View File

@ -1,4 +1,4 @@
#!/bin/sh
#!/usr/bin/env bash
set -o xtrace
exec 3>&1
exec 1>&2

View File

@ -1,33 +0,0 @@
#!/usr/bin/env python3
# pvc.py - PVC client command-line interface (stub testing interface)
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import pvc.pvc
#
# Main entry point
#
def main():
return pvc.pvc.cli(obj={})
if __name__ == "__main__":
main()

View File

@ -1,97 +0,0 @@
#!/usr/bin/env python3
# ansiprint.py - Printing function for formatted messages
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import datetime
# ANSII colours for output
def red():
return "\033[91m"
def blue():
return "\033[94m"
def cyan():
return "\033[96m"
def green():
return "\033[92m"
def yellow():
return "\033[93m"
def purple():
return "\033[95m"
def bold():
return "\033[1m"
def end():
return "\033[0m"
# Print function
def echo(message, prefix, state):
# Get the date
date = "{} - ".format(datetime.datetime.now().strftime("%Y/%m/%d %H:%M:%S.%f"))
endc = end()
# Continuation
if state == "c":
date = ""
colour = ""
prompt = " "
# OK
elif state == "o":
colour = green()
prompt = ">>> "
# Error
elif state == "e":
colour = red()
prompt = ">>> "
# Warning
elif state == "w":
colour = yellow()
prompt = ">>> "
# Tick
elif state == "t":
colour = purple()
prompt = ">>> "
# Information
elif state == "i":
colour = blue()
prompt = ">>> "
else:
colour = bold()
prompt = ">>> "
# Append space to prefix
if prefix != "":
prefix = prefix + " "
print(colour + prompt + endc + date + prefix + message)

File diff suppressed because it is too large Load Diff

View File

@ -1,313 +0,0 @@
#!/usr/bin/env python3
# cluster.py - PVC CLI client function library, cluster management
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import json
import pvc.lib.ansiprint as ansiprint
from pvc.lib.common import call_api
def initialize(config, overwrite=False):
"""
Initialize the PVC cluster
API endpoint: GET /api/v1/initialize
API arguments: overwrite, yes-i-really-mean-it
API schema: {json_data_object}
"""
params = {"yes-i-really-mean-it": "yes", "overwrite": overwrite}
response = call_api(config, "post", "/initialize", params=params)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get("message", "")
def backup(config):
"""
Get a JSON backup of the cluster
API endpoint: GET /api/v1/backup
API arguments:
API schema: {json_data_object}
"""
response = call_api(config, "get", "/backup")
if response.status_code == 200:
return True, response.json()
else:
return False, response.json().get("message", "")
def restore(config, cluster_data):
"""
Restore a JSON backup to the cluster
API endpoint: POST /api/v1/restore
API arguments: yes-i-really-mean-it
API schema: {json_data_object}
"""
cluster_data_json = json.dumps(cluster_data)
params = {"yes-i-really-mean-it": "yes"}
data = {"cluster_data": cluster_data_json}
response = call_api(config, "post", "/restore", params=params, data=data)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get("message", "")
def maintenance_mode(config, state):
"""
Enable or disable PVC cluster maintenance mode
API endpoint: POST /api/v1/status
API arguments: {state}={state}
API schema: {json_data_object}
"""
params = {"state": state}
response = call_api(config, "post", "/status", params=params)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get("message", "")
def get_info(config):
"""
Get status of the PVC cluster
API endpoint: GET /api/v1/status
API arguments:
API schema: {json_data_object}
"""
response = call_api(config, "get", "/status")
if response.status_code == 200:
return True, response.json()
else:
return False, response.json().get("message", "")
def format_info(cluster_information, oformat):
if oformat == "json":
return json.dumps(cluster_information)
if oformat == "json-pretty":
return json.dumps(cluster_information, indent=4)
# Plain formatting, i.e. human-readable
if (
cluster_information.get("maintenance") == "true"
or cluster_information.get("cluster_health", {}).get("health", "N/A") == "N/A"
):
health_colour = ansiprint.blue()
elif cluster_information.get("cluster_health", {}).get("health", 100) > 90:
health_colour = ansiprint.green()
elif cluster_information.get("cluster_health", {}).get("health", 100) > 50:
health_colour = ansiprint.yellow()
else:
health_colour = ansiprint.red()
ainformation = []
ainformation.append(
"{}PVC cluster status:{}".format(ansiprint.bold(), ansiprint.end())
)
ainformation.append("")
health_text = (
f"{cluster_information.get('cluster_health', {}).get('health', 'N/A')}"
)
if health_text != "N/A":
health_text += "%"
if cluster_information.get("maintenance") == "true":
health_text += " (maintenance on)"
ainformation.append(
"{}Cluster health:{} {}{}{}".format(
ansiprint.purple(),
ansiprint.end(),
health_colour,
health_text,
ansiprint.end(),
)
)
if cluster_information.get("cluster_health", {}).get("messages"):
health_messages = "\n > ".join(
sorted(cluster_information["cluster_health"]["messages"])
)
ainformation.append(
"{}Health messages:{} > {}".format(
ansiprint.purple(),
ansiprint.end(),
health_messages,
)
)
else:
ainformation.append(
"{}Health messages:{} N/A".format(
ansiprint.purple(),
ansiprint.end(),
)
)
if oformat == "short":
return "\n".join(ainformation)
ainformation.append("")
ainformation.append(
"{}Primary node:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["primary_node"]
)
)
ainformation.append(
"{}PVC version:{} {}".format(
ansiprint.purple(),
ansiprint.end(),
cluster_information.get("pvc_version", "N/A"),
)
)
ainformation.append(
"{}Cluster upstream IP:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["upstream_ip"]
)
)
ainformation.append("")
ainformation.append(
"{}Total nodes:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["nodes"]["total"]
)
)
ainformation.append(
"{}Total VMs:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["vms"]["total"]
)
)
ainformation.append(
"{}Total networks:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["networks"]
)
)
ainformation.append(
"{}Total OSDs:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["osds"]["total"]
)
)
ainformation.append(
"{}Total pools:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["pools"]
)
)
ainformation.append(
"{}Total volumes:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["volumes"]
)
)
ainformation.append(
"{}Total snapshots:{} {}".format(
ansiprint.purple(), ansiprint.end(), cluster_information["snapshots"]
)
)
nodes_string = "{}Nodes:{} {}/{} {}ready,run{}".format(
ansiprint.purple(),
ansiprint.end(),
cluster_information["nodes"].get("run,ready", 0),
cluster_information["nodes"].get("total", 0),
ansiprint.green(),
ansiprint.end(),
)
for state, count in cluster_information["nodes"].items():
if state == "total" or state == "run,ready":
continue
nodes_string += " {}/{} {}{}{}".format(
count,
cluster_information["nodes"]["total"],
ansiprint.yellow(),
state,
ansiprint.end(),
)
ainformation.append("")
ainformation.append(nodes_string)
vms_string = "{}VMs:{} {}/{} {}start{}".format(
ansiprint.purple(),
ansiprint.end(),
cluster_information["vms"].get("start", 0),
cluster_information["vms"].get("total", 0),
ansiprint.green(),
ansiprint.end(),
)
for state, count in cluster_information["vms"].items():
if state == "total" or state == "start":
continue
if state in ["disable", "migrate", "unmigrate", "provision"]:
colour = ansiprint.blue()
else:
colour = ansiprint.yellow()
vms_string += " {}/{} {}{}{}".format(
count, cluster_information["vms"]["total"], colour, state, ansiprint.end()
)
ainformation.append("")
ainformation.append(vms_string)
if cluster_information["osds"]["total"] > 0:
osds_string = "{}Ceph OSDs:{} {}/{} {}up,in{}".format(
ansiprint.purple(),
ansiprint.end(),
cluster_information["osds"].get("up,in", 0),
cluster_information["osds"].get("total", 0),
ansiprint.green(),
ansiprint.end(),
)
for state, count in cluster_information["osds"].items():
if state == "total" or state == "up,in":
continue
osds_string += " {}/{} {}{}{}".format(
count,
cluster_information["osds"]["total"],
ansiprint.yellow(),
state,
ansiprint.end(),
)
ainformation.append("")
ainformation.append(osds_string)
ainformation.append("")
return "\n".join(ainformation)

View File

@ -1,201 +0,0 @@
#!/usr/bin/env python3
# common.py - PVC CLI client function library, Common functions
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import os
import math
import time
import requests
import click
from urllib3 import disable_warnings
def format_bytes(size_bytes):
byte_unit_matrix = {
"B": 1,
"K": 1024,
"M": 1024 * 1024,
"G": 1024 * 1024 * 1024,
"T": 1024 * 1024 * 1024 * 1024,
"P": 1024 * 1024 * 1024 * 1024 * 1024,
}
human_bytes = "0B"
for unit in sorted(byte_unit_matrix, key=byte_unit_matrix.get):
formatted_bytes = int(math.ceil(size_bytes / byte_unit_matrix[unit]))
if formatted_bytes < 10000:
human_bytes = "{}{}".format(formatted_bytes, unit)
break
return human_bytes
def format_metric(integer):
integer_unit_matrix = {
"": 1,
"K": 1000,
"M": 1000 * 1000,
"B": 1000 * 1000 * 1000,
"T": 1000 * 1000 * 1000 * 1000,
"Q": 1000 * 1000 * 1000 * 1000 * 1000,
}
human_integer = "0"
for unit in sorted(integer_unit_matrix, key=integer_unit_matrix.get):
formatted_integer = int(math.ceil(integer / integer_unit_matrix[unit]))
if formatted_integer < 10000:
human_integer = "{}{}".format(formatted_integer, unit)
break
return human_integer
class UploadProgressBar(object):
def __init__(self, filename, end_message="", end_nl=True):
file_size = os.path.getsize(filename)
file_size_human = format_bytes(file_size)
click.echo("Uploading file (total size {})...".format(file_size_human))
self.length = file_size
self.time_last = int(round(time.time() * 1000)) - 1000
self.bytes_last = 0
self.bytes_diff = 0
self.is_end = False
self.end_message = end_message
self.end_nl = end_nl
if not self.end_nl:
self.end_suffix = " "
else:
self.end_suffix = ""
self.bar = click.progressbar(length=self.length, show_eta=True)
def update(self, monitor):
bytes_cur = monitor.bytes_read
self.bytes_diff += bytes_cur - self.bytes_last
if self.bytes_last == bytes_cur:
self.is_end = True
self.bytes_last = bytes_cur
time_cur = int(round(time.time() * 1000))
if (time_cur - 1000) > self.time_last:
self.time_last = time_cur
self.bar.update(self.bytes_diff)
self.bytes_diff = 0
if self.is_end:
self.bar.update(self.bytes_diff)
self.bytes_diff = 0
click.echo()
click.echo()
if self.end_message:
click.echo(self.end_message + self.end_suffix, nl=self.end_nl)
class ErrorResponse(requests.Response):
def __init__(self, json_data, status_code):
self.json_data = json_data
self.status_code = status_code
def json(self):
return self.json_data
def call_api(
config,
operation,
request_uri,
headers={},
params=None,
data=None,
files=None,
):
# Set the connect timeout to 2 seconds but extremely long (48 hour) data timeout
timeout = (2.05, 172800)
# Craft the URI
uri = "{}://{}{}{}".format(
config["api_scheme"], config["api_host"], config["api_prefix"], request_uri
)
# Craft the authentication header if required
if config["api_key"]:
headers["X-Api-Key"] = config["api_key"]
# Determine the request type and hit the API
disable_warnings()
try:
if operation == "get":
response = requests.get(
uri,
timeout=timeout,
headers=headers,
params=params,
data=data,
verify=config["verify_ssl"],
)
if operation == "post":
response = requests.post(
uri,
timeout=timeout,
headers=headers,
params=params,
data=data,
files=files,
verify=config["verify_ssl"],
)
if operation == "put":
response = requests.put(
uri,
timeout=timeout,
headers=headers,
params=params,
data=data,
files=files,
verify=config["verify_ssl"],
)
if operation == "patch":
response = requests.patch(
uri,
timeout=timeout,
headers=headers,
params=params,
data=data,
verify=config["verify_ssl"],
)
if operation == "delete":
response = requests.delete(
uri,
timeout=timeout,
headers=headers,
params=params,
data=data,
verify=config["verify_ssl"],
)
except Exception as e:
message = "Failed to connect to the API: {}".format(e)
response = ErrorResponse({"message": message}, 500)
# Display debug output
if config["debug"]:
click.echo("API endpoint: {}".format(uri), err=True)
click.echo("Response code: {}".format(response.status_code), err=True)
click.echo("Response headers: {}".format(response.headers), err=True)
click.echo(err=True)
# Return the response object
return response

File diff suppressed because it is too large Load Diff

View File

@ -1,709 +0,0 @@
#!/usr/bin/env python3
# node.py - PVC CLI client function library, node management
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import time
import pvc.lib.ansiprint as ansiprint
from pvc.lib.common import call_api
#
# Primary functions
#
def node_coordinator_state(config, node, action):
"""
Set node coordinator state state (primary/secondary)
API endpoint: POST /api/v1/node/{node}/coordinator-state
API arguments: action={action}
API schema: {"message": "{data}"}
"""
params = {"state": action}
response = call_api(
config,
"post",
"/node/{node}/coordinator-state".format(node=node),
params=params,
)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get("message", "")
def node_domain_state(config, node, action, wait):
"""
Set node domain state state (flush/ready)
API endpoint: POST /api/v1/node/{node}/domain-state
API arguments: action={action}, wait={wait}
API schema: {"message": "{data}"}
"""
params = {"state": action, "wait": str(wait).lower()}
response = call_api(
config, "post", "/node/{node}/domain-state".format(node=node), params=params
)
if response.status_code == 200:
retstatus = True
else:
retstatus = False
return retstatus, response.json().get("message", "")
def view_node_log(config, node, lines=100):
"""
Return node log lines from the API (and display them in a pager in the main CLI)
API endpoint: GET /node/{node}/log
API arguments: lines={lines}
API schema: {"name":"{node}","data":"{node_log}"}
"""
params = {"lines": lines}
response = call_api(
config, "get", "/node/{node}/log".format(node=node), params=params
)
if response.status_code != 200:
return False, response.json().get("message", "")
node_log = response.json()["data"]
# Shrink the log buffer to length lines
shrunk_log = node_log.split("\n")[-lines:]
loglines = "\n".join(shrunk_log)
return True, loglines
def follow_node_log(config, node, lines=10):
"""
Return and follow node log lines from the API
API endpoint: GET /node/{node}/log
API arguments: lines={lines}
API schema: {"name":"{nodename}","data":"{node_log}"}
"""
# We always grab 200 to match the follow call, but only _show_ `lines` number
params = {"lines": 200}
response = call_api(
config, "get", "/node/{node}/log".format(node=node), params=params
)
if response.status_code != 200:
return False, response.json().get("message", "")
# Shrink the log buffer to length lines
node_log = response.json()["data"]
shrunk_log = node_log.split("\n")[-int(lines) :]
loglines = "\n".join(shrunk_log)
# Print the initial data and begin following
print(loglines, end="")
print("\n", end="")
while True:
# Grab the next line set (200 is a reasonable number of lines per half-second; any more are skipped)
try:
params = {"lines": 200}
response = call_api(
config, "get", "/node/{node}/log".format(node=node), params=params
)
new_node_log = response.json()["data"]
except Exception:
break
# Split the new and old log strings into constitutent lines
old_node_loglines = node_log.split("\n")
new_node_loglines = new_node_log.split("\n")
# Set the node log to the new log value for the next iteration
node_log = new_node_log
# Get the difference between the two sets of lines
old_node_loglines_set = set(old_node_loglines)
diff_node_loglines = [
x for x in new_node_loglines if x not in old_node_loglines_set
]
# If there's a difference, print it out
if len(diff_node_loglines) > 0:
print("\n".join(diff_node_loglines), end="")
print("\n", end="")
# Wait half a second
time.sleep(0.5)
return True, ""
def node_info(config, node):
"""
Get information about node
API endpoint: GET /api/v1/node/{node}
API arguments:
API schema: {json_data_object}
"""
response = call_api(config, "get", "/node/{node}".format(node=node))
if response.status_code == 200:
if isinstance(response.json(), list) and len(response.json()) != 1:
# No exact match, return not found
return False, "Node not found."
else:
# Return a single instance if the response is a list
if isinstance(response.json(), list):
return True, response.json()[0]
# This shouldn't happen, but is here just in case
else:
return True, response.json()
else:
return False, response.json().get("message", "")
def node_list(
config, limit, target_daemon_state, target_coordinator_state, target_domain_state
):
"""
Get list information about nodes (limited by {limit})
API endpoint: GET /api/v1/node
API arguments: limit={limit}
API schema: [{json_data_object},{json_data_object},etc.]
"""
params = dict()
if limit:
params["limit"] = limit
if target_daemon_state:
params["daemon_state"] = target_daemon_state
if target_coordinator_state:
params["coordinator_state"] = target_coordinator_state
if target_domain_state:
params["domain_state"] = target_domain_state
response = call_api(config, "get", "/node", params=params)
if response.status_code == 200:
return True, response.json()
else:
return False, response.json().get("message", "")
#
# Output display functions
#
def getOutputColours(node_information):
node_health = node_information.get("health", "N/A")
if isinstance(node_health, int):
if node_health <= 50:
health_colour = ansiprint.red()
elif node_health <= 90:
health_colour = ansiprint.yellow()
elif node_health <= 100:
health_colour = ansiprint.green()
else:
health_colour = ansiprint.blue()
else:
health_colour = ansiprint.blue()
if node_information["daemon_state"] == "run":
daemon_state_colour = ansiprint.green()
elif node_information["daemon_state"] == "stop":
daemon_state_colour = ansiprint.red()
elif node_information["daemon_state"] == "shutdown":
daemon_state_colour = ansiprint.yellow()
elif node_information["daemon_state"] == "init":
daemon_state_colour = ansiprint.yellow()
elif node_information["daemon_state"] == "dead":
daemon_state_colour = ansiprint.red() + ansiprint.bold()
else:
daemon_state_colour = ansiprint.blue()
if node_information["coordinator_state"] == "primary":
coordinator_state_colour = ansiprint.green()
elif node_information["coordinator_state"] == "secondary":
coordinator_state_colour = ansiprint.blue()
else:
coordinator_state_colour = ansiprint.cyan()
if node_information["domain_state"] == "ready":
domain_state_colour = ansiprint.green()
else:
domain_state_colour = ansiprint.blue()
if node_information["memory"]["allocated"] > node_information["memory"]["total"]:
mem_allocated_colour = ansiprint.yellow()
else:
mem_allocated_colour = ""
if node_information["memory"]["provisioned"] > node_information["memory"]["total"]:
mem_provisioned_colour = ansiprint.yellow()
else:
mem_provisioned_colour = ""
return (
health_colour,
daemon_state_colour,
coordinator_state_colour,
domain_state_colour,
mem_allocated_colour,
mem_provisioned_colour,
)
def format_info(node_information, long_output):
(
health_colour,
daemon_state_colour,
coordinator_state_colour,
domain_state_colour,
mem_allocated_colour,
mem_provisioned_colour,
) = getOutputColours(node_information)
# Format a nice output; do this line-by-line then concat the elements at the end
ainformation = []
# Basic information
ainformation.append(
"{}Name:{} {}".format(
ansiprint.purple(),
ansiprint.end(),
node_information["name"],
)
)
ainformation.append(
"{}PVC Version:{} {}".format(
ansiprint.purple(),
ansiprint.end(),
node_information["pvc_version"],
)
)
node_health = node_information.get("health", "N/A")
if isinstance(node_health, int):
node_health_text = f"{node_health}%"
else:
node_health_text = node_health
ainformation.append(
"{}Health:{} {}{}{}".format(
ansiprint.purple(),
ansiprint.end(),
health_colour,
node_health_text,
ansiprint.end(),
)
)
node_health_details = node_information.get("health_details", [])
if long_output:
node_health_messages = "\n ".join(
[f"{plugin['name']}: {plugin['message']}" for plugin in node_health_details]
)
else:
node_health_messages = "\n ".join(
[
f"{plugin['name']}: {plugin['message']}"
for plugin in node_health_details
if int(plugin.get("health_delta", 0)) > 0
]
)
if len(node_health_messages) > 0:
ainformation.append(
"{}Health Plugin Details:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_health_messages
)
)
ainformation.append("")
ainformation.append(
"{}Daemon State:{} {}{}{}".format(
ansiprint.purple(),
ansiprint.end(),
daemon_state_colour,
node_information["daemon_state"],
ansiprint.end(),
)
)
ainformation.append(
"{}Coordinator State:{} {}{}{}".format(
ansiprint.purple(),
ansiprint.end(),
coordinator_state_colour,
node_information["coordinator_state"],
ansiprint.end(),
)
)
ainformation.append(
"{}Domain State:{} {}{}{}".format(
ansiprint.purple(),
ansiprint.end(),
domain_state_colour,
node_information["domain_state"],
ansiprint.end(),
)
)
if long_output:
ainformation.append("")
ainformation.append(
"{}Architecture:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["arch"]
)
)
ainformation.append(
"{}Operating System:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["os"]
)
)
ainformation.append(
"{}Kernel Version:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["kernel"]
)
)
ainformation.append("")
ainformation.append(
"{}Active VM Count:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["domains_count"]
)
)
ainformation.append(
"{}Host CPUs:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["vcpu"]["total"]
)
)
ainformation.append(
"{}vCPUs:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["vcpu"]["allocated"]
)
)
ainformation.append(
"{}Load:{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["load"]
)
)
ainformation.append(
"{}Total RAM (MiB):{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["memory"]["total"]
)
)
ainformation.append(
"{}Used RAM (MiB):{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["memory"]["used"]
)
)
ainformation.append(
"{}Free RAM (MiB):{} {}".format(
ansiprint.purple(), ansiprint.end(), node_information["memory"]["free"]
)
)
ainformation.append(
"{}Allocated RAM (MiB):{} {}{}{}".format(
ansiprint.purple(),
ansiprint.end(),
mem_allocated_colour,
node_information["memory"]["allocated"],
ansiprint.end(),
)
)
ainformation.append(
"{}Provisioned RAM (MiB):{} {}{}{}".format(
ansiprint.purple(),
ansiprint.end(),
mem_provisioned_colour,
node_information["memory"]["provisioned"],
ansiprint.end(),
)
)
# Join it all together
ainformation.append("")
return "\n".join(ainformation)
def format_list(node_list, raw):
if raw:
ainformation = list()
for node in sorted(item["name"] for item in node_list):
ainformation.append(node)
return "\n".join(ainformation)
node_list_output = []
# Determine optimal column widths
node_name_length = 5
pvc_version_length = 8
health_length = 7
daemon_state_length = 7
coordinator_state_length = 12
domain_state_length = 7
domains_count_length = 4
cpu_count_length = 6
load_length = 5
mem_total_length = 6
mem_used_length = 5
mem_free_length = 5
mem_alloc_length = 6
mem_prov_length = 5
for node_information in node_list:
# node_name column
_node_name_length = len(node_information["name"]) + 1
if _node_name_length > node_name_length:
node_name_length = _node_name_length
# node_pvc_version column
_pvc_version_length = len(node_information.get("pvc_version", "N/A")) + 1
if _pvc_version_length > pvc_version_length:
pvc_version_length = _pvc_version_length
# node_health column
node_health = node_information.get("health", "N/A")
if isinstance(node_health, int):
node_health_text = f"{node_health}%"
else:
node_health_text = node_health
_health_length = len(node_health_text) + 1
if _health_length > health_length:
health_length = _health_length
# daemon_state column
_daemon_state_length = len(node_information["daemon_state"]) + 1
if _daemon_state_length > daemon_state_length:
daemon_state_length = _daemon_state_length
# coordinator_state column
_coordinator_state_length = len(node_information["coordinator_state"]) + 1
if _coordinator_state_length > coordinator_state_length:
coordinator_state_length = _coordinator_state_length
# domain_state column
_domain_state_length = len(node_information["domain_state"]) + 1
if _domain_state_length > domain_state_length:
domain_state_length = _domain_state_length
# domains_count column
_domains_count_length = len(str(node_information["domains_count"])) + 1
if _domains_count_length > domains_count_length:
domains_count_length = _domains_count_length
# cpu_count column
_cpu_count_length = len(str(node_information["cpu_count"])) + 1
if _cpu_count_length > cpu_count_length:
cpu_count_length = _cpu_count_length
# load column
_load_length = len(str(node_information["load"])) + 1
if _load_length > load_length:
load_length = _load_length
# mem_total column
_mem_total_length = len(str(node_information["memory"]["total"])) + 1
if _mem_total_length > mem_total_length:
mem_total_length = _mem_total_length
# mem_used column
_mem_used_length = len(str(node_information["memory"]["used"])) + 1
if _mem_used_length > mem_used_length:
mem_used_length = _mem_used_length
# mem_free column
_mem_free_length = len(str(node_information["memory"]["free"])) + 1
if _mem_free_length > mem_free_length:
mem_free_length = _mem_free_length
# mem_alloc column
_mem_alloc_length = len(str(node_information["memory"]["allocated"])) + 1
if _mem_alloc_length > mem_alloc_length:
mem_alloc_length = _mem_alloc_length
# mem_prov column
_mem_prov_length = len(str(node_information["memory"]["provisioned"])) + 1
if _mem_prov_length > mem_prov_length:
mem_prov_length = _mem_prov_length
# Format the string (header)
node_list_output.append(
"{bold}{node_header: <{node_header_length}} {state_header: <{state_header_length}} {resource_header: <{resource_header_length}} {memory_header: <{memory_header_length}}{end_bold}".format(
node_header_length=node_name_length
+ pvc_version_length
+ health_length
+ 2,
state_header_length=daemon_state_length
+ coordinator_state_length
+ domain_state_length
+ 2,
resource_header_length=domains_count_length
+ cpu_count_length
+ load_length
+ 2,
memory_header_length=mem_total_length
+ mem_used_length
+ mem_free_length
+ mem_alloc_length
+ mem_prov_length
+ 4,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
node_header="Nodes "
+ "".join(
[
"-"
for _ in range(
6, node_name_length + pvc_version_length + health_length + 1
)
]
),
state_header="States "
+ "".join(
[
"-"
for _ in range(
7,
daemon_state_length
+ coordinator_state_length
+ domain_state_length
+ 1,
)
]
),
resource_header="Resources "
+ "".join(
[
"-"
for _ in range(
10, domains_count_length + cpu_count_length + load_length + 1
)
]
),
memory_header="Memory (M) "
+ "".join(
[
"-"
for _ in range(
11,
mem_total_length
+ mem_used_length
+ mem_free_length
+ mem_alloc_length
+ mem_prov_length
+ 3,
)
]
),
)
)
node_list_output.append(
"{bold}{node_name: <{node_name_length}} {node_pvc_version: <{pvc_version_length}} {node_health: <{health_length}} \
{daemon_state_colour}{node_daemon_state: <{daemon_state_length}}{end_colour} {coordinator_state_colour}{node_coordinator_state: <{coordinator_state_length}}{end_colour} {domain_state_colour}{node_domain_state: <{domain_state_length}}{end_colour} \
{node_domains_count: <{domains_count_length}} {node_cpu_count: <{cpu_count_length}} {node_load: <{load_length}} \
{node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length}} {node_mem_free: <{mem_free_length}} {node_mem_allocated: <{mem_alloc_length}} {node_mem_provisioned: <{mem_prov_length}}{end_bold}".format(
node_name_length=node_name_length,
pvc_version_length=pvc_version_length,
health_length=health_length,
daemon_state_length=daemon_state_length,
coordinator_state_length=coordinator_state_length,
domain_state_length=domain_state_length,
domains_count_length=domains_count_length,
cpu_count_length=cpu_count_length,
load_length=load_length,
mem_total_length=mem_total_length,
mem_used_length=mem_used_length,
mem_free_length=mem_free_length,
mem_alloc_length=mem_alloc_length,
mem_prov_length=mem_prov_length,
bold=ansiprint.bold(),
end_bold=ansiprint.end(),
daemon_state_colour="",
coordinator_state_colour="",
domain_state_colour="",
end_colour="",
node_name="Name",
node_pvc_version="Version",
node_health="Health",
node_daemon_state="Daemon",
node_coordinator_state="Coordinator",
node_domain_state="Domain",
node_domains_count="VMs",
node_cpu_count="vCPUs",
node_load="Load",
node_mem_total="Total",
node_mem_used="Used",
node_mem_free="Free",
node_mem_allocated="Alloc",
node_mem_provisioned="Prov",
)
)
# Format the string (elements)
for node_information in sorted(node_list, key=lambda n: n["name"]):
(
health_colour,
daemon_state_colour,
coordinator_state_colour,
domain_state_colour,
mem_allocated_colour,
mem_provisioned_colour,
) = getOutputColours(node_information)
node_health = node_information.get("health", "N/A")
if isinstance(node_health, int):
node_health_text = f"{node_health}%"
else:
node_health_text = node_health
node_list_output.append(
"{bold}{node_name: <{node_name_length}} {node_pvc_version: <{pvc_version_length}} {health_colour}{node_health: <{health_length}}{end_colour} \
{daemon_state_colour}{node_daemon_state: <{daemon_state_length}}{end_colour} {coordinator_state_colour}{node_coordinator_state: <{coordinator_state_length}}{end_colour} {domain_state_colour}{node_domain_state: <{domain_state_length}}{end_colour} \
{node_domains_count: <{domains_count_length}} {node_cpu_count: <{cpu_count_length}} {node_load: <{load_length}} \
{node_mem_total: <{mem_total_length}} {node_mem_used: <{mem_used_length}} {node_mem_free: <{mem_free_length}} {mem_allocated_colour}{node_mem_allocated: <{mem_alloc_length}}{end_colour} {mem_provisioned_colour}{node_mem_provisioned: <{mem_prov_length}}{end_colour}{end_bold}".format(
node_name_length=node_name_length,
pvc_version_length=pvc_version_length,
health_length=health_length,
daemon_state_length=daemon_state_length,
coordinator_state_length=coordinator_state_length,
domain_state_length=domain_state_length,
domains_count_length=domains_count_length,
cpu_count_length=cpu_count_length,
load_length=load_length,
mem_total_length=mem_total_length,
mem_used_length=mem_used_length,
mem_free_length=mem_free_length,
mem_alloc_length=mem_alloc_length,
mem_prov_length=mem_prov_length,
bold="",
end_bold="",
health_colour=health_colour,
daemon_state_colour=daemon_state_colour,
coordinator_state_colour=coordinator_state_colour,
domain_state_colour=domain_state_colour,
mem_allocated_colour=mem_allocated_colour,
mem_provisioned_colour=mem_allocated_colour,
end_colour=ansiprint.end(),
node_name=node_information["name"],
node_pvc_version=node_information.get("pvc_version", "N/A"),
node_health=node_health_text,
node_daemon_state=node_information["daemon_state"],
node_coordinator_state=node_information["coordinator_state"],
node_domain_state=node_information["domain_state"],
node_domains_count=node_information["domains_count"],
node_cpu_count=node_information["vcpu"]["allocated"],
node_load=node_information["load"],
node_mem_total=node_information["memory"]["total"],
node_mem_used=node_information["memory"]["used"],
node_mem_free=node_information["memory"]["free"],
node_mem_allocated=node_information["memory"]["allocated"],
node_mem_provisioned=node_information["memory"]["provisioned"],
)
)
return "\n".join(node_list_output)

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1,102 +0,0 @@
#!/usr/bin/env python3
# zkhandler.py - Secure versioned ZooKeeper updates
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
import uuid
# Exists function
def exists(zk_conn, key):
stat = zk_conn.exists(key)
if stat:
return True
else:
return False
# Child list function
def listchildren(zk_conn, key):
children = zk_conn.get_children(key)
return children
# Delete key function
def deletekey(zk_conn, key, recursive=True):
zk_conn.delete(key, recursive=recursive)
# Data read function
def readdata(zk_conn, key):
data_raw = zk_conn.get(key)
data = data_raw[0].decode("utf8")
return data
# Data write function
def writedata(zk_conn, kv):
# Start up a transaction
zk_transaction = zk_conn.transaction()
# Proceed one KV pair at a time
for key in sorted(kv):
data = kv[key]
# Check if this key already exists or not
if not zk_conn.exists(key):
# We're creating a new key
zk_transaction.create(key, str(data).encode("utf8"))
else:
# We're updating a key with version validation
orig_data = zk_conn.get(key)
version = orig_data[1].version
# Set what we expect the new version to be
new_version = version + 1
# Update the data
zk_transaction.set_data(key, str(data).encode("utf8"))
# Set up the check
try:
zk_transaction.check(key, new_version)
except TypeError:
print('Zookeeper key "{}" does not match expected version'.format(key))
return False
# Commit the transaction
try:
zk_transaction.commit()
return True
except Exception:
return False
# Write lock function
def writelock(zk_conn, key):
lock_id = str(uuid.uuid1())
lock = zk_conn.WriteLock("{}".format(key), lock_id)
return lock
# Read lock function
def readlock(zk_conn, key):
lock_id = str(uuid.uuid1())
lock = zk_conn.ReadLock("{}".format(key), lock_id)
return lock

File diff suppressed because it is too large Load Diff

View File

@ -1,32 +0,0 @@
# PVC helper scripts
These helper scripts are included with the PVC client to aid administrators in some meta-functions.
The following scripts are provided for use:
## `migrate_vm`
Migrates a VM, with downtime, from one PVC cluster to another.
`migrate_vm <vm> <source_cluster> <destination_cluster>`
### Arguments
* `vm`: The virtual machine to migrate
* `source_cluster`: The source PVC cluster; must be a valid cluster to the local PVC client
* `destination_cluster`: The destination PVC cluster; must be a valid cluster to the local PVC client
## `import_vm`
Imports a VM from another platform into a PVC cluster.
## `export_vm`
Exports a (stopped) VM from a PVC cluster to another platform.
`export_vm <vm> <source_cluster>`
### Arguments
* `vm`: The virtual machine to migrate
* `source_cluster`: The source PVC cluster; must be a valid cluster to the local PVC client

View File

@ -1,98 +0,0 @@
#!/usr/bin/env bash
# export_vm - Exports a VM from a PVC cluster to local files
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
set -o errexit
set -o pipefail
usage() {
echo -e "Export a VM from a PVC cluster to local files."
echo -e "Usage:"
echo -e " $0 <vm> <source_cluster> [<destination_directory>]"
echo -e ""
echo -e "Important information:"
echo -e " * The local user must have valid SSH access to the primary coordinator in the source_cluster."
echo -e " * The user on the cluster primary coordinator must have 'sudo' access."
echo -e " * If the VM is not in 'stop' state, it will be shut down."
echo -e " * Do not switch the cluster primary coordinator while the script is running."
echo -e " * Ensure you have enough space in <destination_directory> to store all VM disk images."
}
fail() {
echo -e "$@"
exit 1
}
# Arguments
if [[ -z ${1} || -z ${2} ]]; then
usage
exit 1
fi
source_vm="${1}"
source_cluster="${2}"
if [[ -n "${3}" ]]; then
destination_directory="${3}"
else
destination_directory="."
fi
# Verify the cluster is reachable
pvc -c ${source_cluster} status &>/dev/null || fail "Specified source_cluster is not accessible"
# Determine the connection IP
cluster_address="$( pvc cluster list 2>/dev/null | grep -i "^${source_cluster}" | awk '{ print $2 }' )"
# Attempt to connect to the cluster address
ssh ${cluster_address} which pvc &>/dev/null || fail "Could not SSH to source_cluster primary coordinator host"
# Verify that the VM exists
pvc -c ${source_cluster} vm info ${source_vm} &>/dev/null || fail "Specified VM is not present on the cluster"
echo "Verification complete."
# Shut down the VM
echo -n "Shutting down VM..."
set +o errexit
pvc -c ${source_cluster} vm shutdown ${source_vm} &>/dev/null
shutdown_success=$?
while ! pvc -c ${source_cluster} vm info ${source_vm} 2>/dev/null | grep '^State' | grep -q -E 'stop|disable'; do
sleep 1
echo -n "."
done
set -o errexit
echo " done."
# Dump the XML file
echo -n "Exporting VM configuration file... "
pvc -c ${source_cluster} vm dump ${source_vm} 1> ${destination_directory}/${source_vm}.xml 2>/dev/null
echo "done".
# Determine the list of volumes in this VM
volume_list="$( pvc -c ${source_cluster} vm info --long ${source_vm} 2>/dev/null | grep -w 'rbd' | awk '{ print $3 }' )"
for volume in ${volume_list}; do
volume_pool="$( awk -F '/' '{ print $1 }' <<<"${volume}" )"
volume_name="$( awk -F '/' '{ print $2 }' <<<"${volume}" )"
volume_size="$( pvc -c ${source_cluster} storage volume list -p ${volume_pool} ${volume_name} 2>/dev/null | grep "^${volume_name}" | awk '{ print $3 }' )"
echo -n "Exporting disk ${volume_name} (${volume_size})... "
ssh ${cluster_address} sudo rbd map ${volume_pool}/${volume_name} &>/dev/null || fail "Failed to map volume ${volume}"
ssh ${cluster_address} sudo dd if="/dev/rbd/${volume_pool}/${volume_name}" bs=1M 2>/dev/null | dd bs=1M of="${destination_directory}/${volume_name}.img" 2>/dev/null
ssh ${cluster_address} sudo rbd unmap ${volume_pool}/${volume_name} &>/dev/null || fail "Failed to unmap volume ${volume}"
echo "done."
done

View File

@ -1,118 +0,0 @@
#!/usr/bin/env bash
# force_single_node - Manually promote a single coordinator node from a degraded cluster
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
set -o errexit
set -o pipefail
usage() {
echo -e "Manually promote a single coordinator node from a degraded cluster"
echo -e ""
echo -e "DANGER: This action will cause a permanent split-brain within the cluster"
echo -e " which will have to be corrected manually upon cluster restoration."
echo -e ""
echo -e "This script is primarily designed for small clusters in situations where 2"
echo -e "of the 3 coordinators have become unreachable or shut down. It will promote"
echo -e "the remaining lone_node to act as a standalone coordinator, allowing basic"
echo -e "cluster functionality to continue in a heavily degraded state until the"
echo -e "situation can be rectified. This should only be done in exceptional cases"
echo -e "as a disaster recovery mechanism when the remaining nodes will remain down"
echo -e "for a significant amount of time but some VMs are required to run. In general,"
echo -e "use of this script is not advisable."
echo -e ""
echo -e "Usage:"
echo -e " $0 <target_cluster> <lone_node>"
echo -e ""
echo -e "Important information:"
echo -e " * The lone_node must be a fully-qualified name that is directly reachable from"
echo -e " the local system via SSH."
echo -e " * The local user must have valid SSH access to the lone_node in the cluster."
echo -e " * The user on the cluster node must have 'sudo' access."
}
fail() {
echo -e "$@"
exit 1
}
# Arguments
if [[ -z ${1} || -z ${2} ]]; then
usage
exit 1
fi
target_cluster="${1}"
lone_node="${2}"
lone_node_shortname="${lone_node%%.*}"
# Attempt to connect to the node
ssh ${lone_node} which pvc &>/dev/null || fail "Could not SSH to the lone_node host"
echo "Verification complete."
echo -n "Allowing Ceph single-node operation... "
temp_monmap="$( ssh ${lone_node} mktemp )"
ssh ${lone_node} "sudo systemctl stop ceph-mon@${lone_node_shortname}" &>/dev/null
ssh ${lone_node} "ceph-mon -i ${lone_node_shortname} --extract-monmap ${temp_monmap}" &>/dev/null
ssh ${lone_node} "sudo cp ${tmp_monmap} /etc/ceph/monmap.orig" &>/dev/null
mon_list="$( ssh ${lone_node} strings ${temp_monmap} | sort | uniq )"
for mon in ${mon_list}; do
if [[ ${mon} == ${lone_node_shortname} ]]; then
continue
fi
ssh ${lone_node} "sudo monmaptool ${temp_monmap} --rm ${mon}" &>/dev/null
done
ssh ${lone_node} "sudo ceph-mon -i ${lone_node_shortname} --inject-monmap ${temp_monmap}" &>/dev/null
ssh ${lone_node} "sudo systemctl start ceph-mon@${lone_node_shortname}" &>/dev/null
sleep 5
ssh ${lone_node} "sudo ceph osd set noout" &>/dev/null
echo "done."
echo -e "Restoration steps:"
echo -e " sudo systemctl stop ceph-mon@${lone_node_shortname}"
echo -e " sudo ceph-mon -i ${lone_node_shortname} --inject-monmap /etc/ceph/monmap.orig"
echo -e " sudo systemctl start ceph-mon@${lone_node_shortname}"
echo -e " sudo ceph osd unset noout"
echo -n "Allowing Zookeeper single-node operation... "
temp_zoocfg="$( ssh ${lone_node} mktemp )"
ssh ${lone_node} "sudo systemctl stop zookeeper"
ssh ${lone_node} "sudo awk -v lone_node=${lone_node_shortname} '{
FS="=|:"
if ( $1 ~ /^server/ ){
if ($2 == lone_node) {
print $0
} else {
print "#" $0
}
} else {
print $0
}
}' /etc/zookeeper/conf/zoo.cfg > ${temp_zoocfg}"
ssh ${lone_node} "sudo mv /etc/zookeeper/conf/zoo.cfg /etc/zookeeper/conf/zoo.cfg.orig"
ssh ${lone_node} "sudo mv ${temp_zoocfg} /etc/zookeeper/conf/zoo.cfg"
ssh ${lone_node} "sudo systemctl start zookeeper"
echo "done."
echo -e "Restoration steps:"
echo -e " sudo systemctl stop zookeeper"
echo -e " sudo mv /etc/zookeeper/conf/zoo.cfg.orig /etc/zookeeper/conf/zoo.cfg"
echo -e " sudo systemctl start zookeeper"
ssh ${lone_node} "sudo systemctl stop ceph-mon@${lone_node_shortname}"
echo ""
ssh ${lone_node} "sudo pvc status 2>/dev/null"

View File

@ -1,80 +0,0 @@
#!/usr/bin/env bash
# import_vm - Imports a VM to a PVC cluster from local files
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
set -o errexit
set -o pipefail
usage() {
echo -e "Import a VM to a PVC cluster from local files."
echo -e "Usage:"
echo -e " $0 <destination_cluster> <destination_pool> <vm_configuration_file> <vm_disk_file_1> [<vm_disk_file_2>] [...]"
echo -e ""
echo -e "Important information:"
echo -e " * At least one disk must be specified; all disks that are present in vm_configuration_file"
echo -e " should be specified, though this is not strictly requireda."
echo -e " * Do not switch the cluster primary coordinator while the script is running."
echo -e " * Ensure you have enough space on the destination cluster to store all VM disks."
}
fail() {
echo -e "$@"
exit 1
}
# Arguments
if [[ -z ${1} || -z ${2} || -z ${3} || -z ${4} ]]; then
usage
exit 1
fi
destination_cluster="${1}"; shift
destination_pool="${1}"; shift
vm_config_file="${1}"; shift
vm_disk_files=( ${@} )
# Verify the cluster is reachable
pvc -c ${destination_cluster} status &>/dev/null || fail "Specified destination_cluster is not accessible"
# Determine the connection IP
cluster_address="$( pvc cluster list 2>/dev/null | grep -i "^${destination_cluster}" | awk '{ print $2 }' )"
echo "Verification complete."
# Determine information about the VM from the config file
parse_xml_field() {
field="${1}"
line="$( grep -F "<${field}>" ${vm_config_file} )"
awk -F '>|<' '{ print $3 }' <<<"${line}"
}
vm_name="$( parse_xml_field name )"
echo "Importing VM ${vm_name}..."
pvc -c ${destination_cluster} vm define ${vm_config_file} 2>/dev/null
# Create the disks on the cluster
for disk_file in ${vm_disk_files[@]}; do
disk_file_basename="$( basename ${disk_file} )"
disk_file_ext="${disk_file_basename##*.}"
disk_file_name="$( basename ${disk_file_basename} .${disk_file_ext} )"
disk_file_size="$( stat --format="%s" ${disk_file} )"
echo "Importing disk ${disk_file_name}... "
pvc -c ${destination_cluster} storage volume add ${destination_pool} ${disk_file_name} ${disk_file_size}B 2>/dev/null
pvc -c ${destination_cluster} storage volume upload ${destination_pool} ${disk_file_name} ${disk_file} 2>/dev/null
done

View File

@ -1,115 +0,0 @@
#!/usr/bin/env bash
# migrate_vm - Exports a VM from a PVC cluster to another PVC cluster
# Part of the Parallel Virtual Cluster (PVC) system
#
# Copyright (C) 2018-2022 Joshua M. Boniface <joshua@boniface.me>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, version 3.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
###############################################################################
set -o errexit
set -o pipefail
usage() {
echo -e "Export a VM from a PVC cluster to another PVC cluster."
echo -e "Usage:"
echo -e " $0 <vm> <source_cluster> <destination_cluster> <destination_pool>"
echo -e ""
echo -e "Important information:"
echo -e " * The local user must have valid SSH access to the primary coordinator in the source_cluster."
echo -e " * The user on the cluster primary coordinator must have 'sudo' access."
echo -e " * If the VM is not in 'stop' state, it will be shut down."
echo -e " * Do not switch the cluster primary coordinator on either cluster while the script is running."
echo -e " * Ensure you have enough space on the target cluster to store all VM disks."
}
fail() {
echo -e "$@"
exit 1
}
# Arguments
if [[ -z ${1} || -z ${2} || -z ${3} || -z ${4} ]]; then
usage
exit 1
fi
source_vm="${1}"
source_cluster="${2}"
destination_cluster="${3}"
destination_pool="${4}"
# Verify each cluster is reachable
pvc -c ${source_cluster} status &>/dev/null || fail "Specified source_cluster is not accessible"
pvc -c ${destination_cluster} status &>/dev/null || fail "Specified destination_cluster is not accessible"
# Determine the connection IPs
source_cluster_address="$( pvc cluster list 2>/dev/null | grep -i "^${source_cluster}" | awk '{ print $2 }' )"
destination_cluster_address="$( pvc cluster list 2>/dev/null | grep -i "^${destination_cluster}" | awk '{ print $2 }' )"
# Attempt to connect to the cluster addresses
ssh ${source_cluster_address} which pvc &>/dev/null || fail "Could not SSH to source_cluster primary coordinator host"
ssh ${destination_cluster_address} which pvc &>/dev/null || fail "Could not SSH to destination_cluster primary coordinator host"
# Verify that the VM exists
pvc -c ${source_cluster} vm info ${source_vm} &>/dev/null || fail "Specified VM is not present on the source cluster"
echo "Verification complete."
# Shut down the VM
echo -n "Shutting down VM..."
set +o errexit
pvc -c ${source_cluster} vm shutdown ${source_vm} &>/dev/null
shutdown_success=$?
while ! pvc -c ${source_cluster} vm info ${source_vm} 2>/dev/null | grep '^State' | grep -q -E 'stop|disable'; do
sleep 1
echo -n "."
done
set -o errexit
echo " done."
tempfile="$( mktemp )"
# Dump the XML file
echo -n "Exporting VM configuration file from source cluster... "
pvc -c ${source_cluster} vm dump ${source_vm} 1> ${tempfile} 2>/dev/null
echo "done."
# Import the XML file
echo -n "Importing VM configuration file to destination cluster... "
pvc -c ${destination_cluster} vm define ${tempfile}
echo "done."
rm -f ${tempfile}
# Determine the list of volumes in this VM
volume_list="$( pvc -c ${source_cluster} vm info --long ${source_vm} 2>/dev/null | grep -w 'rbd' | awk '{ print $3 }' )"
# Parse and migrate each volume
for volume in ${volume_list}; do
volume_pool="$( awk -F '/' '{ print $1 }' <<<"${volume}" )"
volume_name="$( awk -F '/' '{ print $2 }' <<<"${volume}" )"
volume_size="$( pvc -c ${source_cluster} storage volume list -p ${volume_pool} ${volume_name} 2>/dev/null | grep "^${volume_name}" | awk '{ print $3 }' )"
echo "Transferring disk ${volume_name} (${volume_size})... "
pvc -c ${destination_cluster} storage volume add ${destination_pool} ${volume_name} ${volume_size} 2>/dev/null
ssh ${source_cluster_address} sudo rbd map ${volume_pool}/${volume_name} &>/dev/null || fail "Failed to map volume ${volume} on source cluster"
ssh ${destination_cluster_address} sudo rbd map ${volume_pool}/${volume_name} &>/dev/null || fail "Failed to map volume ${volume} on destination cluster"
ssh ${source_cluster_address} sudo dd if="/dev/rbd/${volume_pool}/${volume_name}" bs=1M 2>/dev/null | pv | ssh ${destination_cluster_address} sudo dd bs=1M of="/dev/rbd/${destination_pool}/${volume_name}" 2>/dev/null
ssh ${source_cluster_address} sudo rbd unmap ${volume_pool}/${volume_name} &>/dev/null || fail "Failed to unmap volume ${volume} on source cluster"
ssh ${destination_cluster_address} sudo rbd unmap ${volume_pool}/${volume_name} &>/dev/null || fail "Failed to unmap volume ${volume} on destination cluster"
done
if [[ ${shutdown_success} -eq 0 ]]; then
pvc -c ${destination_cluster} vm start ${source_vm}
fi

View File

@ -1,20 +0,0 @@
from setuptools import setup
setup(
name="pvc",
version="0.9.63",
packages=["pvc", "pvc.lib"],
install_requires=[
"Click",
"PyYAML",
"lxml",
"colorama",
"requests",
"requests-toolbelt",
],
entry_points={
"console_scripts": [
"pvc = pvc.pvc:cli",
],
},
)

View File

@ -0,0 +1,52 @@
---
# Root level configuration key
autobackup:
# Backup root path on the node, used as the remote mountpoint
# Must be an absolute path beginning with '/'
# If remote_mount is enabled, the remote mount will be mounted on this directory
# If remote_mount is enabled, it is recommended to use a path under `/tmp` for this
# If remote_mount is disabled, a real filesystem must be mounted here (PVC system volumes are small!)
backup_root_path: "/tmp/backups"
# Suffix to the backup root path, used to allow multiple PVC systems to write to a single root path
# Must begin with '/'; leave empty to use the backup root path directly
# Note that most remote mount options can fake this if needed, but provided to ensure local compatability
backup_root_suffix: "/mycluster"
# VM tag(s) to back up
# Only VMs with at least one of the given tag(s) will be backed up; all others will be skipped
backup_tags:
- "backup"
- "mytag"
# Backup schedule: when and what format to take backups
backup_schedule:
full_interval: 7 # Number of total backups between full backups; others are incremental
# > If this number is 1, every backup will be a full backup and no incremental
# backups will be taken
# > If this number is 2, every second backup will be a full backup, etc.
full_retention: 2 # Keep this many full backups; the oldest will be deleted when a new one is
# taken, along with all child incremental backups of that backup
# > Should usually be at least 2 when using incrementals (full_interval > 1) to
# avoid there being too few backups after cleanup from a new full backup
# Automatic mount settings
# These settings permit running an arbitrary set of commands, ideally a "mount" command or similar, to
# ensure that a remote filesystem is mounted on the backup root path
# While the examples here show absolute paths, that is not required; they will run with the $PATH of the
# executing environment (either the "pvc" command on a CLI or a cron/systemd timer)
# A "{backup_root_path}" f-string/str.format type variable MAY be present in any cmds string to represent
# the above configured root backup path, which is interpolated at runtime
# If multiple commands are given, they will be executed in the order given; if no commands are given,
# nothing is executed, but the keys MUST be present
auto_mount:
enabled: no # Enable automatic mount/unmount support
# These commands are executed at the start of the backup run and should mount a filesystem
mount_cmds:
# This example shows an NFS mount leveraging the backup_root_path variable
- "/usr/sbin/mount.nfs -o nfsvers=3 10.0.0.10:/backups {backup_root_path}"
# These commands are executed at the end of the backup run and should unmount a filesystem
unmount_cmds:
# This example shows a generic umount leveraging the backup_root_path variable
- "/usr/bin/umount {backup_root_path}"

View File

@ -25,9 +25,8 @@ from functools import wraps
from json import dump as jdump
from json import dumps as jdumps
from json import loads as jloads
from os import environ, makedirs, path
from pkg_resources import get_distribution
from lxml.etree import fromstring, tostring
from os import environ, makedirs, path
from re import sub, match
from yaml import load as yload
from yaml import SafeLoader as SafeYAMLLoader
@ -48,34 +47,17 @@ import click
###############################################################################
# Context and completion handler
# Context and completion handler, globals
###############################################################################
CONTEXT_SETTINGS = dict(help_option_names=["-h", "--help"], max_content_width=120)
CONTEXT_SETTINGS = dict(
help_option_names=["-h", "--help"], max_content_width=MAX_CONTENT_WIDTH
)
IS_COMPLETION = True if environ.get("_PVC_COMPLETE", "") == "complete" else False
CLI_CONFIG = dict()
if not IS_COMPLETION:
cli_client_dir = environ.get("PVC_CLIENT_DIR", None)
home_dir = environ.get("HOME", None)
if cli_client_dir:
store_path = cli_client_dir
elif home_dir:
store_path = f"{home_dir}/.config/pvc"
else:
print(
"WARNING: No client or home configuration directory found; using /tmp instead"
)
store_path = "/tmp/pvc"
if not path.isdir(store_path):
makedirs(store_path)
if not path.isfile(f"{store_path}/{DEFAULT_STORE_FILENAME}"):
update_store(store_path, {"local": DEFAULT_STORE_DATA})
###############################################################################
# Local helper functions
@ -115,6 +97,8 @@ def version(ctx, param, value):
if not value or ctx.resilient_parsing:
return
from pkg_resources import get_distribution
version = get_distribution("pvc").version
echo(CLI_CONFIG, f"Parallel Virtual Cluster CLI client version {version}")
ctx.exit()
@ -1590,6 +1574,214 @@ def cli_vm_flush_locks(domain):
finish(retcode, retmsg)
###############################################################################
# > pvc vm backup
###############################################################################
@click.group(
name="backup",
short_help="Manage backups for PVC VMs.",
context_settings=CONTEXT_SETTINGS,
)
def cli_vm_backup():
"""
Manage backups of VMs in a PVC cluster.
"""
pass
###############################################################################
# > pvc vm backup create
###############################################################################
@click.command(name="create", short_help="Create a backup of a virtual machine.")
@connection_req
@click.argument("domain")
@click.argument("backup_path")
@click.option(
"-i",
"--incremental",
"incremental_parent",
default=None,
help="Perform an incremental volume backup from this parent backup datestring.",
)
@click.option(
"-r",
"--retain-snapshot",
"retain_snapshot",
is_flag=True,
default=False,
help="Retain volume snapshot for future incremental use (full only).",
)
def cli_vm_backup_create(domain, backup_path, incremental_parent, retain_snapshot):
"""
Create a backup of virtual machine DOMAIN to BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
BACKUP_PATH must be a valid absolute directory path on the cluster "primary" coordinator (see "pvc node list") allowing writes from the API daemon (normally running as "root"). The BACKUP_PATH should be a large storage volume, ideally a remotely mounted filesystem (e.g. NFS, SSHFS, etc.) or non-Ceph-backed disk; PVC does not handle this path, that is up to the administrator to configure and manage.
The backup will export the VM configuration, metainfo, and a point-in-time snapshot of all attached RBD volumes, using a datestring formatted backup name (i.e. YYYYMMDDHHMMSS).
The virtual machine DOMAIN may be running, and due to snapshots the backup should be crash-consistent, but will be in an unclean state and this must be considered when restoring from backups.
Incremental snapshots are possible by specifying the "-i"/"--incremental" option along with a source backup datestring. The snapshots from that source backup must have been retained using the "-r"/"--retain-snapshots" option. Retaining snapshots of incremental backups is not supported as incremental backups cannot be chained.
Full backup volume images are sparse-allocated, however it is recommended for safety to consider their maximum allocated size when allocated space for the BACKUP_PATH. Incremental volume images are generally small but are dependent entirely on the rate of data change in each volume.
"""
echo(
CLI_CONFIG,
f"Backing up VM '{domain}'... ",
newline=False,
)
retcode, retmsg = pvc.lib.vm.vm_backup(
CLI_CONFIG, domain, backup_path, incremental_parent, retain_snapshot
)
if retcode:
echo(CLI_CONFIG, "done.")
else:
echo(CLI_CONFIG, "failed.")
finish(retcode, retmsg)
###############################################################################
# > pvc vm backup restore
###############################################################################
@click.command(name="restore", short_help="Restore a backup of a virtual machine.")
@connection_req
@click.argument("domain")
@click.argument("backup_datestring")
@click.argument("backup_path")
@click.option(
"-r/-R",
"--retain-snapshot/--remove-snapshot",
"retain_snapshot",
is_flag=True,
default=True,
help="Retain or remove restored (parent, if incremental) snapshot.",
)
def cli_vm_backup_restore(domain, backup_datestring, backup_path, retain_snapshot):
"""
Restore the backup BACKUP_DATESTRING of virtual machine DOMAIN stored in BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
BACKUP_PATH must be a valid absolute directory path on the cluster "primary" coordinator (see "pvc node list") allowing reads from the API daemon (normally running as "root"). The BACKUP_PATH should be a large storage volume, ideally a remotely mounted filesystem (e.g. NFS, SSHFS, etc.) or non-Ceph-backed disk; PVC does not handle this path, that is up to the administrator to configure and manage.
The restore will import the VM configuration, metainfo, and the point-in-time snapshot of all attached RBD volumes. Incremental backups will be automatically handled.
A VM named DOMAIN or with the same UUID must not exist; if a VM with the same name or UUID already exists, it must be removed, or renamed and then undefined (to preserve volumes), before restoring.
If the "-r"/"--retain-snapshot" option is specified (the default), for incremental restores, only the parent snapshot is kept; for full restores, the restored snapshot is kept. If the "-R"/"--remove-snapshot" option is specified, the imported snapshot is removed.
WARNING: The "-R"/"--remove-snapshot" option will invalidate any existing incremental backups based on the same incremental parent for the restored VM.
"""
echo(
CLI_CONFIG,
f"Restoring backup {backup_datestring} of VM '{domain}'... ",
newline=False,
)
retcode, retmsg = pvc.lib.vm.vm_restore(
CLI_CONFIG, domain, backup_path, backup_datestring, retain_snapshot
)
if retcode:
echo(CLI_CONFIG, "done.")
else:
echo(CLI_CONFIG, "failed.")
finish(retcode, retmsg)
###############################################################################
# > pvc vm backup remove
###############################################################################
@click.command(name="remove", short_help="Remove a backup of a virtual machine.")
@connection_req
@click.argument("domain")
@click.argument("backup_datestring")
@click.argument("backup_path")
def cli_vm_backup_remove(domain, backup_datestring, backup_path):
"""
Remove the backup BACKUP_DATESTRING, including snapshots, of virtual machine DOMAIN stored in BACKUP_PATH on the cluster primary coordinator. DOMAIN may be a UUID or name.
WARNING: Removing an incremental parent will invalidate any existing incremental backups based on that backup.
"""
echo(
CLI_CONFIG,
f"Removing backup {backup_datestring} of VM '{domain}'... ",
newline=False,
)
retcode, retmsg = pvc.lib.vm.vm_remove_backup(
CLI_CONFIG, domain, backup_path, backup_datestring
)
if retcode:
echo(CLI_CONFIG, "done.")
else:
echo(CLI_CONFIG, "failed.")
finish(retcode, retmsg)
###############################################################################
# > pvc vm autobackup
###############################################################################
@click.command(
name="autobackup", short_help="Perform automatic virtual machine backups."
)
@connection_req
@click.option(
"-f",
"--configuration",
"autobackup_cfgfile",
envvar="PVC_AUTOBACKUP_CFGFILE",
default=DEFAULT_AUTOBACKUP_FILENAME,
show_default=True,
help="Override default config file location.",
)
@click.option(
"--force-full",
"force_full_flag",
default=False,
is_flag=True,
help="Force all backups to be full backups this run.",
)
@click.option(
"--cron",
"cron_flag",
default=False,
is_flag=True,
help="Cron mode; don't error exit if this isn't the primary coordinator.",
)
def cli_vm_autobackup(autobackup_cfgfile, force_full_flag, cron_flag):
"""
Perform automated backups of VMs, with integrated cleanup and full/incremental scheduling.
This command enables automatic backup of PVC VMs at the block level, leveraging the various "pvc vm backup"
functions with an internal rentention and cleanup system as well as determination of full vs. incremental
backups at different intervals. VMs are selected based on configured VM tags. The destination storage
may either be local, or provided by a remote filesystem which is automatically mounted and unmounted during
the backup run via a set of configured commands before and after the backup run.
NOTE: This command performs its tasks in a local context. It MUST be run from the cluster's active primary
coordinator using the "local" connection only; if either is not correct, the command will error.
NOTE: This command should be run as the same user as the API daemon, usually "root" with "sudo -E" or in
a cronjob as "root", to ensure permissions are correct on the backup files. Failure to do so will still take
the backup, but the state update write will likely fail and the backup will become untracked. The command
will prompt for confirmation if it is found not to be running as "root" and this cannot be bypassed.
This command should be run from cron or a timer at a regular interval (e.g. daily, hourly, etc.) which defines
how often backups are taken. Backup format (full/incremental) and retention is based only on the number of
recorded backups, not on the time interval between them. Backups taken manually outside of the "autobackup"
command are not counted towards the format or retention of autobackups.
The PVC_AUTOBACKUP_CFGFILE envvar or "-f"/"--configuration" option can be used to override the default
configuration file path if required by a particular run. For full details of the possible options, please
see the example configuration file at "/usr/share/pvc/autobackup.sample.yaml".
The "--force-full" option can be used to force all configured VMs to perform a "full" level backup this run,
which can help synchronize the backups of existing VMs with new ones.
"""
# All work here is done in the helper function for portability; we don't even use "finish"
vm_autobackup(CLI_CONFIG, autobackup_cfgfile, force_full_flag, cron_flag)
###############################################################################
# > pvc vm tag
###############################################################################
@ -3457,14 +3649,14 @@ def cli_storage_pool():
show_default=True,
required=False,
help="""
The replication configuration, specifying both a "copies" and "mincopies" value, separated by a comma, e.g. "copies=3,mincopies=2". The "copies" value specifies the total number of replicas and should not exceed the total number of nodes; the "mincopies" value specifies the minimum number of available copies to allow writes. For additional details please see the Cluster Architecture documentation.
The replication configuration, specifying both a "copies" and "mincopies" value, separated by a comma, e.g. "copies=3,mincopies=2". The "copies" value specifies the total number of replicas and the "mincopies" value specifies the minimum number of active replicas to allow I/O. For additional details please see the documentation.
""",
)
def cli_storage_pool_add(name, pgs, tier, replcfg):
"""
Add a new Ceph RBD pool with name NAME and PGS placement groups.
The placement group count must be a non-zero power of 2.
The placement group count must be a non-zero power of 2. Generally you should choose a PGS number such that there will be 50-150 PGs on each OSD in a single node (before replicas); 64, 128, or 256 are good values for small clusters (1-5 OSDs per node); higher values are recommended for higher node or OSD counts. For additional details please see the documentation.
"""
retcode, retmsg = pvc.lib.storage.ceph_pool_add(
@ -3503,9 +3695,9 @@ def cli_storage_pool_set_pgs(name, pgs):
"""
Set the placement groups (PGs) count for the pool NAME to PGS.
The placement group count must be a non-zero power of 2.
The placement group count must be a non-zero power of 2. Generally you should choose a PGS number such that there will be 50-150 PGs on each OSD in a single node (before replicas); 64, 128, or 256 are good values for small clusters (1-5 OSDs per node); higher values are recommended for higher node or OSD counts. For additional details please see the documentation.
Placement group counts may be increased or decreased as required though frequent alteration is not recommended.
Placement group counts may be increased or decreased as required though frequent alteration is not recommended. Placement group alterations are intensive operations on the storage cluster.
"""
retcode, retmsg = pvc.lib.storage.ceph_pool_set_pgs(CLI_CONFIG, name, pgs)
@ -5614,6 +5806,29 @@ def cli(
"""
global CLI_CONFIG
CLI_CONFIG["quiet"] = _quiet
CLI_CONFIG["silent"] = _silent
cli_client_dir = environ.get("PVC_CLIENT_DIR", None)
home_dir = environ.get("HOME", None)
if cli_client_dir:
store_path = cli_client_dir
elif home_dir:
store_path = f"{home_dir}/.config/pvc"
else:
echo(
CLI_CONFIG,
"WARNING: No client or home configuration directory found; using /tmp instead",
stderr=True,
)
store_path = "/tmp/pvc"
if not path.isdir(store_path):
makedirs(store_path)
if not path.isfile(f"{store_path}/{DEFAULT_STORE_FILENAME}"):
update_store(store_path, {"local": DEFAULT_STORE_DATA})
store_data = get_store(store_path)
# If the connection isn't in the store, mark it bad but pass the value
@ -5659,6 +5874,11 @@ cli_vm.add_command(cli_vm_move)
cli_vm.add_command(cli_vm_migrate)
cli_vm.add_command(cli_vm_unmigrate)
cli_vm.add_command(cli_vm_flush_locks)
cli_vm_backup.add_command(cli_vm_backup_create)
cli_vm_backup.add_command(cli_vm_backup_restore)
cli_vm_backup.add_command(cli_vm_backup_remove)
cli_vm.add_command(cli_vm_backup)
cli_vm.add_command(cli_vm_autobackup)
cli_vm_tag.add_command(cli_vm_tag_get)
cli_vm_tag.add_command(cli_vm_tag_add)
cli_vm_tag.add_command(cli_vm_tag_remove)

View File

@ -20,25 +20,33 @@
###############################################################################
from click import echo as click_echo
from click import progressbar
from click import progressbar, confirm
from datetime import datetime
from distutils.util import strtobool
from getpass import getuser
from json import load as jload
from json import dump as jdump
from os import chmod, environ, getpid, path
from os import chmod, environ, getpid, path, makedirs
from re import findall
from socket import gethostname
from subprocess import run, PIPE
from sys import argv
from syslog import syslog, openlog, closelog, LOG_AUTH
from time import sleep
from yaml import load as yload
from yaml import BaseLoader
from yaml import BaseLoader, SafeLoader
import pvc.lib.provisioner
import pvc.lib.vm
import pvc.lib.node
DEFAULT_STORE_DATA = {"cfgfile": "/etc/pvc/pvcapid.yaml"}
DEFAULT_STORE_FILENAME = "pvc.json"
DEFAULT_API_PREFIX = "/api/v1"
DEFAULT_NODE_HOSTNAME = gethostname().split(".")[0]
DEFAULT_AUTOBACKUP_FILENAME = "/etc/pvc/autobackup.yaml"
MAX_CONTENT_WIDTH = 120
def echo(config, message, newline=True, stderr=False):
@ -65,10 +73,9 @@ def audit():
"""
args = argv
args[0] = "pvc"
pid = getpid()
openlog(facility=LOG_AUTH, ident=f"{args[0]}[{pid}]")
openlog(facility=LOG_AUTH, ident=f"{args[0].split('/')[-1]}[{pid}]")
syslog(
f"""client audit: command "{' '.join(args)}" by user {environ.get('USER', None)}"""
)
@ -239,3 +246,322 @@ def wait_for_provisioner(CLI_CONFIG, task_id):
retdata = task_status.get("state") + ": " + task_status.get("status")
return retdata
def get_autobackup_config(CLI_CONFIG, cfgfile):
try:
config = dict()
with open(cfgfile) as fh:
backup_config = yload(fh, Loader=SafeLoader)["autobackup"]
config["backup_root_path"] = backup_config["backup_root_path"]
config["backup_root_suffix"] = backup_config["backup_root_suffix"]
config["backup_tags"] = backup_config["backup_tags"]
config["backup_schedule"] = backup_config["backup_schedule"]
config["auto_mount_enabled"] = backup_config["auto_mount"]["enabled"]
if config["auto_mount_enabled"]:
config["mount_cmds"] = list()
_mount_cmds = backup_config["auto_mount"]["mount_cmds"]
for _mount_cmd in _mount_cmds:
if "{backup_root_path}" in _mount_cmd:
_mount_cmd = _mount_cmd.format(
backup_root_path=backup_config["backup_root_path"]
)
config["mount_cmds"].append(_mount_cmd)
config["unmount_cmds"] = list()
_unmount_cmds = backup_config["auto_mount"]["unmount_cmds"]
for _unmount_cmd in _unmount_cmds:
if "{backup_root_path}" in _unmount_cmd:
_unmount_cmd = _unmount_cmd.format(
backup_root_path=backup_config["backup_root_path"]
)
config["unmount_cmds"].append(_unmount_cmd)
except FileNotFoundError:
echo(CLI_CONFIG, "ERROR: Specified backup configuration does not exist!")
exit(1)
except KeyError as e:
echo(CLI_CONFIG, f"ERROR: Backup configuration is invalid: {e}")
exit(1)
return config
def vm_autobackup(
CLI_CONFIG,
autobackup_cfgfile=DEFAULT_AUTOBACKUP_FILENAME,
force_full_flag=False,
cron_flag=False,
):
"""
Perform automatic backups of VMs based on an external config file.
"""
# Validate that we are running on the current primary coordinator of the 'local' cluster connection
real_connection = CLI_CONFIG["connection"]
CLI_CONFIG["connection"] = "local"
retcode, retdata = pvc.lib.node.node_info(CLI_CONFIG, DEFAULT_NODE_HOSTNAME)
if not retcode or retdata.get("coordinator_state") != "primary":
if cron_flag:
echo(
CLI_CONFIG,
"Current host is not the primary coordinator of the local cluster and running in cron mode. Exiting cleanly.",
)
exit(0)
else:
echo(
CLI_CONFIG,
f"ERROR: Current host is not the primary coordinator of the local cluster; got connection '{real_connection}', host '{DEFAULT_NODE_HOSTNAME}'.",
)
echo(
CLI_CONFIG,
"Autobackup MUST be run from the cluster active primary coordinator using the 'local' connection. See '-h'/'--help' for details.",
)
exit(1)
# Ensure we're running as root, or show a warning & confirmation
if getuser() != "root":
confirm(
"WARNING: You are not running this command as 'root'. This command should be run under the same user as the API daemon, which is usually 'root'. Are you sure you want to continue?",
prompt_suffix=" ",
abort=True,
)
# Load our YAML config
autobackup_config = get_autobackup_config(CLI_CONFIG, autobackup_cfgfile)
# Get a list of all VMs on the cluster
# We don't do tag filtering here, because we could match an arbitrary number of tags; instead, we
# parse the list after
retcode, retdata = pvc.lib.vm.vm_list(CLI_CONFIG, None, None, None, None, None)
if not retcode:
echo(CLI_CONFIG, f"ERROR: Failed to fetch VM list: {retdata}")
exit(1)
cluster_vms = retdata
# Parse the list to match tags; too complex for list comprehension alas
backup_vms = list()
for vm in cluster_vms:
vm_tag_names = [t["name"] for t in vm["tags"]]
matching_tags = (
True
if len(
set(vm_tag_names).intersection(set(autobackup_config["backup_tags"]))
)
> 0
else False
)
if matching_tags:
backup_vms.append(vm["name"])
if len(backup_vms) < 1:
echo(CLI_CONFIG, "Found no suitable VMs for autobackup.")
exit(0)
# Pretty print the names of the VMs we'll back up (to stderr)
maxnamelen = max([len(n) for n in backup_vms]) + 2
cols = 1
while (cols * maxnamelen + maxnamelen + 2) <= MAX_CONTENT_WIDTH:
cols += 1
rows = len(backup_vms) // cols
vm_list_rows = list()
for row in range(0, rows + 1):
row_start = row * cols
row_end = (row * cols) + cols
row_str = ""
for x in range(row_start, row_end):
if x < len(backup_vms):
row_str += "{:<{}}".format(backup_vms[x], maxnamelen)
vm_list_rows.append(row_str)
echo(CLI_CONFIG, f"Found {len(backup_vms)} suitable VM(s) for autobackup.")
echo(CLI_CONFIG, "Full VM list:", stderr=True)
echo(CLI_CONFIG, " {}".format("\n ".join(vm_list_rows)), stderr=True)
echo(CLI_CONFIG, "", stderr=True)
if autobackup_config["auto_mount_enabled"]:
# Execute each mount_cmds command in sequence
for cmd in autobackup_config["mount_cmds"]:
echo(
CLI_CONFIG,
f"Executing mount command '{cmd.split()[0]}'... ",
newline=False,
)
tstart = datetime.now()
ret = run(
cmd.split(),
stdout=PIPE,
stderr=PIPE,
)
tend = datetime.now()
ttot = tend - tstart
if ret.returncode != 0:
echo(
CLI_CONFIG,
f"failed. [{ttot.seconds}s]",
)
echo(
CLI_CONFIG,
f"Exiting; command reports: {ret.stderr.decode().strip()}",
)
exit(1)
else:
echo(CLI_CONFIG, f"done. [{ttot.seconds}s]")
# For each VM, perform the backup
for vm in backup_vms:
backup_suffixed_path = f"{autobackup_config['backup_root_path']}{autobackup_config['backup_root_suffix']}"
if not path.exists(backup_suffixed_path):
makedirs(backup_suffixed_path)
backup_path = f"{backup_suffixed_path}/{vm}"
autobackup_state_file = f"{backup_path}/.autobackup.json"
if not path.exists(backup_path) or not path.exists(autobackup_state_file):
# There are no new backups so the list is empty
state_data = dict()
tracked_backups = list()
else:
with open(autobackup_state_file) as fh:
state_data = jload(fh)
tracked_backups = state_data["tracked_backups"]
full_interval = autobackup_config["backup_schedule"]["full_interval"]
full_retention = autobackup_config["backup_schedule"]["full_retention"]
full_backups = [b for b in tracked_backups if b["type"] == "full"]
if len(full_backups) > 0:
last_full_backup = full_backups[0]
last_full_backup_idx = tracked_backups.index(last_full_backup)
if force_full_flag:
this_backup_type = "forced-full"
this_backup_incremental_parent = None
this_backup_retain_snapshot = True
elif last_full_backup_idx >= full_interval - 1:
this_backup_type = "full"
this_backup_incremental_parent = None
this_backup_retain_snapshot = True
else:
this_backup_type = "incremental"
this_backup_incremental_parent = last_full_backup["datestring"]
this_backup_retain_snapshot = False
else:
# The very first backup must be full to start the tree
this_backup_type = "full"
this_backup_incremental_parent = None
this_backup_retain_snapshot = True
# Perform the backup
echo(
CLI_CONFIG,
f"Backing up VM '{vm}' ({this_backup_type})... ",
newline=False,
)
tstart = datetime.now()
retcode, retdata = pvc.lib.vm.vm_backup(
CLI_CONFIG,
vm,
backup_suffixed_path,
incremental_parent=this_backup_incremental_parent,
retain_snapshot=this_backup_retain_snapshot,
)
tend = datetime.now()
ttot = tend - tstart
if not retcode:
echo(CLI_CONFIG, f"failed. [{ttot.seconds}s]")
echo(CLI_CONFIG, f"Skipping cleanups; command reports: {retdata}")
continue
else:
backup_datestring = findall(r"[0-9]{14}", retdata)[0]
echo(
CLI_CONFIG,
f"done. Backup '{backup_datestring}' created. [{ttot.seconds}s]",
)
# Read backup file to get details
backup_json_file = f"{backup_path}/{backup_datestring}/pvcbackup.json"
with open(backup_json_file) as fh:
backup_json = jload(fh)
backup = {
"datestring": backup_json["datestring"],
"type": backup_json["type"],
"parent": backup_json["incremental_parent"],
"retained_snapshot": backup_json["retained_snapshot"],
}
tracked_backups.insert(0, backup)
# Delete any full backups that are expired
marked_for_deletion = list()
found_full_count = 0
for backup in tracked_backups:
if backup["type"] == "full":
found_full_count += 1
if found_full_count > full_retention:
marked_for_deletion.append(backup)
# Depete any incremental backups that depend on marked parents
for backup in tracked_backups:
if backup["type"] == "incremental" and backup["parent"] in [
b["datestring"] for b in marked_for_deletion
]:
marked_for_deletion.append(backup)
# Execute deletes
for backup_to_delete in marked_for_deletion:
echo(
CLI_CONFIG,
f"Removing old VM '{vm}' backup '{backup_to_delete['datestring']}' ({backup_to_delete['type']})... ",
newline=False,
)
tstart = datetime.now()
retcode, retdata = pvc.lib.vm.vm_remove_backup(
CLI_CONFIG,
vm,
backup_suffixed_path,
backup_to_delete["datestring"],
)
tend = datetime.now()
ttot = tend - tstart
if not retcode:
echo(CLI_CONFIG, f"failed. [{ttot.seconds}s]")
echo(
CLI_CONFIG,
f"Skipping removal from tracked backups; command reports: {retdata}",
)
continue
else:
tracked_backups.remove(backup_to_delete)
echo(CLI_CONFIG, f"done. [{ttot.seconds}s]")
# Update tracked state information
state_data["tracked_backups"] = tracked_backups
with open(autobackup_state_file, "w") as fh:
jdump(state_data, fh)
if autobackup_config["auto_mount_enabled"]:
# Execute each unmount_cmds command in sequence
for cmd in autobackup_config["unmount_cmds"]:
echo(
CLI_CONFIG,
f"Executing unmount command '{cmd.split()[0]}'... ",
newline=False,
)
tstart = datetime.now()
ret = run(
cmd.split(),
stdout=PIPE,
stderr=PIPE,
)
tend = datetime.now()
ttot = tend - tstart
if ret.returncode != 0:
echo(
CLI_CONFIG,
f"failed. [{ttot.seconds}s]",
)
echo(
CLI_CONFIG,
f"Continuing; command reports: {ret.stderr.decode().strip()}",
)
else:
echo(CLI_CONFIG, f"done. [{ttot.seconds}s]")

View File

@ -433,6 +433,70 @@ def vm_locks(config, vm):
return retstatus, response.json().get("message", "")
def vm_backup(config, vm, backup_path, incremental_parent=None, retain_snapshot=False):
"""
Create a backup of {vm} and its volumes to a local primary coordinator filesystem path
API endpoint: POST /vm/{vm}/backup
API arguments: backup_path={backup_path}, incremental_parent={incremental_parent}, retain_snapshot={retain_snapshot}
API schema: {"message":"{data}"}
"""
params = {
"backup_path": backup_path,
"incremental_parent": incremental_parent,
"retain_snapshot": retain_snapshot,
}
response = call_api(config, "post", "/vm/{vm}/backup".format(vm=vm), params=params)
if response.status_code != 200:
return False, response.json().get("message", "")
else:
return True, response.json().get("message", "")
def vm_remove_backup(config, vm, backup_path, backup_datestring):
"""
Remove a backup of {vm}, including snapshots, from a local primary coordinator filesystem path
API endpoint: DELETE /vm/{vm}/backup
API arguments: backup_path={backup_path}, backup_datestring={backup_datestring}
API schema: {"message":"{data}"}
"""
params = {
"backup_path": backup_path,
"backup_datestring": backup_datestring,
}
response = call_api(
config, "delete", "/vm/{vm}/backup".format(vm=vm), params=params
)
if response.status_code != 200:
return False, response.json().get("message", "")
else:
return True, response.json().get("message", "")
def vm_restore(config, vm, backup_path, backup_datestring, retain_snapshot=False):
"""
Restore a backup of {vm} and its volumes from a local primary coordinator filesystem path
API endpoint: POST /vm/{vm}/restore
API arguments: backup_path={backup_path}, backup_datestring={backup_datestring}, retain_snapshot={retain_snapshot}
API schema: {"message":"{data}"}
"""
params = {
"backup_path": backup_path,
"backup_datestring": backup_datestring,
"retain_snapshot": retain_snapshot,
}
response = call_api(config, "post", "/vm/{vm}/restore".format(vm=vm), params=params)
if response.status_code != 200:
return False, response.json().get("message", "")
else:
return True, response.json().get("message", "")
def vm_vcpus_set(config, vm, vcpus, topology, restart):
"""
Set the vCPU count of the VM with topology

View File

@ -2,7 +2,7 @@ from setuptools import setup
setup(
name="pvc",
version="0.9.78",
version="0.9.80",
packages=["pvc.cli", "pvc.lib"],
install_requires=[
"Click",

View File

@ -1113,23 +1113,24 @@ def getCephSnapshots(zkhandler, pool, volume):
return snapshot_list
def add_snapshot(zkhandler, pool, volume, name):
def add_snapshot(zkhandler, pool, volume, name, zk_only=False):
if not verifyVolume(zkhandler, pool, volume):
return False, 'ERROR: No volume with name "{}" is present in pool "{}".'.format(
volume, pool
)
# 1. Create the snapshot
retcode, stdout, stderr = common.run_os_command(
"rbd snap create {}/{}@{}".format(pool, volume, name)
)
if retcode:
return (
False,
'ERROR: Failed to create RBD snapshot "{}" of volume "{}" in pool "{}": {}'.format(
name, volume, pool, stderr
),
if not zk_only:
retcode, stdout, stderr = common.run_os_command(
"rbd snap create {}/{}@{}".format(pool, volume, name)
)
if retcode:
return (
False,
'ERROR: Failed to create RBD snapshot "{}" of volume "{}" in pool "{}": {}'.format(
name, volume, pool, stderr
),
)
# 2. Add the snapshot to Zookeeper
zkhandler.write(

View File

@ -146,7 +146,11 @@ def run_os_daemon(command_string, environment=None, logfile=None):
# Run a local OS command via shell
#
def run_os_command(command_string, background=False, environment=None, timeout=None):
command = shlex_split(command_string)
if not isinstance(command_string, list):
command = shlex_split(command_string)
else:
command = command_string
if background:
def runcmd():

View File

@ -21,12 +21,18 @@
import time
import re
import os.path
import lxml.objectify
import lxml.etree
from distutils.util import strtobool
from uuid import UUID
from concurrent.futures import ThreadPoolExecutor
from datetime import datetime
from distutils.util import strtobool
from json import dump as jdump
from json import load as jload
from shutil import rmtree
from socket import gethostname
from uuid import UUID
import daemon_lib.common as common
@ -1175,13 +1181,15 @@ def get_info(zkhandler, domain):
return True, domain_information
def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True, negate=False):
if node:
def get_list(
zkhandler, node=None, state=None, tag=None, limit=None, is_fuzzy=True, negate=False
):
if node is not None:
# Verify node is valid
if not common.verifyNode(zkhandler, node):
return False, 'Specified node "{}" is invalid.'.format(node)
if state:
if state is not None:
valid_states = [
"start",
"restart",
@ -1200,7 +1208,7 @@ def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True, negate=False):
full_vm_list.sort()
# Set our limit to a sensible regex
if limit:
if limit is not None:
# Check if the limit is a UUID
is_limit_uuid = False
try:
@ -1229,7 +1237,7 @@ def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True, negate=False):
is_state_match = False
# Check on limit
if limit:
if limit is not None:
# Try to match the limit against the UUID (if applicable) and name
try:
if is_limit_uuid and re.fullmatch(limit, vm):
@ -1241,7 +1249,7 @@ def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True, negate=False):
else:
is_limit_match = True
if tag:
if tag is not None:
vm_tags = zkhandler.children(("domain.meta.tags", vm))
if negate and tag not in vm_tags:
is_tag_match = True
@ -1251,7 +1259,7 @@ def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True, negate=False):
is_tag_match = True
# Check on node
if node:
if node is not None:
vm_node = zkhandler.read(("domain.node", vm))
if negate and vm_node != node:
is_node_match = True
@ -1261,7 +1269,7 @@ def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True, negate=False):
is_node_match = True
# Check on state
if state:
if state is not None:
vm_state = zkhandler.read(("domain.state", vm))
if negate and vm_state != state:
is_state_match = True
@ -1297,3 +1305,541 @@ def get_list(zkhandler, node, state, tag, limit, is_fuzzy=True, negate=False):
pass
return True, sorted(vm_data_list, key=lambda d: d["name"])
def backup_vm(
zkhandler, domain, backup_path, incremental_parent=None, retain_snapshot=False
):
tstart = time.time()
# 0. Validations
# Disallow retaining snapshots with an incremental parent
if incremental_parent is not None and retain_snapshot:
return (
False,
"ERROR: Retaining snapshots of incremental backups is not supported!",
)
# Validate that VM exists in cluster
dom_uuid = getDomainUUID(zkhandler, domain)
if not dom_uuid:
return False, 'ERROR: Could not find VM "{}" in the cluster!'.format(domain)
# Validate that the target path is valid
if not re.match(r"^/", backup_path):
return (
False,
f"ERROR: Target path {backup_path} is not a valid absolute path on the primary coordinator!",
)
# Ensure that backup_path (on this node) exists
if not os.path.isdir(backup_path):
return False, f"ERROR: Target path {backup_path} does not exist!"
# 1. Get information about VM
vm_detail = get_list(zkhandler, limit=dom_uuid, is_fuzzy=False)[1][0]
if not isinstance(vm_detail, dict):
return False, f"ERROR: VM listing returned invalid data: {vm_detail}"
vm_volumes = list()
for disk in vm_detail["disks"]:
if disk["type"] != "rbd":
continue
pool, volume = disk["name"].split("/")
retcode, retdata = ceph.get_list_volume(zkhandler, pool, volume, is_fuzzy=False)
if not retcode or len(retdata) != 1:
if len(retdata) < 1:
retdata = "No volumes returned."
elif len(retdata) > 1:
retdata = "Multiple volumes returned."
return (
False,
f"ERROR: Failed to get volume details for {pool}/{volume}: {retdata}",
)
try:
size = retdata[0]["stats"]["size"]
except Exception as e:
return False, f"ERROR: Failed to get volume size for {pool}/{volume}: {e}"
vm_volumes.append((pool, volume, size))
# 2a. Validate that all volumes exist (they should, but just in case)
for pool, volume, _ in vm_volumes:
if not ceph.verifyVolume(zkhandler, pool, volume):
return (
False,
f"ERROR: VM defines a volume {pool}/{volume} which does not exist!",
)
# 2b. Validate that, if an incremental_parent is given, it is valid
# The incremental parent is just a datestring
if incremental_parent is not None:
for pool, volume, _ in vm_volumes:
if not ceph.verifySnapshot(
zkhandler, pool, volume, f"backup_{incremental_parent}"
):
return (
False,
f"ERROR: Incremental parent {incremental_parent} given, but no snapshots were found; cannot export an incremental backup.",
)
export_fileext = "rbddiff"
else:
export_fileext = "rbdimg"
# 2c. Validate that there's enough space on the target
# TODO
# 3. Set datestring in YYYYMMDDHHMMSS format
now = datetime.now()
datestring = now.strftime("%Y%m%d%H%M%S")
snapshot_name = f"backup_{datestring}"
# 4. Create destination directory
vm_target_root = f"{backup_path}/{domain}"
vm_target_backup = f"{backup_path}/{domain}/{datestring}/pvcdisks"
if not os.path.isdir(vm_target_backup):
try:
os.makedirs(vm_target_backup)
except Exception as e:
return False, f"ERROR: Failed to create backup directory: {e}"
# 5. Take snapshot of each disks with the name @backup_{datestring}
is_snapshot_create_failed = False
which_snapshot_create_failed = list()
msg_snapshot_create_failed = list()
for pool, volume, _ in vm_volumes:
retcode, retmsg = ceph.add_snapshot(zkhandler, pool, volume, snapshot_name)
if not retcode:
is_snapshot_create_failed = True
which_snapshot_create_failed.append(f"{pool}/{volume}")
msg_snapshot_create_failed.append(retmsg)
if is_snapshot_create_failed:
for pool, volume, _ in vm_volumes:
if ceph.verifySnapshot(zkhandler, pool, volume, snapshot_name):
ceph.remove_snapshot(zkhandler, pool, volume, snapshot_name)
return (
False,
f'ERROR: Failed to create snapshot for volume(s) {", ".join(which_snapshot_create_failed)}: {", ".join(msg_snapshot_create_failed)}',
)
# 6. Dump snapshot to folder with `rbd export` (full) or `rbd export-diff` (incremental)
is_snapshot_export_failed = False
which_snapshot_export_failed = list()
msg_snapshot_export_failed = list()
for pool, volume, _ in vm_volumes:
if incremental_parent is not None:
incremental_parent_snapshot_name = f"backup_{incremental_parent}"
retcode, stdout, stderr = common.run_os_command(
f"rbd export-diff --from-snap {incremental_parent_snapshot_name} {pool}/{volume}@{snapshot_name} {vm_target_backup}/{pool}.{volume}.{export_fileext}"
)
if retcode:
is_snapshot_export_failed = True
which_snapshot_export_failed.append(f"{pool}/{volume}")
msg_snapshot_export_failed.append(stderr)
else:
retcode, stdout, stderr = common.run_os_command(
f"rbd export --export-format 2 {pool}/{volume}@{snapshot_name} {vm_target_backup}/{pool}.{volume}.{export_fileext}"
)
if retcode:
is_snapshot_export_failed = True
which_snapshot_export_failed.append(f"{pool}/{volume}")
msg_snapshot_export_failed.append(stderr)
if is_snapshot_export_failed:
for pool, volume, _ in vm_volumes:
if ceph.verifySnapshot(zkhandler, pool, volume, snapshot_name):
ceph.remove_snapshot(zkhandler, pool, volume, snapshot_name)
return (
False,
f'ERROR: Failed to export snapshot for volume(s) {", ".join(which_snapshot_export_failed)}: {", ".join(msg_snapshot_export_failed)}',
)
# 7. Create and dump VM backup information
backup_type = "incremental" if incremental_parent is not None else "full"
vm_backup = {
"type": backup_type,
"datestring": datestring,
"incremental_parent": incremental_parent,
"retained_snapshot": retain_snapshot,
"vm_detail": vm_detail,
"backup_files": [
(f"pvcdisks/{p}.{v}.{export_fileext}", s) for p, v, s in vm_volumes
],
}
with open(f"{vm_target_root}/{datestring}/pvcbackup.json", "w") as fh:
jdump(vm_backup, fh)
# 8. Remove snapshots if retain_snapshot is False
is_snapshot_remove_failed = False
which_snapshot_remove_failed = list()
msg_snapshot_remove_failed = list()
if not retain_snapshot:
for pool, volume, _ in vm_volumes:
if ceph.verifySnapshot(zkhandler, pool, volume, snapshot_name):
retcode, retmsg = ceph.remove_snapshot(
zkhandler, pool, volume, snapshot_name
)
if not retcode:
is_snapshot_remove_failed = True
which_snapshot_remove_failed.append(f"{pool}/{volume}")
msg_snapshot_remove_failed.append(retmsg)
tend = time.time()
ttot = round(tend - tstart, 2)
retlines = list()
if is_snapshot_remove_failed:
retlines.append(
f"WARNING: Failed to remove snapshot(s) as requested for volume(s) {', '.join(which_snapshot_remove_failed)}: {', '.join(msg_snapshot_remove_failed)}"
)
myhostname = gethostname().split(".")[0]
if retain_snapshot:
retlines.append(
f"Successfully backed up VM '{domain}' ({backup_type}@{datestring}, snapshots retained) to '{myhostname}:{backup_path}' in {ttot}s."
)
else:
retlines.append(
f"Successfully backed up VM '{domain}' ({backup_type}@{datestring}) to '{myhostname}:{backup_path}' in {ttot}s."
)
return True, "\n".join(retlines)
def remove_backup(zkhandler, domain, backup_path, datestring):
tstart = time.time()
# 0. Validation
# Validate that VM exists in cluster
dom_uuid = getDomainUUID(zkhandler, domain)
if not dom_uuid:
return False, 'ERROR: Could not find VM "{}" in the cluster!'.format(domain)
# Validate that the source path is valid
if not re.match(r"^/", backup_path):
return (
False,
f"ERROR: Source path {backup_path} is not a valid absolute path on the primary coordinator!",
)
# Ensure that backup_path (on this node) exists
if not os.path.isdir(backup_path):
return False, f"ERROR: Source path {backup_path} does not exist!"
# Ensure that domain path (on this node) exists
vm_backup_path = f"{backup_path}/{domain}"
if not os.path.isdir(vm_backup_path):
return False, f"ERROR: Source VM path {vm_backup_path} does not exist!"
# Ensure that the archives are present
backup_source_pvcbackup_file = f"{vm_backup_path}/{datestring}/pvcbackup.json"
if not os.path.isfile(backup_source_pvcbackup_file):
return False, "ERROR: The specified source backup files do not exist!"
backup_source_pvcdisks_path = f"{vm_backup_path}/{datestring}/pvcdisks"
if not os.path.isdir(backup_source_pvcdisks_path):
return False, "ERROR: The specified source backup files do not exist!"
# 1. Read the backup file and get VM details
try:
with open(backup_source_pvcbackup_file) as fh:
backup_source_details = jload(fh)
except Exception as e:
return False, f"ERROR: Failed to read source backup details: {e}"
# 2. Remove snapshots
is_snapshot_remove_failed = False
which_snapshot_remove_failed = list()
msg_snapshot_remove_failed = list()
if backup_source_details["retained_snapshot"]:
for volume_file, _ in backup_source_details.get("backup_files"):
pool, volume, _ = volume_file.split("/")[-1].split(".")
snapshot = f"backup_{datestring}"
retcode, retmsg = ceph.remove_snapshot(zkhandler, pool, volume, snapshot)
if not retcode:
is_snapshot_remove_failed = True
which_snapshot_remove_failed.append(f"{pool}/{volume}")
msg_snapshot_remove_failed.append(retmsg)
# 3. Remove files
is_files_remove_failed = False
msg_files_remove_failed = None
try:
rmtree(f"{vm_backup_path}/{datestring}")
except Exception as e:
is_files_remove_failed = True
msg_files_remove_failed = e
tend = time.time()
ttot = round(tend - tstart, 2)
retlines = list()
if is_snapshot_remove_failed:
retlines.append(
f"WARNING: Failed to remove snapshot(s) as requested for volume(s) {', '.join(which_snapshot_remove_failed)}: {', '.join(msg_snapshot_remove_failed)}"
)
if is_files_remove_failed:
retlines.append(
f"WARNING: Failed to remove backup file(s) from {backup_path}: {msg_files_remove_failed}"
)
myhostname = gethostname().split(".")[0]
retlines.append(
f"Removed VM backup {datestring} for '{domain}' from '{myhostname}:{backup_path}' in {ttot}s."
)
return True, "\n".join(retlines)
def restore_vm(zkhandler, domain, backup_path, datestring, retain_snapshot=False):
tstart = time.time()
# 0. Validations
# Validate that VM does not exist in cluster
dom_uuid = getDomainUUID(zkhandler, domain)
if dom_uuid:
return (
False,
f'ERROR: VM "{domain}" already exists in the cluster! Remove or rename it before restoring a backup.',
)
# Validate that the source path is valid
if not re.match(r"^/", backup_path):
return (
False,
f"ERROR: Source path {backup_path} is not a valid absolute path on the primary coordinator!",
)
# Ensure that backup_path (on this node) exists
if not os.path.isdir(backup_path):
return False, f"ERROR: Source path {backup_path} does not exist!"
# Ensure that domain path (on this node) exists
vm_backup_path = f"{backup_path}/{domain}"
if not os.path.isdir(vm_backup_path):
return False, f"ERROR: Source VM path {vm_backup_path} does not exist!"
# Ensure that the archives are present
backup_source_pvcbackup_file = f"{vm_backup_path}/{datestring}/pvcbackup.json"
if not os.path.isfile(backup_source_pvcbackup_file):
return False, "ERROR: The specified source backup files do not exist!"
# 1. Read the backup file and get VM details
try:
with open(backup_source_pvcbackup_file) as fh:
backup_source_details = jload(fh)
except Exception as e:
return False, f"ERROR: Failed to read source backup details: {e}"
# Handle incrementals
incremental_parent = backup_source_details.get("incremental_parent", None)
if incremental_parent is not None:
backup_source_parent_pvcbackup_file = (
f"{vm_backup_path}/{incremental_parent}/pvcbackup.json"
)
if not os.path.isfile(backup_source_parent_pvcbackup_file):
return (
False,
"ERROR: The specified backup is incremental but the required incremental parent source backup files do not exist!",
)
try:
with open(backup_source_parent_pvcbackup_file) as fh:
backup_source_parent_details = jload(fh)
except Exception as e:
return (
False,
f"ERROR: Failed to read source incremental parent backup details: {e}",
)
# 2. Import VM config and metadata in provision state
try:
retcode, retmsg = define_vm(
zkhandler,
backup_source_details["vm_detail"]["xml"],
backup_source_details["vm_detail"]["node"],
backup_source_details["vm_detail"]["node_limit"],
backup_source_details["vm_detail"]["node_selector"],
backup_source_details["vm_detail"]["node_autostart"],
backup_source_details["vm_detail"]["migration_method"],
backup_source_details["vm_detail"]["profile"],
backup_source_details["vm_detail"]["tags"],
"restore",
)
if not retcode:
return False, f"ERROR: Failed to define restored VM: {retmsg}"
except Exception as e:
return False, f"ERROR: Failed to parse VM backup details: {e}"
# 4. Import volumes
is_snapshot_remove_failed = False
which_snapshot_remove_failed = list()
msg_snapshot_remove_failed = list()
if incremental_parent is not None:
for volume_file, volume_size in backup_source_details.get("backup_files"):
pool, volume, _ = volume_file.split("/")[-1].split(".")
try:
parent_volume_file = [
f[0]
for f in backup_source_parent_details.get("backup_files")
if f[0].split("/")[-1].replace(".rbdimg", "")
== volume_file.split("/")[-1].replace(".rbddiff", "")
][0]
except Exception as e:
return (
False,
f"ERROR: Failed to find parent volume for volume {pool}/{volume}; backup may be corrupt or invalid: {e}",
)
# First we create the expected volumes then clean them up
# This process is a bit of a hack because rbd import does not expect an existing volume,
# but we need the information in PVC.
# Thus create the RBD volume using ceph.add_volume based on the backup size, and then
# manually remove the RBD volume (leaving the PVC metainfo)
retcode, retmsg = ceph.add_volume(zkhandler, pool, volume, volume_size)
if not retcode:
return False, f"ERROR: Failed to create restored volume: {retmsg}"
retcode, stdout, stderr = common.run_os_command(
f"rbd remove {pool}/{volume}"
)
if retcode:
return (
False,
f"ERROR: Failed to remove temporary RBD volume '{pool}/{volume}': {stderr}",
)
# Next we import the parent images
retcode, stdout, stderr = common.run_os_command(
f"rbd import --export-format 2 --dest-pool {pool} {backup_path}/{domain}/{incremental_parent}/{parent_volume_file} {volume}"
)
if retcode:
return (
False,
f"ERROR: Failed to import parent backup image {parent_volume_file}: {stderr}",
)
# Then we import the incremental diffs
retcode, stdout, stderr = common.run_os_command(
f"rbd import-diff {backup_path}/{domain}/{datestring}/{volume_file} {pool}/{volume}"
)
if retcode:
return (
False,
f"ERROR: Failed to import incremental backup image {volume_file}: {stderr}",
)
# Finally we remove the parent and child snapshots (no longer required required)
if retain_snapshot:
retcode, retmsg = ceph.add_snapshot(
zkhandler,
pool,
volume,
f"backup_{incremental_parent}",
zk_only=True,
)
if not retcode:
return (
False,
f"ERROR: Failed to add imported image snapshot for {parent_volume_file}: {retmsg}",
)
else:
retcode, stdout, stderr = common.run_os_command(
f"rbd snap rm {pool}/{volume}@backup_{incremental_parent}"
)
if retcode:
is_snapshot_remove_failed = True
which_snapshot_remove_failed.append(f"{pool}/{volume}")
msg_snapshot_remove_failed.append(retmsg)
retcode, stdout, stderr = common.run_os_command(
f"rbd snap rm {pool}/{volume}@backup_{datestring}"
)
if retcode:
is_snapshot_remove_failed = True
which_snapshot_remove_failed.append(f"{pool}/{volume}")
msg_snapshot_remove_failed.append(retmsg)
else:
for volume_file, volume_size in backup_source_details.get("backup_files"):
pool, volume, _ = volume_file.split("/")[-1].split(".")
# First we create the expected volumes then clean them up
# This process is a bit of a hack because rbd import does not expect an existing volume,
# but we need the information in PVC.
# Thus create the RBD volume using ceph.add_volume based on the backup size, and then
# manually remove the RBD volume (leaving the PVC metainfo)
retcode, retmsg = ceph.add_volume(zkhandler, pool, volume, volume_size)
if not retcode:
return False, f"ERROR: Failed to create restored volume: {retmsg}"
retcode, stdout, stderr = common.run_os_command(
f"rbd remove {pool}/{volume}"
)
if retcode:
return (
False,
f"ERROR: Failed to remove temporary RBD volume '{pool}/{volume}': {stderr}",
)
# Then we perform the actual import
retcode, stdout, stderr = common.run_os_command(
f"rbd import --export-format 2 --dest-pool {pool} {backup_path}/{domain}/{datestring}/{volume_file} {volume}"
)
if retcode:
return (
False,
f"ERROR: Failed to import backup image {volume_file}: {stderr}",
)
# Finally we remove the source snapshot (not required)
if retain_snapshot:
retcode, retmsg = ceph.add_snapshot(
zkhandler,
pool,
volume,
f"backup_{datestring}",
zk_only=True,
)
if not retcode:
return (
False,
f"ERROR: Failed to add imported image snapshot for {volume_file}: {retmsg}",
)
else:
retcode, stdout, stderr = common.run_os_command(
f"rbd snap rm {pool}/{volume}@backup_{datestring}"
)
if retcode:
return (
False,
f"ERROR: Failed to remove imported image snapshot for {volume_file}: {stderr}",
)
# 5. Start VM
retcode, retmsg = start_vm(zkhandler, domain)
if not retcode:
return False, f"ERROR: Failed to start restored VM {domain}: {retmsg}"
tend = time.time()
ttot = round(tend - tstart, 2)
retlines = list()
if is_snapshot_remove_failed:
retlines.append(
f"WARNING: Failed to remove hanging snapshot(s) as requested for volume(s) {', '.join(which_snapshot_remove_failed)}: {', '.join(msg_snapshot_remove_failed)}"
)
myhostname = gethostname().split(".")[0]
retlines.append(
f"Successfully restored VM backup {datestring} for '{domain}' from '{myhostname}:{backup_path}' in {ttot}s."
)
return True, "\n".join(retlines)

22
debian/changelog vendored
View File

@ -1,3 +1,25 @@
pvc (0.9.80-0) unstable; urgency=high
* [CLI] Improves CLI performance by not loading "pkg_resources" until needed
* [CLI] Improves the output of the audit log (full command paths)
* [Node Daemon/API Daemon] Moves the sample YAML configurations to /usr/share/pvc instead of /etc/pvc and cleans up the old locations automatically
* [CLI] Adds VM autobackup functionality to automate VM backup/retention and scheduling
* [CLI] Handles the internal store in a better way to ensure CLI can be used as a module properly
-- Joshua M. Boniface <joshua@boniface.me> Fri, 27 Oct 2023 09:56:31 -0400
pvc (0.9.79-0) unstable; urgency=high
**API Changes**: New endpoints /vm/{vm}/backup, /vm/{vm}/restore
* [CLI Client] Fixes some storage pool help text messages
* [Node Daemon] Increases the IPMI monitoring plugin timeout
* [All] Adds support for VM backups, including creation, removal, and restore
* [Repository] Fixes shebangs in scripts to be consistent
* [Daemon Library] Improves the handling of VM list arguments (default None)
-- Joshua M. Boniface <joshua@boniface.me> Tue, 24 Oct 2023 02:10:24 -0400
pvc (0.9.78-0) unstable; urgency=high
* [API, Client CLI] Fixes several bugs around image uploads; adds a new query parameter for non-raw images

View File

@ -0,0 +1 @@
client-cli/autobackup.sample.yaml usr/share/pvc

View File

@ -1,7 +1,7 @@
api-daemon/pvcapid.py usr/share/pvc
api-daemon/pvcapid-manage*.py usr/share/pvc
api-daemon/pvc-api-db-upgrade usr/share/pvc
api-daemon/pvcapid.sample.yaml etc/pvc
api-daemon/pvcapid.sample.yaml usr/share/pvc
api-daemon/pvcapid usr/share/pvc
api-daemon/pvcapid.service lib/systemd/system
api-daemon/pvcapid-worker.service lib/systemd/system

View File

@ -18,3 +18,6 @@ fi
if [ ! -f /etc/pvc/pvcapid.yaml ]; then
echo "NOTE: The PVC client API daemon (pvcapid.service) and the PVC provisioner worker daemon (pvcapid-worker.service) have not been started; create a config file at /etc/pvc/pvcapid.yaml, then run the database configuration (/usr/share/pvc/pvc-api-db-upgrade) and start them manually."
fi
# Clean up any old sample configs
rm /etc/pvc/pvcapid.sample.yaml || true

View File

@ -1,5 +1,5 @@
node-daemon/pvcnoded.py usr/share/pvc
node-daemon/pvcnoded.sample.yaml etc/pvc
node-daemon/pvcnoded.sample.yaml usr/share/pvc
node-daemon/pvcnoded usr/share/pvc
node-daemon/pvcnoded.service lib/systemd/system
node-daemon/pvc.target lib/systemd/system

View File

@ -14,3 +14,6 @@ if systemctl is-active --quiet pvcnoded.service; then
else
echo "NOTE: The PVC node daemon (pvcnoded.service) has not been started; create a config file at /etc/pvc/pvcnoded.yaml then start it."
fi
# Clean up any old sample configs
rm /etc/pvc/pvcnoded.sample.yaml || true

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

BIN
docs/images/pvc-nodelog.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 300 KiB

BIN
docs/images/pvc-nodes.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

View File

@ -1,4 +1,4 @@
#!/bin/bash
#!/usr/bin/env bash
# Generate the database migration files

View File

@ -1,4 +1,4 @@
#!/bin/bash
#!/usr/bin/env bash
# Generate the Zookeeper migration files

View File

@ -76,7 +76,7 @@ class MonitoringPluginScript(MonitoringPlugin):
ipmi_password = self.config["ipmi_password"]
retcode, _, _ = run_os_command(
f"/usr/bin/ipmitool -I lanplus -H {ipmi_hostname} -U {ipmi_username} -P {ipmi_password} chassis power status",
timeout=2
timeout=5
)
if retcode > 0:

View File

@ -132,7 +132,7 @@ class MonitoringPluginScript(MonitoringPlugin):
for slave_interface in slave_interfaces:
if slave_interface[1] == 'up':
slave_interface_up_count += 1
if slave_interface_up_count < 2:
if slave_interface_up_count < len(slave_interfaces):
messages.append(f"{dev} DEGRADED with {slave_interface_up_count} active slaves")
health_delta += 10
else:

View File

@ -49,7 +49,7 @@ import re
import json
# Daemon version
version = "0.9.78"
version = "0.9.80"
##########################################################

View File

@ -77,5 +77,5 @@ def start_system_services(logger, config):
start_ceph_mon(logger, config)
start_ceph_mgr(logger, config)
logger.out("Waiting 3 seconds for daemons to start", state="s")
sleep(3)
logger.out("Waiting 10 seconds for daemons to start", state="s")
sleep(10)