Fix bad location of config sets

Also validate on failures
Bump version to 0.9.42
2021-10-12 17:23:04 -04:00 · 2021-10-12 17:11:03 -04:00 · 2021-10-12 15:25:42 -04:00 · 2021-10-12 14:21:52 -04:00 · 2021-10-12 12:24:03 -04:00 · 2021-10-12 11:04:27 -04:00
16 changed files with 159 additions and 124 deletions
--- a/.github/workflows/codeql-analysis.yml
+++ b/.github/workflows/codeql-analysis.yml
@ -1,68 +0,0 @@
-# For most projects, this workflow file will not need changing; you simply need
-# to commit it to your repository.
-#
-# You may wish to alter this file to override the set of languages analyzed,
-# or to provide custom queries or build logic.
-#
-# ******** NOTE ********
-# We have attempted to detect the languages in your repository. Please check
-# the `language` matrix defined below to confirm you have the correct set of
-# supported CodeQL languages.
-# ******** NOTE ********
-
-name: "CodeQL"
-
-on:
-  push:
-    branches: [ master ]
-  pull_request:
-    # The branches below must be a subset of the branches above
-    branches: [ master ]
-  schedule:
-    - cron: '17 22 * * 2'
-
-jobs:
-  analyze:
-    name: Analyze
-    runs-on: ubuntu-latest
-
-    strategy:
-      fail-fast: false
-      matrix:
-        language: [ 'python' ]
-        # CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python' ]
-        # Learn more...
-        # https://docs.github.com/en/github/finding-security-vulnerabilities-and-errors-in-your-code/configuring-code-scanning#overriding-automatic-language-detection
-
-    steps:
-    - name: Checkout repository
-      uses: actions/checkout@v2
-
-    # Initializes the CodeQL tools for scanning.
-    - name: Initialize CodeQL
-      uses: github/codeql-action/init@v1
-      with:
-        languages: ${{ matrix.language }}
-        # If you wish to specify custom queries, you can do so here or in a config file.
-        # By default, queries listed here will override any specified in a config file.
-        # Prefix the list here with "+" to use these queries and those in the config file.
-        # queries: ./path/to/local/query, your-org/your-repo/queries@main
-
-    # Autobuild attempts to build any compiled languages  (C/C++, C#, or Java).
-    # If this step fails, then you should remove it and run the build manually (see below)
-    - name: Autobuild
-      uses: github/codeql-action/autobuild@v1
-
-    # ℹ️ Command-line programs to run using the OS shell.
-    # 📚 https://git.io/JvXDl
-
-    # ✏️ If the Autobuild fails above, remove it and uncomment the following three lines
-    #    and modify them (or add more) to build your code if your project
-    #    uses a compiled language
-
-    #- run: |
-    #   make bootstrap
-    #   make release
-
-    - name: Perform CodeQL Analysis
-      uses: github/codeql-action/analyze@v1
--- a/.version
+++ b/.version
@ -1 +1 @@
-0.9.41
+0.9.42
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,5 +1,12 @@
 ## PVC Changelog

+###### [v0.9.42](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.42)
+
+  * [Documentation] Reworks and updates various documentation sections
+  * [Node Daemon] Adjusts the fencing process to use a power off rather than a power reset for maximum certainty
+  * [Node Daemon] Ensures that MTU values are validated during the first read too
+  * [Node Daemon] Corrects the loading of the bridge_mtu value to use the current active setting rather than a fixed default to prevent unintended surprises
+
 ###### [v0.9.41](https://github.com/parallelvirtualcluster/pvc/releases/tag/v0.9.41)

  * Fixes a bad conditional check in IPMI verification
--- a/api-daemon/pvcapid/Daemon.py
+++ b/api-daemon/pvcapid/Daemon.py
@ -25,7 +25,7 @@ import yaml
 from distutils.util import strtobool as dustrtobool

 # Daemon version
-version = '0.9.41'
+version = '0.9.42'

 # API version
 API_VERSION = 1.0
--- a/build-and-deploy.sh
+++ b/build-and-deploy.sh
@ -12,6 +12,18 @@ else
    SUDO="sudo"
 fi

+KEEP_ARTIFACTS=""
+if [[ -n ${1} ]]; then
+    for arg in ${@}; do
+        case ${arg} in
+            -k|--keep)
+                KEEP_ARTIFACTS="y"
+                shift
+            ;;
+        esac
+    done
+fi
+
 echo -n "> Linting code for errors... "
 ./lint || exit

@ -48,3 +60,6 @@ for HOST in ${HOSTS[@]}; do
    sleep 30
    echo "done."
 done
+if [[ -z ${KEEP_ARTIFACTS} ]]; then
+    rm ../pvc*_${version}*
+fi
--- a/client-cli/pvc/cli_lib/common.py
+++ b/client-cli/pvc/cli_lib/common.py
@ -124,6 +124,9 @@ def call_api(config, operation, request_uri, headers={}, params=None, data=None,
        request_uri
    )

+    # Default timeout is 3 seconds
+    timeout = 3
+
    # Craft the authentication header if required
    if config['api_key']:
        headers['X-Api-Key'] = config['api_key']
@ -134,6 +137,7 @@ def call_api(config, operation, request_uri, headers={}, params=None, data=None,
        if operation == 'get':
            response = requests.get(
                uri,
+                timeout=timeout,
                headers=headers,
                params=params,
                data=data,
@ -142,6 +146,7 @@ def call_api(config, operation, request_uri, headers={}, params=None, data=None,
        if operation == 'post':
            response = requests.post(
                uri,
+                timeout=timeout,
                headers=headers,
                params=params,
                data=data,
@ -151,6 +156,7 @@ def call_api(config, operation, request_uri, headers={}, params=None, data=None,
        if operation == 'put':
            response = requests.put(
                uri,
+                timeout=timeout,
                headers=headers,
                params=params,
                data=data,
@ -160,6 +166,7 @@ def call_api(config, operation, request_uri, headers={}, params=None, data=None,
        if operation == 'patch':
            response = requests.patch(
                uri,
+                timeout=timeout,
                headers=headers,
                params=params,
                data=data,
@ -168,6 +175,7 @@ def call_api(config, operation, request_uri, headers={}, params=None, data=None,
        if operation == 'delete':
            response = requests.delete(
                uri,
+                timeout=timeout,
                headers=headers,
                params=params,
                data=data,
--- a/client-cli/pvc/pvc.py
+++ b/client-cli/pvc/pvc.py
@ -51,6 +51,18 @@ default_store_data = {
 }


+#
+# Version function
+#
+def print_version(ctx, param, value):
+    if not value or ctx.resilient_parsing:
+        return
+    from pkg_resources import get_distribution
+    version = get_distribution('pvc').version
+    click.echo(f'Parallel Virtual Cluster version {version}')
+    ctx.exit()
+
+
 #
 # Data store handling functions
 #
@ -4793,6 +4805,10 @@ def task_init(confirm_flag, overwrite_flag):
    '-u', '--unsafe', '_unsafe', envvar='PVC_UNSAFE', is_flag=True, default=False,
    help='Allow unsafe operations without confirmation/"--yes" argument.'
 )
+@click.option(
+    '--version', is_flag=True, callback=print_version,
+    expose_value=False, is_eager=True
+)
 def cli(_cluster, _debug, _quiet, _unsafe):
    """
    Parallel Virtual Cluster CLI management tool
--- a/client-cli/setup.py
+++ b/client-cli/setup.py
@ -2,7 +2,7 @@ from setuptools import setup

 setup(
    name='pvc',
-    version='0.9.41',
+    version='0.9.42',
    packages=['pvc', 'pvc.cli_lib'],
    install_requires=[
        'Click',
--- a/debian/changelog
+++ b/debian/changelog
@ -1,3 +1,12 @@
+pvc (0.9.42-0) unstable; urgency=high
+
+  * [Documentation] Reworks and updates various documentation sections
+  * [Node Daemon] Adjusts the fencing process to use a power off rather than a power reset for maximum certainty
+  * [Node Daemon] Ensures that MTU values are validated during the first read too
+  * [Node Daemon] Corrects the loading of the bridge_mtu value to use the current active setting rather than a fixed default to prevent unintended surprises
+
+ -- Joshua M. Boniface <joshua@boniface.me>  Tue, 12 Oct 2021 13:48:19 -0400
+
 pvc (0.9.41-0) unstable; urgency=high

  * Fixes a bad conditional check in IPMI verification
--- a/docs/cluster-architecture.md
+++ b/docs/cluster-architecture.md
@ -306,7 +306,7 @@ Self-management and self-healing are important components of PVC's design, and t

 To operate correctly, these functions require each node in the cluster to have a functional IPMI-over-IP setup with a configured user who is able to perform chassis power commands. This differs depending on the chassis manufacturer and model, and should be tested prior to deploying any production cluster. If IPMI is not configured correctly at node startup, the daemon will warn and disable automatic recovery of the node. The IPMI should be present in the Upstream system network (see [System Networks](#system-networks) above), or in another secured network which is reachable from the Upstream system network, whichever is more convenient for the layout of the networks.

-The general process is divided into 3 sections: detecting node failures, fencing nodes, and recovering from fenced nodes.
+The general process is divided into 3 sections: detecting node failures, fencing nodes, and recovering from fenced nodes. Note that this process only applies to nodes in the `run` "daemon state"; if a node daemon cleanly shuts down (for instance due to a service restart or administrative action), it will not be fenced.

 #### Detecting Failed Nodes

@ -322,7 +322,7 @@ Once the cluster, and specifically one node in the cluster, has determined that

 During the `dead` process, the failed node has 6 chances, called "saving throws", at `keepalive_interval` second windows, to send another keepalive before it is fenced. This additional, fixed, delay helps ensure that the cluster will gracefully recover from intermittent network failures or loss of Zookeeper contact, by providing nodes up to another 6 keepalive intervals to save themselves once the fence timer actually begins. This bring the total time, with default options, of a node stopping contact to a node being fenced, to between 60 and 65 seconds. This duration is considered by the author an acceptable compromise between speedy recovery and avoiding false positives (and hence larger outages).

-Once a node has been marked `dead` and has failed its 6 "saving throws", the fence process triggers an IPMI chassis reset sequence. First, the node is issued the standard IPMI `chassis power reset` command to trigger a cold system reset. Next, it waits a fixed 1 second and then issues a `chassis power on` signal to ensure the node is powered on (just in case it had already shut itself off). The node then waits a fixed 2 seconds, and then checks the current `chassis power status`. Using the results of these 3 commands, PVC is then able to determine with near certainty whether the node has truly been forced offline or not, and it can proceed to the next step.
+Once a node has been marked `dead` and has failed its 6 "saving throws", the fence process triggers an IPMI chassis reset sequence. First, the node is issued an IPMI `chassis power off` command to trigger a cold system shutdown. Next, it waits a fixed 1 second and then checks and logs the current `chassis power state`, and then issues a `chassis power on` signal to start up the node. It then finally waits a fixed 2 seconds, and then checks the current `chassis power status`. Using the results of these 3 commands, PVC is then able to determine with near certainty whether the node has truly been forced offline or not, and it can proceed to the next step.

 #### Recovery from Node Fences

@ -403,12 +403,12 @@ This section provides diagrams of 2 best-practice cluster configurations. These

 #### Small 3-node cluster

-![Small 3-node cluster](/images/pvc-3-node-cluster.png)
+[![Small 3-node cluster](/images/pvc-3-node-cluster.png)](/images/pvc-3-node-cluster.png)

 *Above: A diagram of a simple 3-node cluster with all nodes as coordinators. Dual 10 Gbps network interface per node, unified physical networking with collapsed cluster and storage networks.*

 #### Large 8-node cluster

-![Large 8-node cluster](/images/pvc-8-node-cluster.png)
+[![Larger 8-node cluster](/images/pvc-8-node-cluster.png)](/images/pvc-8-node-cluster.png)

 *Above: A diagram of a large 8-node cluster with 3 coordinators and 5 hypervisors. Quad 10Gbps network interfaces per node, split physical networking into guest/cluster and storage networks.*
--- a/docs/getting-started.md
+++ b/docs/getting-started.md
@ -6,41 +6,39 @@ This guide will walk you through setting up a simple 3-node PVC cluster from scr

 ### Part One - Preparing for bootstrap

-0. Read through the [Cluster Architecture documentation](/architecture/cluster). This documentation details the requirements and conventions of a PVC cluster, and is important to understand before proceeding.
+0. Read through the [Cluster Architecture documentation](/cluster-architecture). This documentation details the requirements and conventions of a PVC cluster, and is important to understand before proceeding.

-0. Download the latest copy of the [`pvc-installer`](https://github.com/parallelvirtualcluster/pvc-installer) and [`pvc-ansible`](https://github.com/parallelvirtualcluster/pvc-ansible) repositories to your local machine.
+0. Download the latest copy of the [`pvc-ansible`](https://github.com/parallelvirtualcluster/pvc-ansible) repository to your local machine.

-0. In `pvc-ansible`, create an initial `hosts` inventory, using `hosts.default` as a template. You can manage multiple PVC clusters ("sites") from the Ansible repository easily, however for simplicity you can use the simple name `cluster` for your initial site. Define the 3 hostnames you will use under the site group; usually the provided names of `pvchv1`, `pvchv2`, and `pvchv3` are sufficient, though you may use any hostname pattern you wish. It is *very important* that the names all contain a sequential number, however, as this is used by various components.
+0. Leverage the `create-local-repo.sh` script in the `pvc-ansible` directory to set up a local cluster configuration directory; follow the instructions the script provides, as all future steps will be done inside your new local configuration directory.

-0. In `pvc-ansible`, create an initial set of `group_vars`, using the `group_vars/default` as a template. Inside these group vars are two main files: `base.yml` and `pvc.yml`. These example files are well-documented; read them carefully and specify all required options before proceeding.
+0. Create an initial `hosts` inventory, using `hosts.default` in the `pvc-ansible` repo as a template. You can manage multiple PVC clusters ("sites") from the Ansible repository easily, however for simplicity you can use the simple name `cluster` for your initial site. Define the 3 hostnames you will use under the site group; usually the provided names of `pvchv1`, `pvchv2`, and `pvchv3` are sufficient, though you may use any hostname pattern you wish. It is *very important* that the names all contain a sequential number, however, as this is used by various components.

-    `base.yml` configures the `base` role and some common per-cluster configurations such as an upstream domain, a root password, and a set of administrative users, as well as and most importantly, the basic network configuration of the nodes. Make special note of the various items that must be generated such as passwords; these should all be cluster-unique.
+0. Create an initial set of `group_vars` for your cluster at `group_vars/<cluster>`, using the `group_vars/default` in the `pvc-ansible` repo as a template. Inside these group vars are two main files: `base.yml` and `pvc.yml`. These example files are well-documented; read them carefully and specify all required options before proceeding, and reference the [Ansible manual](/manuals/ansible) for more detailed descriptions of the options.    

-    `pvc.yml` configures the `pvc` role, including all the dependent software and PVC itself. Important to note is the `pvc_nodes` list, which contains a list of all the nodes as well as per-node configurations for each. All nodes, both coordinator and not, must be a part of this list.
+    * `base.yml` configures the `base` role and some common per-cluster configurations such as an upstream domain, a root password, a set of administrative users, various hardware configuration items, as well as and most importantly, the basic network configuration of the nodes. Make special note of the various items that must be generated such as passwords; these should all be cluster-unique.    

-0. Optionally though strongly recommended, move your new configurations out of the `pvc-ansible` repository. The `.gitignore` file will ignore all non-default data, so it is advisable to move these configurations to a separate, secure, repository or filestore, and symlink to them inside the `pvc-ansible` repository directories. The three important locations to symlink are:  
-    * `hosts`: The main Ansible inventory for the various clusters.
-    * `group_vars/<cluster_name>`: The `group_vars` for the various clusters.
-    * `files/<cluster_name>`: Static files, created during the bootstrap Ansible run, for the various clusters.
+    * `pvc.yml` configures the `pvc` role, including all the dependent software and PVC itself. Important to note is the `pvc_nodes` list, which contains a list of all the nodes as well as per-node configurations for each. All nodes must be a part of this list.    

-0. In `pvc-installer`, run the `buildiso.sh` script to generate an installer ISO. This script requires `debootstrap`, `isolinux`, and `xorriso` to function. The resulting file will, by default, be named `pvc-installer_<date>.iso` in the current directory. For additional options, use the `-h` flag to show help information for the script.
+0. In the `pvc-installer` directory, run the `buildiso.sh` script to generate an installer ISO. This script requires `debootstrap`, `isolinux`, and `xorriso` to function. The resulting file will, by default, be named `pvc-installer_<date>.iso` in the current directory. For additional options, use the `-h` flag to show help information for the script.

 ### Part Two - Preparing and installing the physical hosts

-0. Prepare 3 physical servers with IPMI. These physical servers should have at the least a system disk (a single disk, hardware RAID1, or similar), one or more data (Ceph OSD) disks, and networking/CPU/RAM suitable for the cluster needs. Connect their networking based on the configuration set in the `pvc-ansible` `group_vars/base.yml` file.
+0. Prepare 3 physical servers with IPMI. The servers should match the specifications and requirements outlined in the [Cluster Architecture documentation](/cluster-architecture). Connect their networking based on the configuration set in the `base.yml` group vars file for your cluster.

-0. Configure the IPMI user specified in the `pvc-ansible` `group_vars/base.yml` file with the required permissions; this user will be used to reboot the host if it fails, so it must be able to control system power state.
+0. Load the installer ISO generated in step 6 of the previous section onto a USB stick, or using IPMI virtual media, on the physical servers.

-0. Configure IPMI to enable IPMI over LAN. Use the default (all-zero) encryption key; this is needed for fencing to work. Verify that IPMI over LAN is operating by using the following command from a machine able to reach the IPMI interface:  
-    `/usr/bin/ipmitool -I lanplus -H <IPMI_host> -U <user> -P <password> chassis power status`
+0. Boot the physical servers off of the installer ISO. Use UEFI mode - if available - for maximum flexibility and longevity.

-0. Load the installer ISO generated in step 5 of the previous section onto a USB stick, or using IPMI virtual media, on the physical servers.
+0. Follow the prompts from the installer ISO. It will ask for a hostname, the system disk device to use, the initial network interface to configure as well as vLANs and either DHCP or static IP information, and finally either an HTTP URL containing an SSH `authorized_keys` to use for the `deploy` user, or a password for this user if key auth is unavailable.

-0. Boot the physical servers off of the installer ISO, in UEFI mode if available for maximum flexibility.
+0. Wait for the installer to complete. This may take several minutes.

-0. Follow the prompts from the installer ISO. It will ask for a hostname, the system disk device to use, the initial network interface to configure as well as either DHCP or static IP information, and finally either an HTTP URL containing an SSH `authorized_keys` to use for the `deploy` user, or a password for this user if key auth is unavailable.
+0. At the end of the install process, follow the prompts carefully; it is usually prudent to pre-see the `/etc/network/interfaces` configuration based on your expected final physical network config (e.g. set up bonding, etc.) before proceeding, especially if you use DHCP, as the bonding configuration applied later could affect the address. The `chroot` is likely unneeded unless you have good reason to edit the system in this way.

-0. Wait for the installer to complete. It will provide some next steps at the end, and wait for the administrator to acknowledge via an "Enter" key-press. The node will now reboot into the base PVC system.
+0. Make note of the (temporary and insecure!) root password set by the installer; you may need it to troubleshoot the system if it does not come up properly. This will be overwritten later in the setup process.
+
+0. Press "Enter" to reboot the system and confirm it is reachable.

 0. Repeat the above steps for all 3 initial nodes. On boot, they will display their configured IP address to be used in the next steps.

@ -48,42 +46,44 @@ This guide will walk you through setting up a simple 3-node PVC cluster from scr

 0. Make note of the IP addresses of all 3 initial nodes, and configure DNS, `/etc/hosts`, or Ansible `ansible_host=` hostvars to map these IP addresses to the hostnames set in the Ansible `hosts` and `group_vars` files.

-0. Verify connectivity from your administrative host to the 3 initial nodes, including SSH access. Accept their host keys as required before proceeding as Ansible does not like those prompts.
+0. Verify connectivity from your administrative host to the 3 initial nodes, including SSH access as the `deploy` user. Accept their host keys as required before proceeding as Ansible does not like those prompts. If you did not configure SSH key auth during the PVC installer process, configure it now, as it greatly simplifies Ansible configuration.

-0. Verify your `group_vars` setup from part one, as errors here may require a re-installation and restart of the bootstrap process.
+0. Verify your `group_vars` setup from part 1, as errors here may require a re-installation and restart of the bootstrap process.

-0. Perform the initial bootstrap. From the `pvc-ansible` repository directory, execute the following `ansible-playbook` command, replacing `<cluster_name>` with the Ansible group name from the `hosts` file. Make special note of the additional `bootstrap=yes` variable, which tells the playbook that this is an initial bootstrap run.  
+0. Perform the initial bootstrap. From your local configuration repository directory, execute the following `ansible-playbook` command, replacing `<cluster_name>` with the Ansible group name from the `hosts` file. Make special note of the additional `bootstrap=yes` variable, which tells the playbook that this is an initial bootstrap run.  
    `$ ansible-playbook -v -i hosts pvc.yml -l <cluster_name> -e bootstrap=yes`

-    **WARNING:** Never rerun this playbook with the `-e bootstrap=yes` option against an active cluster. This will have unintended, disastrous consequences.
+    **WARNING:** Never run this playbook with the `-e bootstrap=yes` option against an active, already-bootstrapped cluster. This will have **disastrous consequences** including the **loss of all data** in the Ceph system as well as any configured networks, VMs, etc.

-0. Wait for the Ansible playbook run to finish. Once completed, the cluster bootstrap will be finished, and all 3 nodes will have rebooted into a working PVC cluster.
+0. Wait for the Ansible playbook run to finish. Once completed, the cluster bootstrap will be finished, and all 3 nodes will have rebooted into a working PVC cluster. If any errors occur, carefully evaluate them and re-run the playbook (with `-o bootstrap=yes` - your cluster is not active yet!) as required.

-0. Install the CLI client on your administrative host, and add and verify connectivity to the cluster; this will also verify that the API is working. You will need to know the cluster upstream floating IP address here, and if you configured SSL or authentication for the API in your `group_vars`, adjust the first command as needed (see `pvc cluster add -h` for details).  
-    `$ pvc cluster add -a <upstream_floating_ip> mycluster`  
+0. Download and install the CLI client package (`pvc-client-cli.deb`) on your administrative host, and add and verify connectivity to the cluster; this will also verify that the API is working. You will need to know the cluster upstream floating IP address you configured in the `networks` section of the `base.yml` playbook, and if you configured SSL or authentication for the API in your `group_vars`, adjust the first command as needed (see `pvc cluster add -h` for details). A human-readable description can also be specified, which is useful if you manage multiple clusters and their names become unweildy.  
+    `$ pvc cluster add -a <upstream_floating_ip> -d "My first PVC cluster" mycluster`  
    `$ pvc -c mycluster node list`

-    We can also set a default cluster by exporting the `PVC_CLUSTER` environment variable to avoid requiring `-c cluster` with every subsequent command:  
+    You can also set a default cluster by exporting the `PVC_CLUSTER` environment variable to avoid requiring `-c cluster` with every subsequent command:  
    `$ export PVC_CLUSTER="mycluster"`

+    **Note:** It is fully possible to administer the cluster from the nodes themselves via SSH should you so choose, to avoid requiring the PVC client on your local machine.
+
 ### Part Four - Configuring the Ceph storage cluster

-0. Determine the Ceph OSD block devices on each host, via an `ssh` shell. For instance, use `lsblk` or check `/dev/disk/by-path` to show the block devices by their physical SAS/SATA bus location, and obtain the relevant `/dev/sdX` name for each disk you wish to be a Ceph OSD on each host.
+0. Determine the Ceph OSD block devices on each host via an `ssh` shell. For instance, use `lsblk` or check `/dev/disk/by-path` to show the block devices by their physical SAS/SATA bus location, and obtain the relevant `/dev/sdX` name for each disk you wish to be a Ceph OSD on each host.

-0. Add each OSD device to each host. The general command is:  
+0. Cofigure an OSD device for each data disk in each host. The general command is:  
    `$ pvc storage osd add --weight <weight> <node> <device>`

-    For example, if each node has two data disks, as `/dev/sdb` and `/dev/sdc`, run the commands as follows:  
+    For example, if each node has two data disks, as `/dev/sdb` and `/dev/sdc`, run the commands as follows to add the first disk to each node, then the second disk to each node:  
    `$ pvc storage osd add --weight 1.0 pvchv1 /dev/sdb`  
-    `$ pvc storage osd add --weight 1.0 pvchv1 /dev/sdc`  
    `$ pvc storage osd add --weight 1.0 pvchv2 /dev/sdb`  
-    `$ pvc storage osd add --weight 1.0 pvchv2 /dev/sdc`   
    `$ pvc storage osd add --weight 1.0 pvchv3 /dev/sdb`  
+    `$ pvc storage osd add --weight 1.0 pvchv1 /dev/sdc`   
+    `$ pvc storage osd add --weight 1.0 pvchv2 /dev/sdc`  
    `$ pvc storage osd add --weight 1.0 pvchv3 /dev/sdc`   

-    **NOTE:** On the CLI, the `--weight` argument is optional, and defaults to `1.0`. In the API, it must be specified explicitly, but the CLI sets a default value. OSD weights determine the relative amount of data which can fit onto each OSD. Under normal circumstances, you would want all OSDs to be of identical size, and hence all should have the same weight. If your OSDs are instead different sizes, the weight should be proportional to the size, e.g. `1.0` for a 100GB disk, `2.0` for a 200GB disk, etc. For more details, see the Ceph documentation.
+    **NOTE:** On the CLI, the `--weight` argument is optional, and defaults to `1.0`. In the API, it must be specified explicitly, but the CLI sets a default value. OSD weights determine the relative amount of data which can fit onto each OSD. Under normal circumstances, you would want all OSDs to be of identical size, and hence all should have the same weight. If your OSDs are instead different sizes, the weight should be proportional to the size, e.g. `1.0` for a 100GB disk, `2.0` for a 200GB disk, etc. For more details, see the [Cluster Architecture](/cluster-architecture) and Ceph documentation.

-    **NOTE:** OSD commands wait for the action to complete on the node, and can take some time.
+    **NOTE:** OSD commands wait for the action to complete on the node, and can take some time (up to 30 seconds).

    **NOTE:** You can add OSDs in any order you wish, for instance you can add the first OSD to each node and then add the second to each node, or you can add all nodes' OSDs together at once like the example. This ordering does not affect the cluster in any way.

@ -93,10 +93,14 @@ This guide will walk you through setting up a simple 3-node PVC cluster from scr
 0. Create an RBD pool to store VM images on. The general command is:  
    `$ pvc storage pool add <name> <placement_groups>`

-    For example, to create a pool named `vms` with 256 placement groups (a good default with 6 OSD disks), run the command as follows:  
-    `$ pvc storage pool add vms 256`
+    **NOTE:** Ceph placement groups are a complex topic; as a general rule it's easier to grow than shrink, so start small and grow as your cluster grows. The following are some good starting numbers for 3-node clusters, though the Ceph documentation and the [Ceph placement group calculator](https://ceph.com/pgcalc/) are advisable for anything more complex. There is a trade-off between CPU usage and the number of total PGs for all pools in the cluster, with more PGs meaning more CPU usage.    

-    **NOTE:** Ceph placement groups are a complex topic; as a general rule it's easier to grow than shrink, so start small and grow as your cluster grows. The general formula is to calculate the ideal number of PGs is `pgs * maxcopies / osds = ~250`, then round `pgs` down to the closest power of 2; generally, you want as close to 250 PGs per OSD as possible, but no more than 250. With 3-6 OSDs, 256 is a good number, and with 9+ OSDs, 512 is a good number. Ceph will error if the total number exceeds the limit. For more details see the Ceph documentation and the [placement group calculator](https://ceph.com/pgcalc/).
+    * 3 OSDs total: 128 PGs (1 pool) or 64 PGs (2 or more pools, each)    
+    * 6 OSDs total: 256 PGs (1 pool) or 128 PGs (2 or more pools, each)    
+    * 9+ OSDs total: 256 PGs    
+
+    For example, to create a pool named `vms` with 256 placement groups, run the command as follows:  
+    `$ pvc storage pool add vms 256`

    **NOTE:** As detailed in the [cluster architecture documentation](/cluster-architecture), you can also set a custom replica configuration for each pool if the default of 3 replica copies with 2 minimum copies is not acceptable. See `pvc storage pool add -h` or that document for full details.

@ -105,7 +109,7 @@ This guide will walk you through setting up a simple 3-node PVC cluster from scr

 ### Part Five - Creating virtual networks

-0. Determine a domain name and IPv4, and/or IPv6 network for your first client network, and any other client networks you may wish to create. These networks should never overlap with the cluster networks. For full details on the client network types, see the [cluster architecture documentation](/cluster-architecture).
+0. Determine a domain name and IPv4, and/or IPv6 network for your first client network, and any other client networks you may wish to create. These networks must not overlap with the cluster networks. For full details on the client network types, see the [cluster architecture documentation](/cluster-architecture).

 0. Create the virtual network. There are many options here, so see `pvc network add -h` for details.  

--- a/docs/manuals/ansible.md
+++ b/docs/manuals/ansible.md
@ -105,6 +105,11 @@ Example configuration:
 cluster_group: mycluster
 timezone_location: Canada/Eastern
 local_domain: upstream.local
+recursive_dns_servers:
+  - 8.8.8.8
+  - 8.8.4.4
+recursive_dns_search_domains:
+  - "{{ local_domain }}"

 username_ipmi_host: "pvc"
 passwd_ipmi_host: "MyPassword2019"
@ -184,6 +189,18 @@ The TZ database format name of the local timezone, e.g. `America/Toronto` or `Ca

 The domain name of the PVC cluster nodes. This is the domain portion of the FQDN of each node, and should usually be the domain of the `upstream` network.

+#### `recursive_dns_servers`
+
+* *optional*
+
+A list of recursive DNS servers to be used by cluster nodes. Defaults to Google Public DNS if unspecified.
+
+#### `recursive_dns_search_domains`
+
+* *optional*
+
+A list of domain names (must explicitly include `local_domain` if desired) to be used for shortname DNS lookups.
+
 #### `username_ipmi_host`

 * *optional*
--- a/node-daemon/pvcnoded/Daemon.py
+++ b/node-daemon/pvcnoded/Daemon.py
@ -48,7 +48,7 @@ import re
 import json

 # Daemon version
-version = '0.9.41'
+version = '0.9.42'


 ##########################################################
--- a/node-daemon/pvcnoded/objects/VXNetworkInstance.py
+++ b/node-daemon/pvcnoded/objects/VXNetworkInstance.py
@ -80,6 +80,7 @@ class VXNetworkInstance(object):

        try:
            self.vx_mtu = self.zkhandler.read(('network.mtu', self.vni))
+            self.validateNetworkMTU()
        except Exception:
            self.vx_mtu = None

@ -110,7 +111,6 @@ class VXNetworkInstance(object):
                    self.updateNetworkMTU()
        except Exception:
            self.validateNetworkMTU()
-            self.updateNetworkMTU()

        self.createNetworkBridged()

@ -133,6 +133,7 @@ class VXNetworkInstance(object):

        try:
            self.vx_mtu = self.zkhandler.read(('network.mtu', self.vni))
+            self.validateNetworkMTU()
        except Exception:
            self.vx_mtu = None

@ -255,7 +256,6 @@ add rule inet filter forward ip6 saddr {netaddr6} counter jump {vxlannic}-out
                    self.updateNetworkMTU()
        except Exception:
            self.validateNetworkMTU()
-            self.updateNetworkMTU()

        @self.zkhandler.zk_conn.DataWatch(self.zkhandler.schema.path('network.ip6.network', self.vni))
        def watch_network_ip6_network(data, stat, event=''):
--- a/node-daemon/pvcnoded/util/config.py
+++ b/node-daemon/pvcnoded/util/config.py
@ -19,13 +19,17 @@
 #
 ###############################################################################

+import daemon_lib.common as common
+
 import os
 import subprocess
 import yaml
+
 from socket import gethostname
 from re import findall
 from psutil import cpu_count
 from ipaddress import ip_address, ip_network
+from json import loads


 class MalformedConfigurationError(Exception):
@ -287,11 +291,18 @@ def get_configuration():
            'upstream_mtu':         o_sysnetwork_upstream.get('mtu', None),
            'upstream_dev_ip':      o_sysnetwork_upstream.get('address', None),
            'bridge_dev':           o_sysnetworks.get('bridge_device', None),
-            'bridge_mtu':           o_sysnetworks.get('bridge_mtu', 1500),
+            'bridge_mtu':           o_sysnetworks.get('bridge_mtu', None),
            'enable_sriov':         o_sysnetworks.get('sriov_enable', False),
            'sriov_device':         o_sysnetworks.get('sriov_device', list())
        }

+        if config_networks['bridge_mtu'] is None:
+            # Read the current MTU of bridge_dev and set bridge_mtu to it; avoids weird resets
+            retcode, stdout, stderr = common.run_os_command(f"ip -json link show dev {config_networks['bridge_dev']}")
+            current_bridge_mtu = loads(stdout)[0]['mtu']
+            print(f"Config key bridge_mtu not explicitly set; using live MTU {current_bridge_mtu} from {config_networks['bridge_dev']}")
+            config_networks['bridge_mtu'] = current_bridge_mtu
+
        config = {**config, **config_networks}

        for network_type in ['cluster', 'storage', 'upstream']:
--- a/node-daemon/pvcnoded/util/fencing.py
+++ b/node-daemon/pvcnoded/util/fencing.py
@ -133,23 +133,39 @@ def migrateFromFencedNode(zkhandler, node_name, config, logger):
 # Perform an IPMI fence
 #
 def reboot_via_ipmi(ipmi_hostname, ipmi_user, ipmi_password, logger):
-    # Forcibly reboot the node
-    ipmi_command_reset = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power reset'.format(
+    # Power off the node the node
+    logger.out('Sending power off to dead node', state='i')
+    ipmi_command_stop = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power off'.format(
        ipmi_hostname, ipmi_user, ipmi_password
    )
-    ipmi_reset_retcode, ipmi_reset_stdout, ipmi_reset_stderr = common.run_os_command(ipmi_command_reset)
+    ipmi_stop_retcode, ipmi_stop_stdout, ipmi_stop_stderr = common.run_os_command(ipmi_command_stop)

-    if ipmi_reset_retcode != 0:
-        logger.out(f'Failed to reboot dead node: {ipmi_reset_stderr}', state='e')
+    if ipmi_stop_retcode != 0:
+        logger.out(f'Failed to power off dead node: {ipmi_stop_stderr}', state='e')

    time.sleep(1)

-    # Power on the node (just in case it is offline)
+    # Check the chassis power state
+    logger.out('Checking power state of dead node', state='i')
+    ipmi_command_status = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power status'.format(
+        ipmi_hostname, ipmi_user, ipmi_password
+    )
+    ipmi_status_retcode, ipmi_status_stdout, ipmi_status_stderr = common.run_os_command(ipmi_command_status)
+    if ipmi_status_retcode == 0:
+        logger.out(f'Current chassis power state is: {ipmi_status_stdout.strip()}', state='i')
+    else:
+        logger.out(f'Current chassis power state is: Unknown', state='w')
+
+    # Power on the node
+    logger.out('Sending power on to dead node', state='i')
    ipmi_command_start = '/usr/bin/ipmitool -I lanplus -H {} -U {} -P {} chassis power on'.format(
        ipmi_hostname, ipmi_user, ipmi_password
    )
    ipmi_start_retcode, ipmi_start_stdout, ipmi_start_stderr = common.run_os_command(ipmi_command_start)

+    if ipmi_start_retcode != 0:
+        logger.out(f'Failed to power on dead node: {ipmi_start_stderr}', state='w')
+
    time.sleep(2)

    # Check the chassis power state
@ -159,7 +175,7 @@ def reboot_via_ipmi(ipmi_hostname, ipmi_user, ipmi_password, logger):
    )
    ipmi_status_retcode, ipmi_status_stdout, ipmi_status_stderr = common.run_os_command(ipmi_command_status)

-    if ipmi_reset_retcode == 0:
+    if ipmi_stop_retcode == 0:
        if ipmi_status_stdout.strip() == "Chassis Power is on":
            # We successfully rebooted the node and it is powered on; this is a succeessful fence
            logger.out('Successfully rebooted dead node', state='o')
Author	SHA1	Message	Date
Joshua M. Boniface	55f397a347	Fix bad location of config sets	2021-10-12 17:23:04 -04:00
Joshua M. Boniface	dfebb2d3e5	Also validate on failures	2021-10-12 17:11:03 -04:00
Joshua M. Boniface	e88147db4a	Bump version to 0.9.42	2021-10-12 15:25:42 -04:00
Joshua M. Boniface	b8204d89ac	Go back to passing if exception Validation already happened and the set happens again later.	2021-10-12 14:21:52 -04:00
Joshua M. Boniface	fe73dfbdc9	Use current live value for bridge_mtu This will ensure that upgrading without the bridge_mtu config key set will keep things as they are.	2021-10-12 12:24:03 -04:00
Joshua M. Boniface	8f906c1f81	Use power off in fence instead of reset Use a power off (and then make the power on a requirement) during a node fence. Removes some potential ambiguity in the power state, since we will know for certain if it is off.	2021-10-12 11:04:27 -04:00
Joshua M. Boniface	2d9fb9688d	Validate network MTU after initial read	2021-10-12 10:53:17 -04:00
Joshua M. Boniface	fb84685c2a	Make cluster example images clickable	2021-10-12 03:15:04 -04:00
Joshua M. Boniface	032ba44d9c	Mention fencing only in run state	2021-10-12 03:05:01 -04:00
Joshua M. Boniface	b7761877e7	Adjust more wording and fix typos	2021-10-12 03:00:21 -04:00
Joshua M. Boniface	1fe07640b3	Adjust some wording	2021-10-12 02:54:16 -04:00
Joshua M. Boniface	b8d843ebe4	Remove codeql setup I don't use this for anything useful, so disable it since a run takes ages.	2021-10-12 02:51:19 -04:00
Joshua M. Boniface	95d983ddff	Fix formatting of subsection	2021-10-12 02:49:40 -04:00
Joshua M. Boniface	4c5da1b6a8	Add reference to Ansible manual	2021-10-12 02:48:47 -04:00
Joshua M. Boniface	be6b1e02e3	Fix spelling errors	2021-10-12 02:47:31 -04:00
Joshua M. Boniface	ec2a72ed4b	Fix link to cluster architecture docs	2021-10-12 02:41:22 -04:00
Joshua M. Boniface	b06e327add	Adjust getting started docs Update the docs with the current information on setting up a cluster, including simplifying the Ansible configuration to use the new create-local-repo.sh script, and simplifying some other sections.	2021-10-12 02:39:25 -04:00
Joshua M. Boniface	d1f32d2b9c	Default to removing build artifacts in b-a-d.sh	2021-10-11 16:41:00 -04:00
Joshua M. Boniface	3f78ca1cc9	Add explicit 3 second timeout to requests	2021-10-11 16:31:18 -04:00
Joshua M. Boniface	e866335918	Add version function support to CLI	2021-10-11 15:34:41 -04:00
Joshua M. Boniface	221494ed1b	Add new configs for Ansible	2021-10-11 14:44:18 -04:00
 @ -1 +1 @@
 .9.41
 .9.42