Switch to new theme

This commit is contained in:
2024-05-24 13:25:55 -04:00
parent 6739be79d9
commit b583caa8b6
82 changed files with 413 additions and 221 deletions

10
content/en/pages/about.md Normal file
View File

@ -0,0 +1,10 @@
+++
Categories = []
Description = "Welcome to the new blog!"
Tags = []
date = "2016-08-21T23:37:49-04:00"
title = "Welcome"
+++
Welcome to my new blog, build in Hugo! Hopefully this one will stay online a little longer than my previous Wordpress-based sites! On this page you will find posts about cool stuff I'm working on, things that interest me, and general tech news and my opinions thereof. I hope you enjoy your stay!

75
content/en/pages/cv.md Normal file
View File

@ -0,0 +1,75 @@
+++
Categories = []
Tags = []
date = "2016-08-21T23:37:49-04:00"
title = "CV"
+++
Joshua M. Boniface
joshua@boniface.me | 289-208-2830 | 1483 Epping Court, Burlington, ON L7M1P7
https://www.boniface.me | https://github.com/joshuaboniface | https://www.linkedin.com/in/joshuamboniface
## Profile
I am a driven, service-oriented individual with a strong knowledge of Linux administration and computer networking. I enjoy working with a strong team, and am independent and goal focused, always seeking out new knowledge to broaden my skill set and contribute to more efficient and well-operating systems. My deep knowledge of Linux systems and standard administration allows me to focus effectively on key problems and ensure continued successful operations. I am well-versed in scripting and orchestration in a modern DevOps framework and have experience running small- to mid-sized computing environments of up to several thousand VMs and terabytes of storage.
## Primary Skills
* Linux systems, in particular Debian GNU/Linux and Red Hat Enterprise Linux/CentOS environments, storage subsystems including Ceph object storage, DRBD, and ZFS/XFS/ext4 filesystems, and IP networking and routing.
* Linux applications including Apache and Nginx web stacks, Postfix/Dovecot/Courier email stacks, HAProxy, BIND9, PowerDNS, ISC-DHCP, OpenLDAP, RADIUS, and KVM/QEMU and Xen virtualization with Libvirt and Pacemaker.
* Extensive scripting and programming experience in BASH, Python, and other assorted languages.
* Orchestration and configuration management using Ansible, Puppet, and bcfg2, including custom roles and modules.
* Implementation and maintenance of monitoring for large environments with Prometheus/Grafana, Nagios, Icinga, CheckMK, TICK, and ELK stacks.
* Administration of MySQL, PostgreSQL and MongoDB databases, including deployment, tuning, and maintenance.
* Internetwork routing and advanced networking, including network design and capacity planning, troubleshooting, and maintenance.
* Customer service and writing, including technical and customer-focused writing and communication.
## Employment History
#### [Clearcable Networks](https://clearcable.ca) (Hamilton, ON) - Senior Systems Architect
* Oct 2018 - present
In the role of senior systems architect, I am charged with keeping various Clearcable SOE systems in full working order, performing R&D to advance the platform, and enabling DevOps culture and automation within the Systems and Software teams. My primary projects have included the implementation of my own [PVC hypervisor manager](https://github.com/parallelvirtualcluster) software as a base hypervisor for SOE, the expansion and unification of Ansible for management and orchestration of software deployment, and the evolution of various platform aspects to updated versions as needs change. As a deployment specialist as well, I've been responsible for the deployment of 20+ new PVC clusters for both new and upgraded customers, and 10+ brand-new customers including full Clearcable NOMS provisioning integration, both in a technical and project-manageral role.
#### [VM Farms](https://vmfarms.com) (Toronto, ON) - Linux DevOps Administrator, Operations Technical Lead
* Aug 2016 Sept 2018
In the role of Linux System Administrator, I provided management of systems for web application developers within the framework of a DevOps operations service providing consulting advice and managed hosting. Using both an in-house Xen based cloud as well as various remote computing services including AWS, I helped ensure the continued operation of the platform as well as the day-to-day administration of customer systems. Utilizing configuration management with Puppet and Ansible, I provisioned and managed applications serving millions of users and using multiple web development stacks, HTTP servers, and database backends, in addition to various proxying, queueing and caching applications. During late 2017 and 2018 in the role of Technical Lead for Operations, I assisted in the training of new employees and daily technical decisions with a mind for best-practices, and interacted regularly and in-depth with customers using various communication tools.
#### [Clearcable Network Services](https://clearcable.ca) (Hamilton, ON) - Linux System Administrator
* Jan 2013 Jul 2016
In the role of System Administrator, I used my skills with internetworking and Linux servers to ensure the proper operation of the Clearcable Networks Standard Operating Environment, a Debian GNU/Linux-based Internet Service Provider platform running on Cisco UCS hardware, and using Xen virtualization and bcfg2 configuration management. This included management of provider-grade services including DNS with BIND9; ISC DHCP in advanced configurations; Postfix/Courier email stacks; monitoring with Icinga and Munin; server hardware maintenance and support including full system rebuilds, hardware performance analysis and troubleshooting. I contributed advancements to the platform, assisting in its continued development and growth, including a major distribution upgrade project and implementation of live migration functionality to the platform using DRBD. During oncall rotation and daily support tasks, I also assisted in the deployment and maintenance of service provider networking, including routing and CMTS/ DOCSIS and VoIP access technologies to ensure the optimal operation of our client systems.
#### [HHS Population Health Research Institute](https://www.phri.ca) (Hamilton, ON) - Systems Administrator, co-op
* Sep 2010 Aug 2011
In the role of Student (Co-Op) Administrator, I was responsible for day-to-day support of desktop systems and users, including deployment of new systems, hardware replacement/reimaging, and management of accounts with Active Directory. A long-term project during my tenure was an extensive documentation of the site datacenter including diagramming and inventory of the facility.
#### The Home Depot (Burlington, ON) - Special Services, Tool Rental, Electrical, Cashier
* Oct 2006 Jan 2013
## Independent Projects
#### [Parallel Virutal Cluster (PVC)](https://github.com/parallelvirtualcluster)
Parallel Virtual Cluster (PVC) is a virtual machine-based hyperconverged infrastructure (HCI) virtualization cluster solution that is fully Free Software, scalable, redundant, self-healing, self-managing, and designed for administrator simplicity. I started the project in mid-2018 and continue maintaining and advancing it to the present.
#### [Jellyfin](https://jellyfin.org)
I am the project leader/coordinator and release manager for the Jellyfin project, the Free Software Media System that puts you in control of managing and streaming your media. I began the project with several other interested parties as a fork of the Emby media server in late 2018 and continue managing it to the present.
## Educational History
#### Mohawk College (Hamilton, ON) - Network Engineering and Security Analyst
* Sep 2008 - Dec 2012 (completed diploma)
#### Carleton University (Ottawa, ON) - Bachelor of Computer Science
* Sep 2007 - May 2008 (incomplete; two semesters)
## Personal Interests 
In addition to my work and experience above, I am interested in Fantasy and Science Fiction literature, repairing and building computers and electronics, gardening, various DIY projects, performing and composing music, astronomy and astrophysics, and science and technology papers and books.

View File

@ -0,0 +1,242 @@
+++
Categories = []
Tags = []
date = "2024-02-13T00:00:00-05:00"
title = "Hardware"
+++
I selfhost this blog, do a lot of coding, and generally do "computer stuff" on a number of different systems. Here's what I use, current as of 2024-02-13.
## Client Devices
### Primary Laptop: Lenovo Thinkpad T495s
```
_,met$$$$$gg. joshua@dragonstorm
,g$$$$$$$$$$$$$$$P. ------------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 12 (bookworm) x86_64
,$$P' `$$$. Host: 20QJCTO1WW ThinkPad T495s
',$$P ,ggs. `$$b: Kernel: 6.1.0-17-amd64
`d$$' ,$P"' . $$$ Uptime: 17 days, 12 hours, 37 mins
$$P d$' , $$P Packages: 4925 (dpkg), 11 (flatpak)
$$: $$. - ,d$$' Shell: bash 5.2.15
$$; Y$b._ _,d$P' Resolution: 1920x1080
Y$$. `.`"Y$$$$P"' DE: GNOME 43.9
`$$b "-.__ WM: Mutter
`Y$$ WM Theme: Adwaita
`Y$$. Theme: Adwaita-dark [GTK2/3]
`$$b. Icons: Adwaita [GTK2/3]
`Y$$b. Terminal: tmux
`"Y$b._ CPU: AMD Ryzen 5 PRO 3500U w/ Radeon Vega Mobile Gfx (8) @ 2.100GHz
`""" GPU: AMD ATI Radeon Vega Series / Radeon Vega Mobile Series
Memory: 11907MiB / 13860MiB
NVMe: 1x XPG SX8200 Pro 1TB, ext4
SSD: N/A
HDD: N/A
```
### Work Laptop: Lenovo Thinkpad T480s
```
_,met$$$$$gg. joshua@dragoncable
,g$$$$$$$$$$$$$$$P. ------------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 12 (bookworm) x86_64
,$$P' `$$$. Host: 20L8S3WS01 ThinkPad T480s
',$$P ,ggs. `$$b: Kernel: 6.1.0-17-amd64
`d$$' ,$P"' . $$$ Uptime: 36 days, 6 hours, 1 min
$$P d$' , $$P Packages: 4308 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.2.15
$$; Y$b._ _,d$P' Resolution: 1920x1080
Y$$. `.`"Y$$$$P"' DE: GNOME 43.9
`$$b "-.__ WM: Mutter
`Y$$ WM Theme: Adwaita
`Y$$. Theme: Adwaita-dark [GTK2/3]
`$$b. Icons: Adwaita [GTK2/3]
`Y$$b. Terminal: tmux
`"Y$b._ CPU: Intel i5-8350U (8) @ 3.600GHz
`""" GPU: Intel UHD Graphics 620
Memory: 16906MiB / 23778MiB
NVMe: 1x SK Hynix Gold P31 1TB, ext4
SSD: N/A
HDD: N/A
```
### Smartphone: Samsung Galaxy S10e
```
-o o- u0_a674@dragonflight
+hydNNNNdyh+ --------------------
+mMMMMMMMMMMMMm+ OS: Android 12 aarch64
`dMMm:NMMMMMMN:mMMd` Host: Samsung SM-G970W
hMMMMMMMMMMMMMMMMMMh Kernel: 4.14.190-23725627-abG970WVLS8IWD1
.. yyyyyyyyyyyyyyyyyyyy .. Uptime: 5 days, 13 hours, 52 mins
.mMMm`MMMMMMMMMMMMMMMMMMMM`mMMm. Packages: 72 (dpkg), 1 (pkg)
:MMMM-MMMMMMMMMMMMMMMMMMMM-MMMM: Shell: bash 5.1.12
:MMMM-MMMMMMMMMMMMMMMMMMMM-MMMM: CPU: Qualcomm SM8150 (8) @ 1.785GHz
:MMMM-MMMMMMMMMMMMMMMMMMMM-MMMM: Memory: 3796MiB / 5466MiB
:MMMM-MMMMMMMMMMMMMMMMMMMM-MMMM: Storage: 128GB
-MMMM-MMMMMMMMMMMMMMMMMMMM-MMMM-
+yy+ MMMMMMMMMMMMMMMMMMMM +yy+
mMMMMMMMMMMMMMMMMMMm
`/++MMMMh++hMMMM++/`
MMMMo oMMMM
MMMMo oMMMM
oNMm- -mMNs
```
### Home Base (Headless server "Desktop-in-the-cloud"): Dell PowerEdge R630
```
_,met$$$$$gg. joshua@base
,g$$$$$$$$$$$$$$$P. -----------
,g$$P" """Y$$.". OS: Debian GNU/Linux 11 (bullseye) x86_64
,$$P' `$$$. Model: Dell PowerEdge R630
',$$P ,ggs. `$$b: Kernel: 5.10.0-26-amd64
`d$$' ,$P"' . $$$ Uptime: 110 days, 17 hours, 51 mins
$$P d$' , $$P Packages: 1591 (dpkg), 10 (flatpak)
$$: $$. - ,d$$' Shell: bash 5.1.4
$$; Y$b._ _,d$P' Resolution: 1024x768
Y$$. `.`"Y$$$$P"' Terminal: /dev/pts/2
`$$b "-.__ CPU: 2x Intel Xeon E5-2620 v4 (32) @ 3.000GHz
`Y$$ GPU: NVIDIA Tesla T4
`Y$$. GPU: NVIDIA Tesla T4
`$$b. Memory: 53073MiB / 64301MiB
`Y$$b. NVMe: 2x XPG GAMIX S70 BLADE 2TB, ZFS mirror
`"Y$b._ SSD: N/A
`""" HDD: N/A
```
## Servers
My server infrastructure is quite sprawling, but here's the short info. For more detail, please see my perpetually-"upcoming" blog post or [my rack tour videos on YouTube](https://www.youtube.com/playlist?list=PLNfKWbHAcA3PcEpFfS1GqFcs7EkKiBOQr).
### Routers: FreeBSD on Debian on SZBOX G30B Mini-PCs (x2)
```
``` ` joshua@dcrX
` `.....---.......--.``` -/ -----------
+o .--` /y:` +. OS: FreeBSD 13.2-RELEASE-p9 amd64
yo`:. :o `+- Uptime: 3 days, 15 hours, 11 mins
y/ -/` -o/ Packages: 185 (pkg)
.- ::/sy+:. Shell: bash 5.2.15
/ `-- / Terminal: /dev/pts/2
`: :` CPU: QEMU Virtual version (4) @ 1.996GHz
`: :` Memory: 3121MiB / 6102MiB
/ / NVMe: 1x QEMU 80GB, ZFS
.- -. SSD: N/A
-- -. HDD: N/A
`:` `:`
.-- `--.
.---.....----.
```
Which for compatibility reasons are VMs running on top of...
```
_,met$$$$$gg. joshua@dcrhvX
,g$$$$$$$$$$$$$$$P. -------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 12 (bookworm) x86_64
,$$P' `$$$. Host: SZBOX G30B TVI7309X B0
',$$P ,ggs. `$$b: Kernel: 6.1.0-17-amd64
`d$$' ,$P"' . $$$ Uptime: 9 days, 14 hours, 2 mins
$$P d$' , $$P Packages: 830 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.2.15
$$; Y$b._ _,d$P' Terminal: /dev/pts/1
Y$$. `.`"Y$$$$P"' CPU: Intel Celeron N5105 (4) @ 2.900GHz
`$$b "-.__ GPU: Intel JasperLake [UHD Graphics]
`Y$$ Memory: 6522MiB / 7783MiB
`Y$$. NVMe: 1x Generic 128GB, ext4
`$$b. SSD: N/A
`Y$$b. HDD: N/A
`"Y$b._
`"""
```
### Primary Hypervisor Cluster: Dell PowerEdge R630 (x3)
```
_,met$$$$$gg. joshua@hvX.p
,g$$$$$$$$$$$$$$$P. ------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 11 (bullseye) x86_64
,$$P' `$$$. Host: Dell PowerEdge R630
',$$P ,ggs. `$$b: Kernel: 5.10.0-27-amd64
`d$$' ,$P"' . $$$ Uptime: 13 days, 19 hours, 13 mins
$$P d$' , $$P Packages: 886 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.1.4
$$; Y$b._ _,d$P' Resolution: 1024x768
Y$$. `.`"Y$$$$P"' Terminal: /dev/pts/19
`$$b "-.__ CPU: 2x Intel Xeon E5-2683 v4 (64) @ 3.000GHz
`Y$$ GPU: 0b:00.0 Matrox Electronics Systems Ltd. G200eR2
`Y$$. Memory: 113190MiB / 515876MiB
`$$b. NVMe: N/A
`Y$$b. SSD: 2x Intel DC S3700 200GB, RAID-1/ext4; 2x Intel DC S3700 800GB
`"Y$b._ HDD: N/A
`"""
```
### Testing Hypervisor Cluster: Dell PowerEdge R430 (x3)
```
_,met$$$$$gg. joshua@hvX.t
,g$$$$$$$$$$$$$$$P. ------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 12 (bookworm) x86_64
,$$P' `$$$. Host: Dell PowerEdge R430
',$$P ,ggs. `$$b: Kernel: 6.1.0-17-amd64
`d$$' ,$P"' . $$$ Uptime: 25 mins
$$P d$' , $$P Packages: 931 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.2.15
$$; Y$b._ _,d$P' Resolution: 1024x768
Y$$. `.`"Y$$$$P"' Terminal: /dev/pts/2
`$$b "-.__ CPU: Intel Xeon E5-2603 v3 (6) @ 1.600GHz
`Y$$ GPU: 0a:00.0 Matrox Electronics Systems Ltd. G200eR2
`Y$$. Memory: 4070MiB / 31873MiB
`$$b. NVMe: N/A
`Y$$b. SSD: 1x Intel DC S3610 200GB, ext4; 1x Samsung PM883 480GB
`"Y$b._ HDD: N/A
`"""
```
### Ceph Storage Cluster: Dell PowerEdge R720xd (x3)
```
_,met$$$$$gg. joshua@cephX.c
,g$$$$$$$$$$$$$$$P. --------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 11 (bullseye) x86_64
,$$P' `$$$. Host: Dell PowerEdge R720xd
',$$P ,ggs. `$$b: Kernel: 5.10.0-26-amd64
`d$$' ,$P"' . $$$ Uptime: 116 days, 19 hours, 33 mins
$$P d$' , $$P Packages: 636 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.1.4
$$; Y$b._ _,d$P' Resolution: 1024x768
Y$$. `.`"Y$$$$P"' Terminal: /dev/pts/0
`$$b "-.__ CPU: Intel Xeon E5-2697 v2 (24) @ 3.500GHz
`Y$$ GPU: 0b:00.0 Matrox Electronics Systems Ltd. G200eR2
`Y$$. Memory: 39374MiB / 64233MiB
`$$b. NVMe: N/A
`Y$$b. SSD: 2x Intel DC S3700 200GB, RAID-1/ext4
`"Y$b._ HDD: 3x Western Digital Red 14TB; 6x Western Digital Red 8TB
`"""
```
### Backup Server: Whitebox 2U
```
_,met$$$$$gg. joshua@backup
,g$$$$$$$$$$$$$$$P. -------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 11 (bullseye) x86_64
,$$P' `$$$. Host: Whitebox (Supermicro X10SRL-F)
',$$P ,ggs. `$$b: Kernel: 5.10.0-26-amd64
`d$$' ,$P"' . $$$ Uptime: 116 days, 19 hours, 40 mins
$$P d$' , $$P Packages: 790 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.1.4
$$; Y$b._ _,d$P' Resolution: 1024x768
Y$$. `.`"Y$$$$P"' Terminal: /dev/pts/0
`$$b "-.__ CPU: Intel Xeon E5-2620 v3 (12) @ 3.200GHz
`Y$$ GPU: 09:00.0 ASPEED Technology, Inc. ASPEED Graphics Family
`Y$$. Memory: 22450MiB / 31984MiB
`$$b. NVMe: N/A
`Y$$b. SSD: 1x Intel DC S3700 800GB, ext4
`"Y$b._ HDD: 4x Western Digital Red 8TB, ZFS RAID-Z; 1x Western Digital USB3.0 8TB, ZFS
`"""
```

30
content/en/pages/legal.md Normal file
View File

@ -0,0 +1,30 @@
+++
Categories = []
Tags = []
date = "2016-08-21T23:37:49-04:00"
title = "Legal"
+++
Copyright ©2018-2024 Joshua M. Boniface (except where otherwise noted)
All content released under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License
You are free to:
* Share - copy and redistribute the material in any medium or format
* Adapt - remix, transform, and build upon the material for any purpose, even commercially.
* The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
* Attribution - You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
* ShareAlike - If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
* No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
* You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
* No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
See https://creativecommons.org/licenses/by-sa/4.0/legalcode for full details

Binary file not shown.

After

Width:  |  Height:  |  Size: 93 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 687 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 MiB

View File

@ -0,0 +1,145 @@
---
title: "Build a Raspberry Pi BMC"
description: ""
date: 2017-02-10
tags:
- DIY
- Development
- Systems Administration
---
**NOTICE:** This project is long-since obsoleted. I never did complete it, and ended up just buying some IPMI-capable motherboards. I would receommend the various Pi-KVM solutions now available as much better, more robust replacements to this project.
IPMI BMCs are pretty ubiquitous in the datacenter and enterprise computing, because in a warehouse full of computers, finding and walking up to one just to reset it or check its console is quite daunting. The same goes for a home server: it may just be in my basement, but in a closed-up rack it becomes a huge hassle to manage a machine without IPMI. I try to get it on every motherboard I buy, but currently my Ceph nodes are running with motherboards that lack built-in IPMI. After an incident with one machine while I was on vacation, I finally decided that they needed remote management, and I decided to make my own BMC for them rather than waste money on replacement (IPMI-capable) motherboards.
## Enter the Raspberry Pi
If you don't know what it is, the [Rapberry Pi](https://www.raspberrypi.org) is a small single-board computer, featuring an ARM SOC, Ethernet, USB, video, audio, and most importantly, GPIO (General-Purpose Input and Output) pins, which is powered by MicroUSB and running the Debian distribution '[Raspbian](https://www.raspberrypi.org/downloads/raspbian/)'. The GPIO pins allow one to control or read information with a simple utility or Python library, from various devices, including a serial console interface. These features make the Raspberry Pi a perfect fit for a BMC, and really not that far from the "real" BMCs found in most server-grade motherboards.
*(Pictured: A Raspberry Pi 1 model B)*
![Raspberry Pi](rpi-1b.jpg)
## The hardware
### Power - thanks ATX!
One of the main ideas behind a BMC is that it should be accessible even if the host system is off. So we need some sort of power independent of the host state. The first thought when using a Raspberry Pi as a BMC would be to power it from some sort of external USB power brick, but that's messy: another cable running out of the case that must be dealt with, and another power brick. Luckily for us however, the ATX power supply standard has a solution!
The purple wire on a standard 24-pin ATX connector provides a standby +5V power supply, rated in the spec for up to 50mA but in reality today often supporting close to 1A. This is more than enough to run the motherboard standby power as well as a Raspberry Pi. This has the added benefit of working just like a real BMC: when the system is unplugged, the BMC also turns off, and turns back on as soon as power is reapplied. With that in mind, I made a simple "adapter" out of a piece of solid-core CAT5 cable and carefully encased in hot glue, which is inserted directly into the ATX motherboard connector, with the other end attached to a standard MicroUSB cable. The result is consistent, reliable, in-case power for the BMC without any trickery, and a cable tie keeps it locked in place.
[Editor's note: WARNING - This may burn down your computer - I will eventually make proper connectors, but I'm not letting that hold up the project!]
*(Pictured: The power interface to the motherboard, attached to a MicroUSB cable)*
[Picture - power stuff]
### GPIO to rule them all
Once of a BMC's main functions is controlling and determining the power state of the system. This basic functionality allows you to, for instance, hard reset a crashed machine, or start it back up after a power failure. No more running for the box to press the power button!
The Raspberry Pi's GPIO pins provide a nice simple method for interfacing with the raw system management headers on any motherboard. For determining the power state, a GPIO pin is connected with a small resistor directly to the Power LED connector on the motherboard. When this pin is read high, we know the system is online; when it's low, we know it's offline. Probably the simplest circuit in the project!
The power and reset switches are a little more complex. While you can direct the GPIO directly to the switch headers, this will not work as you would expect, and could in fact risk blowing out your motherboard by sending 3.3V into the switch headers. The solution is to use a transistor, and from my basic understanding any would do: connect the base pin of the transistor to your GPIO pin, and the collector and emitter pins to the switch header (emitter to ground). The result is an electrically controlled switch, which turns on when the GPIO is set high for a second, and turns off again when the GPIO is set low. You can now safely control the power and reset switches with your Raspberry Pi.
*(Pictured: the GPIO layout for a first-generation model-B Raspberry Pi)*
![GPIO pinout](rpi-1b-gpio.png)
### Serial - USB or TTL?
Another critical function of a BMC is to manage the host system without requiring a connected monitor+keyboard. Luckily just about every motherboard, and especially "server-grade" motherboards that otherwise lack a BMC, still feature one or two DB9 COM ports and console redirection. With some BIOS configurations, console redirection lets you send the console output through the serial port, giving us an old-school VTY terminal: exactly what we need for console access. However getting that console into the Raspberry Pi is a bit tricky.
The most common way to do this is to use a USB to Serial adapter, but like the original power suggestion, it's messy in terms of cabling, requiring a header, crossover cable, and then the adapter itself. Enter the TTL serial port on the Raspberry Pi. Built in to [GPIO pins 14 and 15](http://codeandlife.com/2012/07/01/raspberry-pi-serial-console-with-max3232cpe/), the Raspberry Pi features a TTL 3.3V serial interface. Normally you would use this to get a console into the Raspberry Pi itself, but with a simple flip of the TXD and RXD lines and some [reconfiguration of the Raspbian image](http://www.hobbytronics.co.uk/raspberry-pi-serial-port), we can use this interface to communicate with the host system instead.
This method does require a special chip called the MAX3232, which will convert between TTL serial from the Raspberry Pi and the RS232 serial of the motherboard. Luckily, the converter chips can be had for about $1.50 for 5 on eBay from the right Chinese sellers, including the capacitors you need to make the circuit work. And with some jumpers, we can connect the chip directly to the motherboard COM2 header; no messy crossover cables or USB to serial adapters! The resulting device is `/dev/ttyAMA0` in Raspbian and works flawlessly with your terminal emulator of choice; I'm using screen to allow persistence and clean disconnection without terminating the serial session (more on that later!)
The one downside of this method is the lack of proper VGA graphics support. Your OS needs to be text-based with support for console redirection, such as any Unix-like OS or possibly Windows Core (I don't know), and it needs to be configured with this support to get anywhere past the BIOS. As I'm using Debian on all my systems this is perfectly fine by me and I include the tweaks into my system images, but keep this in mind when setting up your BMC.
*(Pictured: the MAX3232 signal converter board)*
![MAX3232 Serial boards](max3232-boards.jpg)
### Cabling it up
The actual cabling is kept simple using a solderable breadboard. Each cell features a total of 26 header pins for female-to-female jumper cables, along with the two transistors, two resistors, and the MAX3232 daughter board. This unit keeps all the cabling neat and consistent between all three of my systems, and makes documenting the connections a breeze!
The board layout is straightforward and rendered here in ASCII for convenience, where each line (a connection), character (Transistor, Resistor), or number (header pin) represents one hole on the breadboard. The transistor leads are labeled Emitter, Base, and Collector.
```
[01][02][03][04][05][06][07][08][09][10]
| | | | | | | | | |
[11][12][13][14] | | . . . .
| | | | | | |---|---|---|
E \ / C E \ / C | | | |
T T | | | MAX |
B / /---/ B | | | 3232 |
| | | | | BOARD |
| | [15][16] | | | |
| | | | | | |---|---|---|
| | . | . | . . . .
| | R | R | | | | |
[17][18][19][20][21][22][23][24][25][26]
Motherboard connectors:
01: Reset ground 02: Reset signal
02: Power ground 04: Power signal
05: Power LED ground 06: Power LED signal
07: RS232 RX 08: RS232 TX
09: RS232 3.3V 10: RS232 ground
Chassis connectors:
11: Reset switch ground 12: Reset switch signal
13: Power switch ground 14: Power switch signal
15: Chassis LED ground 16: Chassis LED signal
Raspberry Pi connectors:
17: Reset GPIO [0] 18: Power GPIO [1]
19: GPIO ground 20: Power LED GPIO [3]
21: GPIO ground 22: Power State GPIO [2]
23: TTL ground 24: TTL 3.3V
25: TTL TX 26: TTL RX
```
The finished product is a small board that keeps all the cabling neat and tidy in the case, and is easily mounted. While my soldering job is attrocious, they work!
*(Pictured: The finished breadboard layout)*
![The finished breadboard](breadboard-layout.jpg)
*(Pictured: the cabling of the Raspberry Pi BMC)*
![The finished product](finished-product.jpg)
## The software
Most traditional BMCs are managed via a (usually-terrible) Web UI, which is often prone to breakage of various kinds. And often these Web UIs are incredibly complex, featuring dozens of menus and gigantic Flash monstrosities just to view their state. By using the Raspberry Pi, a seeming limitation - SSH-only access - becomes a benefit: it becomes trivial to connect to the BMC from a terminal, check the power state or serial console, turn on/off/reset a host, and then disconnect. No crummy Flash web page, Java plugins, or slow point-and-click menu system!
The software side started out as a basic Raspbian system, however I wanted to make it a little more "BMC-like", in a sense stripped-down and easy-to-use with a small set of commands. I started by writing a simple "shell" emulator in BASH, and using a constant loop and hostname prompt along with `stty` to keep it focused and running. Each command triggers a function which performs its specific job and then returns to the "shell". While `bash` is available also, it should rarely be needed.
The programs doing the heavy lifting are a combination of `screen`, to view the host system serial console, and the `gpio` utility by WiringPi (Debian package `wiringpi`). The `screen` session is configured to start automatically at BMC boot via `rc.local`, to ensure all serial output is captured and stored for later analysis, even from cold boot - a major problem with the few SSH-based BMCs I've tried! The `gpio` program makes writing and reading the GPIO pins simple and easy, returning 0 or 1 for the low/high states and easily writing states. By writing the BMC shell in `bash`, I was able to get all the flexibility I wanted without any programming overhead; the whole thing is under 200 lines including all the functions and a little prep work ensuring the required packages are installed. The whole code of the `bmc.sh` utility script can be found [on my GitHub](https://github.com/joshuaboniface/rpibmc), and is fairly self-explanatory.
The BMC is accessible via a user called `bmc`, which has the `bmc.sh` script as its login shell, as well as the groups required to run the various functions (`sudo` and `gpio` specifically). The `/etc/sudoers` file has also been edited to allow sudo without a password, which is used within the `bmc.sh` script. The new user can be created and configured on fresh install of Raspbian using these commands, and assuming `bmc.sh` is in `/bin`:
```
sudo sed -i 's/%sudo\tALL=(ALL:ALL) ALL/%sudo\tALL=(ALL:ALL) NOPASSWD: ALL/' /etc/sudoers
sudo useradd -g sudo -G gpio -s /bin/bmc.sh -d /home/bmc -m bmc
sudo chpasswd <<<"bmc:password"
```
Finally we're able to set the host system's name (for display when logging in) via the file `/etc/bmchost`, which makes deploying an image and setting the name trivial. Log in as the 'bmc' user via SSH, and observe:
*(Pictured: an example session with `bmc.sh`)*
![Shell example](bmcshell-sample.png)
## Conclusion
I hope you've found this post interesting and useful - if you have some IPMI-less systems you want to manage remotely, I definitely recommend you try it out. So far my testing has been extremely positive; for under $40 (if you can find the Raspberry Pi for the right price; a first generation is more than powerful enough), you can build youself out-of-band BMC management for any motherboard you can find. And the remote management makes even the most irritating host messups ("Oh broke my udev network rules, again!?") trivial to recover from. A small price to pay for the peace of mind of being able to manage your system from almost anywhere, even a cruise ship!
*(Pictured: what you might have to do on a cruise ship without a BMC!)*
![No BMC fail](txt-from-a-ship.png)
If you have any questions or comments, shoot me an e-mail, or find me on various social media!

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 158 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 551 KiB

View File

@ -0,0 +1,472 @@
---
title: "Automating your garden hose for fun and profit"
description: "Building a custom self-hosted MQTT water shutoff and controlling it with HomeAssistant"
date: 2018-09-28
tags:
- DIY
- Home Automation
---
I love gardening - over the last couple years it's become a great summer pasttime for me. And after a backyard revamp, I'm planning a massive flower garden to create my own little oasis.
One of the parts of gardening that I'm not the biggest fan of is watering. Don't get me wrong - it's relaxing. But in small doses. And a big garden demands a lot of watering on a set schedule, which is often hard to do. The first part of the solution was to lay down some soaker hose throughout the garden, which works wonderfully, with no more having to point a hose for several minutes at each part of the garden. Once the whole thing is finalized I'll install some permanent watering solutions, but for now this does the job.
But turning it on and off at the tap was hardly satisfying. I've already got HomeAssistant set up to automate various lights around my house and implement [voice control](https://www.boniface.me/post/self-hosted-voice-control/), so why not build something to automate watering the garden?
The first thing I did is check for any premade solutions. And while there's plenty of controllers out there, it doesn't look like any of them support HomeAssistant. So I endeavoured to build my own instead - and the controller can be used both for a temporary setup or a permanent one.
### Getting some parts
I knew I'd need a couple things for this setup to work:
* A powered water valve/solenoid of some kind.
* A controller board.
* 24/7 off-grid power.
* Connectivity to HomeAssistant.
The first option was actually the hardest to figure out, but I did find the following on Amazon: [Water Solenoid Valve](https://www.amazon.ca/gp/product/B00K0TKJCU/ref=oh_aui_detailpage_o01_s00?ie=UTF8&psc=1). It's a little pricey, but it's the cheapest one.
The remaining items were fairly standard: a standard Arduino-compatible 12-volt relay to control the valve, a NodeMCU for control allowing me to use MQTT to connect to HomeAssistant, and a 4.7AH 12-volt lead-acid battery and solar panel allows for continuous off-grid operation. I needed a 12-volt to 5-volt converter to power the NodeMCU (and relays) off the 12-volt battery as well which was fairly easy to come by.
### Building the water distributor
The solenoid has 1/2" diameter threads, so that also meant I needed some adapters from the standard hose ends to match up with the valve. Luckily, my local Home Depot had (almost) all the parts I needed. The resulting loop is laid out as follows:
```
_______________
| ___ ___ |
| | | | | |
| | | | | |
|V| | | |V|
| | | | | |
| | | | | |
B S A
```
Where S is the incoming water source, V is a solenoid valve, and A/B are the outputs to hoses.
I did have to improvise a bit since 1/2" threaded female T connectors weren't available for me, but the 1/2" 90° male-to-female elbows, as well as hose reducing pieces were. Putting it all together worked perfectly with no leaks allowing the next stage to begin.
### Setting up the circuitry
The circuit itself is fairly simple. The NodeMCU digital 1 and 2 pins connect to the two relay controllers, with 12-volt lines coming from the battery into each relay and out to the solenoid valves, wired with the "signal applies power" poles of the relay. Connecting it all to the battery and solar panel gave the full circuit.
Power-on testing was a great success, and the next step was programming the Node MCU.
### NodeMCU programming
The code for the NodeMCU I wrote originally for a door lock device that never succeeded, but I was able to quickly repurpose it for this task. The main input comes via MQTT, with one topic per relay, and output goes out a similar set of topics to return state.
The full code is as follows and can be easily scaled to many more valves if needed. The onboard LED is used as a quick debug for operations.
`HoseController.ino`
```
#include <PubSubClient.h>
#include <ESP8266WiFi.h>
/*
* Garden hose WiFi control module
*/
// D0 - 16
#define BUILTIN_LED 16
// D1 - 5
#define RELAY_A_PIN 5
// D2 - 4
#define RELAY_B_PIN 4
// D3 - 0
// D5 - 14
// D6 - 12
// Control constants
const char* mqtt_server = "my.mqtt.server";
const int mqtt_port = 1883;
const char* mqtt_user = "myuser";
const char* mqtt_password = "mypass";
const char mqtt_control_topic[16] = "hose-control";
const char mqtt_state_topic[16] = "hose-state";
const char* wifi_ssid = "myssid";
const char* wifi_psk = "mypsk";
char mqtt_topic[32];
char topic_A[32];
char topic_B[32];
int getRelayPin(char* relay_string) {
char relay = relay_string[0];
int pin;
switch (relay) {
case 'a':
pin = RELAY_A_PIN;
break;
case 'b':
pin = RELAY_B_PIN;
break;
default:
pin = BUILTIN_LED;
break;
}
return pin;
}
WiFiClient espClient;
PubSubClient client(espClient);
long lastMsg = 0;
char msg[50];
int value = 0;
int state = 0; // 0 = unlocked, 1 = locked
int last_state = 0; // 0 = unlocked, 1 = locked
void setup() {
pinMode(RELAY_A_PIN, OUTPUT); // RELAY A pin
pinMode(RELAY_B_PIN, OUTPUT); // RELAY B pin
pinMode(BUILTIN_LED, OUTPUT); // LED pin
digitalWrite(RELAY_A_PIN, HIGH); // Turn off relay
digitalWrite(RELAY_B_PIN, HIGH); // Turn off relay
digitalWrite(BUILTIN_LED, LOW); // Turn on LED
Serial.begin(9600); // Start serial console
setup_wifi(); // Connect to WiFi
client.setServer(mqtt_server, mqtt_port); // Connect to MQTT broker
client.setCallback(callback);
}
void relay_on(char* relay) {
int pin = getRelayPin(relay);
digitalWrite(pin, LOW);
}
void relay_off(char* relay) {
int pin = getRelayPin(relay);
digitalWrite(pin, HIGH);
}
void setup_wifi() {
delay(10);
// We start by connecting to a WiFi network
Serial.println();
Serial.print("Connecting to ");
Serial.println(wifi_ssid);
WiFi.begin(wifi_ssid, wifi_psk);
while (WiFi.status() != WL_CONNECTED) {
digitalWrite(BUILTIN_LED, HIGH); // Turn off LED
delay(250);
digitalWrite(BUILTIN_LED, LOW); // Turn on LED
delay(250);
}
Serial.println("");
Serial.println("WiFi connected");
Serial.print("IP address: ");
Serial.println(WiFi.localIP());
}
void callback(char* topic, byte* payload, unsigned int length) {
Serial.print("Message arrived [");
Serial.print(topic);
Serial.print("] ");
String command;
for (int i = 0; i < length; i++) {
command.concat((char)payload[i]);
}
Serial.print(command);
Serial.println();
// Get the specific topic
String relay_str = getValue(topic, '/', 1);
char relay[8];
relay_str.toCharArray(relay, 8);
strcpy(mqtt_topic, mqtt_state_topic);
strcat(mqtt_topic, "/");
strcat(mqtt_topic, relay);
// Blink LED for debugging
digitalWrite(BUILTIN_LED, HIGH); // Turn off LED
delay(250);
digitalWrite(BUILTIN_LED, LOW); // Turn on LED
// Either enable or disable the relay
if ( command == "on" ) {
relay_on(relay);
Serial.println(String(relay) + ": ON");
client.publish(mqtt_topic, "on");
} else {
relay_off(relay);
Serial.println(String(relay) + ": OFF");
client.publish(mqtt_topic, "off");
}
}
void reconnect() {
// Loop until we're reconnected
while (!client.connected()) {
Serial.print("Attempting MQTT connection...");
// Attempt to connect
if (client.connect("hose", mqtt_user, mqtt_password)) {
Serial.println("connected");
digitalWrite(BUILTIN_LED, HIGH); // Turn on LED
// ... and resubscribe
strcpy(topic_A, mqtt_control_topic);
strcat(topic_A, "/");
strcat(topic_A, "a");
strcpy(topic_B, mqtt_control_topic);
strcat(topic_B, "/");
strcat(topic_B, "b");
client.subscribe(topic_A);
client.subscribe(topic_B);
} else {
Serial.print("failed, rc=");
Serial.print(client.state());
Serial.println(" try again in 4 seconds");
// Wait 4 seconds before retrying
digitalWrite(BUILTIN_LED, HIGH); // Turn off LED
delay(1000);
digitalWrite(BUILTIN_LED, LOW); // Turn on LED
delay(1000);
digitalWrite(BUILTIN_LED, HIGH); // Turn off LED
delay(1000);
digitalWrite(BUILTIN_LED, LOW); // Turn on LED
delay(1000);
}
}
}
void loop() {
if (!client.connected()) {
reconnect();
digitalWrite(BUILTIN_LED, LOW); // Turn on LED
}
client.loop();
delay(1000);
}
String getValue(String data, char separator, int index)
{
int found = 0;
int strIndex[] = { 0, -1 };
int maxIndex = data.length() - 1;
for (int i = 0; i <= maxIndex && found <= index; i++) {
if (data.charAt(i) == separator || i == maxIndex) {
found++;
strIndex[0] = strIndex[1] + 1;
strIndex[1] = (i == maxIndex) ? i+1 : i;
}
}
String string = found > index ? data.substring(strIndex[0], strIndex[1]) : "";
return string;
}
```
### Controlling the hose with HomeAssistant
With the outdoor box portion completed and running as expected in response to MQTT messages, the next step was configuring HomeAssistant to talk to it. I ended up wasting a bunch of time trying to get a useful UI set up before realizing that HomeAssistant has an awesome feature: the [MQTT switch](https://www.home-assistant.io/components/switch.mqtt/) component, which makes adding a switch UI element for the hose that actually works easy! Here is the configuration:
`configuration.yaml`
```
switch:
- platform: mqtt
name: "Hose A"
state_topic: "hose-state/a"
command_topic: "hose-control/a"
payload_on: "on"
payload_off: "off"
state_on: "on"
state_off: "off"
optimistic: false
qos: 0
retain: true
- platform: mqtt
name: "Hose B"
state_topic: "hose-state/b"
command_topic: "hose-control/b"
payload_on: "on"
payload_off: "off"
state_on: "on"
state_off: "off"
optimistic: false
qos: 0
retain: true
```
### Has it rained recently?
The next step in the automtion was to set up a timer to turn on and off the water for me automatically. The easiest solution is simply to run it every night, but that's not ideal if we've had rain recently! I thought about a number of ways to get this information within HomeAssistant, and came up with the following:
* The built-in `yr` component provides excellent weather information from [yr.no](http://www.yr.no), including exposing a number of optional conditions. I enabled it with all of them:
`configuration.yaml`
```
sensor:
# Weather for Burlington (a.k.a. home)
- name: Burlington
platform: yr
monitored_conditions:
- temperature
- symbol
- precipitation
- windSpeed
- pressure
- windDirection
- humidity
- fog
- cloudiness
- lowClouds
- mediumClouds
- highClouds
- dewpointTemperature
```
* From this basic weather information, I built up a set of statistics to obtain 24-hour and 48-hour information about precipitation:
`configuration.yaml`
```
sensor:
# Statistics sensors for precipitation history means
- name: "tfh precipitation stats"
platform: statistics
entity_id: sensor.burlington_precipitation
max_age:
hours: 24
- name: "feh precipitation stats"
platform: statistics
entity_id: sensor.burlington_precipitation
max_age:
hours: 48
```
* From the statistics, I built a sensor template to display the maximum amount of rain that was forecast over the last 24- and 48- hour periods, effectively telling me "has it rained at all, a little, or a lot?" Note that I use the names `tfh` and `feh` since the constructs `24h` and `48h` break the sensor template!
`configuration.yaml`
```
sensor:
# Template sensors to display the max
- name: "Precipitation history"
platform: template
sensors:
24h_precipitation_history:
friendly_name: "24h precipitation history"
unit_of_measurement: "mm"
entity_id: sensor.tfh_precipitation_stats_mean
value_template: >-
{% if states.sensor.tfh_precipitation_stats_mean.attributes.max_value <= 0.1 %}
0.0
{% elif states.sensor.tfh_precipitation_stats_mean.attributes.max_value < 0.5 %}
<0.5
{% elif states.sensor.tfh_precipitation_stats_mean.attributes.max_value >= 0.5 %}
>0.5
{% endif %}
48h_precipitation_history:
friendly_name: "48h precipitation history"
unit_of_measurement: "mm"
entity_id: sensor.feh_precipitation_stats_mean
value_template: >-
{% if states.sensor.feh_precipitation_stats_mean.attributes.max_value <= 0.1 %}
0.0
{% elif states.sensor.feh_precipitation_stats_mean.attributes.max_value < 0.5 %}
<0.5
{% elif states.sensor.feh_precipitation_stats_mean.attributes.max_value >= 0.5 %}
>0.5
{% endif %}
```
* These histories are used in a set of automations that turn on the hose at 2:00AM if it hasn't rained today, and keeps it on for a set amount of time based on the previous day's precipitation:
`automations/hose_control.yaml`
```
---
#
# Hose control structures
#
# Basic idea:
# If it rained <0.5mm in the last 48h, run for 1h
# If it rained >0.5mm in the last 48h, but 0.0mm in the last 24h, run for 30m
# If it rained <0.5mm in the last 24h, run for 15m
# If it rained >0.5mm in the last 24h, don't run tonight
# Turn on the hose at 02:00 if there's been <0.5mm of rain in the last 24h
- alias: 'Hose - 02:00 Timer - turn on'
trigger:
platform: time
at: '02:00:00'
condition:
condition: or
conditions:
- condition: state
entity_id: sensor.24h_precipitation_history
state: '<0.5'
- condition: state
entity_id: sensor.24h_precipitation_history
state: '0.0'
action:
service: homeassistant.turn_on
entity_id: switch.hose_a
# Turn off the hose at 02:15 if there's been <0.5mm but >0.0mm of rain in the last 24h
- alias: 'Hose - 02:15 Timer - turn off (<0.5mm/24h)'
trigger:
platform: time
at: '02:15:00'
condition:
condition: and
conditions:
- condition: state
entity_id: switch.hose_a
state: 'on'
- condition: state
entity_id: sensor.24h_precipitation_history
state: '<0.5'
action:
service: homeassistant.turn_off
entity_id: switch.hose_a
# Turn off the hose at 02:30 if there's been >0.5mm in the last 48h but 0.0 in the last 24h
- alias: 'Hose - 02:30 Timer - turn off (>0.5mm/48h + 0.0mm/24h)'
trigger:
platform: time
at: '02:30:00'
condition:
condition: and
conditions:
- condition: state
entity_id: switch.hose_a
state: 'on'
- condition: state
entity_id: sensor.24h_precipitation_history
state: '0.0'
- condition: state
entity_id: sensor.48h_precipitation_history
state: '>0.5'
action:
service: homeassistant.turn_off
entity_id: switch.hose_a
# Turn off the hose at 03:00 otherwise
- alias: 'Hose - 03:00 Timer - turn off'
trigger:
platform: time
at: '03:00:00'
condition:
condition: state
entity_id: switch.hose_a
state: 'on'
action:
service: homeassistant.turn_off
entity_id: switch.hose_a
```
Tada! Automated watering based on the rainfall!
### Conclusion
This was a fun little staycation project that I'm certain has great expandability. Next year once the garden is arranged I'll probably start work on a larger, multi-zone version to better support the huge garden. But for today I love knowing that my hose will turn itself on and water the garden every night if it needs it, no involvement from me! I hope you find this useful to you, and of course, I'm open to suggestions for improvement or questions - just send me an email!
### Errata
*2018-10-03*: I realized that 4.0mm of rain as an automation value was quite high. Even over a few days of light rain, it never reported above 1.0mm at any single ~2h interval, let alone 4.0mm. So 0.5mm seems like a much nicer value than 4.0mm for the "it's just lightly showered"/"it's actually rained" distinction I was going for. The code and descriptions above have been updated. One could also modify the templates to add up the 24h/48h total and base the condition off the total, which is a little more clunky but would be more accurate if that matters in your usecase.

View File

@ -0,0 +1,424 @@
---
title: "Building LibreOffice Online for Debian"
description: "How to build LibreofficeOnline against stock LibreOffice on Debian Stretch"
date: 2017-07-07
tags:
- Debian
- Development
---
DISCLAIMER: I never did proceed with this project beyond building the packages. I can offer no helpful support regarding getting it running.
LibreOffice Online is a very cool project by the fine people at [Collabora](https://www.collaboraoffice.com/code/) to bring the LibreOffice core functionality into a web browser. In effect, it's a Free Software version of the Google Docs suite of productivity tools, allowing one or many people to edit and save documents in a browser.
The software builds on the LibreOffice core code, and currently is only distributed as a Docker image, obtainable from the link above. However, my BLSE2 platform does not support Docker now, or at any time in the foreseeable future [aside: maybe I'll write my reasoning out, one day], and is based entirely around Debian and the idea of "use the package manager as much as possible". So I set out to build everything myself.
This however was no small task - there's precious little usable information in any one place on how to do this, especially not from the official Collabora site that just wants you to use the Docker image [re-aside: yea, that's one reason I hate Docker...]. Luckily, a GitHub user by the name of `m-jowlett` has written about his processes in a log [over at his GitHub page](https://gist.github.com/m-jowett/0f28bff952737f210574fc3b2efaa01a) [`m-jowett/log.md` should this link eventually rot]. His guide is however extremely rough, as well as using Ubuntu and building for integration with another project featuring OwnCloud. It however gave me most of what I needed to get going with this, including help integrating the final package with my OwnCloud instance(s).
As mentioned briefly above, my philosophy in BLSE2 is "use a package" - this is a core feature of Debian, and one of the most solid examples of quality forethought in design in the Free Software world. By separating applications from their libraries, you keep security updates easy and with minimal administrative work. As such, I always choose to build a package if possible, and luckily with LibreOffice Online I can. And it's right there in the repo! A huge win in my mind, especially considering my initial fear of a program distributed as a Docker Image [re-re-aside: poor dependency life-cycle management and monolithic software bundles - another reason I hate Docker; but I digress]. As this is a brand-new project and I'm a keen `dist-upgrade`er, I've gone with the brand-new Stretch (9.0) release in the `amd64` arch - you should probably be running the same, but 32-bit will work too.
```
$ cat /etc/debian_version
9.0
$ uname -a
Linux libreoffice-online 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) x86_64 GNU/Linux
```
So without further ado, here's how to build LibreOffice Online on Debian Stretch!
## Installing the [Easy] Dependencies
The `m-jowlett` guide lists a couple of dependencies: `libpng12-dev`, `libcap-dev`, `libtool`, `m4`, and `automake`, and I'll add in `fakeroot`, `debhelper`, `dh-systemd`, and the `build-essential` metapackage to build the Debian packages, as well as `unixodbc-dev` which is required by the POCO build process below. The only one to cause problems in Stretch is `libpng12-dev`: we need the `libpng-dev` package instead, which installs `libpng16-dev`. The version bump doesn't seem to affect anything negatively, however. And of course, we need the full `libreoffice` suite and it's build-deps installed as well, `python-polib`, `nodejs-legacy` and `node-jake` to grab some modules during the build, as well as `libghc-zlib-bindings-dev` and `libghc-zlib-dev` which pulls in `ghc`.
```
$ sudo apt install libpng-dev libcap-dev libtool m4 automake fakeroot debhelper dh-systemd build-essential unixodbc-dev libreoffice python-polib nodejs-legacy node-jake libghc-zlib-bindings-dev libghc-zlib-dev
$ sudo apt build-dep libreoffice
```
## Building POCO
The one dependency that is hard is `libpoco-dev`. LibreOffice Online makes extensive use of JSON processing using `libpoco`. However, there's a problem: the author of JSON decided it was a bright idea to troll(?) the Free Software community, and added a problematic line to his otherwise-MIT licensed code. "The Software shall be used for Good, not Evil." As a result, JSON doesn't fit the [Debian Free Software Guidelines and isn't present in any Debian packages](https://wiki.debian.org/qa.debian.org/jsonevil), and while Debian Stretch contains `libpoco` of a version we require, the extremely-critical-to-LOOL JSON parsing library is disabled in the code. So, we have to build the `libpoco` packages ourselves with JSON support, and then install them as dependencies.
First, we start with a bare Stretch machine with the above dependencies installed and several GB of free space, and then create a directory to contain our `libpoco` build. We then download both the Debian source package and the Git repository fetching the same version branch.
```
$ mkdir ~/libpoco
$ cd ~/libpoco/
$ apt-get source libpoco-dev
$ git clone --recursive --depth 1 --branch poco-1.7.6-release https://github.com/pocoproject/poco.git
$ ls
poco poco-1.7.6+dfsg1 poco_1.7.6+dfsg1-5.debian.tar.xz poco_1.7.6+dfsg1-5.dsc poco_1.7.6+dfsg1.orig.tar.bz2
```
Note the `+dfsg` tag to the source packages - this is indicative that the Debian maintainer modified the package to comply with the DFSG. Luckily though, all they did was set some package rules to avoid building the JSON component, and then removed the relevant source. By cloning the official repo, we can then combine the `debian/` folder from the source package, with modifications, and the upstream source in order to build the JSON libraries. Just "don't use it for evil"!
First we create our "source" `tar` archive for the package build process (the name is explained below), then copy over, clean, and commit the stock `debian/` folder - I recommend doing this for all your custom packaging as it makes retracing your changes far easier!
```
$ tar -cvJf poco_1.7.6-lool.orig.tar.xz poco/
$ cp -a poco-1.7.6+dfsg1/debian poco/
$ cd poco/
$ git checkout -b debian
$ dh_clean
$ git add debian/
$ git commit -m "Initial debian folder from Stretch source package"
```
Now we can begin modifying the Debian package rules. This is fairly straightforward even without Debian packaging experience. I'll indicate the file edits with `vim <filename>`; you can replace the `vim` with the editor of you choice. The output that follows is a basic `git`-style diff of the changes, as generated as the changes are committed to the custom branch.
The first target is the `changelog` file, to tell it we have a new version. Note that the version string (`lool-1`) is chosen specifically because it is "higher" than the official package's `+dfsg1` string. You can validate this yourself using `dpkg --compare-versions`. This ensures that our custom packages will supersede the official ones, should you commit them to a custom repo and upgrade, though in this guide we install them locally with `dpkg`. Note that the formatting of this file must match exactly, including every space and the full date, but feel free to edit the note and name/email as you desire - this is irrelevant unless you intend to distribute the packages.
```
$ vim debian/changelog
@@ -1,3 +1,9 @@
+poco (1.7.6-lool-1) stable; urgency=medium
+
+ * A custom build of 1.7.6 including non-DFSG JSON libraries for LibreOffice Online
+
+ -- Your Name <you@example.com> Tue, 06 Jul 2017 23:47:21 -0400
+
poco (1.7.6+dfsg1-5) unstable; urgency=medium
* Add missing dependencies (Closes: #861682)
```
The next file to edit is the `control` file. In here, we add an entry for the `libpocojson46` package that will also be built with `libpoco-dev`. This edit should not require any changes from what is presented here.
```
$ vim debian/control
@@ -228,3 +228,19 @@ Description: C++ Portable Components (POCO) Zip library
consistent and easy to maintain.
.
This package provides the POCO Zip library.
+
+Package: libpocojson46
+Architecture: any
+Depends: libpocofoundation46 (= ${binary:Version}), ${shlibs:Depends}, ${misc:Depends}
+Description: C++ Portable Components (POCO) JSON library
+ The POCO C++ Libraries are a collection of open source C++ class libraries
+ that simplify and accelerate the development of network-centric, portable
+ applications in C++. The libraries integrate perfectly with the C++ Standard
+ Library and fill many of the functional gaps left open by it.
+ .
+ POCO is built strictly using standard ANSI/ISO C++, including the standard
+ library. The contributors attempt to find a good balance between using advanced
+ C++ features and keeping the classes comprehensible and the code clean,
+ consistent and easy to maintain.
+ .
+ This package provides the POCO JSON library.
```
Next we edit the `rules` file to remove the exclusion of the JSON component.
```
$ vim debian/rules
@@ -4,7 +4,7 @@ DPKG_EXPORT_BUILDFLAGS = 1
export DEB_LDFLAGS_MAINT_APPEND = -Wl,--as-needed
include /usr/share/dpkg/buildflags.mk
-CONFFLAGS = --prefix=/usr --no-samples --no-tests --unbundled --everything --omit=JSON --cflags="-DPOCO_UTIL_NO_JSONCONFIGURATION" --odbc-lib=/usr/lib/$(DEB_HOST_MULTIARCH)/
+CONFFLAGS = --prefix=/usr --no-samples --no-tests --unbundled --everything --odbc-lib=/usr/lib/$(DEB_HOST_MULTIARCH)/
# Disable parallel build on armel and mipsel
ifneq (,$(filter $(DEB_BUILD_ARCH),armel mipsel))
```
Now we remove the patches that disabled JSON support in the package. First we remove the patch, then we remove the `series` entry for it.
```
$ rm debian/patches/0006-Disable-JSON-targets-in-Makefiles.patch
$ vim debian/patches/series
@@ -6,7 +6,6 @@ no-link-dl-rt.patch #could be removed now
# patches for build/rules/*
no-debug-build.patch
no-strip-release-build.patch
-0006-Disable-JSON-targets-in-Makefiles.patch
# upstream patches
0007-Add-patch-for-OpenSSL-1.1.0.patch
```
Finally we remove the old lintian overrides (no longer needed) and create the `install` file for our new `libpocojson46` package.
```
$ rm debian/source.lintian-overrides
$ vim debian/libpocojson46.install
@@ -0,0 +1 @@
+usr/lib/libPocoJSON.so.*
```
We can now build the package; I use `-j4` on my quad-vCPU build machine but you should adjust this based on your core count for best performance. Note also that we're using `sudo` here; for whatever reason, trying to compile `libpoco` without root causes a spew of error messages like `object 'libfakeroot-sysv.so' from LD_PRELOAD cannot be preloaded`, on both my testing machines.
```
$ sudo dpkg-buildpackage -us -uc -j4
[lots of output]
dpkg-buildpackage: info: full upload (original source is included)
```
That last line indicates that the build succeeded; above it, we see a long list of generated packages in the parent directory. Move up a directory, remove the unneeded `dbgsym` packages, and install all the rest. Note the `sudo` commands again (due to the permissions of `dpkg-buildpackage`).
```
$ cd ..
$ sudo rm *-dbgsym_*.deb
$ sudo dpkg -i *.deb
```
We now have a working set of POCO libraries and can now begin building LibreOffice Online itself!
## Building LibreOffice Online
Once the dependencies are in place building the LibreOffice Online package itself is actually fairly straightforward - the repo contains a working `debian` folder, though it too requires some tweaking to build properly.
Begin by making a new directory, and cloning the git repo; I'm using version 2.1.2 as it's the latest stable one at the time of writing. Note that because the LibreOffice Online developers use tags, we have to actually `cd` into and `git checkout` the right version _before_ we proceed. Then, as we did for POCO, make a `tar` archive for the package build to use containing the source before we start editing anything.
```
$ mkdir ~/loolwsd
$ cd ~/loolwsd/
$ git clone https://github.com/LibreOffice/online.git
$ cd online/
$ git checkout -b debian tags/2.1.2
$ cd ..
$ tar -cvJf loolwsd_2.1.2.orig.tar.xz online/
$ cd online
```
We now need to do a some editing of the Debian control files, similar to POCO. First we add a `changelog` entry (as is customary).
```
$ vim debian/changelog
@@ -1,3 +1,9 @@
+loolwsd (2.1.2-7) stable; urgency=medium
+
+ * Custom build of 2.1.2 for Debian Stretch
+
+ -- Your Name <you@example.com> Tue, 06 Jul 2017 23:47:21 -0400
+
loolwsd (2.1.2-6) unstable; urgency=medium
* see the git log: http://col.la/cool21
```
Next we edit the `control` file. By default, LibreOffice online depends on the `collaboraofficebasis5.3` suite, however we can override that and allow it to run against the stock Stretch `libreoffice` package. While the diff is long, simply search for the first instance of `collabora` on that line and delete everything after it, as well as the entry for `libssl1.0.0` which is obsolete in Stretch and should be present by default anyways.
```
$ vim debian/control
@@ -8,7 +8,7 @@ Standards-Version: 3.9.7
Package: loolwsd
Section: web
Architecture: any
-Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, fontconfig, libsm6, libssl1.0.0, libodbc1, libxinerama1, libcairo2, libgl1-mesa-glx, libcups2, libdbus-glib-1-2, cpio, collaboraofficebasis5.3-calc (>= 5.3.10.15), collaboraofficebasis5.3-core (>= 5.3.10.15), collaboraofficebasis5.3-graphicfilter (>= 5.3.10.15), collaboraofficebasis5.3-images (>= 5.3.10.15), collaboraofficebasis5.3-impress (>= 5.3.10.15), collaboraofficebasis5.3-ooofonts (>= 5.3.10.15), collaboraofficebasis5.3-writer (>= 5.3.10.15), collaboraoffice5.3 (>= 5.3.10.15), collaboraoffice5.3-ure (>= 5.3.10.15), collaboraofficebasis5.3-en-us (>= 5.3.10.15), collaboraofficebasis5.3-en-us-calc (>= 5.3.10.15), collaboraofficebasis5.3-en-us-res (>= 5.3.10.15), collaboraofficebasis5.3-noto-fonts (>= 5.3.10.15), collaboraofficebasis5.3-draw (>= 5.3.10.15), collaboraofficebasis5.3-extension-pdf-import (>= 5.3.10.15)
+Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, fontconfig, libsm6, libodbc1, libxinerama1, libcairo2, libgl1-mesa-glx, libcups2, libdbus-glib-1-2, cpio, libreoffice
Description: LibreOffice Online WebSocket Daemon
LOOLWSD is a daemon that talks to web browser clients and provides LibreOffice
services.
```
Next we edit the `rules` to add a few missing things. First, we add some additional configuration flags for disabling SSL (instead use a forward proxy; this also prevents build errors) and specifying the library directory; this is needed for the build process to find the `libreoffice` libraries. We also add a call to `autogen.sh` in the configuration step (missing inexplicably in this version), as well as overrides to the auto-build (to allow `-jX` flags to work).
```
@@ -5,7 +5,7 @@ DPKG_EXPORT_BUILDFLAGS = 1
include /usr/share/dpkg/default.mk
-CONFFLAGS = --enable-silent-rules --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-lokit-path=`pwd`/bundled/include $(CONFIG_OPTIONS)
+CONFFLAGS = --enable-silent-rules --disable-ssl --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-lokit-path=`pwd`/bundled/include --libdir=/usr/lib/x86_64-linux-gnu $(CONFIG_OPTIONS)
# Avoid setcap when doing "make", when building for packaging
# the setcap is done at installation time
@@ -16,10 +16,14 @@ export BUILDING_FROM_RPMBUILD=yes
dh $@ --with=systemd
override_dh_auto_configure:
+ ./autogen.sh
./configure $(CONFFLAGS)
override_dh_auto_test:
# do not test
+override_dh_auto_build:
+ dh_auto_build --parallel $(MAKEARGS)
+
override_dh_installinit:
# no init.d scripts here, assume systemd
```
Now we have to write some patches. `quilt` is a phenomenal tool for this, and I recommend reading up on it, but for simplicity I'll provide the patch files as-is. First create a `debian/patches` folder and create a `series` file in it.
```
$ mkdir debian/patches
$ vim debian/patches/series
@@ -0,0 +1,2 @@
+enable-ssl-always
+fix-libpath
```
Next we create the two patch files. The first is to enable SSL support always - this seems to contradict the above change to disable SSL, but otherwise the build process fails. The second fixes up the location of the `libreoffice` libraries so the `amd64`-only build can find where they exist in Stretch `amd64`.
_Note: to avoid showing a diff of a diff, the following two entries are the verbatim contents of the file_
```
$ vim debian/patches/enable-ssl-always
Description: Enable SSL always
Ensure that SSL is always enabled during build
.
loolwsd (2.1.2-7) stable; urgency=medium
.
* Custom build of 2.1.2 for Debian Stretch
Author: Your Name <you@example.com>
---
The information above should follow the Patch Tagging Guidelines, please
checkout http://dep.debian.net/deps/dep3/ to learn about the format. Here
are templates for supplementary fields that you might want to add:
Origin: <vendor|upstream|other>, <url of original patch>
Bug: <url in upstream bugtracker>
Bug-Debian: https://bugs.debian.org/<bugnumber>
Bug-Ubuntu: https://launchpad.net/bugs/<bugnumber>
Forwarded: <no|not-needed|url proving that it has been forwarded>
Reviewed-By: <name and email of someone who approved the patch>
Last-Update: 2017-06-23
--- loolwsd-2.1.2.orig/Makefile.am
+++ loolwsd-2.1.2/Makefile.am
@@ -38,9 +38,9 @@ endif
AM_LDFLAGS = -pthread -Wl,-E,-rpath,/snap/loolwsd/current/usr/lib $(ZLIB_LIBS)
-if ENABLE_SSL
+#if ENABLE_SSL
AM_LDFLAGS += -lssl -lcrypto
-endif
+#endif
loolwsd_fuzzer_CPPFLAGS = -DKIT_IN_PROCESS=1 -DFUZZER=1 -DTDOC=\"$(abs_top_srcdir)/test/data\" $(AM_CPPFLAGS)
```
```
$ vim debian/patches/fix-libpath
Description: Fix the LibreOffice library path
Ensure that the build can find the LibreOffice files
.
loolwsd (2.1.2-7) stable; urgency=medium
.
* Custom build of 2.1.2 for Debian Stretch
Author: Your Name <you@example.com>
---
The information above should follow the Patch Tagging Guidelines, please
checkout http://dep.debian.net/deps/dep3/ to learn about the format. Here
are templates for supplementary fields that you might want to add:
Origin: <vendor|upstream|other>, <url of original patch>
Bug: <url in upstream bugtracker>
Bug-Debian: https://bugs.debian.org/<bugnumber>
Bug-Ubuntu: https://launchpad.net/bugs/<bugnumber>
Forwarded: <no|not-needed|url proving that it has been forwarded>
Reviewed-By: <name and email of someone who approved the patch>
Last-Update: 2017-06-23
--- loolwsd-2.1.2.orig/configure.ac
+++ loolwsd-2.1.2/configure.ac
@@ -166,7 +166,7 @@ AS_IF([test -n "$with_lokit_path"],
[CPPFLAGS="$CPPFLAGS -I${with_lokit_path}"])
lokit_msg="$with_lokit_path"
-LO_PATH="/usr/lib64/libreoffice"
+LO_PATH="/usr/lib/libreoffice"
JAIL_PATH=not-set
SYSTEMPLATE_PATH=not-set
have_lo_path=false
```
One final tweak to perform is to edit the systemd unit file to point to the `libreoffice` templates directory rather than the Collabora one, and change the Red Hat-specific `EnvironmentFile` directive to the Debian one (which isn't installed by default but can be used if needed).
```
$ vim debian/loolwsd.service
@@ -3,8 +3,8 @@ Description=LibreOffice Online WebSocket Daemon
After=network.target
[Service]
-EnvironmentFile=-/etc/sysconfig/loolwsd
-ExecStart=/usr/bin/loolwsd --version --o:sys_template_path=/opt/lool/systemplate --o:lo_template_path=/opt/collaboraoffice5.3 --o:child_root_path=/opt/lool/child-roots --o:file_server_root_path=/usr/share/loolwsd
+EnvironmentFile=-/etc/default/loolwsd
+ExecStart=/usr/bin/loolwsd --version --o:sys_template_path=/opt/lool/systemplate --o:lo_template_path=/usr/lib/libreoffice --o:child_root_path=/opt/lool/child-roots --o:file_server_root_path=/usr/share/loolwsd
User=lool
KillMode=control-group
Restart=always
```
Now we install some required `npm` dependencies; the `nodejs-legacy` package will provide our `npm`; don't install the `npm` package itself as this will cause dependency hell. I do this in the main homedir to avoid putting cruft into the source directories.
```
$ cd ~
$ npm install uglify-js exorcist d3 evol-colorpicker bootstrap eslint browserify-css d3
```
Finally, we can build the LibreOffice Online package.
```
$ cd ~/loolwsd/online/
$ sudo dpkg-buildpackage -us -uc -j4
[lots of output]
dpkg-buildpackage: info: full upload (original source is included)
$ cd ..
```
Install the resulting `deb` file and we're set - LibreOffice Online, in a Debian package. To install it on another machine, all we need are the packages generated by this guide (`libpoco-dev` and friends, and `loolwsd`).
### Bonus - Changing the LibreOffice Online directory
By default, the LibreOffice Online package installs the main components of the `loolwsd` service under `/opt/lool`. I'm not a fan of putting anything under `/opt` however, and in BLSE2 everything that is per-server and not automated via configuration management goes under `/srv`. If you also desire this, it's very straightforward to edit the Debian configuration to support installing to an arbitrary target directory before building the package.
```
$ cd ~/loolwsd/online/
$ vim debian/loolwsd.postinst.in
@@ -7,24 +7,24 @@ case "$1" in
setcap cap_fowner,cap_mknod,cap_sys_chroot=ep /usr/bin/loolforkit || true
setcap cap_sys_admin=ep /usr/bin/loolmount || true
- adduser --quiet --system --group --home /opt/lool lool
+ adduser --quiet --system --group --home /srv/lool lool
mkdir -p /var/cache/loolwsd && chown lool: /var/cache/loolwsd
rm -rf /var/cache/loolwsd/*
chown lool: /etc/loolwsd/loolwsd.xml
chmod 640 /etc/loolwsd/loolwsd.xml
# We assume that the LibreOffice to be used is built TDF-style
- # and installs in @LO_PATH@, and that /opt/lool is
+ # and installs in @LO_PATH@, and that /srv/lool is
# on the same file system
- rm -rf /opt/lool
- mkdir -p /opt/lool/child-roots
- chown lool: /opt/lool
- chown lool: /opt/lool/child-roots
+ rm -rf /srv/lool
+ mkdir -p /srv/lool/child-roots
+ chown lool: /srv/lool
+ chown lool: /srv/lool/child-roots
fc-cache @LO_PATH@/share/fonts/truetype
- su lool --shell=/bin/sh -c "loolwsd-systemplate-setup /opt/lool/systemplate @LO_PATH@ >/dev/null 2>&1"
+ su lool --shell=/bin/sh -c "loolwsd-systemplate-setup /srv/lool/systemplate @LO_PATH@ >/dev/null 2>&1"
;;
esac
$ vim debian/loolwsd.service
@@ -4,7 +4,7 @@ After=network.target
[Service]
EnvironmentFile=-/etc/default/loolwsd
-ExecStart=/usr/bin/loolwsd --version --o:sys_template_path=/opt/lool/systemplate --o:lo_template_path=/usr/lib/libreoffice --o:child_root_path=/opt/lool/child-roots --o:file_server_root_path=/usr/share/loolwsd
+ExecStart=/usr/bin/loolwsd --version --o:sys_template_path=/srv/lool/systemplate --o:lo_template_path=/usr/lib/libreoffice --o:child_root_path=/srv/lool/child-roots --o:file_server_root_path=/usr/share/loolwsd
User=lool
KillMode=control-group
Restart=always
```
---
I hope this helps you avoid many hours of headache! I'll document the configuration and integration of LibreOffice Online in another post. Happy building!

View File

@ -0,0 +1,153 @@
+++
date = "2021-04-03T00:00:00-04:00"
tags = ["diy","homelab","buildlog"]
title = "A Custom Monitored PDU"
description = "Building a custom power monitoring PDU for fun and profit"
type = "post"
weight = 1
draft = true
+++
As a veteran homelabber, one thing that always comes up is power usage. Electricity is, unfortunately, not free, even if I do get half of it from "too cheap to meter" nuclear power here in Ontario. And servers can use a lot of it. Some systems provide on-demand power monitoring via their IPMI/BMC interfaces, but not all do. And figuring out how much power the other, non-intelligent, systems use can be a hassle.
The most obvious solution is what is normally called a "per-port monitored PDU". Typically used by colocation providers and large enterprises, these power distribution units (PDUs) allow the administrator to see the actual power usage out of each individual port at a given time, both for billing and monitoring purposes. They're the perfect solution to the problem of not knowing how much power you are using.
But these PDUs are not cheap. New, the cheapest ones I've been able to find run over $1400 USD, and they're gigantic 5-foot monsters designed for full-sized 42U+ racks. Add in dual circuits/UPSes, and the cost doubles. There has to be a better way.
Well, there is. With a decent amount of electrical know-how, some programming, 3D printing, and a lot of patience, I've been able to build myself several custom PDUs. Read on to know how!
**DISCLAIMER/WARNING:** While I am not a professional licensed electrician, I've spent a large portion of my life working with A/C electricity, pretty much from the time I could walk and hold a screwdriver, including a year as an Electrical associate at The Home Depot, as well as numerous home projects and two previous PDUs. I know what I'm doing. **Working with mains electricity is very dangerous, especially if you do not know what you're doing.** This post is provided as a curiosity for most, and a build log/guide only for those who are well-versed in working with this sort of thing. **Do not try this at home. I will not provide advice or guidance on any aspect of any similar project(s) outside of the scope of this document. Contact an electrician if in doubt.**
## PDU 1.0 and 2.0 - Hall Effect sensors
My first two forrays into the custom PDU project were simple devices that used the ACS714 [Hall effect](https://en.wikipedia.org/wiki/Hall_effect) current sensors alone. These units were built out of plastic wall boxes with the sensors held in series with the hot lines connecting to each plug.
The first iteration worked well, but was quite small, with only 16 outlets (4 boxes), which I quickly outgrew.
![PDU 1.0](/images/pdu/1.0/finished.jpg)
The second iteration was a fair bit larger, with 28 outlets (7 boxes), which was more than enough for my rack even now.
![PDU 2.0](/images/pdu/2.0/finished.jpg)
This design had a lot of downsides however:
1. In terms of monitoring, only getting the current was problematic. Current, measured in amperes, is only one part of the energy equation, and voltage and power factor are other important components which the ACS714 sensor does not provide. I was able to hack together a solution in my monitoring software by using the output voltage readings from my UPSes, but this wasn't satisfactory to me, especially given the slow speed of readings and the inaccuracy relative to a live device reading.
2. The sensors were, in my experience, quite unreliable. They were nearly impossible to calibrate and would sometimes report wildly inaccurate values due to interference. This was especially pronounced with low loads, since I needed to use 20A sensors for safety which have a correspondingly low threshold. Under 0.1A (about 12W) they were completely useless, and under 0.5A (about 60W) they were often +/- 15-20% out from a Kill-A-Watt's readings. Only at very high current values (>1.0A) were they accurate, and then only to about 1 decimal place, a fairly rough value.
3. The physical design of the PDU was cumbersome. Each box had to be wired in a very tight space with very tight tolerances on wire length, leading to many a scraped and cut finger. This was fine at the start, but connect 8 of these boxes together and the unit became cumbersome to work with. Maintenance was also a hassle for this reason. If a sensor died, which thankfully has not happened, replacing it would be a massive chore. And due to the through runs of the power busses, made out of normal 14-2 Romex wire, the boxes were permanently attached to each other, making disassembly tricky at best.
In setting out to design version 3 of the PDU, I wanted to solve all 3 issues, making something more robust and easier to service and maintain, as well as more accurate.
## PDU 3.0: Physical design
Solving issue 3 turned out to be fairly easy - the solution was a 3D Printer, specifically my new Ender 3 v2. Instead of using pre-made 3-gang plastic wall boxes, I could design individual "modules" one at a time, print their components, and then assemble them together using some sort of quick-release between them.
I began with a plan to create a CAD design of a full box, but ultimately this ended up going nowhere, not least due to my lack of experience (and patience!) with 3D modeling software. Instead, I was able to find two smaller components with which I could build out larger boxes: a 2-gang wallplate, and a 120mm "honeycomb" fan grill. These two components could easily be combined with a bit of superglue and electrical tape to form a 2-outlet cubic box which would hold all the wiring, the sensors, and the plugs, while providing ample room to work, as well as an open visual apperance to allow easy inspection of the internal components.
![Faceplate and bus sidepiece](/images/pdu/3.0/faceplate.png)
To connect the electrical portion of the modules together, I avoided the 1.0 and 2.0 method of using marette connectors to join multiple leads, and instead went with a busbar concept. I was able to find two designs of busbar: a dual-bus, 4-post version which would be used for the hot and neutral leads, and a raised 6-post version for the ground leads. This would greatly simplify the assembly by allowing me to use Y-connectors and securely screw down the leads, keeping everything very neat.
![Hot/neutral and ground busbars](/images/pdu/3.0/busbars.png)
These busbars were then mounted on the bus sidepeace pictured above, to give a secure base to work off of. This piece alone was enough to assemble the core electrical components easily with plenty of working room.
![Mounted busbars and outlets](/images/pdu/3.0/mounted-busbars-and-outlets.png)
Connecting the leads was then a trivial exercise of cutting exactly-length pieces of 14-gauge wire, stripping the right amount off each end, and bending them into position. Each sensor - we'll cover these in the next section - required 4 leads: two hot, in and out, and two neutral, in and out (though bridged internally), so all 4 leads met up in a level location towards the back of the module.
![Finished leads](/images/pdu/3.0/finished-leads.png)
The final step was a method to connect the modules to each other. Rather than fixed, through wire, I settled on a relatively-recent innovation, which is very common in Europe but virtually unknown in North America: clamp-down connectors. These are absolutely fantastic for this purpose, able to handle a full 32A through them while being absolutely trivial to connect and disconnect. And it turned out that the clearances between modules were absolutely perfect for these. Thus, I could easily connect and disconnect modules during assembly or maintenance.
![Connecting modules](/images/pdu/3.0/connecting-modules.png)
And *voilà*, a finished module, ready to accept the four sensor modules. I could then attach the other side plate and the back when I was ready to connect the modules and microcontroller.
![Finished module](/images/pdu/3.0/finished-module.png)
The input section proved to be equally simple to assemble. I was able to find a set of combination IEC-13 input, fuse, and switch units from Amazon fairly easily. While technically rated only for 10A, I feel comfortable with this since each circuit should only ever be expected to run just under 10A load, and this is a derated value. I simply replaced the stock fuses with a 15A fast-blow variety for safety, and assembled a final segment to hold them. Input can now be handled by a simple IEC cord, along with switching, power control, and an indicator light, protected by a fuse should anything ever short out.
![Input module](/images/pdu/3.0/input-module.png)
While I ultimately did need to permanently attach the modules physically to keep the movement from fatiguing the wires, breaking this connection in the future would simply be a matter of cutting some cable ties and breaking some glue bonds to replace the module. I could also extend the unit arbitrarily, with another module or two should I eventually need more than 32 ports.
The last step was assembling a cage for the sensor modules. One "flaw" in the design of the sensors I will mention below is that they float at line level, so an external enclosure was absolutely critical to avoid accidentally touching the modules.
## PDU 3.0: Sensors
As previously mentioned, the original PDU designs used a Hall effect sensor, which only reported current. But for version 3.0, I wanted something more feature-rich and robust. In scouring the Internet for a solution, I came across the perfect device: the HLW8012.
![HLW8012 module from ElectroDragon](/images/pdu/3.0/hlw8012.png)
This module is able to output PWM signals on two interface pins that correspond to the active power (in Watts), and, via a select pin, the voltage or current passing through the sensor. Though there are some flaws with the design, specifically that the voltage is read from the neutral side and thus a large voltage drop in the load will provide bogus readings, and the fact that these units float at line level due to their design, these seemed like the perfect fit.
I was able to find a blog post by Xose Pérez, titled [The HLW8012 IC in the new Sonoff POW](https://tinkerman.cat/post/hlw8012-ic-new-sonoff-pow/) which went over the basics of how this sensor worked and how it can be used to make a DIY single-plug power monitor, similar to a Kill-A-Watt. He [even wrote an Arduino library for it](https://github.com/xoseperez/hlw8012)!
Armed with this knowledge, I ordered 48 (and later, due to an error, another 24) of the sensors from a Chinese seller called [ElectroDragon](https://www.electrodragon.com/product/energy-meter-hlw8012-breakout-board/) and got to work assembling the last component.
![HLW8012 modules installed](/images/pdu/3.0/hlw8012-in-module.png)
During my preliminary testing with an Aruino Uno, however, I found some issues with Xose's library - it was clearly not designed to work with more than one module at a time, so I had to come up with another solution. I also ran into an issue that plagued the 2.0 design: how to collect dozens of wires from dozens of sensors back to a central microcontroller to actually power and read data from them.
## PDU 3.0: Microcontrollers
My first idea was to use the Arduino Mega. This is a monstrous Arduino microcontroller with 54 digital I/Os, more than enough to handle 16 or 18 sensors per circuit, which was my goal - match or slightly excede the 32-port 2.0 design. But it had some fatal flaws: first, this is an 8-bit microcontroller which makes dealing with relatively-large integer and floating point values very cumbersome; second, the microcontroller CPU is very slow; and third and most important to my physical design, I would have to run a total of 20 wires from each module back up to the central Arduino board at the top, an exercise in frustration.
I continued to look for a good solution, when finally a discussion with a friend led to an excellent discovery: the STM32 "Black Pill" microcontroller. Forget an 8-bit 16MHz 54-I/O Arduino board; this tiny monster can do 16 I/Os with a 32-bit 100MHz ARM-based core, which is more than enough to read sensors on the order of microseconds. And it has a USB-C output!
![STM32 "Black Pill" microcontroller](/images/pdu/3.0/stm32-black-pill.png)
But one microcontroller wouldn't cut it; I'd have a total of 3 leads (plus power) from each sensor, and due to the "float at line voltage" design of the sensors, needed separate microcontrollers for each circuit lest they short. So I would need more than just one. Thus I came up with an even better solution: why not use one STM32 for "each" module (really, one for each "side" of two modules)? This had two benefits; first, I could keep the "each module is separate" mentality even though the compute side; and second, the USB connections would make it trivial to run back to the central controller, a Raspberry Pi - I would just need a USB hub, USB isolator, and some relatively-short USB cables, instead of dozens and dozens of discrete leads. And the distance between each sensor and the microcontroller was small enough to use normal 6" jumper wires. A match made in heaven!
![STM32 attached to the module](/images/pdu/3.0/stm32-attached.png)
## PDU 3.0: Programming
With all the physical assembly completed and the units ready for live testing, I was able to get started with the programming aspect.
The first task was to calibrate the sensors. Looking through Xose's library code, I was able to determine that each sensor would need 3 multiplier values (one each for wattage, voltage, and amperage), which was then divided by the resulting PWM value to produce a usable reading. But how to find those multipliers?
To solve this problem, I used on of the free STM32's to build a sensor calibrator. The idea is quite simple: I would connect up the sensor to the calibrator, attach a known resistive load (a 40W and 60W lightbulb in parallel totaling ~100W of load), and connect everything to a no-name Kill-A-Watt to give me a reference. I could then enter the reference value the Kill-A-Watt showed, and let the calibrator read the modules and calculate the correct multiplier.
This process took a lot of iteration to get right, and in the end I settled on code that would run a large number of scans trying to determine the exact value that matched my input values. But it was worth the time, and the results turned out to be perfect - I was able to use the calibrator on each sensor to determine what their multiplier should be, and then store this for later use inside the live code on each microcontroller, giving me nearly to-the-watt accuracy.
```C++
[calibrator code]
```
![Calibrator output](/images/pdu/3.0/calibrator-output.png)
With the calibration values in hand, I turned to writing code to handle the actual module microcontrollers. The code here ended up being extremely simple once I had the calibration: simply poll the PWM of each sensor in turn, calculate the output, then display it on the serial console for reading by the Raspberry Pi using JSON formatting. Note the `struct` for the sensor modules, which contain the individual multipiers found during the calibration step for that given module, as well as the identifier of each plug.
```C++
[microcontroller code]
```
The final step was to write the controller software for the Raspberry Pi side. This turned out to be a bit complex, due to needing to read from a total of 10 microcontrollers in quick succession.
[TBD]
```Python
[pdusensord code]
```
## Putting everything together
With all the programming and module assembly done, I could begin assembling the final PDU. Here is the final result:
![Final PDU](/images/pdu/3.0/final.png)
I spent a few days load testing it with a resistive heater in my garage to be sure everything was in safe working order, and it passed all my tests. I then let it run with 3 servers attached for 3 full months to do a final burn-in, occasionally switching which outlets they were attached to, without incident.
The last step was final installation into my rack:
![PDU installed](/images/pdu/3.0/installed.png)
And the readings by my monitoring software are exactly what I wanted - accurate to the Watt, at least in theory:
![PDU monitoring](/images/pdu/3.0/monitoring.png)
I hope you found this interesting!

View File

@ -0,0 +1,581 @@
---
title: "Building a Debian Package 101"
description: "It's not as confusing or complicated as you think"
date: 2022-12-02
tags:
- Debian
- Development
---
One of the most oft-repeated reasons I've heard for software not packaging for Debian and its derivatives it that Debian packaging is complicated. Now, the thing is, it can be. If you look at [the manual](https://www.debian.org/doc/manuals/maint-guide/index.en.html) or a reasonably complicated program from the Debian repositories, it sure seems like it is. But I'm here today to show you that it can be easy with the right guide!
My target audience for this post is anyone who has software they want to build, but who currently thinks that making a `.deb` is too complex, difficult, or not worth the effort. Hopefully, by the end of this post, you'll understand exactly how to do it and be able to implement your own Debian package in under 30 minutes.
If that sounds good, read on!
For simplicity's sake, I assume you're doing all this on a Debian system, or one of its derivatives like Ubuntu. Note that things like cross-architecture building are well outside our scope here, but such things are possible. Your package will match what you build it under, so if you want an Ubuntu 22.04 package, be sure to build it on an Ubuntu 22.04 system, etc.
## Prerequisites
Before starting, you'll need a few dependencies. First and foremost is anything you need to actually build your program; for a lot of things that's `build-essential` plus a few supplemental libraries, but it could include anything else.
Keep track of what build dependencies you need, because we'll need that list later on when creating the `control` file.
Next install `dpkg-dev`, `debhelper`, and `devscripts` packages, which provide the main Debian packaging tools and some helper programs. You might also want `quilt` if you plan to make package-specific patches to the code, but I don't cover `quilt` here.
## The Basics: Creating your initial `debian/` folder
Start with your source code in a directory, GIT repo, etc. To start you'll want all your code in the root level, so that you can build it right from there. This helps keep the complexity down.
Our first step is to build a basic, boilerplate `debian/` folder, which is a sub-directory at the root of the source code repository that provides the Debian packaging instructions. So run `mkdir debian` and continue.
Within that `debian` folder are a few key files that every build needs. I'll go through each one in turn, explaining what it does and how to write one. At the end, you'll be able to run `dpkg-buildpackage` to get your binary package.
## Boilerplate files (`compat`, `source/format`, and `source/options`)
These files define some basic configuration for the build system. Given how simple and boilerplate they are, I've collected all 3 under this heading.
`compat` defines the Debian packaging compatibility version, i.e. what version of `debhelper` the package supports. What version you support depends on how old the releases of Debian you want to support are, but `8` or `9` are good baselines.
The next two entries are under the sub-directory `source` within the `debian` folder.
`source/format` defines the package layout format. There are two main formats: `1.0` and `3.0`. Within `3.0`, `3.0 (quilt)` is the most common. Which you choose depends on how advanced you want to go here, but we'll stick with `1.0` as it's the simplest.
`source/options` defines some additional options that will be passed to `dpkg-source` when it builds your package. There's two main categories of entries here that I have used in my packages, though there are many more:
* `tar-ignore='<pattern>'`: One or more entries will define file patterns (Perl regular expressions) to ignore when creating the source tar archive. It's usually a good practice to ignore things like `.git*`, `*.deb`, and any temporary files or directories your build might produce.
* `extend-diff-ignore='<pattern>'`: One or more entries will define file patterns (Perl regular expressions) to ignore when when creating the diff of your source code. Generally you want to ignore any binary files in your source tree.
A good, safe default would be something like:
```
tar-ignore='*.deb'
tar-ignore='.git*'
extend-diff-ignore='.git*'
```
## The `copyright` file
The `copyright` file defines the copyright information for your package. Usually, for simple programs, this will just match your project's license.
The file is structured as follows:
```
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0
```
This line defines the copyright format. The 1.0 format specified here is usually sufficient. This is a link to the full manual of the contents of the `copyright` file, so for more advanced situations it is worth the read.
```
Upstream-Name: mypackage
```
This line specifies the upstream name of the program. It should match your program's name and the name of the source package. If you're building someone else's program and want to change the name to "Debianize" it, this would be the original name.
```
Source: https://github.com/aperson/myproject
```
This line provides a link to your source. It can be any URL you want, but you should provide something here.
Next is a newline, followed by one or more blocks:
```
Files: *
```
This line defines what file(s) this copyright entry belongs to. For a simple project all under one license, this can just be `*`; more advanced copyright situations (e.g. submodules, libraries) might require separate sections for each.. The `*` block should always be the last block; that is, define any more specific blocks first. If not `*`, this should be the relative path to the file(s) under the source repository.
```
Copyright: 2022 A. Person <aperson@email.tld>
```
This line defines the copyright year and name of the copyright owner (including email address in angle brackets). This is probably you unless you're packaging up someone else's code. While this email doesn't have to be valid, it should be in case a user wants to reach you about a copyright question, and will be shown in the information about the package.
```
License: GPL-3
```
This line, and subsequent lines prefixed with a single space, define the actual license of the files. The license name should be one of those found under `/usr/share/common-licenses` (e.g. `GPL-3`, or `Apache-2.0`). The subsequent lines should include the short version of the license text, i.e. what you would put at the top of your source files (not the full license text). Within this block, paragraph breaks should be delineated with `.` characters. At the bottom it's usually best to reference the aforementioned directory as a source of license text as these contain the full version of each license.
### A complete example
Here is a complete example of a `copyright` file for a GPL v3 program:
```
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: myprogram
Source: https://github.com/aperson/myproject
Files: *
Copyright: 2022 A. Person <aperson@email.tld>
License: GPL-3
This package is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, version 3.
.
This package is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>
.
On Debian systems, the complete text of the GNU General
Public License version 3 can be found in "/usr/share/common-licenses/GPL-3".
```
### Separate 'debian' copyright
Sometimes, it might be ideal to use a different license for the actual 'debian' folder versus the original program source code. Why you might want to do this is up to you, but to do so simply create a `Files: debian` block before the `Files *` block to define your alternate packaging license.
## The `control` file
Now we're getting into the meat of the package. The `control` file defines your package, both the source component and the binary component(s). There are many available options here, but I'll provide only the most basic ones needed to build a functional package.
### The "source" package section
These entries define the source package information. The entries are structured as follows:
```
Source: myprogram
```
This line defines the name of the source package, and will usually match the name of the program.
```
Section: misc
```
This line defines the section of the repository that your application goes into. What you put here is pretty arbitrary unless you want your package to be included in the official Debian repositories, so go with `misc`.
```
Priority: optional
```
This line defines the priority of the package. Like the above entry, this only really matters if you're making an official package, so go with `optional`.
```
Maintainer: A. Person <aperson@email.tld>
```
This line defines who maintains the package (and thus, who to reach out for if help is needed by an end user. This uses the same format as the person entry from `copyright` above, and this format will be used again later as well.
```
Build-Depends: debhelper (>= 8),
libssl-dev,
somebuilddep
```
These lines define any build dependencies your package requires, i.e. what you installed in the very first section. You can safely exclude `dpkg-dev` (as this is implied), as well as `build-essential` (for the same reason), but include here any specific development libraries, additional programs, etc. that you might need to build the program. Note too the first line, which should usually be `debhelper` at `>=` the version you specified in `compat` above. This entry also demonstrates how to define specific version(s) of dependencies; `>=`/`<=` (greater than/less than or equal) are the most common, to specify minimum dependency versions, though other comparisons are possible in more advanced cases.
Entries in this list can be placed on one line, comma separated, or on separate lines as shown here. The final entry should not have a comma after it.
```
Standards-Version: 3.9.4
```
The version of the package standards that the package uses. I usually use `3.9.4` as a baseline for my own packages; the latest version as of writing is `4.6.1`.
```
Homepage: https://myproject.org
```
This line defines a URL to the homepage of your project.
### The "binary" package section(s)
These entries define the output binary package information. There should be one block for each binary package you produce from the single source package, though for a simple project there is a 1-to-1 relationship here. The entries are structured as follows:
```
Package: myprogram
```
This line defines the name of the package, usually the name of your program.
```
Architecture: any
```
This line defines the architecture that the package will support. For simple packaging, this should be `any` (the program can be built against any architecture that Debian supports) or `all` (for native cross-platform packages like Python code or documentation).
```
Depends: mypackagedependency (>= 1.0),
someotherdependency,
afinaldependency
Recommends: asoftdependency
```
These line defines any package relationships that the final package will have, formatted like the `Build-Depends` entry in the "source" section above. These are optional: if your program doesn't depend on any other (binary) packages at runtime, just leave it out, but usually you'll depend on *something*.
The `Depends` entries are strict: the package will refuse to install if any of these are missing (when using `dpkg --install`), and will pull them in automatically when using the package manager (e.g. `apt install`). Use this for any hard dependencies the program has.
The `Recommends` entries are malleable: the package will still install if these are missing, but this relationship exists to define anything that might be "nice to have" alongside your program. By default, `apt` *et al* will not install recommended packages, but will show them when installing the package.
```
Description: The oneline description of your program for 'apt search'
Some additional lines that will describe the program in more depth.
.
You may have multiple paragraphs here with . deliniators.
```
These lines provide a description of your package so users know what they're installing. The first line (along with the `Description:` label) is a short version that will be shown as output when running `apt search` and the like. Any additional lines provide more detail for use with `apt info`.
### A complete example
Here is a complete example of a basic `control` file for a simple program:
```
Source: myprogram
Section: misc
Priority: optional
Maintainer: A. Person <aperson@email.tld>
Build-Depends: debhelper (>= 8),
libssl-dev,
somebuilddep
Standards-Version: 3.9.4
Homepage: https://myproject.org
Package: myprogram
Architecture: any
Depends: mypackagedependency (>= 1.0),
someotherdependency,
afinaldependency
Recommends: asoftdependency
Description: The oneline description of your program for 'apt search'
Some additional lines that will describe the program in more depth.
.
You may have multiple paragraphs here with . deliniators.
```
## The `changelog` file
The changelog file defines the current, and any past, versions of your package, along with a (generally brief) changelog, as the name implies.
This file is important when releasing new versions: whatever entry is at the top of this file is the "current version" of your program, and will determine the version of the output package. Thus you will have to add a new entry to the top of this file each time you release a new version of your package.
It is required to have at least one entry here (to define the current version of the package), but also good practice to keep older versions in descending order for as long as feasible, so people can compare what changed between various versions of your program.
The entries are structured as follows:
```
mypackage (1.0-1) unstable; urgency=medium
```
The first line defines the values for the changelog entry, and is in a very specific format.
First is the name of the program, which *must* match the `Source:` entry in the `control` file.
Next is the version of the package enclosed in parentheses. This should be the real version of the program that you are building. The first part (before the `-`) defines the "upstream" version, so in this case, we're building version `1.0` of the program, corresponding to a hypothetical Git tag of `v1.0`. The second part (after the `-`) is the version of the *package*. This can be used to define multiple versions of the package that use the same underlying upstream version; unless you're doing complicated stuff involving delegating packaging, just set this to `-1` or `-0`, or leave it out altogether.
Next is the code-name of the release of the package; just set this to `unstable`. Note the semicolon after this.
Finally is the "urgency" of the package. This is used by `apt` to determine how "important" the update is, but can be pretty arbitrary. I usually use `urgency=medium` as a safe default.
```
* Here is a changelog entry
* Here is another changelog entry
```
The next section, separated from the first line by an extra newline, contains individual changelog entries. You must provide at least one explaining what's changed, but you can specify several as shown here. Each entry must be prefixed by two spaces then an asterisk (`*`) character before starting the entry. Standard formatting is to capitalize the first letter, keep it short and sweet, and end without a full stop (`.`); if you're using Git and [are writing good Git commit messages](https://cbea.ms/git-commit/), you can just use your Git commit titles here! What you put in each line is up to you, and you can include any metadata or information you might want. Finally note the trailing newline before the final line.
```
-- A. Person <aperson@email.tld> Fri, 02 Dec 2022 14:28:01 -0500
```
The final line of the changelog entry specifies who wrote the entry, again in a very specific format. The line begins with a single space followed by two dashes (`--`) then another space, followed by author in the standard name + email format (I did say it would come up again!), then two spaces, and finally an RFC Email date (i.e. the output of `date --rfc-email`) defining when the entry was written.
### A complete example
Here is a complete example of a single `changelog` file entry for version `1.0` of our simple program:
```
mypackage (1.0-1) unstable; urgency=medium
* Here is a changelog entry
* Here is another changelog entry
-- A. Person <aperson@email.tld> Fri, 02 Dec 2022 14:28:01 -0500
```
If we were to add version `1.1` of the program in the future, we would add it to the top, and the file would thus look like this (note the extra line between entries):
```
mypackage (1.1-1) unstable; urgency=medium
* This is a newer version after fixing a bug (GitHub #123)
-- A. Person <aperson@email.tld> Fri, 03 Dec 2022 18:28:01 -0500
mypackage (1.0-1) unstable; urgency=medium
* Here is a changelog entry
* Here is another changelog entry
-- A. Person <aperson@email.tld> Fri, 02 Dec 2022 14:28:01 -0500
```
### The `dch` helper program
The `devscripts` package provides a helper program to assist in automating changelog entries, named `dch`. In my experience, you have to change so much from the generated content (or set so many environment variables) as to not make it worthwhile, but is something to consider if you do a lot of packaging.
## The `rules` file
The `rules` file is a `make` script that defines how to build your package. This is the part that usually trips a lot of people up, because this file can get very complicated. However, for most simple programs using standard build tools, `dh` - the Debian build helper - automates a lot of the grunt work for you, and this file can thus be very simple.
The file is structured as follows; note that this is `make` format, so indentations *must* be a tab (`\t`) character, *not* spaces, and the file *must* be executable to work:
```
#!/usr/bin/make -f
```
The first line is a shebang line defining that this is a `make` script with the `-f` option.
```
export DH_VERBOSE = 1
```
This line sets verbosity when building the package, useful for troubleshooting.
```
MY_FILE := binary.out
```
This line defines a variable that can be used later in the script. I show this example here only to specify the format (note the `:=`); a simple program likely won't need any variables.
```
%:
dh $@
```
This section defines the basic rules for the build. The `%:` heading is "any stage"; there are about two dozen stages in a normal package build that can be defined, and `%` is the "wildcard" for all of them.
Next, the tab-indented line(s) specifies what commands happen during this stage. Note that each line here is executed in its own shell context, so if you were to e.g. `cd`, that would get lost on the next line. In this basic example though, all we do is pass all of the arguments for the stage on to the `dh` program.
And that's it! Really! If your program uses `./configure && make && make install` style installation, or `cmake`, or is a properly-formatted Python module, or really any "standard" build type, this is all you need to do. `dh` takes care of it all, automatically determining how to build the program, putting it in the right places, and giving you a package out the other side.
### Overriding build stages
Now, of course, you can do some more advanced things in this file as well. Any stage can be overridden by using an `override_dh_<stage>` section, which will replace this normal `dh $@` with whatever you specify. For example, lets say that `make clean` doesn't actually clean up all of our artifacts, so we want to define some custom cleanup that will happen as well. We can override the default `dh_auto_clean` step with the following to achieve this:
```
override_dh_auto_clean:
rm -f artifacts/out/$(MY_FILE)
dh $@
```
Note here that we also use the variable we defined above as an example; variable references in `make` are surrounded by normal brackets (i.e. `(`/`)`) and not curly braces (i.e. `{`/`}`) like in BASH.
Another common example is overriding `dh_auto_configure` to run a `./configure` script with special options. For example:
```
override_dh_auto_configure:
./configure --my-option-1 --my-option-1 \
--newlined-option
```
Note that this example doesn't include `dh $@`, so `dh` will not be executed for it. You can use this for completely manual control of a build stage if appropriate.
You have a lot of flexibility here, which is why `rules` files seem so complex. But don't be scared: start simple, see if it works, and only override if you find you *really* need it.
### Handling the pesky shell context
As mentioned above, each line runs in its own shell context. This is mostly relevant if you're moving around directories. So for example, this is *not* valid:
```
override_dh_auto_clean:
cd artifacts/out/
rm -f $(MY_FILE)
cd ../..
dh $@
```
Because that first `cd artifacts/out/` runs in its own shell context, the next line (`rm -f $(MY_FILE)`), in another context, is actually relative to the base directory, *not* the `artifacts/out/` directory! You can work around this by putting both commands on one line with a command separator (e.g. `&&` or `;`) like so:
```
override_dh_auto_clean:
cd artifacts/out/ && rm -f $(MY_FILE)
dh $@
```
And since context is discarded, you don't even need to worry about the `cd ../..` part; you will always be back at the root of the repository on the next line.
### In-built variables
One final note is a special variable that can be used, `$(CURDIR)`. This variable is a full path to the current directory (usually the root of the repository) and can be used for commands that need a full path, for example:
```
override_dh_auto_clean:
cd $(CURDIR)/artifacts/out/ && rm -f $(MY_FILE)
dh $@
```
There are several other in-built variables that you can use as well, but for simplicity, I won't cover them here.
### A complete example
Here is a complete example of the basic `rules` file, with some comments:
```
#!/usr/bin/make -f
# Be verbose during the build
export DH_VERBOSE = 1
# This variable contains a pesky file that 'make clean' won't remove
MY_FILE := binary.out
# Main debhelper entry
%:
dh $@
# Override dh_auto_clean to clean up MY_FILE
override_dh_auto_clean:
cd $(CURDIR)/artifacts/out/ && rm -f $(MY_FILE)
dh $@
```
## Installing files manually with `install`
Sometimes, and in fact quite often, you will have some static files that will need to be manually installed into the package, i.e. that your build process doesn't take care of automatically. For example, if you had a systemd service unit file called `myprogram.service` that needs to be installed.
These custom files can be defined in the `install` file, which tells the package build to add the files to the resulting package after the build is completed.
Each line in the file is structured as a source and then a destination (either a directory or filename), just like a `cp` or `mv` command.
The source is always relative to the root of the repository, while the destination is always relative to `/` on the target system. So using our `myprogram.service` example, we might put that file in `debian/conf/` and then have an entry in `install` like so:
```
debian/conf/myprogram.service lib/systemd/system/
```
This will ensure that the `myprogram.service` is installed to `/lib/systemd/system/myprogram.service`. This is smart: if the destination is known to be a directory, you don't need the trailing `/` (though adding it makes it clear), otherwise it will treat it as a filename.
### `install` shenanigans: a build-less package
This file also allows shenanigans if you want to create a "source" package that doesn't actually do any "building", just moves files around. You could for example have a `rules` file that does nothing:
```
#!/usr/bin/make -f
%:
/bin/true
```
And then use `install` to just copy a bunch of files into place:
```
src/myprogram.py /usr/bin/myprogram
```
This can be useful for things like pure documentation or a collection of scripts that are entirely static.
### An `install` per package
While not explicitly covered here, `control` lets you make multiple binary packages out of one source package. It can thus be useful to have separate `install` lists for each binary package. To do this, you simply start the filename with the name of the binary package (i.e. what is defined in `Package:` in the `control` file) followed by `.install`. For example, you could have `mypackage.install` and `mypackage-docs.install` which install different sets of files.
## The `conffiles` file
Sometimes, you might have configuration files shipped with your program that you want users to be able to edit themselves, and that won't be (automatically) overwritten by a new version of your package. You can handle this with the `conffiles` file.
By default Debian will treat any file under `/etc` as a `conffile`, so you don't need to explicitly define these. Thus, if your program follows the Linux filesystem hierarchy standard, you don't need this file.
However, if you have configuration files elsewhere on the system, you should define them in this file, one file per line.
The `conffiles` of a program are treated specially during a package removal. `apt remove` will not remove them by default, in order to preserve the configuration of a package; you must use `apt purge` to remove any defined `conffiles`, so keep this in mind if you want to define them.
## Controlling installation and removal with maintainer scripts
When your package is installed on a user system, it can often be useful to do "things" to the system. A canonical example would be creating a service user and enabling our example `myprogram.service` unit on install, then deleting the service unit and user when the package is removed.
There are 4 types of maintainer scripts that can be specified. Each script is a `/bin/sh` script (starting with a `#!/bin/sh` shebang) which can then do arbitrary things to the system. They do not need to be executable in the source repository, but will be once installed by the package.
Each script has `set -o errexit` enabled by default; thus any failure of any step will be a fatal error, and will terminate the configuration (and, for `pre` scripts, the remaining installation) of the package, so be careful to explicitly "catch" errors with `||` as needed. Note too that the scripts run as `root`, so be very careful here!
* `preinst` runs during package installation, before the actual files of the program are installed. You can use it to check the sanity of the system or other similar tasks, though this file is likely the least-used.
* `postinst` runs during package installation, after the actual files of the program are installed. This is the most common maintainer script, often used to configure services, add users, `chown` directories, etc.
* `prerm` runs during package removal, before the actual files of the program are removed. This is the second most common maintainer script, often used to de-configure services, remove users, remove created directories, etc.
* `postrm` runs during package remove, after the actual files of the program are removed. Some tasks in `prerm` could likely also go in `postrm`, but where you put tasks depends on the specifics of your program and what the script is doing, e.g. stop servers in `prerm` but remove directories in `postrm`.
In very simple programs, you might not need any of these scripts, or might only need one or two of them. For our example we'll only need `postinst` and `prerm` to handle our service and user.
Thus we would have a `postinst` as follows:
```
#!/bin/sh
# Create the user and set their home to /var/lib/myprogram, shell /usr/bin/nologin to prevent login
useradd \
--no-user-group \
--create-home \
--home-dir /var/lib/myprogram \
--shell /usr/bin/nologin \
--group daemon \
--system \
myprogram
# Enable and start the service
systemctl enable --now myprogram.service
# Explicitly exit 0
exit 0
```
And a `prerm` as follows:
```
#!/bin/sh
# Disable and stop the service
systemctl disable --now myprogram.service
# Remove the user
userdel myprogram
# Clean up the data directory (don't worry about program files, 'dpkg' handles that!)
rm -rf /var/lib/myprogram
# Explicitly exit 0
exit 0
```
### Maintainer scripts per package
Like the `install` file above, these maintainer scripts can be defined per-binary-package, using the same `<package name>.<script>` format, if your package requires it.
### Don't do sketchy things in maintainer scripts!
Finally I want to point out to not do sketchy things in maintainer scripts. 2 years ago, the Raspberry Pi Foundation [abused their maintainer scripts in a critical package](https://github.com/RPi-Distro/raspberrypi-sys-mods/commit/655cad5aee6457b94fc2336b1ff3c1104ccb4351) [to install a completely unrelated repository for Microsoft VS Code](https://www.reddit.com/r/linux/comments/lbu0t1/microsoft_repo_installed_on_all_raspberry_pis/) [without any obvious traces in the usual Debian places](https://hothardware.com/news/raspberry-pi-microsoft-repository-phones-home-added-pi-os) (i.e. anywhere visible with `dpkg -L`/`apt-file search`/etc.)
DO NOT do this, EVER. Maintainer scripts are NOT for adding files to the system; that's what `install` and the build process are for, which allow the files installed by packages to be tracked by the `dpkg` system. You could perhaps make a case for modifying files in maintainer scripts, but adding new files or trying to do anything "trixy" is verboten, and certainly do not do what the RPF did. Abuse of maintainer scripts like this not only destroys user trust, but it actively hides changes to the system from the package manager, and prevents these entries from being managed and modified in the future by new package versions. It's a horrible practice all around. Use maintainer scripts only to do the bare minimum tasks needed to ensure your package will work and to clean up after it, nothing more.
## Building your package
Now that you've prepared your `debian` folder and package configuration, it's time to actually build your new package! In the root of your source repository, run the following command:
```
dpkg-buildpackage
```
This will build the package for you. You should get 5 files out of the build, one level higher than your current directory (i.e. at `../`):
* `mypackage_1.0-1_amd64.deb`: The actual binary package. The version and architecture are auto-populated based on the build.
* `mypackage_1.0-1_amd64.buildinfo`: A file containing information on the build, including checksums, dependencies, environment, etc.
* `mypackage_1.0-1_amd64.changes`: A file containing information about the package including changelog, checksums, and the description.
* `mypackage_1.0-1.dsc`: The Debian source package information.
* `mypackage_1.0-1.tar.gz`: An archive of the source for use with the `.dsc` file.
You can then install your `.deb` or add it to a repository manager like `reprepro`.
If something went wrong, that's OK! It's common to have errors the first time you try to build a package. Either errors in `rules`, parts that don't build right, typos, etc. Luckily the `dpkg-buildpackage` command is very verbose and shows, in real-time, all the build steps that are occurring. Pay close attention to what failed and tweak your scripts or configuration to match, and try again. Once you're at this stage, and assuming that your `dh_auto_clean` is actually cleaning everything up properly, it's safe to re-run the build as many times as needed to get it working - and if it isn't, the command will complain and tell you about it, so you're getting plenty of feedback to adjust your `rules` to get it to work.
Happy building!

View File

@ -0,0 +1,130 @@
---
title: "Fixing a Pesky Trackpoint"
description: "Stop your mouse moving randomly on a Thinkpad while preserving the buttons"
date: 2023-01-27
tags:
- DIY
- Technology
---
Today's post is a fairly short one. I've used Thinkpads for quite a while, first a T450s, then a T495s. I'm a huge fan of them, even the current generations. One thing I especially like is the button layout: because of the trackpoint (a.k.a. the "nub" mouse pointer), I get an extra set of physical buttons above my trackpad, including a middle mouse button. I find these buttons absolutely invaluable to my minute-to-minute usage of my laptop.
The problem began when I had to replace my T495s keyboard due to a fault. I needed a replacement quick, so official Lenovo parts were out. I ended up settling on a relatively cheap Amazon replacement from an off-brand. While the keyboard itself was relatively fine, I was almost immediately struck by a major problem, and one that seems to plague many Thinkpad users: the mouse would move by itself due to the faulty sensor in the trackpoint. This is often called "trackpoint drift".
This is an extremely annoying condition, since it not only moves the mouse in an unwanted way, but can often completely override the trackpad input. So I wanted to find a solution to stop this. Luckily for me, I don't actually use the trackpoint for it's mouse movement functions at all, so my first thought turned to disabling it entirely.
The problem is that, of course, this thing is just a mouse. If you turn off the trackpoint, you also turn off its buttons. So that option was completely out.
Next, I did some searching on ways to disable just the mouse functionality while retaining the buttons. This is much harder than you might think (or, is exactly as hard as you may think, depending on perspective).
Luckily though I was able to stumble upon [a random Arch Linux forums thread](https://bbs.archlinux.org/viewtopic.php?id=252636) where someone posted a hacky (elegant) solution to this. Specifically, post #6 from the user "k395" mentions a solution he came up with that leverages the `evtest` command (Debian package `evtest`) to capture the mouse events from the device, and then use a Perl wrapper to the `xdotool` command (Debian package `xdotool`) to manually generate the appropriate mouse button events. What a solution!
```shell
sudo evtest --grab /dev/input/event21 | perl -ne 'system("xdotool mouse".($2?"down ":"up ").($1-271)) if /Event:.*code (.*) \(BTN.* value (.)/'
```
*"k395"'s one-liner solution*
I had to do a bit of modification here though. First of all, I needed to determine exactly what `/dev/input/event` node was the one for my trackpoint. Luckily, running `evtest` with no arguments enters an interactive mode that lets you see what each event node maps to. Unfortunately I haven't found a way to get this programatically, but these seem to be stable across reboots so simply grabbing the correct value is sufficient for me. In my case, the node is `/dev/input/event6` for the `Elantech TrackPoint`.
```shell
$ sudo evtest
No device specified, trying to scan all of /dev/input/event*
Available devices:
/dev/input/event0: AT Translated Set 2 keyboard
/dev/input/event1: Power Button
/dev/input/event2: Lid Switch
/dev/input/event3: Video Bus
/dev/input/event4: Sleep Button
/dev/input/event5: Power Button
/dev/input/event6: ETPS/2 Elantech TrackPoint
/dev/input/event7: ETPS/2 Elantech Touchpad
/dev/input/event8: PC Speaker
/dev/input/event9: ThinkPad Extra Buttons
/dev/input/event10: HD-Audio Generic HDMI/DP,pcm=3
/dev/input/event11: HD-Audio Generic HDMI/DP,pcm=7
/dev/input/event12: HD-Audio Generic HDMI/DP,pcm=8
/dev/input/event13: Integrated Camera: Integrated C
/dev/input/event14: Integrated Camera: Integrated I
/dev/input/event15: HDA Digital PCBeep
/dev/input/event16: HD-Audio Generic Mic
/dev/input/event17: HD-Audio Generic Headphone
Select the device event number [0-17]: ^C
```
*Getting the list of event inputs for my system*
But that wasn't the only issue. Unfortunately this basic implementation lacks support for the *middle* mouse button, and given that I use it quite extensively (for Linux middle-button quickpaste, closing tabs in Firefox, etc.), I needed that functionality.
This prompted me to rewrite the Perl-based one-liner into a slightly easier to read Python version, and implemented middle button support as well. I also added a bit of debouncing to avoid very rapid presses resulting in 2 `xdotool` events in very rapid succession. I then put everything together into a script which I called `disable-trackpoint`.
```bash
#!/bin/bash
set -o xtrace
# We assume display 0 on the laptop
export DISPLAY=:0
# Event ID found via evtest
event_id=6
# Grab events and pipe to our xdotool parser
/usr/bin/evtest --grab /dev/input/event${event_id} | /usr/bin/python3 -c '
from os import system
from sys import stdin
from time import sleep
from re import search, sub
last_time = 0.0
for line in map(str.rstrip, stdin):
if search(r"^Event", line) and search(r"EV_KEY", line):
event = line.split()
time = float(event[2].strip(","))
button = sub(r"\W+", "", event[8])
action = event[10]
# Debounce button presses to 0.05 seconds
if action == "1":
if (time - 0.05) < last_time:
continue
last_time = time
# Action 1 is a "down" (press), 0 is an "up" (release)
if action == "1":
cmd = "mousedown"
else:
cmd = "mouseup"
# Buttons generally map this way: Left mouse is 1, middle is 2, right is 3, wheel up is 4, wheel down is 5.
if button == "BTN_LEFT":
btn = "1"
elif button == "BTN_MIDDLE":
btn = "2"
elif button == "BTN_RIGHT":
btn = "3"
# Run xdotool with cmd and btn
system(f"/usr/bin/xdotool {cmd} {btn}")
'
```
*The disable-trackpoint script*
There is one major downside here: this script does not function properly under Wayland. It seems to work but button presses are mapped to incorrect windows. You must use Xorg for this to work. For me that's not a huge deal as I've never really found much benefit to one over the other, but it's worth noting if you try this yourself.
Finally, I set this script to run automatically in a systemd unit file which will start it on boot and ensure it keeps trying to start until the display is initialized.
```systemd
[Unit]
Description = Fix trackpoint problems by disabling it
Wants = multi-user.target
[Service]
Type = simple
ExecStart = /usr/local/sbin/disable-trackpoint
Restart = on-failure
RestartSec = 5
StartLimitInterval = 5
StartLimitBurst = 99
[Install]
WantedBy = multi-user.target
```
*The disable-trackpoint service unit (at `/etc/systemd/system/disable-trackpoint.service`)*
One `systemctl enable` later, and there we go: I have a disabled trackpoint but with enabled buttons, and can finally stop chasing my mouse cursor across the screen! While this is certainly a dirty hack, spawning a lot of processes, it does seem to work for me and hopefully someone else too.

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

View File

@ -0,0 +1,137 @@
---
title: "Gamifying My Drumming, or: Rock Band 3 with an Alesis Strike Pro"
description: "How I connected my electronic drums to a PS3 to play Rock Band 3, with full hi-hat support"
date: 2023-05-09
tags:
- DIY
- Technology
- Music
---
## The Backstory
I've been a drummer for over 2 decades, since I was 14. As part of my youthful musical exploration journey, I got a very old basic drumkit from my grandfather, who was a big band drummer during his heyday, and I spent some time learning swing beats and trying to emulate some of the drummers I had come to idolize (mostly, Neil Peart). In Grade 10 I auditioned for our high school jazz band on guitar, and while I wasn't successful there, almost on a whim the band director suggested I try the drums too, since he knew from our Grade 9 class that I had excellent rhythm. I pulled out those swing beats my grandfather had taught me, and in contrast to the dudes coming in and playing punk rock rhythms, he picked me handily.
With my new-found purpose in drumming, I got myself (read: my parents got me) a basic Tama drumkit that I used for the next 10 years, slowly adding pieces and cymbals (mostly cymbals), including my favourite addition: a DW double kick pedal I got for my 21st birthday. As much as I loved this kit, as anyone who's ever lived with (or near) a drummer knows, it's a loud hobby. As I moved forward in my career in my mid-to-late 20's, I found I had less and less time to bang out a drum session during reasonable hours, so eventually to save space I put away the kit.
A few years later I had moved on to a new job, working a 13:00-21:00 shift and getting home around 22:45 every night. After having stuck to the bass and keys for a few years at that point, I thought seriously - using my newfound pay raise - about getting an electronic drumset, so I could play at any time and expand my soundscapes with near infinite variety using samples. I eventually settled on the Alesis Strike Pro, which really is a fantastic kit for the cost. I won't go too deep into the features or specs here, but suffice to say that aside from a few quirks, I still love the thing over 5 years later!
## The Problem
But, the problem is, not drumming for nearly 5 years took an absolute toll on my skills. Before, I could easily play for an hour or more solid, doing back-to-back-to-back Rush, Dream Theater, Porcupine Tree, and other demanding songs, with nary a care in the world. But after the break, my endurance was absolutely shot - sometimes I could barely finish a song before my arms would "give out", and my general skill had also taken a dive, especially around double-kick and fast fills.
There was also an issue with what I'll call "active feedback" on the Strike. With a real drumkit, I was able to put in earbuds for the music I was listening to, and the sheer sound of the drums would permeate me physically and around the earbuds, and I ended up with a perfect "mix" most of the time without being overly loud in my ears. But with the Strike, being purely electronic audio, it was very hard to find a mix that worked without either blowing out my eardrums or it being very hard to hear myself.
In short, I was stuck in a rut and a catch-22: to improve, I needed the motivation to really play, but I had little motivation to play because I had lost so much of that skill and it was so hard to hear when I played. So, for most of the 5 years I've had the Alesis Strike Pro, it sat idle, barely being played and collecting a nice layer of dust that is *very* hard to remove from the rubber cymbals.
As further motivation, the COVID-19 pandemic did an absolute number on me personally and health-wise. 2+ years of working from home with very little activity gave me a whole host of health problems, including a significant weight increase, high blood pressure, and anxiety that revolves around (causing and caused by) "left side numbness" as I put it. By early 2022 I was a complete wreck, suffered a serious mental health crisis that still hasn't fully resolved over a year later, and I really needed a way to get myself back into some semblance of shape.
## The Solution: Gamify Drumming
Drumming is *excellent* exercise. Even if you're just playing basic beats, you're doing cardio, strength training on at least 2, maybe 3 or even 4, limbs, and you can work up a sweat fairly easily. So I knew what I needed to do: I needed to actually *play* my drums. But how?
Well, there was another thing I did a lot of in my early-to-mid 20's: Rock Band! I love this game, even today. Playing Rock Band drums was pretty close to a real drumkit in terms of workout, and was always fun for me, even solo. I still had all my gear kicking around, and a few months ago my good friend suggested we bust out the game for a small gathering of friends, which was a smashing success. But the fake little plastic drumkit has a lot of pain points: the kick pedal has no rebound and is flimsy and easily broken, the positions of the drums are wacky, and hitting them hard (as I'm wont to do) kills them very quickly. This really got me thinking: could I play Rock Band with my Strike? It would solve all the problems, in both directions: I'd get *visual* feedback for my playing, no audio mixing issues, and I could work up to harder songs and longer sessions over time, while also letting me play the game on a real kit with real positioning and playing feedback (proper kick rebound, drums that didn't feel like a rubber mat, etc.). So I went looking for how to do this.
It turns out I'm definitely nowhere near the first, and I definitely won't be the last. So, the rest of this blog post will detail my setup, how I got it working, the parts I used, and the challenges I've faced with the Strike, with an eye towards helping others do this as well.
## Part Zero: The PlayStation 3
I've never been much of a gamer, even going back to my childhood. We had a complete mishmash of consoles over the years (Sega Saturn, then XBox, then PS3), and for the PS3 I think the most we ever owned was 3 games: LittleBigPlanet 1 & 2 and Rock Band 3. But my PS3 still worked perfectly, and as luck would have it, it was an early Slim model that was fully compatible with modding/jail-breaking so I could use RB3 Deluxe and custom songs. I won't detail that part in this blog post, but I was able to jailbreak, reload, and put custom songs onto my PS3 in about an afternoon's worth of work.
Then, as luck would have it, my sister told me that she actually had an old spare PS3 from an ex who had abandoned it with her before they broke up. Even better it was an original fat model, though alas not one with native PS2 support and NAND flash. Further, her fiancé had an older 42" Samsung TV I could use. In my head I jumped right to the idea of having a dedicated PS3 and TV for my drum area so I wouldn't have to move anything around: I could just sit down and play! I got all the pieces, fully cleaned and re-pasted the PS3, set up the TV, repositioned my drums, and got to work with the setup.
## Part One: Rock Band 3 Pro
The first step of this is Rock Band Pro mode. Introduced in Rock Band 3, pro mode is designed to give a fully "authentic experience" in playing the instruments. Pro guitar/bass feature a full-fret plastic guitar (versus just 5 buttons), pro keys has you play actual notes over a 2 octave range, and pro drums adds support for 3 cymbals in addition to the "toms"/pads.
As part of Pro mode, Mad Catz made the "Rock Band Pro MIDI adapter", which is pretty much exactly what it sounds like: you can input MIDI from an instrument into it, and it will "convert" it into the signals that Rock Band's Pro mode can handle. It works with virtually anything as long as you send the right MIDI notes, and doing Pro mode with electronic drums is of course one of the supported options.
I got the Wii version of the adapter, because according to much of the community, this was the best option as it was widely available and could be easily modified (by removing a resistor) to support the PS3, while costing up to 1/3 of the price of the PS3 version, the only limitation being a lack of shoulder buttons. So I ordered one from Amazon along with a MIDI cable and hooked it up. I then recalibrated the Alesis head module to output the specific MIDI notes that the adapter was looking for (documented in the manual) for each drum and cymbal.
And it worked!
## Part Two: The Hi-Hat
But, there were a couple issues I had. Well, OK, really just one issue that wasn't "me": the Hi-Hat.
See, Rock Band 3 Pro mode for the drums comes with a "Hi-Hat Pedal" mode. They would sell you a second pedal you could use as either a second kick drum or the "hi-hat pedal", and there is an option in the game to turn this mode on. Problem is, this mode doesn't do *jack*. It does nothing to affect the gameplay or the charts whatsoever.
Now why might that be a problem? Well, as a convention, Rock Band Pro drum charts use the Yellow Cymbal (hereafter YC) for a closed hi-hat note, and sometimes a secondary or tertiary crash. Then they use the Blue Cymbal (BC) for a ride note, a secondary or tertiary crash, and, most importantly, for an open hi-hat note. But only sometimes.
What this means is that a fairly common drum pattern of closed and open notes looks something like this (YC, BC, Ki[ck], Sn[are]):
```
YC BC YC BC YC BC YC ...
Ki Sn Ki Sn ...
```
On a real drumkit, this is played with the hi-hat pedal opening and closing the hat on each 8th note. No moving between arm positions, just use of the left foot. It's so basic it's often one of the first "exotic" beats that people learn. Further there's often little open hi-hat splashes that occur inside other beats, which are trivial on a drumkit.
But because of the complete lack of functionality of the "Hi-hat pedal mode", you're forced to play this in Rock Band on two different cymbals. Now, on the tiny plastic Rock Band kit this is annoying but manageable, since your arm only has to move about 10-20 degrees and about 12-18 inches. But on a real kit, this is functionally impossible to do accurately and comfortably at any amount of speed. It's a breaking feature, and I really wanted to solve this.
Now, many *low-end* drumkits have a workaround for this. You could simply have the drum head unit send a different note, specifically a YC note, for the hi-hat when it's closed, and a BC note for when it's open. Great!
But, the problem is, that's only for *low-end* kits. See that method of doing things isn't flexible for doing "real things" with DAWs or sequencers or what have you. On higher end kits - and the Alesis Strike Pro is a very high-end kit - what is done instead is to send a single "note" for the hi-hat hit itself, and then also send MIDI CC#4 (Control Channel #4) events containing the position of the pedal. This gives the maximum flexibility for audio equipment to handle the signal. And you can't switch off this mode on the drum kit itself; you're stuck with this mode of operation.
Now here's the craziest thing: Rock Band 3 can accept that CC#4 signal! In fact, it's what the Hi-Hat pedal in the game is mapped to! But because the feature isn't actually implemented, it's useless outside of the "freestyle" mode or fills (which I turn off myself); it doesn't affect the chart or rewrite the notes for you as it "should" (at least, as I think it should have...). A dead end, so I thought.
## Part Three: An Arduino and a MIDI Shield
I asked around a few places about this, and I got answers ranging from "why would you want that?" (clearly not a drummer) to "it's impossible". But I knew it wasn't. The solution seemed fairly obvious to me: if I could somehow read the MIDI signal as it was coming out of the drumkit, and, using the CC#4 signal as a guide, rewrite the Hi-Hat note based on the pedal position, I could *fake* a simple on/off hi-hat signal mode.
Enter the [SparkFun MIDI Shield for the Arduino](https://www.sparkfun.com/products/12898)! This shield gives you MIDI-IN and MIDI-OUT ports and interfaces with the internal serial bus of the Arduino. And [there is a fully feature-complete library for it as well](https://github.com/FortySevenEffects/arduino_midi_library), allowing one to easily build MIDI functionality in the Arduino IDE.
All the pieces came together for me: I just had to write a bit of conditional code that would read in the MIDI CC#4 events, track the hi-hat "openness" state, handle nodes asynchronously, and rewrite the YC signal to be a BC signal whenever the hi-hat was opened.
As part of this, I leveraged the Pro MIDI adapter's ability to map multiple notes to each game note, so I would send note 22 for the Hi-Hat yellow cymbal, but send note 26 for the Crash yellow cymbal, so they could operate independently. I'd then rewrite only note 22 events based on the pedal, so the crash would always work while the hi-hat would change in response to the pedal.
I also wanted the ability to turn this remapping on and off. Very quickly I noticed some songs where the charting was such that all hi-hat notes, even "open" ones, were mapped to the yellow cymbal, especially during double kick drum passages. Thus for those songs I'd want the ability to turn remapping off so I could still play naturally without needing to adjust my hi-hat physically, then swap back for other songs easily. Luckily the MIDI Shield also comes with push-buttons, so I attached one of them to the board, and added this to the code. I used the red and green LEDs on the board to indicate what mode it's in as well, so I have clear visual feedback of when the remapping is on (green) or off (red).
At this point I had things working enough to do a quick demo video, [visible on my YouTube channel here](https://www.youtube.com/watch?v=ocAzJ67x4Z0). While my videography failed quite spectacularly after about 3.5 minutes, it gets the point across: this solution was a success!
## Part Four: Building the Final Form
One thing I didn't like about the build at this point was the Arduino Uno I was using as the microcontroller. It's hard to explain or really quantify, but the responsiveness of the kit just felt "sluggish" to me, and I attributed this to the performance of the Arduino in reading the MIDI events and then writing them back out. There were also weird issues with floating capacitance causing the mode to flip back and forth constantly at times, which became really annoying.
So I decided to replace the Arduino itself with another microcontroller, the much faster STM32 "Blackpill", of which I had over a dozen lying around from another (failed) project. The Blackpill had a number of benefits: first, it was much faster (up to 16x faster than the Arduino Uno, CPU-wise, with dual ARM cores), it was much smaller and could thus fit into a smaller area, it had 2 serial UARTs so I could actually *debug* the thing with serial prints, and finally it was able to be powered over USB-C (of which I had many more long cables than I had for USB-A to USB-B).
I soldered everything together with my (brand new) [Pine64 Pinecil](https://pine64.com/product/pinecil-smart-mini-portable-soldering-iron/) - which, aside, is the best soldering iron I've ever used - using 22-gauge Ethernet wire I had laying around, and covered it in tape to protect it and the leads as well as eliminate any stray bridging from touching it. I then made a quick little wire mount, soldered onto the potentiometer anchor points of the MIDI Shield, and used this to attach it to the top of my speaker controller for a clean, easily accessible setup.
The debugging part came in real handy as I worked to calibrate excactly what the threshold between open and closed should be, and also helped me greatly simplify the code I had originally written for maximum optimization and thus speed. At this point I started playing regularly on it, and after nearly 2 weeks I've been playing at least once a day, sometimes even more.
## Part Five: Pictures and Code!
What would this post be without some pictures?
![Wiring of the Blackpill and MIDI Shield](blackpill-hat-wiring.jpg)
Here is a quick WIP shot of the wiring for the Blackpill and the MIDI Shield. You can see the power along the left and the various signal lines to the shield across the center. A2 and A3 are the second serial UART on the Blackpill; A5 is the button for mode control; and A6 and A7 are the LEDs for status indication. Not shown is the aforementioned heavy wire mount, which was soldered to the mechanical anchor points at the top of the board in this image. The boards are attached together with relatively thick double-sided tape to keep them solidly together while insulating them from each other.
![MIDI Rewriter module in situ, front](midi-rewriter-front.jpg)
![MIDI Rewriter module in situ, back](midi-rewriter-back.jpg)
Here are two images, front and back, of the MIDI Rewriter module in its final position with all connections. From the front, the MIDI-IN from the drum head is on the right, while the MIDI-OUT to the Pro Adapter is on the left. USB power is visible on the back, and all the cables are neatly organized using small cable ties. The two USB cables (USB-C power for the Blackpill and USB signal for the Pro Adapter) are routed over to a USB hub by the PS3 along the drum frame.
![Pro Adapter/Controller](pro-controller.jpg)
Here is a shot of me holding the Pro Controller. The cables are neatly routed to provide me plenty of slack to hold the controller if needed, and the MIDI cable acts as a loop to hook onto the golden-coloured 3D-printed hook attached to the side of the drum module. Also (slightly) visible underneath the drum module are my headphones that I use during "quiet hours", on another golden 3D-printed hook. This keeps everything together and nicely out of the way while I'm playing while still being accessible instantly.
![Whole Setup](whole-setup.jpg)
And here's the entire kit setup, with the TV, speakers and PS3 (just behind the uncovered speaker) visible. The USB hub is attached to the desk just behind the Hi-Hat cymbals. The speakers are in Stereo 2x mode, with both the pair on the desk as well as a pair on the floor on either side of me (right one visible). I used coloured electrical tape to add little colour accents for the cymbals to help establish my muscle memory for the game, which took a solid week to get used to (versus the original Rock Band drums), but now I just like how it looks. The fact that the Strike Pro has 3 crashes worked out wonderfully here as I'm able to have both the normal Green Cymbal crash, along with separate "crash" versions of the Yellow and Blue cymbals for when I feel that playing authentically requires them. For toms, the rack toms are mapped as you would expect (smallest is yellow, next is blue), and the "floor" toms both are technically mapped to green but I only use the first, with the second acting as a convenient table for the remote and vocal controller. Bonus: my best result yet for Time and Motion by Rush ([a custom chart by ejthedj on C3](https://db.c3universe.com/song/time-and-motion-16247))!
Finally, [the code for the Blackpill version of the Rewriter module is available on my GitHub](https://gist.github.com/joshuaboniface/660ab942198909e4f136f66a4065a691) for anyone interested in implementing their own.
## Part Six: Demo Video!
Here's a demo of me playing a song with the finalized version of everything. The hi-hat action still isn't *perfect* but it's more than good enough to "feel good" to play.
{{< youtube cA7e7zTVD7E >}}
## Part Seven: Conclusions and Next Steps
All in all I'd consider this project a resounding success. First and most importantly, it's got me wanting to drum, often 2 or 3 times a day in short 15-30 minute increments. It took a little while to get used to this layout and I'm slowly building back my endurance, but the goal of making me *want* to drum by gamifying it has definitely worked well. Second it was a very fun project, letting me use all my biggest skillsets to achieve a goal I wanted for a long time, which was deeply satisfying.
For next steps, I have a few ideas. First of course I plan to continue tweaking the hi-hat settings mercilessly until I get them *just* right, but I fear I'm limited as much by the Strike Pro (it's always been flaky on open/closed hi-hat work) and my own skill as I am by the controller, but I'll try. I also plan to implement a foot pedal switch to swap modes, along with better debouncing code, so that I could theoretically switch the Rewriter module on and off mid-song if I wanted to, though so far I haven't found much of a need for this.
I hope you find this post interesting, useful, and perhaps inspiring! Please send me an email if you have any feedback!

Binary file not shown.

After

Width:  |  Height:  |  Size: 117 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 162 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 168 KiB

View File

@ -0,0 +1,83 @@
+++
class = "post"
date = "2024-02-17T00:00:00-05:00"
tags = ["philosophy", "floss"]
title = "My Opinions on Free and Libre Open Source Software"
description = "Because trying to write them as replies never works"
type = "post"
weight = 1
draft = true
+++
## Why Write This?
Over the years, I've been engaged in many arguments and debates about the nature of open source, especially *vis-a-vis* funding open source. Invariably, my position is apparently unclear to others in the debate, forcing me to expend literally thousands of words clarifying minutae and defeating strawmen.
As a leader of two projects that are inherently goverened by my philosophy on Free and Libre Open Source Software (hereafter, "FLOSS"), I feel it's important to get my actual opinions out in the open and in as direct and clear a manner as I possibly can. Hence, this blog post.
## Part One: What I Believe FLOSS "means" a.k.a. "The Philosophy of FLOSS"
"FLOSS" is a term I use very specifically, because it is a term that Richard Stallman, founder of the Free Software Foundation (FSF) and writer of the GNU General-Purpose License (GPL) suggests we use.
In terms of general philosophy, I agree with Mr. Stallman on a great number of points, though I do disagree on some.
To me, "FLOSS" is about two key things, which together make up and ethos and philosophy on software development.
### FLOSS is about ensuring users have rights
This part is is pretty self-explanatory, because it's what's covered explicitly in every conception of FLOSS, from the FSF's definition, to the Open Source Initiative (OSI) definition, to the Debian Free Software Guidelines (FSG).
Personally, I adhere to the FSF and GPL's 4 freedoms, and thus I reject - for myself - non-copyleft licenses.
> “Free software” means software that respects users' freedom and community. Roughly, it means that the users have the freedom to run, copy, distribute, study, change and improve the software. Thus, “free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer.” We sometimes call it “libre software,” borrowing the French or Spanish word for “free” as in freedom, to show we do not mean the software is gratis.
> You may have paid money to get copies of a free program, or you may have obtained copies at no charge. But regardless of how you got your copies, you always have the freedom to copy and change the software, even to sell copies.
Now, as I'll discuss below, I have some disagreements with this definition when we begin to talk about "price". But those first two sentences are what's important here.
### FLOSS is a statement of altruism
This is the part that I think, if not makes me unique, at least makes me different than most people who write and release "open source" or other FLOSS software.
I believe that FLOSS software is a statement of altruism. It is about giving something to the world, to humanity, and to the computing community.
On it's face, this doesn't seem radical, but it is, and it almost completely informs my opinions on monitization and distribution that I'll discuss below. So it's a very important point to take in: to adhere to "FLOSS philosophy" means, to me, to have altruistic motives and actions.
## Part Two: Monetizing FLOSS done Wrong with "Open-core"
With my definition of "FLOSS Philosophy" out of the way, let's discuss monetization, starting with things I see as counter to said philosophy and thus intellectually dishonest or worse.
This blog post originally started as a treatise on Open-Core software, but without the philosophical underpinning, I found it very hard to explain why I felt the way I did about it.
For anyone unaware of what this term means, "open-core" software is software that is *nominally* FLOSS, but which hides some subset of actual code features behind a proprietary license and other restrictions. For a hypothetical example, consider a grocery list software program. If the program itself is free and open source, but the ability to, say, create lists longer than 50 entries or to create lists of electronics instead of groceries, is a proprietary, paid extension, this is "open-core" software.
Open-core is one of the most pervasive FLOSS monetization options. Countless pieces of software, from GitLab to CheckMK to MongoDB, are "open-core".
And I think this model is scummy, because it fundamentally violates the second part of the philosophy. How?
1. "Open-core" software is not about altruism. Sure, it may *seem* that way because *part* of it is FLOSS. But that other part is not, and thus, the *complete* software solution is not FLOSS
2. "Open-core" software is, almost *invariably*, marketed as FLOSS, becausee the social clout of FLOSS brings in contributors and users, building an "ecosystem" that is then monitized when...
3. The lines of all pieces of "open-core" software is arbitrary. Why 50 grocery items, and not 100? Why just groceries but not electronics? Why is the line drawn there, and not somewhere else? The very existence of such a line is arbitrary, as is its positioning. Thus, the software *as a whole* is not FLOSS because of arbitrary limits on its usage.
Now, some may argue that feature X is "only for enterprises and they should pay" or something similar. This is nonsense. It is not up to the *author* to decide that, it's up to the *user*. And by presenting an arbitrary line, the philosophical idea of altruism goes out the widow. There is nothing altruistic about proprietary software, and "open-core" software is just proprietary software with FLOSS marketing.
There is one last part of "open-core" software that I find particular egregious. By its nature, "open-core" software is contrary to a volunteer ethos and community-driven development. Consinder the grocery example above and a new contributor called Jon. Jon wants to add support in for listing clothing in addition to grocery items. He wants to exend this "FLOSS" software. Will his contribution even be accepted? After all, the "FLOSS" part is just for *groceries*, and electronics are hidden behind the paywall. Will Jon's merge request languish forever, be ignored, or be outright deleted? And if it's merged to the "FLOSS" portion of the software, the line becomes even more arbitrary.
## Part Three: Monetizing FLOSS done Wrong with "CLAs"
Contributor License Agreements or CLAs are incredibly common in "corporate" FLOSS software. They're usually marketed as "protecting" the community, when in fact they do anything but. The software license protects the community; the CLA allows the company to steal contributions at an arbitrary future date by changing the license at will.
I think it should be pretty obvious to anyone who adheres to the philosophy above why this too is scummy. Contributors make contributions under a particular license, and if that license is changed in the future, particularly to a propreitary license, those contributions are stolen from the world at large and divorced from the altruistic intentions of the contributor.
Now, not every project with a CLA will necessarily switch licenses in the future. The issue with CLAs is that they give the controlling interests the *option* to do so. And for how long can those interests be trusted, especially from a profit-driven corporate entity?
## Part Three: Monetizing FLOSS done Right with Employer-sponsored FLOSS
## Part Four: My Thoughts on the Future of FLOSS

View File

@ -0,0 +1,119 @@
+++
class = "post"
date = "2019-11-14T19:00:00-05:00"
tags = ["politics","union","administrator","developer"]
title = "On Knowledge Workers and Unions"
description = "or, On unions for developers and aministrators from a Marxist perspective"
type = "post"
draft = true
weight = 1
+++
It's been quite a while since I made a post on here, and this one is not about tech itself, but on my opinions, politically-influenced, on Unions and my industry of DevOps. This post is heavily influenced by my own political views. I'm a Marxist - I subscribe to his Labour Theory of Value, his idea of Dialectical and Historical Materialism, and his ideas on the Class Relations of workers (proletarians) and owners (bourgeoisie) along with a few more obscure classes. I'll try to avoid filling this post with excessive leftist jargon in the hopes of not requiring much or any previous knowledge of leftistism, but some may still slip through and I'll try to define them in context. But ultimately I hope this post will inspire some alternative thoughts about unionization and professional relations within our industry to our benefit. I think labour relations are something the "IT Industry" needs to think about broadly as we expand in size, both to avoid selling ourselves short (literally) and repeating the mistakes of the past. I'm sure a lot of this could also be applicable to other fields, but I'm focusing heavily on my own field here, and I will refer to "knowledge workers" to respresent us and a few similar fields more generally and "IT industry" to refer to DevOps (and its two child fields, software development and systems administration) more specifically.
# On Knowledge Workers and Unions
I just finished reading [Erik Dietrich's fantastic post](https://daedtech.com/the-beggar-ceo-and-sucker-culture/) on what he calls the Beggar CEO and Sucker Culture. It draws on a previous post of his, [Defining the Corporate Herirarchy](https://daedtech.com/defining-the-corporate-hierarchy/) itself inspired by [a post by Venkatesh Rao](https://www.ribbonfarm.com/2009/10/07/the-gervais-principle-or-the-office-according-to-the-office/) and the original cartoon by Hugh MacLeod. If you haven't read these posts, I'd definitely recommend reading them - the post by Rao in particular is more a series than a post, is incredibly long, but is incredibly important and has definitely influenced a lot about how I think of corporate culture.
In this article, Dietrich talks about an Ask Abby-style article in which (briefly) a CEO complains that her workers leave after their 9-5 and won't work long hours "like [she] does". And I agree with a lot of Dietrich's points here. But he is careful not to make this "political". He, for his own reasons, keeps the discussion purely in terms of existing Neoliberal Capitalism (the dominant form of Capitalism in the Western core since the early 1980's focused on tax cuts for the rich and "austerity", public service cuts for the working classes) without any class analysis. One commenter noticed this, the aptly-named "Unionist":
>Unionist: More than 2,000 words and not one of them is “union”. Thats the real problem.
The resulting comment chain was a fairly expected debate between a few pro-union (and a few seemingly leftist) commenters, some anti-unionist knowledge workers, and the author himself. One section of the chain struck me in particular:
>Eric Dietrich: I dont think that collective bargaining is the cure for what ails a high-demand, knowledge work field. Reason being, I dont think we need to accept the subordination that entails — I see the demand for and cost of software creating a future where we engage in a professional services model, like doctors and lawyers. Those professions dont need to unionize because they control the game. So should we.
>Eric Boersma: Respectfully, I think this is a place where youre letting your own skill set cloud your judgement of whats possible. Everyone cannot just be consultants, dispensing code where it is valuable and proper, because thats not a model that fits the needs of many or most programmers or businesses. Most programmers are quite bad at selling their own skill sets. Most poorly estimate what theyre capable of providing many to the positive, some to the negative. Consultation very rarely provides space for effective on-the-job training, putting a higher workload onto individual employees to continue to grow their skills in useful directions throughout their career. Additionally, its telling that your two examples, lawyers and doctors, have significant and difficult profession entrance exams which gate people who are not capable of doing the job effectively from being able to claim the title as well as grueling early-career workloads.
>The vast majority of doctors do not work for themselves. The vast majority of lawyers do not work for themselves. The vast majority of developers will never work for themselves; all three of those groups can use the power that collective bargaining provides to effectively improve their work conditions. Swearing off unionization as a means of professional advancement is like becoming a programmer and swearing that youre never going to use TCP/IP because youve heard bad things about it. Youre taking tools out of your toolbox before ever giving them a shot, and ignoring them even when theyre clearly the best tool for the job youre trying to accomplish.
Dietrich's opinion reflects what I see as a trend in the IT industry away from meta-analysis of our own employment in a leftist lense. I think this is a very flawed, but common, understanding, and he uses an also-common-and-flawed comparison to two other well-discussed knowledge worker careers: doctors and lawyers. Boersma points out several quick examples of these flaws, but I think this deserves a more in-depth breakdown, because the issue of unions in the IT industry, and of knowledge workers more broadly, is woefully un-discussed and becomes more relevant with every new member joining the field.
## On "Knowledge Workers" - Doctors and Lawyers, a brief history
The first place to start would be on the comparison made between IT workers and two other extremely-well-cited knowledge worker fields: doctors and lawyers. It is extremely common to mention these two careers when discussing wages, compensation, and other employment-related matters. I hope for their sake that readers are generally aware of these two professions, if not the specifics of each, so I'll avoid discussing them in detail. But the comparisons involved when discussing these two professions almost invariably comes down to a few common elements that both professions share, at least in the Anglosphere (the "English-speaking West" of the US, UK, and the nations of the British Commonwealth, including my own Canada):
1. Doctors and Lawyers are generally considered to be "high class" jobs worthy of aspiration to.
1. Doctors and Lawyers, especially Senior (10+ years experience) members, are generally very well-paid, making "upper-middle-class wages".
1. Doctors and Lawyers generally work very long hours, upwards of 60+ per week.
1. Doctors and Lawyers are both extremely well-educated, requiring many years of schooling, practical "grunt-work" experience, as well as professional certification.
If you're an IT professional, or an Engineer, or an Architect, or one of several other knowledge fields, looking at this list, you'll probably notice that these traits are also generally shared by us. And this is not something I see as a flaw in Dietrich's argument. It's absolutely true that these professions, collectively, are something entirely different from what many would call "blue-collar" labour, the proletarians of Marxist thought. And throught the development of leftist though, a name was created to describe these workers: the "Professional and Managerial Class" (PMC). Normal western non-leftist though commonly calls this the "Middle Class", but that is a term devoid of meaningful analysis and hence I will not use it. I also exclude the "managerial" element of this class in my discussion here, partly due to the influence of Rao's opinions on corporate culture in the 21st centure, and also because to lump them together would hinder the analysis.
As the world has moved further into the 21st century, knowledge workers have come to dominate the discussions of the future of labour. This is after all what someone is implying when they say, usually to a worker who's job has been automated by machinery, "go back to school to get an education and a 'better job'". Education is an important component of PMC careers. But usually when this is said, the person saying it is *not* implying that the target should be one of these two specific jobs. Why is that?
First and foremost, being "high class" jobs really means little, and under capitalism usually means "very well-paid". So why are these two careers very well paid?
The most common answer is generally the other two points: "they work very long hours and deserve the high pay", and "they had to spend a lot of time and, without public-paid post-secondary education, money training in the job". But this doesn't tell the whole story.
The important thing is their *union*.
But, you may ask, what union? I don't mean "unions" as is traditionally thought of them here. What I mean is their professional standards organizations.
Doctors have medical schools and the AMA, ACP, etc.. Lawyers have law schools and the "bar" of their jurisdiction. PEng's have engineering schools and their local societies, in Ontario the OSPC. A similar story is true for almost every other PMC industry (except of course the managers).
These professions all have bodies that, while not focusing primarily, or even at all, on collective bargaining or the proletarian-versus-bourgeoisie element of employment, include an element of certification to the profession. In order to truly call yourself a Lawyer, legally, you must pass the bar. Or pass a medical school exam, followed by a residency. Or write a professional engineering certification. What these bodies do is ensure that these knowledge workers form an insular society, which is gatekept by the existing members of the organization in order to ensure a minimum bar of knowledge before a new member can work professionally.
This, I think, is what fundamentally separates PMC careers from "blue collar" careers like trades or service work, despite trades in particular superficially resembling this. Every one of these careers has a bar that must be crossed.
## On Traditional Unions - the why and how
With the PMC out of the way, we can discuss the main point of unions - their protection of workers from exploitation. All things that are usually associated with unions - collective bargaining for better compensation, workplace standards, protection of members from dismissal - all tie back into this point; they protect workers from exploitation by the ownership class.
The history of unions is long and depressing. Born out of the conditions of coal mines, steel foundries, and Dickensian sweatshops, they sought to organize workers together to fight against their exploitation. They were brutally, violently, suppressed time and again, but kept fighting until they won what are commonly consindered the hallmarks of modern employment: 8(-ish) hour days, fair pay, weekends, workplace health and safety regulations, and a plethora of other things modern workers take for granted.
But Western unions lost much of their power throughout the 20th century, as the protections they won became the norm, and as more dangerous, labourious, and low-skilled work was exported to poorer parts of the world. This culminated, I think, in one of the major blows against unions in modern history: the 1981 Air Traffic Controllers strike, where Ronald Reagan called the ATC union's bluff, and fired the entire union's membership and replaced them.
One of the biggest threats of unions has always been the idea that firing the entire union membership is impossible - if not because they would be easy to replace, then because they could literally occupy the factory, mine, or workplace and prevent others from working. But this was impossible in 1981. How do even 11000 people occupy hundreds of airports, guarded by millitarized law-enforcement officers behind razor-wire fences. It was an exceptional situation for sure, but the ripple effects were wide-reaching.
Union membership has been declining steadily since the 1980's, especially as neoliberalism became the norm. And unions have since developed a very unsavoury reputation - that they "keep bad employees employed", that they simply suck money from members to fund a union elite (a Labour Aristocracy), and that they're generally useless.
But of course, this has never been true and has been said of unions since their earliest days. The fact is unions have been a force for more good than bad, and that fixing these problems with unions is one of building class consciousness and solidarity, not insurmountable considering where unions started. Unions can be powerful if they're well organized, and this is visible even after four decades of attack.
## On the IT Industry's Lack of Organization
Despite them not being "unionized" in the normal sense, knowledge workers are still organized by their certification bodies. And "blue-collar" workers form traditional unions, which ensure all members are treated fairly.
But the IT industry lacks both of these. In fact, for the most part it lacks *any* coherent organization at all. And this is indeed a problem.
First, as a whole the IT industry does not have any sort of professional certification body or specific schooling. There may be optional certs, computer science or engineering courses, and a few professional *societies* like the IETF or the League of Professional System Administrators, but these are not binding organizations. Indeed, a large part of the appeal of the IT industry, especially software development, is that it requires no formal training or education to become a member, allowing those who have self-taught compete with even thir most well-educated colleagues.
Second, despite appearing very much like a traditional trade, the IT industry has almost no collective representation. Indeed, due to the silo'd nature of the field (as is, for instance Engineering more broadly) individual members of the industry may not see much at all in common with one another, with new subfields being created every day. This makes traditional unionization more difficult as well. After all, how many articles have been written about DevOps and how it breaks walls between System Administrators and Developers? Just tackling this one bridge, entirely within the professional sphere itself, has been a huge challenge. But this helps us.
Ultimately there are two sources to these issues that I can see:
1. Generally, IT industry professionals are not responsible for hiring their own, with some exceptions.
1. The culture of the IT industry, stretching back as far as the 1980's, has always favoured brash individualism over collective solidarity.
Each of these requires a bit to go into, but both fundamentally shape the lack of organization within the IT industry, as well as help identify the things we need to combat to improve this.
## On Hiring in IT - buzzwords, HR, and Startups
* common thing i see people complain about - HR hiring
* buzzwords abound, filter by keywords
* testing of employees is hard
* finding good workers takes time
* professional certifications are worthless
## On the Toxic Culture of IT
* High focus on individualism, low empathy
* Rockstars, 10x
* Nerd in-group (reference Dietrich article)
* The good - FOSS
* The good - DevOps, building connections
## Why bother with organization?
* good for people
* good for tech! (DevOps, etc.)
## The ideal IT union
* descrie the ideal IT union

View File

@ -0,0 +1,79 @@
---
title: "Open Core Is Cancer"
date: 2019-08-16T09:53:17-04:00
draft: true
---
# Open-Core software is cancer on the FOSS world
In December 2018, I started the Jellyfin project with a few other contributors. The saga is fairly long and complex, but the short version is: "FOSS" branded software closes off portions of their code, "FOSS" branded software does scummy things, "FOSS" branded software goes closed-source, "FOSS" community forks project, original authors call the FOSS community "moochers", "pirates", etc. "FOSS" community fork sees massive support and encouragement regardless. Clearly people like truly "FOSS" software.
Now, almost more than 2 years later, this saga has forced me to see something that I cannot unsee: this pattern is everywhere in the "FOSS" software world. It permeates projects, from large Enterprise-backed projects all the way down to small, single-developer projects. This mentality and software monetization paradigm is cancer on the FOSS world. It divides and drags down members of the community. It stifles contributors and users. And it is only getting worse.
In this post I hope to record my thoughts on this trend, as a sort-of manifesto for my ideas in developing FOSS software, such as my [PVC](https://github.com/parallelvirtualcluster) hypervisor manager project, and of course [Jellyfin](https://jellyfin.media).
## What is "Open-Core" software?
Before I can explain why I think Open-Core software is cancerous, it's helpful to have a definition for it. I use the following:
> Open-Core software is software that consists of at least two versions.
> One version is fully FOSS-licensed, distributed, and maintained as FOSS software. It accepts contributions from community members, it is released under a FOSS license, and is generally Gratis. It is often called a "community edition", "free edition", or something of the sort.
> The other version(s) are NOT FOSS-licensed. They are almost invariably fully closed-source, and are only available for a fee in binary format. They are often called the "enterprise edition", "paid edition", etc.
> Most critially for the definition, and separating "Open-Core" from two closely-related projects that are simply "FOSS" and "Closed-source", is that the *the non-FOSS components extend the functionality of the FOSS component*. Or put another way, the non-FOSS components offer *features and functionality* that are explicitly missing from the FOSS component.
With this definition out of the way, we can begin to talk about the motivations for "Open-Core" software, and why this is destructive to the ethos of FOSS.
## Why would anyone choose "Open-Core" as their software model?
Money.
It's about the money.
No, really. It is widely accepted that FOSS software has a "monitization problem". That is, it's hard for programmers to get paid to write software which is given away 100% freely. And even I can conceded that this view has merit. Despite being an avowed Communist, I can understand completely that in a Capitalist economic system, payment for programmer time is important.
However, this sort of monitization is also the *easiest* form. It's absolutely trivial to hide functionality behind a paywall. After all, the code is the same. A developer simply has to not release part of the code as FOSS and suddenly, if that feature is high demand, they have a guaranteed revenue stream.
This is not the only method to ensure developer time is compensated, nor even is the requirement for compensation guaranteed. The Debian project is able to create a fully-functional (and in my opinion one of the best) operating systems in the world entirely through volunteer effort. Other projects are able to fund developer teams only through donations. These methods are not impossible. But they require work beyond just writing the code. Volunteer-only projects require non-monetary compensation to be worthwhile to the contributors (often in the form of self-motivated goals, and the sense of community the projects provide, important human motivators). And donation-based projects require active effort to ensure a steady stream of donations. But again, this is not impossible, and there are countless examples of 100% FOSS software which is able to succeed with these methods.
There is one more method: support. The model of RedHat, the most successful FOSS-based company in history. Provide the product for free, and support it for a fee. Clearly this method also works. But it requires even more effort and process than the donation-based approach. And unfortunately, projects that already use this method can quickly succumb to a desire to boost their monitization by going "open-core"; even RedHat has. the reason is quite simple: you need talent to monitize support, talent that is often very quick to leave for greener pastures. And you have to be able to support complex issues with the software in order to justify the costs. It is a balancing act, and one that is not always successful. But it certainly can be, at least until the Capitalist profit motive overrides moral considerations.
## Open-Core, by its very definition, disrespects the ideals of FOSS software
This is a pretty spicy take. And of course, I'm not qualified to speak for every FOSS developer, community member, or Richard Stallman out there. But let me break it down.
FOSS software, and specifically copyleft (GNU GPL), is built on the 4 key freedoms, laid out by the GNU project:
* The freedom to run the program as you wish, for any purpose (freedom 0).
* The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
* The freedom to redistribute copies so you can help others (freedom 2).
* The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.
In the strictest sense, open-core seems to abide by the letter of these laws. After all, one could easily argue that "any given feature in the closed-edition is simply not FOSS, but the rest of the software is". But I think this is fundamentally missing the *point* of the four freedoms. They speak towards an ethos of collaboration, of sharing the softwrae fully in order to enable each user to have those freedoms.
And this is where I think open-core is insidious. If a user of a piece of software is only able to achieve 80% (for example) of the functionality of the software, under traditional FOSS software it behooves them to implement that remaining 20%. Hopefully, with the support of the community, those improvements can then be shared back to the remainder of the community, both to build on, and improve further. But if that remaining 20% is locked behind the non-FOSS portion of an open-core project, a conflict of interest arrises. Community members may want to implement that functionality for themselves, but attempts to contribute it back upstream will inevitably be rejected - the group running the software does not want that 20% being freely available under the FOSS license. This has a chilling effect on the community: seemingly arbitrary pieces of functionality can, at any time, be stripped out in favour of the non-FOSS component. And this in turn leads towards contributors second-guessing features - does X fall under the banner of Y which is a non-FOSS component? Am I even privy to the existence of the functionality? Will my contribution be rejected without explanation and possibly even deleted? Will I be threatened with "infringement" because I accidentally reimplemented non-FOSS code? This is also a clear conflict of interest with the ideals behind the four freedoms. By causing developers to question the nature of their contributions and where they fit in to the grand monitization scheme of the project, they really are having the second, third, and fourth freedoms compromised.
## Gratis users are left in the dark
So, the "Open-Core" model is harmful to developer effort and the spirit of cooperation in FOSS software communities. But it also harms users as well.
Motivations are often unclear and hidden. I'll admit, I've seen examples of Open-Core software that truly does respect their FOSS and Gratis sides. They release all the important features for free, and reserve a select few for their paid version. However, how can any user ever know where the line is drawn? Perhaps a feature that seems "high-end" to the developers is a critical feature for a very small operation. Or perhaps, in the inverse, a very heavy/large user has no use for any of the paid features. No one wins in this scenario.
"Open-Core" ultimately results in keeping users, and specifically the Gratis users, in the dark, always second-guessing what might happen next release. Always second guessing whether a "bug" is legitimate or a scummy attempt to force users to the non-FOSS versions. And this harms FOSS in general - after all, the project is branded FOSS, has a FOSS community, but if users see, for example, a big performance bug, and this bug is not present in the non-FOSS versions, it begs the questions "is this a priority to fix?" and "is this intentional?". Both of which harm the developers, as well as the FOSS community more broadly.
Further, it's often hard to tell whether projects are taking a "release-later" "open-core" position. With this sort of position, the software releases a feature as part of the non-FOSS product today, but intends to eventually release it as part of the FOSS project. But of course, if they announced they were doing this, it would cannibalize their attempt to monitize. So this fact is hidden from end-users, until the day the feature is released. Now, I do think that this method of "Open-Core" is probably the least offensive, since the features do eventually make it to the FOSS community. But this is also predicated on the good-will of those behind the project, and can change at any time. All of this leaving users in the dark.
## Monitization drives decisions
Beyond just leaving users in the dark, with "Open-Core" software, the attempt to monitize will, almost inevitibly, change the motivations of the project for the worse.
In a truly FOSS project, 100% set to the ideals of the 4 freedoms, every decision, every feature, exists to make the project better. It has to - otherwise, why would anyone contribute it? Perhaps it only scratches the itch of a single member of the project, but that's still a motivation to improve the project as a whole. And since every change is open to everyone, it can be adequetly critiqued and tweaked to be the best it can be.
But when monetization is the first goal and "Open-Core" is the solution, everything flips on its head. Suddenly the "FOSS" aspect is secondary to the moneymaking, and the ideological compromises stack up, one after another, until there is very little left of the freedoms and the project ultimately decides to go closed-source.
## Under Open-Core, FOSS is just marketing
Ultimately, under an "Open-Core" model, FOSS just becomes a marketing tool. A way to "hook" users who care about the freedoms of FOSS software into using a product, only to effectively extort them for payment to get 100% of the functionality. This is harmful.
## There must be a better way
I don't have all, or even a good, answer to the problem of FOSS monetization. Some models work for some companies or individuals, others don't, and [some people go nuclear](https://lwn.net/Articles/880809/). But as I've stated above, I think "Open-Core" is one of the worst ways to proceed, harming both developers and users in the process. It is an afront to the 4 freedoms in spirit, stifles innovation, and should be stopped.

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

View File

@ -0,0 +1,140 @@
---
title: "Patroni and HAProxy Agent Checks"
description: "Using HAProxy agent checks to clean up your Patroni load balancing"
date: 2018-09-17
tags:
- Development
- Systems Administration
---
[Patroni](https://github.com/zalando/patroni) is a wonderful piece of technology. In short, it [allows an administrator to configure a self-healing and self-managing replicated PostgreSQL cluster](https://patroni.readthedocs.io/en/latest/), and [quite simply at that](https://www.opsdash.com/blog/postgres-getting-started-patroni.html). With Patroni, gone are the days of having to manage your PostgreSQL replication manually, worrying about failover and failback during an outage or maintenance. Having a tool like this was paramount to supporting PostgreSQL in my own cluster, and after a lot of headaches with [repmgr](https://repmgr.org/) finding Patroni was a dream come true. If you haven't heard of it before, definitely check it out!
Once you have a working Patroni cluster, managing client access to it becomes the next major step. And probably the easiest (and, in their docs, recommended) method to do so is using HAProxy. With its integrated health checking and simple load balancing, an HAProxy-fronted Patroni cluster provides the maximum flexibility for the administrator while seamlessly handling failovers.
### The problem - Do you like `DOWN` hosts?
However, the [official HAProxy configuration template](https://github.com/zalando/patroni/blob/master/extras/confd/templates/haproxy.tmpl) has a problem - in a read-write backend, you want your non-`master` hosts to be inaccessable to clients, to prevent write attempts against a read-only replica. However this configuration results in the `replica` hosts being marked `DOWN` in HAProxy.
Now, some people might ask "well, why is that a big deal"? And they may be right. However, as soon as you start trying to monitor your HAProxy backends via an external monitoring tool, you see the problem: "CRITICAL" alerts during normal operation! After all, a `DOWN` host is considered a _problem_ in 99.9% of HAProxy usecases. But with Patroni, it's expected behaviour, which is not ideal.
So what can we do?
### HAProxy's `agent-check` directive
HAProxy, since at least version 1.5, supports [a feature called `agent-check`](https://cbonte.github.io/haproxy-dconv/1.5/configuration.html#5.2-agent-check). In short, this "enable[s] an auxiliary agent check which is run independently of a regular health check". The `agent-check` will connect to a specific port on either the backend host or another target, and will modify the backend status based on the response, which must be one of the common HAProxy keyworks (eg. `MAINT` or `READY`).
So how does this help us? Well, if we had some way to obtain Patroni's `master`/`replica` status for each host, we could, instead of having the `replica` machines marked `DOWN`, put them into `MAINT` mode instead. This provides cleanliness for monitoring purposes while still letting us use the typical Patroni HAProxy configuration, with just minimal modifications to the HAProxy configuation and deploying an additional daemon on the Patroni hosts.
### The Code - Python 3 daemon
The following piece of code is a Python 3 daemon I wrote that uses the `socket` and `requests` (requires the `python3-requests` package on Debian, or `requests` via `pip3`) libraries to:
1. Listen for the agent check on port `5555`.
2. In response to a request, query Patroni's local API to determine that host's `role`.
3. Return `MAINT` or `READY` to HAProxy based on the role.
Here is the code - I'm sure it can be improved significantly but it works for me!
```
#!/usr/bin/env python3
# Simple agent check for HAProxy to determine Patroni master/replica status
import socket, requests
# Make sure we clean up when we fail
def cleanup(e):
print(e)
conn.close()
sock.close()
exit(1)
# External port to listen on and report status to HAProxy
listen_port = 5555
# Location of the Patroni API
data_target = 'http://localhost:8008'
# Get the current role from Patroni's API
def getstate():
try:
r = requests.get(data_target)
except:
return 'null'
data = r.json()
role = data['role']
return role
# Set up a listen socket on listen_port
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind(('', listen_port))
sock.listen(1)
# Loop waiting for client requests in blocking mode
while True:
conn, addr = sock.accept()
state = getstate()
# Set our response based on the state; only `master` should be READY in read-write mode
if state == 'master':
data = b'READY\n'
else:
data = b'MAINT\n'
# Send the data to the client
try:
conn.sendall(data)
conn.close()
except Exception as e:
cleanup(e)
```
### Running the daemon with systemd
Running the above Python code is really simple with systemd. I use the following unit file, assuming the above code is located at `/usr/local/bin/patroni-check`.
```
# Patroni agent check systemd unit file
[Unit]
Description=HAProxy agent check for Patroni status
After=syslog.target network.target patroni.service
[Service]
Type=simple
User=postgres
Group=postgres
StartLimitInterval=15
ExecStart=/usr/local/bin/patroni-check
KillMode=process
TimeoutSec=30
Restart=on-failure
[Install]
WantedBy=multi-user.target
```
This is a really straightfoward unit with one deviation - `StartLimitInterval=15` is used to prevent the daemon from restarting immediately on failure. In my experience (probably a n00b error), Python doesn't properly clean up the socket immediately, leading to the daemon blowing through its ~5 restart attempts in under a second and failing every time with an "Address already in use" error. This interval gives some breathing room for the socket to free up. And luckily, HAProxy won't change the state if the agent check becomes unreachable, so this should be safe.
### Enable it in HAProxy
Now finally, configure your HAProxy backend to use the agent check. Here's my (live) config for a read-write backend:
```
backend mast-pgX_psql_readwrite
mode tcp
option tcpka
option httpchk OPTIONS /master
http-check expect status 200
server mast-pg1 mast-pg1:5432 resolvers nsX resolve-prefer ipv4 maxconn 100 check agent-check agent-port 5555 inter 1s fall 2 rise 2 on-marked-down shutdown-sessions port 8008
server mast-pg2 mast-pg2:5432 resolvers nsX resolve-prefer ipv4 maxconn 100 check agent-check agent-port 5555 inter 1s fall 2 rise 2 on-marked-down shutdown-sessions port 8008
server mast-pg3 mast-pg3:5432 resolvers nsX resolve-prefer ipv4 maxconn 100 check agent-check agent-port 5555 inter 1s fall 2 rise 2 on-marked-down shutdown-sessions port 8008
```
And here it is in action:
![HATop output](haproxy-psql-backend.png)
### Conclusion
I hope that this provides some help to those who want to use Patroni fronted by HAProxy but don't want `DOWN` backends all the time! And of course, I'm open to suggestions for improvement or questions - just send me an email!

View File

@ -0,0 +1,58 @@
---
title: "Problems in FLOSS Projects #1 - Feature: Burden or Boon?"
description: "Why it's hard to prioritize and balance advanced features with ease of use"
date: 2020-04-18
tags:
- FLOSS
- Development
---
## Welcome
Welcome to part one of a series I've decided to create on this blog, called "Problems in FLOSS Projects". I intend this series to be a number of relatively short posts, each one investigating an interesting topic or issue I've seen or faced, mostly within the Jellyfin project, but also in FLOSS self-hosted software in general. Some are technical, some are managerial, some are neither, but all are things that I've thought about and want to write down for the benefit of others who may not have thought of these ideas, or who have and didn't have a name, or even for whom these are constant struggles and who would like validation that they're not alone. I don't intend these posts to have a concrete point or thesis, but just to share a musing or observation, interspersed with my own advice, for others to see. Some of these could have been thought-of-but-unwritten responses to Reddit posts, extensions of chat discussions, or general observations I made, really anthing that didn't "fit" another medium. And some that did and I'm just lifting here for posterity.
As you can tell historically I haven't updated this blog much, but I hope by creating a little series like this without a formal goal, it will encourage me to write more. I hope you enjoy.
## Who uses your software?
Generally speaking, it seems that administrators of self-hosted software tend to fall into to main camps: Users and Admins. Now, yes of course anyone running self-hosted software is generally an "admin", but I mean something specific with these terms, so bear with me. When I use the words "User" and "Admin" capitalized later in this piece, I mean these two definitions.
Users want software that "works", that is to say, fulfills a function. They want self-contained software that can be installed quickly and easily, and that will provide them some functionality. Then, they use the software, often only tweaking or updating it when they feel like it, when they hit a bug, or when a really exciting new feature comes along. Users will generally tolerate small bugs, but big bugs, especially big breaking bugs, tend to turn them right off of a piece of software.
Admins want software that is "interesting", that is to say, provides them joy in and of itself, either to set up, to learn, to discover functionality, or just in general, to tinker. They will definitely use the software as a User does, but for them the joy is in the journey, not the destination, and they'll probably run a nightly or upgrade every release within hours just for the fun of being on the bleeding edge. Admins also tend to tolerate much more and much bigger bugs than Users will, some even reaching true "Hacker" level and helping write the software.
Now, if you've read those two descriptions and thought "Well, sometimes I'm a User, and sometimes I'm an Admin, what a false dichotomy!", you would be right. In fact, for a lot of the things I self-host myself, I certainly fall more into "User" than "Admin". I want it to just sit there and work so I never have to think about it. Of course, with other things, like Jellyfin, I'm an Admin (when my users won't notice). And while you would probably say that almost every first-time self-hoster falls squarely into the User category, for those of us with more experience, professional or otherwise, the line absolutely blurs significantly.
But the real point of these descriptions is less about individuals as a whole, and more about the conceptual space each type represents, like the comparison between Introversion and Extroversion. Every user of your software is going to fall somewhere in these two camps - practical vs. interesting, stable vs. bleeding-edge - though not always in the same place for every piece of software, or even on different days with the same software.
However, we can use these two models to help us figure out a deeper issue: is a feature going to be a Burden, or a Boon?
## Burden or Boon?
Every feature you add to a project/product is going to add more conceptual baggage to the software. A really interesting portrayal of this can be found in UX designer and musician/composer Tantacrul's "Music Software & Bad Interface Design" series [[1]](https://www.youtube.com/watch?v=dKx1wnXClcI). In this particular episode, speaking about Sibelius and its Ribbon design, he discusses at length the usefulness of a (good) Ribbon UI in helping solve a major problem that Microsoft originally, and Sibelius later, found with their software: there were just too many features to anyone to find them all. The Ribbon UI was a solution to this; we can debate at length about how successful it was, but it illustrates a real problem with software design.
When you add features, those features have to both serve a purpose, and be accessible to your users.
This seems obvious. After all, why would anyone add a feature if it served no purpose and wasn't accessible to the user?
The problem with self-hosted server software is that *you have to take into account your user concepts when asking this question*. For instance, let's take a very real feature request for Jellyfin: support for SSO.
SSO, or Single-Sign-On, is functionality that allows an end user to authenticate to a single captive portal-like page once, and then be seamlessly logged in to multiple services at once. Think how a random site can log you in with your Facebook account, immediately taking you in a small popup to a Facebook page where you confirm access, and then promptly log you in. This is SSO.
SSO is a really useful feature for a lot of software, especially as a self-hosting admin who runs many services. You can automate authentication and hand it off to another piece of software. You can centralize authentication, as well as ensure that one badly-behaved application cannot leak credentials.
But here's the rub: SSO is an *Admin* feature. The majority, perhaps even the vast majority, of people running self-hosted software don't care about SSO functionality. They don't know the internals of HTTP basic authentication versus application authentication, they don't know what LDAP is, and they certainly don't know what Kubernetes is. These are all extremely advanced features. Most self-hosted admins are, especially at the start, *Users*. This feature is, not only useless to them, but if your only authentication is via SSO plugins, you're going to lose them very quickly.
You must think about whether a feature you add is a Boon to both Users and Admins, or a burden to one group or the other. And it cuts both ways. For instance, a very simple authentication scheme might be all that a User wants. But now the Admin is frustrated, because his more advanced use-case necessitates some extensions to that authentication (for instance, SSO, or more commonly just plain LDAP support), and the software doesn't provide it.
## The sliding scale of User-Friendly
This is a topic I could (and maybe will) write a whole blog post on, but "user friendly" when describing software is almost always a misnomer. User-friendliness is a sliding scale. Something that is friendly for a beginner can easily become tedious for an advanced user, and something that an advanced user sees as conceptually basic can be an unimaginable learning curve for a new user. This is something every software must face.
But the difference with self-hosted software is that you don't just use have "users" in the traditional sense, you also have the two conceptual spaces for the administrator as well. If the software has a huge learning curve during initial setup, it will attract Admins but not Users, and if your software is super easy to set up but lacks any advanced features, you will attract plenty of Users but Admins will quickly become frustrated (or, hopefully, help you hack, if you're open to it).
So what is the point of all this? Besides me putting to words these thoughts?
Well, my advice to any aspiring manager/designer/creator of a piece of self-hosted software is this: make sure you think about these things, especially when designing new features. Who does your feature help, Users or Admins? Who is your feature a burden on, Users or Admins, or neither? By asking yourself these questions, and understanding the two types of self-hosted administrators, you can help design software that everyone will like. And when adding features to your project, thinking about which group the feature is targeted at can help you decide how to integrate the feature, whether or not to make it mandatory, and how and where to document it.
I've seen many cases where these concerns aren't thought of, and the result is almost always subpar software. Either Users are unhappy, or Admins are unhappy, and the software suffers for it. Users build community, Admins push the envelope. And worst of all, if both groups are unhappy, no one will use your software at all! Which sometimes, fine, it scratched your itch. But to make something that many people use, that is useful to not just the author, is generally the goal of FLOSS, and taking these issues into account will help you realize that goal. Because we all want FLOSS and self-hosting to take over the world.

View File

@ -0,0 +1,74 @@
---
title: "Problems in FLOSS Projects #2 - Support Vampires"
description: "How to spot and deal with people draining your community's life-force"
date: 2020-05-31
tags:
- FLOSS
- Development
---
## Welcome
Welcome to part two of my "Problems in FLOSS Projects" series. In this entry, we talk about "Support Vampires" a.k.a. "Help Vampires", how to spot them, what to do about them, and how to avoid being one. I hope you enjoy.
## Support Vampires
In FLOSS communities, there is a kind of user dreaded by everyone who's spent significant time in chat rooms or forums. They appear out of nowhere, and at first seem like simple, new users. But as time goes on, you notice something strange happen. This person posts more and more, but is always asking the same questions. They don't respond well to criticism, warranted or not. And sooner or later, no one wants to deal with them, and by extension anyone else. You've come across a support vampire, and they're sucking your community dry.
The term "Support Vampire" is my own, though a similar term, "Help Vampire", is used in [this wonderful post by Amy Hoy](http://slash7.com/2006/12/22/vampires/). Her post does an amazing job at explaining, in a somewhat snarky way, what they are and how to deal with them, but I think this is a subject so common, but also so unknown-in-name, that it deserves another post nearly 15 years later.
To really discuss support vampires, we must first define what they are. The term, at least in my conception, derives from that of the "psychic vampire". Put simply, a psychic vampire is a person who, through your interaction with them, drains you of your mental energy and desire to help. The quintessential example is the toxic friend who demands endless time, understanding, and support from you, while providing none back, thus leaving you in a state of near total mental apathy, unable to devote mental energy to anything else. For introverts, very common in FLOSS and Internet communities in general, this is especially damaging, as our wells of interpersonal interaction energy (to borrow the phrase from the Autistic community, "spoons") are even more limited.
A support vampire is, fundamentally, the same thing, applied to a support community. They are a user, or sometimes contributor, who's interactions leave community members, and in serious cases the entire community, drained of all energy and desire to help. This not only hurts the individual members, but the community as a whole, making help even harder to find in the future even for those who will give back. Below, we will investigate what makes a support vampire and how to spot them, how a community can deal with them, and finally how to correct one's own ways if they realize they're being vampiric.
## Identifying Vampires
Generally speaking, in their first interactions, support vampires look like any other (new) user. They often join the community looking for help with the software or for general guidance. But what really distinguishes a help vampire from a normal user worthy of help, is in how they respond to the interaction, and their future actions.
Amy Hoy's guide offers a quick checklist, which I will quote her verbatim, with my own thoughts interspersed.
> Does he ask the same, tired questions others ask (at a rate of once or more per minute)?
The required time-frame may be a little hyperbolic, but one of the most common traits of support vampires is their *unwillingness to accept answers given*, thus resulting in the same questions being repeated over and over. A new user who spams the same question a few times to no answer is more likely a troll or just generally lacking in netiquette, but what really distinguishes a support vampire is this behaviour occurring over longer timescales. For instance, a user asking about feature Y, getting a reasonable answer, then a week or two later asking the same question again. The user is almost certainly fishing for a different, "better" answer, and this is one of the most definitive and useful early-warning signs.
> Does he clearly lack the ability or inclination to ask the almighty Google?
Now, as a veteran of the Internet, I'll be the first to admit that Google-Fu is a skill, and often turns up little to nothing of value, especially for more obscure problems. But the first step of any user should be to check the available information. Asking questions that are readily available in an FAQ or (for forums) in a stickied thread is a sure sign that things will quickly go downhill and that a vampire is about to strike.
> Does he refuse to take the time to ask coherent, specific questions?
Asking incoherent questions, especially combined with the first point (asking the same question over and over in varying ways) is a common support vampire trope. General questions that require 50-page answers are not helpful to anyone else. Asking "how does X work" or "what does X do" are time-sinks for those responding. And on a deeper level, having to dig specifics out of the support vampire is precisely the sort of energy draining that makes support vampires so toxic - no one wants to spend 20 minutes trying to pry the *real* question out of someone asking for help, since 100% of that time is wasted.
> Does he think helping him must be the high point of your day?
A key trait of the support vampire is entitlement. Demanding answers, threatening, spamming until they're answered, are all toxic behaviours that will quickly strip a community of its patience. While an entitled user is not automatically a support vampire, the Venn Diagram of the two is nearly a perfect circle - the lack of respect for others inherent in entitlement leads directly to a lack of concern for the consequences of their actions on the community and its members.
> Does he get offensive, as if you need to prove to him why he should use Ruby on Rails?
I think this one speaks for itself - anyone demanding you spend time proving to them why your project needs them of a user can be met with one answer: it doesn't. Being indignant and offensive about it is entitlement, which as we saw before, just leads to a support vampire.
> Is he obviously just waiting for some poor, well-intentioned person to do all his thinking for him?
> Can you tell he really isnt interested in having his question answered, so much as getting someone else to do his work?
The next two, together, say almost the same thing and speak of the same problem: a support vampire wants *someone else* to think for them. This is, I think, the reason they are so willing to do the various other things above. They want an easy answer, a cut-and-paste solution, and when they don't get it, they turn to entitlement ("well why should I use this?" "I demand answers!"), vague questions ("I don't understand the details, just give me the answers!"), and spammy responses ("If I don't get my easy answer I'll just ask until I do!"). This is the root of the problem, in my view. And it is simultaneously the easiest and hardest thing to fix.
## For communities and members: I have a vampire, what do I do?
It's unfortunate, but support vampires are common in any online community devoted to helping users with a project or solution. As mentioned above, this is almost always born out of laziness and a desire to avoid thinking about their questions. There are a couple things a community can do to help fix the situation.
First, do not take shit. Always remember: as a contributor to, or user of, a FLOSS project, *you do not owe anyone your time or effort*. This is an absolute. If a user is abusive, call them on it and ask them to stop. You can do this politely in an otherwise-helpful response, and in my experience this is an invaluable part of helping guide a vampire past their support-sucking predilections. If someone doesn't know what they're doing is wrong, they can't ever be expected to change it.
Second, as Amy calls it, "Cease Enabling Behaviour". Don't just past an answer or respond with snark. Enforce autonomy by providing resources, not direct answers. Foster thinking by responding with questions rather than answers. Reward self-help and helping others by actively and openly appreciating those who help themselves and help others. And finally, be friendly and thus encourage the reformed vampire to help out too.
Third, provide resources for the project as a whole. From experience, a project with little or no documentation is also the most likely to become quickly infested by help vampires. Not because of the support vampires themselves, but because it is impossible for them to find information themselves. Create guides, tutorials, and FAQ lists, especially if you keep getting the same questions over and over from *different* people. When a support vampire arrives, you can point them at the documentation instead of wasting your time. If they help themselves, that's the first step to reformation.
Fourth, do not hesitate to weed out a hopeless case. If a user is informed their behaviour is wrong, is given the chance to reform, and fails to do so over and over, let them know they are no longer welcome. No one likes banning a user, but if the choice is between one hopeless help vampire and your entire community, the choice should be obvious.
## Oh no, I'm the vampire, what do I do?
If you're reading this, and think you might be a help vampire (especially if someone sent you a link to this post), the first thing to do is stop what you're doing, be it spamming, being belligerent, demanding special attention, or otherwise feeling entitled. You are not special or entitled. No one owes you anything. The people you are interacting with want to help you, but if you make that impossible, you are more likely to get cold shoulders than the praise and thoughtless walkthroughs you seek.
If you have not yet, go and read [asking questions the smart way](http://www.catb.org/~esr/faqs/smart-questions.html) by Eric S. Raymond. His personality aside, this document is an absolutely invaluable resource to understanding the people you will be interacting with, and how to not only get their attention and get a (useful) answer, but also how to ensure you are not a burden on the community.
Show humility and a desire to learn. Everyone is a newbie at some point. The key to getting past that is to recognize that you must put in the effort to learn, and that you are not entitled to easy answers or simple fixes. Read the documentation, read the forum history, and above all, be respectful to those you ask for help. Especially in FOSS, you are not paying for support - the people helping you are willing volunteers, and if you abuse that, they will show you the door.

View File

@ -0,0 +1,94 @@
---
title: "Problems in FLOSS Projects #3 - The Development Bystander Problem"
description: "The paradoxical link between user count and new developers"
date: 2022-12-07
tags:
- FLOSS
- Development
---
## Welcome
Welcome to part three of my "Problems in FLOSS Projects" series. Better late than never. In this entry, we talk about a paradoxical problem that I've observed within Jellyfin and several other large projects. I hope you enjoy.
## The Paradox
There is a seeming paradox in FLOSS projects when it comes to developer engagement versus user count. It can be summed up pretty simply:
> The larger a FLOSS project gets, in terms of user base, the less likely it is that new developers come onboard, causing the project to stagnate.
I call this a paradox because this is not what most new projects *think* will happen.
The conventional wisdom is that more users is always a *good* thing, because more users will bring a larger community which will in turn bring on more developers to aid the project.
But in my experience, this does not hold, and in fact, the opposite happens. The more users the project gets, the *fewer* new developers the project takes on.
Why? I have some ideas.
## The Bystander Effect and the Tragedy of the Commons
These are two somewhat related but distinct sociological phenomena that have been well described and documented by sociologists.
The [Bystander Effect](https://en.wikipedia.org/wiki/Bystander_effect) is, to quote Wikipedia:
> a social psychological theory that states that individuals are less likely to offer help to a victim when there are other people present
The Bystander Effect is most commonly described in terms of victim situations, for instance, a robbery. If many people see a person being robbed, the more people there are witnessing the crime, the less likely any single one is to take action and help the victim.
In short, the more people that there are around, the less likely any *single* person is to render assistance. The thinking is, in its simplest form, "someone else will help".
The [Tragedy of the Commons](https://en.wikipedia.org/wiki/Tragedy_of_the_commons) is a similar idea, to quote Wikipedia:
> a situation in which individual users, who have open access to a resource unhampered by shared social structures or formal rules that govern access and use, act independently according to their own self-interest and, contrary to the common good of all users, cause depletion of the resource through their uncoordinated action in case there are too many users related to the available resources
While the Bystander Effect is usually used for social situations, the Tragedy of the Commons is most often described in terms of environmental situations. A canonical example is that of a number of woodcutters: each woodcutter has an individual incentive to cut down as many trees as they possibly can to increase their individual wealth. However taken together, all woodcutters are likely to completely exhaust the forest. Despite this being something that negatively affects all woodcutters, because of the incentives involved, each woodcutter is incentivized to hurt everyone - including themselves - in the long-term to benefit themselves short-term, because the long-term consequences are abstracted among "the commons" of all woodcutters.
I posit that both effects are at play as FLOSS projects grow, to the detriment of the project at large and individual contributors and users over time.
## A Disclaimer
I had some backlash to the last post for my generalizations, so I want to make a quick disclaimer here.
Despite anything I say below, I am *not* saying that projects shouldn't grow, or that "users are bad", or anything of that sort. I am simply describing a problem I observe that I believe project managers, contributors, and users should be aware of.
I also don't have an easy solution. Like many sociological phenomena, these are complex, and there is no one-size-fits-all solution or magic bullet that will solve them. I simply hope that by bringing attention to this, that it can help projects understand what is happening and work to develop their own solutions that suit their project.
## How the Bystander Effect affects projects
The Bystander Effect is the easier of the two to map onto FLOSS projects.
As a project grows, and more users join, there *is* a much higher likelihood of any individual user being a developer who is capable, and even willing, to contribute to the project. This does of course depends on the nature of the project, as projects that focus primarily on end-users are likely to have a lower "developer pool" than those that target developers (i.e. libraries and such), but there are still likely many capable developers in the userbase of every project.
The Bystander Effect comes in because of the size of the userbase. Each individual developer is more likely to see both the size of the project, the number of other developers (active or not), and the number of users, and conclude that "someone else is working on that". In contrast, when a project is still new and small, a developer is much less likely to conclude this, and is thus more likely to make a contribution to fix a bug or implement a feature that they want.
We have seen this happen with Jellyfin in near real-time. Earlier on in the project, when issues came up, we were very likely to get a quick implementation of a fix by a new contributor; many of these contributors ended up joining the project team and helping long-term. However as time has gone on and the userbase has grown exponentially, this has become rarer and rarer. I'm certain these developers are out there, but that for some reason they are not making the jump from user to contributor, and I believe that this "FLOSS Bystander Effect" is the major reason why.
## How the Tragedy of the Commons affects projects
The Tragedy of the Commons has a slightly more tenuous connection to FLOSS projects, but I think it is still relevant here, though causing a slightly different result than the Bystander Effect.
This issue is more on the side of users who are also developers. Each individual user-developer has their own ideas about what the project should do. This is especially pronounced in very large, sprawling projects with many features, like Jellyfin. For each feature the developer may want, they have an incentive to implement it. And the user-developer *does* move past the Bystander Effect and start working on a feature or bugfix. They might even finish it. The tragedy of the commons comes into play when said feature is shared with the project at large.
Depending on the "cost" of the feature, in many terms - developer effort, review effort, migration paths, etc. - the project as a whole might not be interested in the feature. Either in general, or in specific implementation. This results in a major discouragement to the user-developer-turned-new-contributor, as their pull request languishes in the purgatory of "waiting on review", with no one being willing to dismiss it nor to actively accept it.
Now this may seem like a stretch, but there is one specific area where I think this particular phenomenon is decidedly at play: that of code quality.
In the Jellyfin project, we have a relatively small number of core "server" developers, the "server" being the backend API/database that the various clients (including the main "web" client) interface with. These developers have spent many years "cleaning up" the codebase, trying to bring in good development practices and clear, maintainable code.
In this situation, "the commons" is the codebase as a whole.
The tragedy comes because there is a conflicting incentive for the user-developer-turned-new-contributor. Without saying anything about their actual skill level, it is a lot of work for them to (a) get up to speed on the codebase, *and* (b) learn the rules and processes of the project, *and* (c) ensure that their code follows the often-institutionalized-and-undocumented knowledge practices that the current contributors have developed, all on top of the work of actually implementing the feature. However, there is incentive for the user-developer to implement the feature they want quickly, so they can use it, without too much concern for how it fits into the larger codebase (the "commons" of the project).
This can have two effects, both of which are alienating: first, if the project "doesn't care" about "the commons", you can end up with a situation where cleanup work is discarded, the codebase becomes messy as new contributions add code-paths that do not align with the structure, and this alienates those developers doing the cleanup; or, second, if the project *does* "care" about "the commons", this can alienate the new user-developer, who sees the constant requests for rewrites as "nitpicking" their contributions, alienating them and making them at best less likely to see their contribution through, and at worse cause them to fork and implement things their way.
Hopefully the analogy to the original "environmental" version of the Tragedy of the Commons makes more sense with that example.
Another example is that of feature creep. Here "the commons" is the conciseness and focus of the project, the few features that the (current) core contributors see as the most important. The tragedy comes about as users request more and more features and, potentially, new user-developers attempt to implement them, resulting in either a sprawling codebase that becomes harder and harder to maintain, or further alienation vis-a-vis "this feature isn't wanted".
I'm sure there are plenty more as well that maintainers of large, long-lived projects can think of, but these are the two major ones that I have observed, and neither is limited to Jellyfin specifically, but to many other large FLOSS projects.
## What is a project to do?
As mentioned in the disclaimer, I unfortunately do not have a good solution here. The problem of "new blood" in the project is one we've been struggling with in Jellyfin for over a year now. We've thrown around some ideas, but ultimately none have been particularly satisfactory.
Ultimately, I think the best solution to both is simply education. Knowing, from both sides, that these are potential pitfalls can go a long way to a shared understanding between users (and especially, developer-users) and core contributors/maintainers about how best to approach things like work-sharing, reviews, new features, and code quality. Once there is shared understanding, this hopefully makes finding solutions tailored to each specific project easier on everyone.

View File

@ -0,0 +1,152 @@
---
title: "Adventures in Ceph tuning, part 2"
description: "A follow-up to my analysis of Ceph system tuning for Hyperconverged Infrastructure"
date: 2022-11-12
tags:
- PVC
- Development
- Systems Administration
---
Last year, [I made a post](https://www.boniface.me/pvc-ceph-tuning-adventures/) about Ceph storage tuning with my [my Hyperconverged Infrastructure (HCI) project PVC](https://github.com/parallelvirtualcluster/pvc), with some interesting results. At the time, I outlined how two of the nodes were a newer, more robust server configuration, but I was still stuck with one old node which was potentially throwing off my performance results and analysis. Now, I have finally acquired a 3rd server matching the spec of the other 2, bringing all 3 of my hypervisor nodes into perfect balance. Also, earlier in the year, I upgraded the CPUs of the nodes to the Intel E5-2683 V4, which provides double the cores, threads, and L3 cache than the previous 8-core E5-2620 V4's, helping further boost performance.
![Perfect Balance](perfect-balance.png)
With my configuration now standardized across all the nodes, I can finally revisit the performance analysis from that post and make some more useful conclusions, without mismatched CPUs getting in the way.
If you haven't read that previous post, I recommend doing so now to get the context, as I will be jumping right in to the updated testing and results.
## The Cluster Specs
All 3 nodes in the cluster now have the following specifications:
| **Part** &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; | **node1 + node2 + node3** |
| :-------------------------------------------------------------- | :------------------------ |
| Chassis | Dell R430 |
| CPU | 1x [Intel E5-2683 v4](https://ark.intel.com/content/www/us/en/ark/products/91766/intel-xeon-processor-e52683-v4-40m-cache-2-10-ghz.html) (16 core, 32 thread, 2.1 GHz base, 3.0 GHz maximum boost) |
| Memory | 128 GB DDR4 (4x 32 GB) |
| OSD DB/WAL (NVMe) | 1x Intel DC P4801X 100 GB |
| OSD Data (SATA) | 2x Intel DC S3700 800 GB |
| Networking | 2x 10GbE in 802.3ad (LACP) bond |
## The OSD Database Device
As part of that original round of testing, I compared various configurations, including no WAL, with WAL, and various CPU set configurations with no WAL. After that testing was completed, the slight gains of the WAL prompted me to leave that configuration in place for production going forward, and I don't see much reason to remove it for further testing, due to the clear benefit (even if slight) that it gave to write performance with my fairly-slow-by-modern-standards SATA data SSDs. Thus, this follow-up post will focus exclusively on the CPU set configurations with the upgraded and balanced CPUs.
## The Fatal Flaw of my Previous Tests and Updated CPU Set Configuration
The CPU-limited tests as outlined in the original post were fatally flawed. While I was indeed able to use CPU sets with the `cset` tool to limit the OSD processes to specific cores, and this appeared to work, the problem was that I wasn't limiting *anything else* to the non-OSD CPU cores. Thus, the OSDs were likely being thrashed by the various other processes in addition to being limited to specific CPUs. This might explain some of the strange anomalies in performance that are visible in those tests.
To counteract this, I created a fresh, from-scratch CPU tuning mechanism for the PVC Ansible deployment scheme. With this new mechanism, CPUs are limited with the systemd `AllowedCPUs` and `CPUAffinity` flags, which are set on the various specific systemd slices that the system uses to organize processes, including a custom OSD slice. This ensures that the limit happens in both directions and everything is forced into its own proper CPU set.
In addition to separating the OSDs and VMs, a third CPU set is also added strictly for system processes. This is capped at 2 cores (plus hyperthreads) for all testing here, and the FIO processes are also limited to this CPU set.
Thus, the final layout of CPU core sets on all 3 nodes looks like this:
| **Slice/Processes** &emsp; | **Allowed CPUs + Hyperthreads** &emsp; | **Notes** |
| :------------------------- | :------------------------------------- | :-------- |
| system.slice | 0-1, 16-17 | All system processes, databases, etc. and non-OSD Ceph processes. |
| user.slice | 0-1, 16-17 | All user sessions, including FIO tasks. |
| osd.slice | 2-**?**, 18-**?** | All OSDs; how many (the ?) depends on test to find optimal number. |
| machine.slice| **?**-15, **?**-31 | All VMs; how many (the ?) depends on test as above. |
The previous tests were also, as mentioned above, significantly limited by the low CPU core counts of the old processors, and during those tests the node running the tests was flushed of VMs; the new CPUs allow both flaws to be corrected, and all 3 nodes will have VMs present, with the primary node running the OSD tests in the `user.slice` CPU set.
During all tests, node1 was both the testing node, as well as primary PVC coordinator (adding a slightly higher process burden to it).
## Test Outline and Hypothesis
To determine both whether limiting OSD CPUs, and if so, how many, is worthwhile, a set of 4 total tests was run.
* The first test is without any CPU limits, i.e. all cores can be used by all processes.
* The second test is with 2 total CPU cores dedicated to OSDs (1 "per" OSD).
* The third test is with 4 total CPU cores dedicated to OSDs (2 "per" OSD).
* The fourth test is with 6 total CPU cores dedicated to OSDs (3 "per" OSD).
The results are displayed as an average of 3 tests with each configuration, and include the 60-second post-test load average of all 3 nodes in addition to the raw test result to help identify trends in CPU utilization.
I would expect, with Ceph's CPU-bound nature, that each increase in the number of CPU cores dedicated to OSDs will increase performance. The two open questions are thus:
* Is doing no limit at all (pure scheduler allocations) better than any fixed limits?
* Is one of the numbers above optimal (no obvious performance hit, and diminishing returns thereafter).
## Test Results
### Sequential Bandwidth Read & Write
Sequential bandwidth tests tend to be "ideal situation" tests, not necessarily applicable to VM workloads except in very particular circumstances. However they can be useful for seeing the absolute maximum raw throughput performance that can be attained by the storage subsystem.
![Sequential Read Bandwidth (MB/s, 4M block size, 64 queue depth)](seq-read.png)
Sequential read shows a significant spike with the all-cores configuration, then a much more consistent performance curve in the limited configurations. There is a significant difference in performance between the configurations, with a margin of just over 450 MB/s between the best (all-cores) and worst (2+2+12) configurations.
The most interesting point to note is that the all-cores configuration has significantly higher sequential read performance than any limited configuration, and in such a way that even following the pattern of the limited configurations we would not reach this high performance even with all 16 cores dedicated. The Linux scheduler must be working some magic to ensure the raw data can transfer very quickly in the all-cores configuration.
System load also follows an interesting trend. The highest load on nodes 1 and 2, and the lowest on node 3, was with the all-cores configuration, indicating that Ceph OSDs will indeed use many CPU cores to spread read load around. The overall load became much more balanced with the 2+4+10 and 2+6+8 configurations.
This is overall an interesting result and, as will be shown below, the outlier in terms of all-core configuration performance. It does not adhere to the hypothesis, and provides a "yes" answer for the first question (thus negating the second).
![Sequential Write Bandwidth (MB/s, 4M block size, 64 queue depth)](seq-write.png)
Sequential write shows a much more consistent result in line with the hypothesis above, and providing a clear "no" answer for the first question and a fairly clear point of diminishing returns for the second. The overall margin between the configurations is minimal, with just 17 MB/s of performance difference between the best (2+6+8) and worst (2+2+12) configurations.
There is a clear drop going from the all-cores configuration to the 2+2+12 configuration, however performance immediately spikes to even higher levels with the 2+4+10 and 2+6+8 configurations, with those only showing a 1 MB/s difference between them. This points towards the 2+4+10 configuration as an optimal one for sequential write performance, as it leaves more cores for VMs and shows that OSD processes seem to use at most about 2 cores each for sequential write operations. The performance spread does however limit the applicability of this test to much higher-throughput devices (i.e. NVMe SSDs), leaving the question still somewhat open.
System load also follows a general upwards trend, indicating better overall CPU utilization.
### Random IOPS Read & Write
Random IO tests tend to better reflect the realities of VM clusters, and thus are likely the most applicable to PVC.
![Random Read IOs (IOPS, 4k block size, 64 queue depth)](rand-read.png)
Random read, like sequential write above, shows a fairly consistent upward trend in line with the the original hypothesis, as well as clear answers to the two questions ("no", and "any limit"). The spread here is quite significant, with the difference between the best (2+6+8) and worst (all-cores) configurations being over 4100 IOs per second; this can be quite significant when speaking of many dozens of VMs doing random data operations in parallel.
This test shows the all-cores configuration as the clear loser, with a very significant performance benefit to even the most modest (2+2+12) limited configuration. Beyond that, the difference between 2 OSD cores and 6 OSD cores is a relatively small 643 IOs per second; still significant, but not nearly as much as the nearly 3500 IOs per second uplift between the all-cores and 2+2+12 configurations.
This test definitely points towards a trade-off between VM CPU allocations and maximum read performance, but also seems to indicate that, unlike sequential reads, Ceph does far better with just a few dedicated cores versus many shared cores when performing random reads.
System load follows a similar result to the sequential read tests, with more significant load on the testing node for the all-core and 2+2+12 configurations, before balancing out more in the 2+6+8 configuration.
![Random Write IOs (IOPS, 4k block size, 64 queue depth)](rand-write.png)
Random write again continues a general trend in line with the hypothesis and providing nearly the same answers as the sequential write tests, with a similar precipitous drop for the 2+2+12 configuration versus the all-core configuration, before rebounding and increasing with the 2+4+10 and 2+6+8 configurations. The overall margin is a very significant 7832 IOs per second between the worst (2+2+12) and best (2+6+8) tests, more than double the performance.
This test definitely shows that Ceph random writes can consume many CPU cores per OSD process, and that providing more, dedicated cores can provide significant uplift in random write performance. Thus, like random reads, there is a definite trade-off between the CPU and storage performance requirements of VMs, so a balance must be struck. With regards to the second question, this test does show less clear diminishing returns as the number of dedicated cores increases, potentially indicating that it can scale almost indefinitely.
System load shows an interesting trend compared to the other tests. Overall, the load remains in a fairly consistent spread between all 3 nodes, though with a closing gap by the 2+6+8 configuration. Of note is that the overall load drops significantly on all nodes for the 2+2+12 configuration, showing quite clearly that the OSD processes are starved for CPU resources during those tests and explaining the overall poor performance there.
### 95th Percentile Latency Read & Write
Latency tests show the "best case" scenarios for the time individual writes can take to complete. A lower latency means the system can service writes far quicker. With Ceph, due to the inter-node replication, latency will always be based primarily on network latency, though there are some gains to be had.
These tests are based on the 95th percentile latency numbers; thus, these are the times in which 95% of operations will have completed, ignoring the outlying 5%. Though not shown here, the actual FIO test results show a fairly consistent spread up until the 99.9th percentile, so this number was chosen as a "good average" for everyday performance.
![Read Latency (μs, 4k block size, 1 queue depth)](latency-read.png)
Read latency shows a consistent downwards trend like most of the tests so far, with a relatively large drop from the all-cores configuration to the 2+2+12 limited configuration, followed by steady decreases through each subsequent increase in cores. This does seem to indicate a clear benefit towards limiting CPUs, though like the random read tests, the point of diminishing returns comes fairly quickly.
System load also follows another hockey-stick-converging pattern, showing that CPU utilization is definitely correlated with the lower latency as the number of dedicated cores increases.
![Write Latency (μs, 4k block size, 1 queue depth)](latency-write.png)
Write latency shows another result consistent with the other write tests, where the 2+2+12 configuration fares (slightly) worse than the all-cores configuration before rebounding. Here the latency difference becomes significant, with the spread of 252 μs being enough to become noticeable in high-performance applications. There is also no clear point of diminishing returns, just like the other write tests.
System load follows a very curious curve, with node1 load dropping off and levelling out with the 2+4+10 and 2+6+8 configurations, while the other nodes continue to increase. I'm not sure exactly what to make of this result, but the overall performance trend does seem to indicate that, like other write tests, more cores dedicated to the OSDs results in higher utilization and performance.
## Conclusions
With a valid testing methodology, I believe we can demonstrate some clear takeaways from this testing.
First, our original hypothesis that "more cores means better performance" certainly holds. Ceph is absolutely CPU-bound, and better (newer) CPUs at higher frequencies with more cores are always a benefit to a hyperconverged cluster system like PVC. It also clearly shows that Ceph OSD processes are not single-threaded in the latest versions, and that they can utilize many cores to benefit performance.
Second, our first unanswered question, "is a limit worthwhile over no limit", seems to be a definitive "yes" in all except for one case: sequential reads. Only in that situation was the all-cores configuration able to beat all other configurations. However, given that sequential read performance is, generally, a purely artificial benchmark, and also not particularly applicable to the PVC workload, I would say that it is definitely the case that a dedicated set of CPUs for the OSDs is a good best-practice to follow, as the results from all other tests do show a clear benefit.
Third, our second unanswered question, "how many dedicated cores should OSDs be given and what are the diminishing return points", is less clear cut. From the testing results, it is clear that only 1 core per OSD process is definitely too few, as this configuration always performed the worst out of the 3 tested. Beyond that, the workload on the cluster, and the balance between cores-for-VMs and storage performance, become more important. It is clear from the results that a read-heavy cluster would benefit most from a 2-cores-per-OSD configuration, as beyond that the returns seem to diminish quickly. For write-heavy clusters, though, more cores seem to provide an obvious benefit scaling up at least past our 2+6+8 configuration, and thus such clusters should be built with as many cores as possible and then with at least 3-4 (or more) cores dedicated to each OSD process.
The overall takeaway is thus: I will begin implementing some sort of CPU core affinity configuration on all new PVC clusters, and retrofit one befitting the required performance to all existing clusters; the benefits clearly outweigh the drawbacks.
The next post in this series will look at the same performance evaluation but with NVMe SSDs, and with several even-higher OSD allocations on some newer AMD Epyc-based machines. Stay tuned for more!

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

View File

@ -0,0 +1,156 @@
---
title: "Adventures in Ceph tuning, part 3"
description: "A second follow-up to my analysis of Ceph system tuning for Hyperconverged Infrastructure"
date: 2023-07-29
tags:
- PVC
- Development
- Systems Administration
---
In 2021, [I made a post](https://www.boniface.me/pvc-ceph-tuning-adventures/) about Ceph storage tuning with my [my Hyperconverged Infrastructure (HCI) project PVC](https://github.com/parallelvirtualcluster/pvc), and in 2022 [I wrote a follow-up](https://www.boniface.me/pvc-ceph-tuning-adventures-part-2/) clarifying the test methodology with an upgraded hardware specification.
At the end of that second part, I said:
> The next post in this series will look at the same performance evaluation but with NVMe SSDs, and with several even-higher OSD allocations on some newer AMD Epyc-based machines.
Well, here is that next part!
Like part 2, I'll jump right into the cluster specifications, changes to the tests, and results. If you haven't read parts 1 and 2 yet, I suggest you do so now to get the proper context before proceeding.
## The Cluster Specs (only better)
Parts 1 and 2 used my own home server setup, based on Dell R430 servers using Broadwell-era Intel Xeon CPUs, for analysis. But being my homelab, I'm quite limited in what hardware I have access to: namely, I'm using several-generations-old hardware and SATA SSDs, so despite having some very interesting results, they are very limited by the hardware. So I wanted to get results with more modern hardware (new CPUs and NVMe SSDs). Luckily, I was able to test on just such a cluster thanks to my employer deploying my PVC solution to our customers using brand-new hardware.
Like my home cluster, these clusters use 3 nodes, with the following specifications:
| **Part** &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; | **node1 + node2 + node3** |
| :-------------------------------------------------------------- | :------------------------ |
| Chassis | Dell R6515
| CPU | 1x [AMD 7543P](https://www.amd.com/en/products/cpu/amd-epyc-7543p) (32 core, 64 thread, 2.8 GHz base, 3.7 GHz maximum boost) |
| Memory | 128 GB DDR4 (8x 16 GB) |
| OSD DB/WAL | N/A |
| OSD Data | 1x Dell PE8010 3.84TB U.2 NVMe SSD |
| Networking | 2x BCM57416 10GbE in 802.3ad (LACP) bond |
## The OSD Database Device
Because the main data disks in these servers are already NVMe, they do not feature an OSD DB/WAL device. As this made only a slight difference with the much slower SATA SSDs, I do not consider this important to use with ultra-fast NVMe OSDs.
## CPU Set Layout
These tests use the same methodology as the tests in part 2, with one minor change: instead of only dedicating 2 CPU cores (plus corresponding threads) to the system, here I dedicated 4 CPU cores instead. This was done simply due to the larger core count, as this would be the counts I would run in production on these nodes.
Thus, the layout of CPU core sets on all 3 nodes looks like this:
| **Slice/Processes** &emsp; | **Allowed CPUs + Hyperthreads** &emsp; | **Notes** |
| :------------------------- | :------------------------------------- | :-------- |
| system.slice | 0-3, 32-35 | All system processes, databases, etc. and non-OSD Ceph processes. |
| user.slice | 0-3, 32-35 | All user sessions, including FIO tasks. |
| osd.slice | 4-**?**, 32-**?** | All OSDs; how many (the ?) depends on test to find optimal number. |
| machine.slice| **?**-31, **?**-63 | All VMs; how many (the ?) depends on test as above. |
Due to an oversight, the primary PVC coordinator node actually flipped from node1 in tests 1 and 4-6 to node2 in tests 2 and 3, however as the results show this did not seem to make any appreciable difference in the CPU loads, and thus I think this can be ignored. I would expect this because the PVC processes running on the primary coordinator are not particularly intensive (maybe 1-2% of one core of CPU utilization).
## Test Outline and Hypothesis
This test used nearly the same test outline as the previous post, only with two additional tests:
* The first test is without any CPU limits, i.e. all cores can be used by all processes.
* The second test is with 1 total CPU cores dedicated to OSDs (1 "per" OSD).
* The third test is with 2 total CPU cores dedicated to OSDs (2 "per" OSD).
* The fourth test is with 3 total CPU cores dedicated to OSDs (3 "per" OSD).
* The fifth (new) test is with 4 total CPU cores dedicated to OSDs (4 "per" OSD).
* The sixth (new) test is with 8 total CPU cores dedicated to OSDs (8 "per" OSD).
Since there are half as many OSDs in these nodes (i.e. only 1 each), the OSD CPU count was scaled down to match the "per" numbers with the previous post.
The fifth test was added for more details on the scaling between configurations, and the sixth was added later due to an interesting observation in the results from the first 5 tests, specifically around random write performance, that will be discussed below.
The results are still displayed as an average of 3 tests with each configuration, and include the 60-second post-test load average of all 3 nodes in addition to the raw test result to help identify trends in CPU utilization.
Similarly, our hypothesis - that more dedicated OSD CPUs is better - and open questions remain the same:
* Is doing no limit at all (pure scheduler allocations) better than any fixed limits?
* Is one of the numbers above optimal (no obvious performance hit, and diminishing returns thereafter).
## Test Results
### Sequential Bandwidth Read & Write
Sequential bandwidth tests tend to be "ideal situation" tests, not necessarily applicable to VM workloads except in very particular circumstances. However they can be useful for seeing the absolute maximum raw throughput performance that can be attained by the storage subsystem.
![Sequential Read Bandwidth (MB/s, 4M block size, 64 queue depth)](seq-read.png)
Sequential read shows a significant difference with the NVMe SSDs and newer CPUs versus the SATA SSDs in the previous post, beyond just the near doubling of speed thanks to the higher performance of the NVMe drives. In that post, no-limit sequential read was by far the highest, and this was an outlier result.
This test instead shows a result much more inline with expectations: no-limit performance is significantly lower than the dedicated limits, and by a relatively large 13% margin.
The best result was with the 4+1+27 configuration, with a decreasing stair-step pattern to the 4+4+24 configuration. However, all the limit tests were within 1% of each other, which I would consider the margin of error.
Thus, this test upholds the hypothesis: a limit is a good thing to avoid scheduler overhead, though there is no clear winner in terms of the number of dedicated OSD CPUs.
CPU load does show an interesting drop with the 4+3+25 configuration before jumping back up in the 4+4+24 configuration, however all nodes track each other, and the node with the widest swing (node3) was not a coordinator in any of the tests, so this is likely due to the VMs rather than the OSD processes.
![Sequential Write Bandwidth (MB/s, 4M block size, 64 queue depth)](seq-write.png)
Sequential write shows a similar stair-step pattern, though more pronounced. The no-limit performance is actually the second-best here, which is an interesting result, though again the results are all within a nearly margin-of-error 2% of each other.
The highest performance was the 4+2+26, though interestingly the 4+3+25 configuration performed the worst. Though again since these are all within a reasonable margin of error, I think we can conclude that for sequential writes, there is no conclusive benefit to a CPU limit.
System load follows the same trend as did sequential reads, with a drop off for each test until a bottom with the 4+3+25 configuration before rebounding slightly higher for the 4+4+24 configuration. I'm not sure at this point if these load numbers are even showing anything at all, but it is still interesting to see.
Finally, in watching the live results, there was full saturation of the 10GbE NIC during this test:
![Sequential Write Network Bandwidth](seq-write-network.png)
This is completely expected, since our configuration uses a `copies=3` replication mode, so we should expect about 50% of the performance of the sequential reads, since every write is replicated over the network twice. It definitely proves that our limitation here is not the drives but the network, but also shows that this is not completely linear, since instead of 50% we're actually seeing about 70% of the maximum network bandwidth in actual performance.
### Random IOPS Read & Write
Random IO tests tend to better reflect the realities of VM clusters, and thus are likely the most applicable to PVC.
![Random Read IOs (IOPS, 4k block size, 64 queue depth)](rand-read.png)
Random read shows a similar trend as sequential reads, and one completely in-line with our hypothesis. There is definitely a more pronounced trend here though, with a clear increase in performance of about 8% between the worst (4+1+27) and best (4+8+24) results.
However this test shows yet another stair-step pattern where the 4+2+26 configuration outpaced the 4+3+25 configuration. I suspect this might be due to the on-package NUMA domains and chiplet architecture of the Epyc chips, whereby the 3rd core has to traverse a higher-latency interconnect and thus hurts performance when going from 2 to 3 dedicated CPUs, though more in-depth testing would be needed to definitively confirm this.
System load continues to show almost no correlation at all with performance, and thus can be ignored.
![Random Write IOs (IOPS, 4k block size, 64 queue depth)](rand-write.png)
Random writes bring back the strange anomaly that we saw with sequential reads in the previous post. Namely, that for some reason, the no-limit configuration performs significantly better than all limits. After that, the performance seems to scale roughly linearly with each increase in CPU core count, exactly as was seen with the SATA SSDs in the previous post.
One possible explanation is again the NUMA domains within the CPU package. The Linux kernel is aware of these limitations, and thus could potentially be assigning CPU resources to optimize performance, especially for the CPU-to-NIC pipeline. Again this would need some more thorough, in-depth testing to confirm, but it is my hunch that this is occurring.
The system load here shows another possibly explanation for the anomalous results though. Random writes seem to hit the CPU much harder than the other tests, and the baseline load of all nodes with the no-limit configuration is about 8, which would indicate that the OSD processes want about 8 CPU cores per OSD here. Adding in the 4+8+20 configuration, we can see that this is definitely higher than all the other limit configurations, but is still less than the no-limit configuration, so this doesn't seem to be the *only* explanation. It does appear that the scaling is not linear as well, since doubling the cores only brought us about half-way up to the no-limit performance, thus pointing towards the NUMA limit as well and giving us a pretty conclusively "yes" answer to our first main question.
For write-heavy workloads, this is a very important takeaway. This test clearly shows that the no-limit configuration is ideal for random writes on NVMe drives, as the Linux scheduler seems better able to distribute the load among many cores. I'd be interested to see how this is affected by many CPU-heavy noisy-neighbour VMs, but testing this is extremely difficult and thus is not in scope for this series.
### 95th Percentile Latency Read & Write
Latency tests show the "best case" scenarios for the time individual writes can take to complete. A lower latency means the system can service writes far quicker. With Ceph, due to the inter-node replication, latency will always be based primarily on network latency, though there are some gains to be had.
These tests are based on the 95th percentile latency numbers; thus, these are the times in which 95% of operations will have completed, ignoring the outlying 5%. Though not shown here, the actual FIO test results show a fairly consistent spread up until the 99.9th percentile, so this number was chosen as a "good average" for everyday performance.
![Read Latency (μs, 4k block size, 1 queue depth)](latency-read.png)
Read latency shows a consistent downwards trend throughout the configurations, though with the 4+4+24 and 4+8+24 results being outliers. However the latency here is very good, only 1/4 of the latency of the SATA SSDs in the previous post, and the results are all so low that they are not likely to be particularly impactful. We're really pushing raw network latency and packet processing overheads with these results.
![Write Latency (μs, 4k block size, 1 queue depth)](latency-write.png)
Write latency also shows a major improvement over SATA SSDs, being only 1/5 of those results. It also, like the read latency, shows a fairly limited spread in results, though with a similar uptick from 4+3+25 to 4+4+24 to 4+8+20. Like read latency, I don't believe these numbers are significant enough to show a major benefit to the CPU limits.
## Conclusions
Our results with NVMe drives shows some interesting differences from SATA SSDs. For sequential reads, the outlier result of the SATA drives is eliminated, but it is replaced instead with an outlier result for random writes, likely one of the most important metrics when talking of VM workloads. In addition the better CPUs are also likely impacting the results, and the limitations of the 10GbE networking really come into play here: I expect we might see some differences if we were running on a much faster network interconnect.
Based primarily on that one result, I think we can safely conclude that while there are some minor gains to be made with sequential read performance and some more major gains with random read performance, overall a CPU limit on the Ceph OSD processes does not seem to be worth the trade-offs for NVMe SSDs, at least in write-heavy workloads. If your workload is extremely random-read-heavy, then a limit might be beneficial, but if it is more write-heavy, CPU limits seem to hurt more than help. This is in contrast to SATA SSDs on the older processors where there were clear benefits to the CPU limit.
The final part of this series will investigate the results if we put multiple OSDs on one NVMe drive and then rerun these same tests. Stay tuned for that in the next few months!

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 487 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 487 KiB

View File

@ -0,0 +1,137 @@
---
title: "Adventures in Ceph tuning"
description: "An analysis of Ceph system tuning for Hyperconverged Infrastructure"
date: 2021-10-01
tags:
- PVC
- Development
- Systems Administration
---
In early 2018, I started work on [my Hyperconverged Infrastructure (HCI) project PVC](https://github.com/parallelvirtualcluster/pvc). Very quickly, I decided to use Ceph as the storage backend, for a number of reasons, including its built-in host-level redundancy, self-managing and self-healing functionality, and general good performance. With PVC now being used in numerous production clusters, I decided to tackle optimization. This turned out to be a bit of rabbit hole, which I will detail below. Happy reading.
## Ceph: A Primer
Ceph is a distributed, replicated, self-managing, self-healing object store, which exposes 3 primary interfaces: a raw object store, a block device emulator, and a POSIX filesystem. Under the hood, at least in recent releases, it makes use of a custom block storage system called Bluestore which entirely removes a filesystem and OS tuning from the equation. Millions of words have been written about Ceph, its interfaces, and Bluestore elsewhere, so I won't bore you with rehashed eulogies of its benefits here.
In the typical PVC use-case, we have 3 nodes, each running the Ceph monitor and manager, as well as 2 to 4 OSDs (Object Storage Daemons, what Ceph calls its disks and their management processes). It's a fairly basic Ceph configuration, and I use exactly one feature on top: the block device emulator, RBD (RADOS Block Device), to provide virtual machine disk images to KVM.
The main problem comes when Ceph is placed under heavy load. It is very CPU-bound, especially when writing random data, and further the replication scheme means that it is also network- and disk- bound in some cases. But primarily, the CPU speed (both in frequency and IPC) is the limiting factor.
After having one cluster placed under extreme load by a client application PostgreSQL database, I began looking into additional tuning, in order to squeeze every bit of performance I could out of the storage layer. The disks we are using are nothing special: fairly standard SATA SSDs with relatively low performance and endurance, but with upgrade costs being a concern, and the monitoring graphs showing plenty of raw disk performance on the table, I turned my attention to the Ceph layer, with very interesting results.
## Ceph Tuning: A Dead End
The first thought was, of course, to tune the Ceph parameters themselves. Unfortunately for me, or, perhaps, fortunately for everyone, there isn't much to tune here. Using the Luminous release (14.x) with the Bluestore backing store, most of the defaults seem to be extremely optimal. In fact, despite finding some Red Hat blogs to the contrary, I found that almost nothing I could change would make any appreciable difference to the performance of the Ceph cluster. I had to go deeper.
## The Ceph OSD Database and WAL
With Ceph Bluestore, there are 3 main components of an OSD: the main data block device, the database block device, and the write-ahead log (WAL). In the most basic configuration, all 3 are placed on the same disk. However Ceph provides the option to move the database (and WAL, if it is large enough) onto a separate block device. It isn't correct to call this a "cache", except in a general, technically-incorrect sense: the database houses mostly metadata about the objects stored on the OSD, and the WAL handles sequential write journaling and can thus be thought of similar to a RAID controller write cache, but not precisely the same. In this configuration, one can leverage a very fast device - for example, and Intel Optane SSD - to handle metadata and WAL operations for a relatively "slow" SSD block device, and thus in theory increase performance.
## Turbo-charging My Cluster with Intel Optane SSDs
I decided to test this out myself first by purchasing a set of 3 [Intel Optane DC P4801X 100GB M.2-form-factor SSDs](https://ark.intel.com/content/www/us/en/ark/products/149367/intel-optane-ssd-dc-p4801x-series-100gb-m-2-110mm-pcie-x4-3d-xpoint.html). I was able to obtain these drives, brand new, for the bargain-basement price of $80 CAD each, less than 1/4 of their MSRP on release in 2019. I guess there isn't much market for these very small but very fast drives out there. I used a set of PCIe HHHL to M.2 adapter cards to install the SSDs into my servers, and I was quickly able to validate their near-unfathomable performance. At anything less than a 256-depth queue with 8 workers, doing 4k random read and write tests, I was more limited by the CPU usage of the `fio` process than I was by the SSDs, and during these tests I was even able to exceed the official rated specifications - in both IOPS and raw bandwidth, but not latency - by as much as 10%.
Emboldened by the sheer performance of the drives, I quickly implemented OSD DB offloading in PVC, added the Optane SSDs to my existing pair-per-node of [Intel DC S3700 800GB SSD](https://ark.intel.com/content/www/us/en/ark/products/71916/intel-ssd-dc-s3700-series-800gb-2-5in-sata-6gbs-25nm-mlc.html) OSDs, and began benchmarking.
## The Bane of Hyperconverged Architectures: Sharing Resources
I quickly noticed a slight problem, however. My home cluster, which was doing these tests, is a bit of a hodge-podge of server equipment, and runs a fair number (68 at the time of testing) of virtual machines across its 3 nodes. The hardware breakdown is as follows:
| **Part** &emsp;&emsp;&emsp; | **node1** &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; | **node2 + node3** |
| :-------- | :------- | :------------- |
| Chassis | HP Proliant DL-360 G6 | Dell R430 |
| CPU | 2x [Intel E5-5649](https://ark.intel.com/content/www/us/en/ark/products/52581/intel-xeon-processor-e5649-12m-cache-2-53-ghz-5-86-gt-s-intel-qpi.html) | 1x [Intel E5-2620 v4](https://ark.intel.com/content/www/us/en/ark/products/92986/intel-xeon-processor-e52620-v4-20m-cache-2-10-ghz.html) |
| Memory | 144 GB DDR3 (18x 8 GB) | 128 GB DDR4 (4x 32 GB) |
The VMs themselves also range from basically-idle to very CPU-intensive, with a wide range of vCPU allocations. I quickly realized that there might be another tuning aspect to consider: CPU (and NUMA, for `node1`) pinning.
I decided to try implementing a basic CPU pinning scheme with the `cpuset` Linux utility. This tool allows the administrator to create static `cset`s, which are logical groups assigned to specific CPUs, and then place processes - either during runtime or at process start - into these `cset`s. So, in addition to testing the Optane drives, I decided to also test Optane-less configurations whereby specific numbers of cores (and their corresponding hyperthreads) were dedicated to the Ceph OSDs instead of all CPUs shared by both OSDs, VMs, and PVC host daemons.
Ultimately, the disparate configurations here do present potential problems in interpreting the results, however within this particular cluster the comparisons are valid, and I do hope to repeat these tests (and update this post) in the future when I'm able to simplify and unify the server configurations.
## Test Explanation
The benchmarks themselves were run with the system in production, running the full set of VMs. This was done, both for practical reasons, but also to simulate a real-world scenario with numerous noisy neighbours. While this might affect a single random test, I ran 3 tests each and staggered them over time to minimize the impact of bursty VM effects. Further the `cpuset` tuning would be fairly moot without additional real load on the nodes, and thus I believe this to be a worthwhile assumption. A future addition to the results might be to run a similar set of tests against an empty cluster, and if and when I am able to do so, I will add to this post.
The tests were run with PVC's in-built benchmark system, which creates a new, dedicated Ceph RBD volume and then runs the `fio` tests against it directly using the `rbd` engine. To ensure `fio` itself was not limited by noisy neighbours, the node running the tests was flushed of VMs.
For the 3 `cpuset` tests, the relevant `cset` configuration was applied to all 3 nodes, regardless of the number of or load in the VMs, and putting the `fio` process inside the "VM" `cpuset`. Thus the CPUs set aside for them were completely dedicated to the OSDs.
## The Benchmark Results in 6 Graphs
The results were fairly interesting, to say the least. First, I'll present the 6 key indicator graphs I obtained from the benchmark data, and then run through what they mean. Within each graph, the 5 tests are as follows:
* No-O, No-C: No Optane DB/WAL drive, no `cpuset` tuning.
* O, No-C: Optane DB/WAL drive, no `cpuset` tuning.
* No-O, C=2: No Optane, `cpuset` OSD group with 2 CPU cores (+ hyperthreads, on `node1` within CPU0 NUMA domain)
* No-O, C=4: No Optane, `cpuset` OSD group with 4 CPU cores (+ hyperthreads, on `node1` within CPU0 NUMA domain)
* No-O, C=6: No Optane, `cpuset` OSD group with 6 CPU cores (+ hyperthreads, on `node1` within CPU0 NUMA domain)
It's worth noting that the 5th test left just 2 CPU cores (+ hyperthreads) to run VMs on `hv2` - the performance inside them was definitely sub-optimal!
Each test, in each configuration mode, was run 3 times, with the results presented here being an average of the results of the 3 tests.
#### Test Suite 1: Sequential Read/Write Bandwidth, 4M block size, 64-depth queue
These two tests measure raw sequential throughput at a very large block size and relatively high queue depth.
![Sequential Read Bandwidth, 4M block size, 64 queue depth](seq-bw-4m-read.png)
![Sequential Write Bandwidth, 4M block size, 64 queue depth](seq-bw-4m-write.png)
#### Test Suite 2: Random Read/Write IOPS, 4k block size, 64-depth queue
These two tests measure IOPS performance at a very small block size and relatively high queue depth.
![Random Read IOPS, 4k block size, 64 queue depth](random-iops-4k-read.png)
![Random Write IOPS, 4k block size, 64 queue depth](random-iops-4k-write.png)
#### Test Suite 3: Random Read/Write Latency, 4k block size, 1-depth queue
These two tests measure average request latency at a very small block size and single queue depth.
![Random Read Latency, 4k block size, 1 queue depth](random-latency-4k-1q-read.png)
![Random Write Latency, 4k block size, 1 queue depth](random-latency-4k-1q-write.png)
## Benchmark Analysis
### Sequential Performance
For reads, the performance is nearly identical, and almost within margin-of-error, for the first 3 data points. The Optane drive did not seem to make any difference to sequential read performance, which would be expected since the roughly 1GB of metadata per OSD can easily be cached in the OSD's 4GB of allowed RAM. However when using 6 CPU cores (theoretically, 3 per OSD), the read performance drops by a fairly significant margin. I don't have any explanation for this drop.
For writes, the performance shows some very noteworthy results. The Optane drive makes a noticeable, thought not significant, difference in the write performance, likely due to the WAL. A larger drive, and thus larger WAL, might make an even more significant improvement. The `cpuset` tuning, for 2 and 4 CPU `cset`s`, seems to make no difference over no limiting; however once the limit was raised to 6 CPU cores, write performance did increase somewhat, though not as noticeably as with the Optane cache.
The two main takeaways from these tests seem to be that (a) Optane database/WAL drives do have a noticeable effect on write performance; and (b) that dedicating 3 (or potentially more) CPU cores per OSD *increases* write performance while *decreasing* read performance. The increase in write performance would seem to indicate a CPU bottleneck is occurring with the lower CPU counts (or when contending with VM/`fio` processes), but this does not match the results of the read tests, which in the same situation should increase as well. One possible explanation might lie in the Ceph monitor processes, which direct clients to data objects on OSDs and were in the "VM" `cset`, but in no test did I see the `ceph-mon` process become a significant CPU user. Perhaps more research into the inner workings of Ceph OSDs and CRUSH maps will reveal the source of this apparent contradiction, but at this time I can not explain it.
### Random I/O Performance
Random I/O performance seems to show very similar things to sequential I/O performance, though with its own interesting caveats.
For reads, it continues to be clear that the Optane drive does not make any noticeable difference to the performance. The `cpuset` results however are far more interesting. When limiting to 2 or 4 CPUs, the random read performance increases by over 40%, however like the write test, limiting to 6 CPUs results in a marked drop, though still higher than the baseline.
For writes, the story is even more interesting. Random writes are, by far, in my experience, the most CPU-demanding Ceph I/O operation, and the results demonstrate this. Like sequential writes, the Optane drive produces a noticeable, though again not dramatic, increase in write IOPS. The more interesting story comes with the `cpuset` tuning. Limiting the OSDs to just 2 CPUs, the write IOPS are nearly halved compared to the baseline. Increasing this to 4 CPUs returns the write performance to the baseline, while increasing it again to 6 increases the performance yet again, surpassing the non-`cset` Optane performance, thought the increase from 4 to 6 is not as dramatic as from 2 to 4, which definitely points towards a plateau at 8 or 10 cores for 2 OSDs.
One interesting takeaway from this result is the breaking of my assumption that Ceph OSDs were primarily single-threaded applications limited by raw single-core performance. They are clearly not, and one OSD will consume resources from many CPU cores. When building a HCI cluster, this becomes a much more important consideration, making very high-core-count CPUs, even slightly slower ones, a more attractive option. More testing to evaluate the differences that speed and IPC can make across multiple CPU frequencies and generations would be very useful in further narrowing down the optimal performance range, though this is currently outside of my personal capabilities.
### Random I/O Latency
The final test concerns latency, using a single depth queue to measure latency almost exclusively. And unlike the other two results, these results do line up with my intuitive expectations.
For reads, the Optane drive does drop the latency slightly, likely due to the lower latency of metadata reads from the DB volume. Though the result is pronounced in the graph due to the scale, it really only amounts to a 2% performance difference. The `cpuset` results show a further latency drop, of roughly another 2-3%, which does seem to indicate that CPU contention can make a big difference in the latency of read operations, though not as dramatically as raw performance.
For writes, the Optane drive is a clear winner, reducing the average latency by almost 15%. This combined with the `cpuset` results showing a steady, if minimal, reduction in latency as more CPU cores are dedicated the OSDs, definitely points towards the CPU-sensitive nature of Ceph latency, since clearly the software component accounts for over 130x the latency that the drive does (~15 microseconds for a direct write versus 1980 microseconds for a Ceph write).
## Overall Conclusions and Takeaways
Going into this project, I had hoped that both the Optane drive and the `cpuset` core dedication would make profound, dramatic, and consistent differences to the Ceph performance. However, the results instead show, like much in the realm of computer storage, trade-offs and caveats. As takeaways from the project, I have the following 4 main thoughts:
1. For write-heavy workloads, especially random writes, an Optane DB/WAL device can make a not-insignificant difference in overall performance. However, the money spent on an Optane drive might better be spent elsewhere...
2. CPU is, as always with Ceph, king. The more CPU cores you can get in your machine, and the faster those CPU cores are, the better, even ignoring the VM side of the equation. Going forward I will definitely be allocating more than my original 1 CPU core per OSD assumption into my overall CPU core count calculations, with 4 cores per OSD being a good baseline.
3. While I have not been able to definitely test and validate it myself, it seems that `cpuset` options are, at best, only worthwhile in very read-heavy use-cases and in cases where VMs are extremely noisy neighbours and there are insufficient physical CPU cores to satiate them. While there is a marked increase in random I/O performance, the baseline write performance matching the 4-core limit seems to show that the effect would be minimized the more cores there are for both workloads to use, and the seemingly-dramatic read improvement might be due to the age of some of the CPUs in my particular cluster. More investigation is definitely warranted.
4. While it was not exactly tested here, memory performance would certainly make a difference to read performance. Like with CPUs, I expect that read rates would be much higher if all nodes were using the latest DDR4 memory.
Hopefully this analysis of my recent Ceph tuning adventures was worthwhile, and that you learned something. And of course, I definitely welcome any comments, suggestions, or corrections!

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

View File

@ -0,0 +1,971 @@
+++
class = "post"
date = "2020-05-31T00:00:00-04:00"
tags = ["systems administration", "development", "matrix"]
title = "Building a scalable, redundant Matrix Homeserver"
description = "Deploy an advanced, highly-scalable Matrix instance with split-workers and backends from scratch"
type = "post"
weight = 1
draft = true
+++
## What is Matrix?
Matrix is, fundamentally, a combination of the best parts of IRC, XMPP, and Slack-like communication platforms (Discord, Mattermost, Rocketchat, etc.) built to modern standards. In the Matrix ecosystem, users can run their own server instances, called "homeservers", which then federate amongst themselves to create a "fediverse". It is thus fully distributed, allowing users to communicate with each other on their own terms, while providing all the features one would expect of a global chat system, such as large public rooms, as well as standard features of more modern platforms, like small private groups, direct messages, file uploads, and advanced integration and moderation features, such as bots. The reference homeserver application is called "Synapse", written in Python 3, and released under an Apache 2.0 license.
In this guide, I seek to provide a document detailing the full steps to deploy a highly-available, redundant, multi-worker Matrix instance, with a fully redundant PostgreSQL database and LDAP authentication and 3PID backend. For those of you who just want to run a quick-and-easy Matrix instance with few advanced features, this guide is probably not for you, and there are numerous guides out there for setting up basic Matrix Synapse instances instead.
Most of the concepts in this guide, as well as most of the configuration files given, can be adapted to a single-host but still split-worker instance instead, should the configuration below be deemed too complicated or excessive for your usecase. Be sure to carefully read this document and the Matrix documentation if you wish to do so, though most sections can be adapted verbatim.
## The problem with Synapse
The main issue with Synapse in its default configuration, as documented by the Matrix project themselves, is that it is single-threaded and non-redundant. Since a lot of actions inside Synapse require significant CPU resources, especially those related to federation, this can be a significant bottleneck. This is especially true in very large rooms, where there are potentially hundreds of joined users on multiple homeservers that all must be communicated to. Without tweaking, this can manifest as posts to large rooms taking an extrordanarily long time, upwards of 10 seconds, to send, as well as problems joining very large rooms for the first time (significant delays, timeouts, join failures, etc.).
Unfortunately, most homeserver users aren't running their instance on the fastest possible CPU, thus, the only solution to improve performance in this area is to somehow allow the Synapse process to use multiple threads. Luckily for us, Matrix Synapse, since about version 1.10, supports this via workers. Workers allow one to split various functions out of the main Synapse process, which then allows multi-threaded operation and thus, increased performance.
The configuration of workers [is discussed in the Synapse documentation](https://github.com/matrix-org/synapse/blob/master/docs/workers.md), however a number of details are glossed over or not mentioned completely. Thus, this blog post will outline some of the specific details involved in tuning workers for maximum performance.
## Step 1 - Prerequisites and planning
The system outlined in this guide is designed to provide a very scalable and redundant Matrix experience. To this end, the entire system is split up into multiple hosts. In most cases, these should be Virtual Machines running on at least 2 hypervisors for redundancy at the lower layers, though this is outside of the scope of this guide. For our purposes, we will assume that the VMs discussed below are already installed, configured, and operating.
The configuration outlines here makes use of a total of 14 VMs, with 6 distinct roles. Within each role, either 2 or 3 individual VMs are configured to provide redundancy. The roles can be roughly divided into two categories, frontends that expose services to users, and backends that expose databases to the frontend instances.
The full VM list, with an example naming convention where X is the host "ID" (e.g. 1, 2, etc.), is as follows:
Quantity Name Description
--- --- ---
2 flbX Frontend load balancers running HAProxy, handling incoming requests from clients and federated servers.
2 rwX Riot Web instances under Nginx.
3 hsX Matrix Synapse homeserver instances running the various workers.
2 blbX Backend load balancers running HAProxy, handling database requests from the homeserver instances.
3 mpgX PostgreSQL database instances running Patroni with Zookeeper.
2 mldX OpenLDAP instances.
While this setup may seem like overkill, it is, aside from the homeserver instances, the minimum configuration possible while still providing fully redundancy. If redundancy is not desired, a smaller configuration, down to as little as one host, is possible, though this is not detailed below.
In addition to these 14 VMs, some sort of shared storage must be provided for the sharing of media files (e.g. uploaded files) between the homeservers. For the purpose of this guide, we assume that this is an NFS export at `/srv/matrix` from a system called `nas`. The configuration of redundant, shared storage is outside of the scope of this guide, and thus we will not discuss this beyond this paragraph, though prospective administrators of highly-available Matrix instances should consider this as well.
All the VMs mentioned above should be running the same operating system. I recommended Debian 10.X (Buster) here, both because it is the distribution I run myself, and also because it provides nearly all the required packages with minimal fuss. If you wish to use another distribution, you must adapt the commands and examples below to fit. Additionally, this guide expects that you are running the Systemd init system. This is not the place for continuing the seemingly-endless initsystem debate, but some advanced features of Systemd (such as template units) are used below and in the official Matrix documentation, so we expect this is the initsystem you are running, and you are on your own if you choose to use an alternative.
For networking purposes, it is sufficient to place all the above servers in a single RFC1918 network. Outbound NAT should be configured to allow all hosts to reach the internet, and a small number of ports should be permitted through a firewall towards the external load balancer VIP (virtual IP address). The following is an example IP configuration in the network `10.0.0.0/24` that can be used for this guide, though you may of course choose a different subnet and host IP allocation scheme if you wish. All these names should resolve in DNS, or be configured in `/etc/hosts` on all machines.
IP address Hostname Description
--- --- ---
10.0.0.1 gw NAT gateway and firewall, upstream router.
10.0.0.2 blbvip Floating VIP for blbX instances.
10.0.0.3 blb1 blbX host 1.
10.0.0.4 blb2 blbX host 2.
10.0.0.5 mpg1 mpgX host 1.
10.0.0.6 mpg2 mpgX host 2.
10.0.0.7 mpg3 mpgX host 3.
10.0.0.8 mld1 mldX host 1.
10.0.0.9 mld2 mldX host 2.
10.0.0.10 flbvip Floating VIP for flbX instances.
10.0.0.11 flb1 flbX host 1.
10.0.0.12 flb2 flbX host 2.
10.0.0.13 rw1 rwX host 1.
10.0.0.14 rw2 rwX host 2.
10.0.0.15 hs1 hsX host 1.
10.0.0.16 hs2 hsX host 2.
10.0.0.17 hs3 hsX host 3.
# Step 2 - Installing and configuring OpenLDAP instances
[OpenLDAP]() is a common LDAP server, which provides centralized user administration as well as the configuration of additional details in a user directory. Installing and configuring OpenLDAP is beyond the scope of this guide, though the Matrix Homeserver configurations below assume that this is operating and that all Matrix users are stored in the LDAP database. In our example configuration, there are 2 OpenLDAP instances running with replication (`syncrepl`) between them, which are then load-balanced in a multi-master fashion. Since no services below here will be performing writes to this database, this is fine. The administrator is expected to configure some sort of user management layer of their choosing (e.g. scripts, or a web-based frontend) for managing users, resetting passwords, etc.
While this short section may seem like a cop-out, this is an extensive topic with many potential caveats, and should thus have its own (future) post on this blog. Until then, I trust that the administrator is able to look up and configure this themselves. I include these references only to help guide the administrator towards full-stack redundancy and to explain why there are LDAP sections in the backend load balancer configurations.
# Step 3 - Installing and configuring Patroni instances
[Patroni](https://github.com/zalando/patroni) is a service manager for PostgreSQL which provides automated failover and replication support for a PostgreSQL database. Like OpenLDAP above, the configuration of Patroni is beyond the scope of this guide, and the configurations below assume that this is operating and already configured. In our example configuration, there are 3 Patroni instances, which is the minimum required for quorum among the members. As above, I do plan to document this in a future post, but until then, I recommend the administrator reference the Patroni documentation as well as [this other post on my blog](https://www.boniface.me/post/patroni-and-haproxy-agent-checks/) for details on setting up the Patroni instances.
# Step 4 - Installing and configuring backend load balancers
While I do not go into details in the previous two steps, this section details how to make use of a redundant pair of HAProxy instances to expose the redundant databases mentioned above to the Homeserver instances.
In order to provide a single entrypoint to the load balancers, the administrator should first install and configure Keepalived. The following `/etc/keepalived/keepalived.conf` configuration will set up the `blbvip` floating IP address between the two instances, while providing checking of the HAProxy instance health. This configuration below can be used on both proxy hosts, and inline comments provide additional clarification and information as well as indicating any changes required between the hosts.
```
# Global configuration options.
global_defs {
# Use a dedicated IPv4 multicast group; adjust the last octet if this conflicts within your network.
vrrp_mcast_group4 224.0.0.21
# Use VRRP version 3 in strict mode and with no iptables configuration.
vrrp_version 3
vrrp_strict
vrrp_iptables
}
# HAProxy check script, to ensure that this host will not become PRIMARY if HAProxy is not active.
vrrp_script chk {
script "/usr/bin/haproxyctl show info"
interval 5
rise 2
fall 2
}
# Primary IPv4 VIP configuration.
vrrp_instance VIP_4 {
# Initial state, MASTER on both hosts to ensure that at least one host becomes active immediately on boot.
state MASTER
# Interface to place the VIP on; this is optional though still recommended on single-NIC machines; replace "ens2" with your actual NIC name.
interface ens2
# A dedicated, unique virtual router ID for this cluster; adjust this if required.
virtual_router_id 21
# The priority. Set to 200 for the primary (first) server, and to 100 for the secondary (second) server.
priority 200
# The (list of) virtual IP address(es) with CIDR subnet mask for the "blbvip" host.
virtual_ipaddress {
10.0.0.2/24
}
# Use the HAProxy check script for this VIP.
track_script {
chk
}
}
```
Once the above configuration is installed at `/etc/keepalived/keepalived.conf`, restart the Keepalived service with `sudo systemctl restart keepalived` on each host. You should see the VIP become active on the first host.
The HAProxy configuration below can be used verbatim on both proxy hosts, and inline comments provide additional clarification and information to avoid breaking up the configuration snippit. This configuration makes use of an advanced feature for the Patroni hosts [which is detailed in another post on this blog](https://www.boniface.me/post/patroni-and-haproxy-agent-checks/), to ensure that only the active Patroni node is sent traffic and to avoid the other two database hosts from reporting `DOWN` state all the time.
```
# Global settings - tune HAProxy for optimal performance, administration, and security.
global
# Send logs to the "local6" service on the local host, via an rsyslog UDP listener. Enable debug logging to log individual connections.
log ip6-localhost:514 local6 debug
log-send-hostname
chroot /var/lib/haproxy
pidfile /run/haproxy/haproxy.pid
# Use multi-threadded support (available with HAProxy 1.8+) for optimal performance in high-load situations. Adjust `nbthread` as needed for your host's core count (1/2 is optimal).
nbproc 1
nbthread 2
# Provide a stats socket for `hatop`
stats socket /var/lib/haproxy/admin.sock mode 660 level admin process 1
stats timeout 30s
# Run in daemon mode as the `haproxy` user
daemon
user haproxy
group haproxy
# Set the global connection limit to 10000; this is certainly overkill but avoids needing to tweak this for larger instances.
maxconn 10000
# Default settings - provide some default settings that are applicable to (most) of the listeners and backends below.
defaults
log global
timeout connect 30s
timeout client 15m
timeout server 15m
log-format "%ci:%cp [%t] %ft %b/%s %Tw/%Tc/%Tt %B %ts %ac/%fc/%bc/%sc/%rc %sq/%bq %bi:%bp"
# Statistics listener with authentication - provides stats for the HAProxy instance via a WebUI (optional)
userlist admin
# WARNING - CHANGE ME TO A REAL PASSWORD OR A SHA512-hashed PASSWORD (with `password` instead of `insecure-password`). IF YOU USE `insecure-password`, MAKE SURE THIS CONFIGURATION IS NOT WORLD-READABLE.
user admin insecure-password P4ssw0rd
listen stats
bind :::5555 v4v6
mode http
stats enable
stats uri /
stats hide-version
stats refresh 10s
stats show-node
stats show-legends
acl is_admin http_auth(admin)
http-request auth realm "Admin access required" if !is_admin
# Stick-tables peers configuration
peers keepalived-pair
peer blb1 10.0.0.3:1023
peer blb1 10.0.0.4:1023
# LDAP frontend
frontend ldap
bind :::389 v4v6
maxconn 1000
mode tcp
option tcpka
default_backend ldap
# PostgreSQL frontend
frontend pgsql
bind :::5432 v4v6
maxconn 1000
mode tcp
option tcpka
default_backend pgsql
# LDAP backend
backend ldap
mode tcp
option tcpka
balance leastconn
server mld1 10.0.0.8:389 check inter 2000 maxconn 64
server mld2 10.0.0.9:389 check inter 2000 maxconn 64
# PostgreSQL backend using agent check
backend pgsql
mode tcp
option tcpka
option httpchk OPTIONS /master
http-check expect status 200
server mpg1 10.0.0.5:5432 maxconn 1000 check agent-check agent-port 5555 inter 1s fall 2 rise 2 on-marked-down shutdown-sessions port 8008
server mpg2 10.0.0.6:5432 maxconn 1000 check agent-check agent-port 5555 inter 1s fall 2 rise 2 on-marked-down shutdown-sessions port 8008
server mpg3 10.0.0.7:5432 maxconn 1000 check agent-check agent-port 5555 inter 1s fall 2 rise 2 on-marked-down shutdown-sessions port 8008
```
Once the above configurations are installed on each server, restart the HAProxy service with `sudo systemctl restart haproxy`. Use `sudo hatop -s /var/lib/haproxy/admin.sock` to view the status of the backends, and continue once all are running correctly.
## Step 5 - Install and configure Synapse instances
The core homeserver processes should be configured on all homeserver machines. There are numerous options but
## Step 2 - Configure systemd units
The easiest way to set up workers is to use a template unit file with a series of individual worker configurations. A series of unit files are [provided within the Synapse documentation](https://github.com/matrix-org/synapse/tree/master/docs/systemd-with-workers), which can be used to set up template-based workers.
I decided to modify these somewhat, by replacing the configuration directory at `/etc/matrix-synapse/workers` with `/etc/matrix-synapse/worker.d`, but this is just a personal preference. If you're using official Debian packages (as I am), you will also need to adjust the path to the Python binary. I also adjust the description to be a little more consistent. The resulting template worker unit looks like this:
```
[Unit]
Description = Synapse Matrix worker %i
PartOf = matrix-synapse.target
[Service]
Type = notify
NotifyAccess = main
User = matrix-synapse
WorkingDirectory = /var/lib/matrix-synapse
EnvironmentFile = /etc/default/matrix-synapse
ExecStart = /usr/bin/python3 -m synapse.app.generic_worker --config-path=/etc/matrix-synapse/homeserver.yaml --config-path=/etc/matrix-synapse/conf.d/ --config-path=/etc/matrix-synapse/worker.d/%i.yaml
ExecReload = /bin/kill -HUP $MAINPID
Restart = on-failure
RestartSec = 3
SyslogIdentifier = matrix-synapse-%i
[Install]
WantedBy = matrix-synapse.target
```
There is also a generic target unit that should be installed to provide a unified management point for both the primary Synapse process as well as the workers. After some similar tweaks, including adjusting the After condition to use `network-online.target` instead of `network.target`, the resulting file looks like this:
```
[Unit]
Description = Synapse Matrix homeserver target
After = network-online.target
[Install]
WantedBy = multi-user.target
```
Install both of these units, as `matrix-synapse-worker@.service` and `matrix-synapse.target` respectively, to `/etc/systemd/system`, and run `sudo systemctl daemon-reload`.
Once the unit files are prepared, you can begin building each individual worker configuration.
## Step 3 - Configure the individual workers
Each worker is configured via an individual YAML configuration file, with our units under `/etc/matrix-synapse/worker.d`. By design, each worker makes use of `homeserver.yaml` for all global configuration values, then the individual worker configurations override specific settings for the particular worker. The [Synapse documentation on workers](https://github.com/matrix-org/synapse/blob/master/docs/workers.md) provides a good starting point, but some sections are vague, and thus this guide hopes to provide more detailed instructions and explanations.
Each worker is given a specific section below, which includes the full YAML configuration I use, as well as any notes about the configuration that are worth mentioning. They are provided in alphabetical order, rather than the order provided in the documentation above, for clarity.
For any worker which responds to REST, a port must be selected for the worker to listen on. The main homeserver runs by default on port 8008, and I have `ma1xd` running on port 8090, so I chose ports from 8091 to 8097 for the various REST workers in order to keep them in a consistent range.
Finally, the main homeserver must be configured with both TCP and HTTP replication listeners, to provide communication between the workers and the main process. For this I use the ports provided by the Matrix documentation above, 9092 and 9093, with the following configuration in the main `homeserver.yaml` `listeners` section:
```
listeners:
- port: 8008
tls: false
bind_addresses:
- '::'
type: http
x_forwarded: true
resources:
- names: [client, webclient]
compress: true
- port: 9092
bind_addresses:
- '::'
type: replication
x_forwarded: true
- port: 9093
bind_addresses:
- '::'
type: http
x_forwarded: true
resources:
- names: [replication]
```
There are a couple adjustments here from the default configuration. First, the `federation` resource has been removed from the primary listener, since this is implemented as a worker below. TLS is disabled here, and `x_forwarded: true` is added to all 3 frontends, since this is handled by a reverse proxy, as discussed later in this guide. All three listeners use a global IPv6+IPv4 bind address of `::` so they will be accessible by other machines on the network, which is important for the final, multi-host setup. As noted in the Matrix documentation, *ensure that the replication ports are not publicly accessible*, since they are unauthenticated and unencrypted; I run these servers on an RFC1918 private network behind a firewall so this is secure, but you will need to provide some sort of firewall if your Synapse instance is directly available on the public Internet.
The configurations below show a hostname, `mlbvip`, for all instances of `worker_replication_host`. This will be explained and discussed further in the reverse proxy section. If you are only interested in running a "single-server" instance, you may use `localhost`, `127.0.0.1`, or `::1` here instead, as these ports will not managed by the reverse proxy in such a setup.
#### `appservice` worker (`/etc/matrix-synapse/worker.d/appservice.yaml`)
The `appservice` worker does not service REST endpoints, and thus has a minimal configuration.
```
---
worker_app: synapse.app.appservice
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@appservice.service`. It will be started later in the process.
#### `client_reader` worker (`/etc/matrix-synapse/worker.d/client_reader.yaml`)
The `client_reader` worker services REST endpoints, and thus has a listener section, with port 8091 chosen.
```
---
worker_app: synapse.app.client_reader
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
worker_listeners:
- type: http
port: 8091
resources:
- names:
- client
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@client_reader.service`. It will be started later in the process.
#### `event_creator` worker (`/etc/matrix-synapse/worker.d/event_creator.yaml`)
The `event_creator` worker services REST endpoints, and thus has a listener section, with port 8092 chosen.
```
---
worker_app: synapse.app.event_creator
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
worker_listeners:
- type: http
port: 8092
resources:
- names:
- client
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@event_creator.service`. It will be started later in the process.
#### `federation_reader` worker (`/etc/matrix-synapse/worker.d/federation_reader.yaml`)
The `federation_reader` worker services REST endpoints, and thus has a listener section, with port 8093 chosen. Note that this worker, in addition to a `client` resource, also provides a `federation` resource.
```
---
worker_app: synapse.app.federation_reader
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
worker_listeners:
- type: http
port: 8093
resources:
- names:
- client
- federation
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@federation_reader.service`. It will be started later in the process.
#### `federation_sender` worker (`/etc/matrix-synapse/worker.d/federation_sender.yaml`)
The `federation_sender` worker does not service REST endpoints, and thus has a minimal configuration.
```
---
worker_app: synapse.app.federation_sender
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@federation_sender.service`. It will be started later in the process.
#### `frontend_proxy` worker (`/etc/matrix-synapse/worker.d/frontend_proxy.yaml`)
The `frontend_proxy` worker services REST endpoints, and thus has a listener section, with port 8094 chosen. This worker has an additional configuration parameter, `worker_main_http_uri`, which allows the worker to direct requests back to the primary Synapse instance. Similar to the `worker_replication_host` value, this uses `mlbvip` in this example, and for "single-server" instances *must* be replaced with `localhost`, `127.0.0.1`, or `::1` instead, as this port will not managed by the reverse proxy in such a setup.
```
---
worker_app: synapse.app.frontend_proxy
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
worker_main_http_uri: http://mlbvip:8008
worker_listeners:
- type: http
port: 8094
resources:
- names:
- client
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@frontend_proxy.service`. It will be started later in the process.
#### `media_repository` worker (`/etc/matrix-synapse/worker.d/media_repository.yaml`)
The `media_repository` worker services REST endpoints, and thus has a listener section, with port 8095 chosen. Note that this worker, in addition to a `client` resource, also provides a `media` resource.
```
---
worker_app: synapse.app.media_repository
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
worker_listeners:
- type: http
port: 8095
resources:
- names:
- client
- media
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@media_repository.service`. It will be started later in the process.
#### `pusher` worker (`/etc/matrix-synapse/worker.d/pusher.yaml`)
The `pusher` worker does not service REST endpoints, and thus has a minimal configuration.
```
---
worker_app: synapse.app.pusher
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@pusher.service`. It will be started later in the process.
#### `synchrotron` worker (`/etc/matrix-synapse/worker.d/synchrotron.yaml`)
The `synchrotron` worker services REST endpoints, and thus has a listener section, with port 8096 chosen.
```
---
worker_app: synapse.app.synchrotron
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
worker_listeners:
- type: http
port: 8096
resources:
- names:
- client
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@synchrotron.service`. It will be started later in the process.
#### `user_dir` worker (`/etc/matrix-synapse/worker.d/user_dir.yaml`)
The `user_dir` worker services REST endpoints, and thus has a listener section, with port 8097 chosen.
```
---
worker_app: synapse.app.user_dir
worker_replication_host: mlbvip
worker_replication_port: 9092
worker_replication_http_port: 9093
worker_listeners:
- type: http
port: 8097
resources:
- names:
- client
```
Once the configuration is in place, enable the worker by running `sudo systemctl enable matrix-synapse-worker@user_dir.service`. It will be started later in the process.
## Step 4 - Riot instance
Riot Web is the reference frontend for Matrix instances, allowing a user to access Matrix via a web browser. Riot is an optional, but recommended, feature for your homeserver
## Step 5 - ma1sd instance
ma1sd is an optional component for Matrix, providing 3PID (e.g. email, phone number, etc.) lookup services for Matrix users. I use ma1sd with my Matrix instance for two main reasons: first, to map nice-looking user data such as full names to my Matrix users, and also as RESTful authentication provider to interface Matrix with my LDAP instance. For this guide, I assume that you already have an LDAP instance set up and that you are using it in this manner too.
## Step 6 - Reverse proxy
For this guide, HAProxy was selected as the reverse proxy of choice. This is mostly due to my familiarity with it, but also to a lesser degree for its more advanced functionality and, in my opinion, nicer configuration syntax. This section provides configuration for a "load-balanced", multi-server instance with an additional 2 slave worker servers and with separate proxy servers; a single-server instance with basic split workers can be made by removing the additional servers. This will allow the homeserver to grow to many dozens or even hundreds of users. In this setup, the load balancer is separated out onto a separate pair of servers, with a `keepalived` VIP (virtual IP address) shared between them. The name `mlbvip` should resolve to this IP, and all previous worker configurations should use this `mlbvip` hostname as the connection target for the replication directives. Both a reasonable `keepalived` configuration for the VIP and the HAProxy configuration are provided.
The two proxy hosts can be named as desired, in my case using the names `mlb1` and `mlb2`. These names must resolve in DNS, or be specified in `/etc/hosts` on both servers.
The Keepalived configuration below can be used on both proxy hosts, and inline comments provide additional clarification and information as well as indicating any changes required between the hosts. The VIP should be selected from the free IPs of your server subnet.
```
# Global configuration options.
global_defs {
# Use a dedicated IPv6 multicast group; adjust the last octet if this conflicts within your network.
vrrp_mcast_group4 224.0.0.21
# Use VRRP version 3 in strict mode and with no iptables configuration.
vrrp_version 3
vrrp_strict
vrrp_iptables
}
# HAProxy check script, to ensure that this host will not become PRIMARY if HAProxy is not active.
vrrp_script chk {
script "/usr/bin/haproxyctl show info"
interval 5
rise 2
fall 2
}
# Primary IPv4 VIP configuration.
vrrp_instance VIP_4 {
# Initial state, MASTER on both hosts to ensure that at least one host becomes active immediately on boot.
state MASTER
# Interface to place the VIP on; this is optional though still recommended on single-NIC machines; replace "ens2" with your actual NIC name.
interface ens2
# A dedicated, unique virtual router ID for this cluster; adjust this if required.
virtual_router_id 21
# The priority. Set to 200 for the primary (first) server, and to 100 for the secondary (second) server.
priority 200
# The (list of) virtual IP address(es) with CIDR subnet mask.
virtual_ipaddress {
10.0.0.10/24
}
# Use the HAProxy check script for this VIP.
track_script {
chk
}
}
```
Once the above configuration is installed at `/etc/keepalived/keepalived.conf`, restart the Keepalived service with `sudo systemctl restart keepalived` on each host. You should see the VIP become active on the first host.
The HAProxy configuration below can be used verbatim on both proxy hosts, and inline comments provide additional clarification and information to avoid breaking up the configuration snippit. In this example we use `peer` configuration to enable the use of `stick-tables` directives, which ensure that individual user sessions are synchronized between the HAProxy instances during failovers; with this setting, if the hostnames of the load balancers do not resolve, HAProxy will not start. Some additional, advanced features are used in several ACLs to ensure that, for instance, specific users and rooms are always directed to the same workers if possible, which is required by the individual workers as specified in [the Matrix documentation](https://github.com/matrix-org/synapse/blob/master/docs/workers.md).
```
global
# Send logs to the "local6" service on the local host, via an rsyslog UDP listener. Enable debug logging to log individual connections.
log ip6-localhost:514 local6 debug
log-send-hostname
chroot /var/lib/haproxy
pidfile /run/haproxy/haproxy.pid
# Use multi-threadded support (available with HAProxy 1.8+) for optimal performance in high-load situations. Adjust `nbthread` as needed for your host's core count (2-4 is optimal).
nbproc 1
nbthread 4
# Provide a stats socket for `hatop`
stats socket /var/lib/haproxy/admin.sock mode 660 level admin process 1
stats timeout 30s
# Run in daemon mode as the `haproxy` user
daemon
user haproxy
group haproxy
# Set the global connection limit to 10000; this is certainly overkill but avoids needing to tweak this for larger instances.
maxconn 10000
# Set default SSL configurations, including a modern highly-secure configuration requiring TLS1.2 client support.
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
tune.ssl.default-dh-param 2048
ssl-default-bind-ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
ssl-default-server-ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384
ssl-default-server-options ssl-min-ver TLSv1.2 no-tls-tickets
defaults
log global
option http-keep-alive
option forwardfor except 127.0.0.0/8
option redispatch
option dontlognull
option splice-auto
option log-health-checks
default-server init-addr libc,last,none
timeout client 30s
timeout connect 30s
timeout server 300s
timeout tunnel 3600s
timeout http-keep-alive 60s
timeout http-request 30s
timeout queue 60s
timeout tarpit 60s
peers keepalived-pair
# Peers for site bl0
peer mlb1.i.bonilan.net mlb1.i.bonilan.net:1023
peer mlb2.i.bonilan.net mlb2.i.bonilan.net:1023
resolvers nsX
nameserver ns1 10.101.0.61:53
nameserver ns2 10.101.0.62:53
userlist admin
user admin password MySuperSecretPassword123
listen stats
bind :::5555 v4v6
mode http
stats enable
stats uri /
stats hide-version
stats refresh 10s
stats show-node
stats show-legends
acl is_admin http_auth(admin)
http-request auth realm "Admin access" if !is_admin
frontend http
bind :::80 v4v6
mode http
option httplog
acl url_letsencrypt path_beg /.well-known/acme-challenge/
use_backend letsencrypt if url_letsencrypt
redirect scheme https if !url_letsencrypt !{ ssl_fc }
frontend https
bind :::443 v4v6 ssl crt /etc/ssl/letsencrypt/ alpn h2,http/1.1
bind :::8448 v4v6 ssl crt /etc/ssl/letsencrypt/ alpn h2,http/1.1
mode http
option httplog
capture request header Host len 64
http-request set-header X-Forwarded-Proto https
http-request add-header X-Forwarded-Host %[req.hdr(host)]
http-request add-header X-Forwarded-Server %[req.hdr(host)]
http-request add-header X-Forwarded-Port %[dst_port]
# Method ACLs
acl http_method_get method GET
# Domain ACLs
acl host_matrix hdr_dom(host) im.bonifacelabs.ca
acl host_element hdr_dom(host) chat.bonifacelabs.ca
# URL ACLs
# Sync requests
acl url_workerX_stick-auth path_reg ^/_matrix/client/(r0|v3)/sync$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3)/events$
acl url_workerX_stick-auth path_reg ^/_matrix/client/(api/v1|r0|v3)/initialSync$
acl url_workerX_stick-auth path_reg ^/_matrix/client/(api/v1|r0|v3)/rooms/[^/]+/initialSync$
# Federation requests
acl url_workerX_generic path_reg ^/_matrix/federation/v1/event/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/state/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/state_ids/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/backfill/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/get_missing_events/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/publicRooms
acl url_workerX_generic path_reg ^/_matrix/federation/v1/query/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/make_join/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/make_leave/
acl url_workerX_generic path_reg ^/_matrix/federation/(v1|v2)/send_join/
acl url_workerX_generic path_reg ^/_matrix/federation/(v1|v2)/send_leave/
acl url_workerX_generic path_reg ^/_matrix/federation/(v1|v2)/invite/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/event_auth/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/exchange_third_party_invite/
acl url_workerX_generic path_reg ^/_matrix/federation/v1/user/devices/
acl url_workerX_generic path_reg ^/_matrix/key/v2/query
acl url_workerX_generic path_reg ^/_matrix/federation/v1/hierarchy/
# Inbound federation transaction request
acl url_workerX_stick-src path_reg ^/_matrix/federation/v1/send/
# Client API requests
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/createRoom$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/publicRooms$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/joined_members$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/context/.*$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/members$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/state$
acl url_workerX_generic path_reg ^/_matrix/client/v1/rooms/.*/hierarchy$
acl url_workerX_generic path_reg ^/_matrix/client/unstable/org.matrix.msc2716/rooms/.*/batch_send$
acl url_workerX_generic path_reg ^/_matrix/client/unstable/im.nheko.summary/rooms/.*/summary$
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/account/3pid$
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/account/whoami$
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/devices$
acl url_workerX_generic path_reg ^/_matrix/client/versions$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/voip/turnServer$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/event/
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/joined_rooms$
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/search$
# Encryption requests
# Note that ^/_matrix/client/(r0|v3|unstable)/keys/upload/ requires `worker_main_http_uri`
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/keys/query$
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/keys/changes$
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/keys/claim$
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/room_keys/
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/keys/upload/
# Registration/login requests
acl url_workerX_generic path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/login$
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/register$
acl url_workerX_generic path_reg ^/_matrix/client/v1/register/m.login.registration_token/validity$
# Event sending requests
acl url_workerX_stick-path path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/redact
acl url_workerX_stick-path path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/send
acl url_workerX_stick-path path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/state/
acl url_workerX_stick-path path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/(join|invite|leave|ban|unban|kick)$
acl url_workerX_stick-path path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/join/
acl url_workerX_stick-path path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/profile/
# User directory search requests
acl url_workerX_generic path_reg ^/_matrix/client/(r0|v3|unstable)/user_directory/search$
# Pagination requests
acl url_workerX_stick-path path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/messages$
# Push rules (GET-only)
acl url_push-rules path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/pushrules/
# Directory worker endpoints
acl url_directory-worker path_reg ^/_matrix/client/(r0|v3|unstable)/user_directory/search$
# Event persister endpoints
acl url_stream-worker path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/rooms/.*/typing
acl url_stream-worker path_reg ^/_matrix/client/(r0|v3|unstable)/sendToDevice/
acl url_stream-worker path_reg ^/_matrix/client/(r0|v3|unstable)/.*/tags
acl url_stream-worker path_reg ^/_matrix/client/(r0|v3|unstable)/.*/account_data
acl url_stream-worker path_reg ^/_matrix/client/(r0|v3|unstable)/rooms/.*/receipt
acl url_stream-worker path_reg ^/_matrix/client/(r0|v3|unstable)/rooms/.*/read_markers
acl url_stream-worker path_reg ^/_matrix/client/(api/v1|r0|v3|unstable)/presence/
# Backend directors
use_backend synapseX_worker_generic if host_matrix url_workerX_generic
use_backend synapseX_worker_generic if host_matrix url_push-rules http_method_get
use_backend synapseX_worker_stick-auth if host_matrix url_workerX_stick-auth
use_backend synapseX_worker_stick-src if host_matrix url_workerX_stick-src
use_backend synapseX_worker_stick-path if host_matrix url_workerX_stick-path
use_backend synapse0_directory_worker if host_matrix url_directory-worker
use_backend synapse0_stream_worker if host_matrix url_stream-worker
# Master workers (single-instance) - Federation media repository requests
acl url_mediarepository path_reg ^/_matrix/media/
acl url_mediarepository path_reg ^/_synapse/admin/v1/purge_media_cache$
acl url_mediarepository path_reg ^/_synapse/admin/v1/room/.*/media.*$
acl url_mediarepository path_reg ^/_synapse/admin/v1/user/.*/media.*$
acl url_mediarepository path_reg ^/_synapse/admin/v1/media/.*$
acl url_mediarepository path_reg ^/_synapse/admin/v1/quarantine_media/.*$
acl url_mediarepository path_reg ^/_synapse/admin/v1/users/.*/media$
use_backend synapse0_media_repository if host_matrix url_mediarepository
# MXISD/MA1SD worker
acl url_ma1sd path_reg ^/_matrix/client/(api/v1|r0|unstable)/user_directory
acl url_ma1sd path_reg ^/_matrix/client/(api/v1|r0|unstable)/login
acl url_ma1sd path_reg ^/_matrix/identity
use_backend synapse0_ma1sd if host_matrix url_ma1sd
# Webhook service
acl url_webhook path_reg ^/webhook
use_backend synapse0_webhook if host_matrix url_webhook
# .well-known configs
acl url_wellknown path_reg ^/.well-known/matrix
use_backend elementX_http if host_matrix url_wellknown
# Catchall Matrix and RElement
use_backend synapse0_master if host_matrix
use_backend elementX_http if host_element
# Default to Riot
default_backend elementX_http
frontend ma1sd_http
bind :::8090 v4v6
mode http
option httplog
use_backend synapse0_ma1sd
backend letsencrypt
mode http
server elbvip.i.bonilan.net elbvip.i.bonilan.net:80 resolvers nsX resolve-prefer ipv4
backend elementX_http
mode http
balance leastconn
option httpchk GET /index.html
# Force users (by source IP) to visit the same backend server
stick-table type ipv6 size 5000k peers keepalived-pair expire 72h
stick on src
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server element1 element1.i.bonilan.net:80 resolvers nsX resolve-prefer ipv4 check inter 5000 cookie element1.i.bonilan.net
server element2 element2.i.bonilan.net:80 resolvers nsX resolve-prefer ipv4 check inter 5000 cookie element2.i.bonilan.net
backend synapse0_master
mode http
balance roundrobin
option httpchk
retries 0
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse0.i.bonilan.net synapse0.i.bonilan.net:8008 resolvers nsX resolve-prefer ipv4 check inter 5000 backup
backend synapse0_directory_worker
mode http
balance roundrobin
option httpchk
retries 0
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse0.i.bonilan.net synapse0.i.bonilan.net:8033 resolvers nsX resolve-prefer ipv4 check inter 5000 backup
backend synapse0_stream_worker
mode http
balance roundrobin
option httpchk
retries 0
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse0.i.bonilan.net synapse0.i.bonilan.net:8035 resolvers nsX resolve-prefer ipv4 check inter 5000 backup
backend synapse0_media_repository
mode http
balance roundrobin
option httpchk
retries 0
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse0.i.bonilan.net synapse0.i.bonilan.net:8095 resolvers nsX resolve-prefer ipv4 check inter 5000 backup
backend synapse0_ma1sd
mode http
balance roundrobin
option httpchk
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse0.i.bonilan.net synapse0.i.bonilan.net:8090 resolvers nsX resolve-prefer ipv4 check inter 5000
backend synapse0_webhook
mode http
balance roundrobin
option httpchk GET /
server synapse0.i.bonilan.net synapse0.i.bonilan.net:4785 resolvers nsX resolve-prefer ipv4 check inter 5000 backup
backend synapseX_worker_generic
mode http
balance roundrobin
option httpchk
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse1.i.bonilan.net synapse1.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
server synapse2.i.bonilan.net synapse2.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
backend synapseX_worker_stick-auth
mode http
balance roundrobin
option httpchk
# Force users (by Authorization header) to visit the same backend server
stick-table type string len 1024 size 5000k peers keepalived-pair expire 72h
stick on hdr(Authorization)
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse1.i.bonilan.net synapse1.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
server synapse2.i.bonilan.net synapse2.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
backend synapseX_worker_stick-path
mode http
balance roundrobin
option httpchk
# Force users to visit the same backend server
stick-table type string len 1024 size 5000k peers keepalived-pair expire 72h
stick on path,word(5,/) if { path_reg ^/_matrix/client/(r0|unstable)/rooms }
stick on path,word(6,/) if { path_reg ^/_matrix/client/api/v1/rooms }
stick on path
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse1.i.bonilan.net synapse1.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
server synapse2.i.bonilan.net synapse2.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
backend synapseX_worker_stick-src
mode http
balance roundrobin
option httpchk
# Force users (by source IP) to visit the same backend server
stick-table type ipv6 size 5000k peers keepalived-pair expire 72h
stick on src
errorfile 500 /etc/haproxy/sorryserver.http
errorfile 502 /etc/haproxy/sorryserver.http
errorfile 503 /etc/haproxy/sorryserver.http
errorfile 504 /etc/haproxy/sorryserver.http
server synapse1.i.bonilan.net synapse1.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
server synapse2.i.bonilan.net synapse2.i.bonilan.net:8030 resolvers nsX resolve-prefer ipv4 check inter 5000
```
Once the above configurations are installed on each server, restart the HAProxy service with `sudo systemctl restart haproxy`. You will now have access to the various endpoints on ports 443 and 8448 with a redirection from port 80 to port 443 to enforce SSL from clients.
## Final steps
Now that your proxy is running, test connectivity to your servers. For Riot, visit the bare VIP IP or the Riot subdomain. For Matrix, visit the Matrix subdomain. In both cases, ensure that the page loads properly. Finally, use the [Matirx Homserver Federation Tester](https://federationtester.matrix.org/) to verify that Federation is correctly configured for your Homserver.
Congratulations, you now have a fully-configured, multi-worker and, if configured, load-balanced Matrix instance capable of handling many dozens or hundreds of users with optimal performance!
If you have any feedback about this post, including corrections, please contact me - you can find me in the [`#synapse:matrix.org`](https://matrix.to/#/!mjbDjyNsRXndKLkHIe:matrix.org) Matrix room, or via email!

View File

@ -0,0 +1,178 @@
---
title: "Self-Hosted Voice Control (for Paranoids)"
description: "Building a self-hosted voice interface for HomeAssistant"
date: 2018-03-12
tags:
- DIY
- Technology
---
#### _Building a self-hosted voice interface for HomeAssistant_
Voice control is both a new, and quite old, piece of the home automation puzzle. As far back as the 1960's, science fiction depicted seamless voice control of computers, culminating in, to me, one of Star Trek's most endearing lines: "Computer, lights", followed by the satisfying brightness of hands-free lighting!
In the last few years, real-life technology has finally progressed to the point that this is truly possible. While there have been many attempts over the years, the fact is that reliable voice recognition requires massive quantites of computing power, machine learning, and sample data. It's something that truly requires "the cloud" to be workable. But with the rise of Google and Amazon voice appliances, the privacy implications of this have come into play. As a now-widely-circulated comic puts it, 30 years ago people were concerned about police wiretaps - now, they say "Wiretap, order me some duct tape"! And this is compounded by the proprietary nature of these appliances. Sure, the company may _say_ that they don't listen to you all the time, but without visibility into the hardware and software, how much can we really trust them?
Luckily, the free software community has a couple of answers. And today, it's possible to build your own appliance! It still uses the Google/Amazon/Microsoft speech-to-text facilities, but by controlling the hardware and software, you can be sure that the device is only listening to you when you tell it to! Hopefully one day projects like Sphinx and Kaldi will be up to the task, but for now we're stuck using the cloud players, for better or worse.
## Hardware - The Raspberry Pi and ReSpeaker
The Raspberry Pi has pretty much become the go-to device for building small self-hosted appliance solutions. From wildlife cameras to [a server BMC](/post/a-raspberry-pi-bmc), the Raspberry Pi provides a fantastic base system for just about any small computing project you could want to build. This project makes use of the Raspberry Pi 3 model B, mainly because it's the most commonly available new, and due to the computing requirements of the software we will be using - the original Raspberry Pi doesn't have enough computing power, and the Raspberry Pi 2 has some software consistency issues.
The second main component of this project is the [Seeed Studio ReSpeaker (4-mic version)](http://wiki.seeed.cc/ReSpeaker_4_Mic_Array_for_Raspberry_Pi/). The ReSpeaker provides an array of 4 microphones, one on each corner of the square board, in addition to a ring of LEDs, giving a visual appearance similar to the Google and Amazon appliances. By integrating tightly with the Raspberry Pi, you can build a very compact unit that can be placed almost anywhere and with only a single incoming cord for power, assuming WiFi is in use.
### Parts list
* 1x Raspberry Pi 3 (or newer)
* 1x SD Card for Raspberry Pi (8+ GB)
* 1x Power cord for Raspberry Pi
* 1x ReSpeaker 4-mic hat
### Assembly
Assembly of the unit is very straightfoward. The ReSpeaker attaches to the main Raspberry Pi GPIO pins, and sits above the board as seen in the picture on their site above. Once this is attached, the Raspberry Pi is ready to be installed and configured for it.
## Software - Kalliope, ReSpeaker, and Raspbian
To start, this post doesn't document my HomeAssistant configuration - to do so would require its own post entirely! What is important for our purposes though is that my HomeAssistant interface is exposing multiple API endpoints, one for each room, that handle the various lighting events that happen there. You can use this method for communicating almost anything to HomeAssistant via voice control.
For example, the following endpoint + data combination triggers a "lights-on" event for my bedroom:
```
curl -H 'X-HA-Access: MySuperSecretPassword' -H 'Content-Type: application/json' -X POST -d '{ "state": "on" }' https://myhomeassistantdomain.net:8123/api/events/bedroomlights
```
With the HomeAssistant side set up, we can begin configuring the Raspberry Pi.
### Kalliope
[Kalliope](https://github.com/kalliope-project/kalliope) is a free software (MIT-licensed) project to provide an always-on voice assistant. It is written in Python and features a very modular structure and extremely flexible configuration options. Unlike commercial options, though, you can inspect the code and confirm that it indeed does not report everything you say to some Internet service. Using the Snowboy library to provide and wake to a trigger word, you can then customize its behaviour based on the phrase recieved from your choice of speech-to-text provider (Google, Amazon, etc.). And since Snowboy is a local service, it is only sending data to the cloud once it's awoken by the trigger word.
I start with the [official Kalliope image](https://github.com/kalliope-project/kalliope/blob/master/Docs/installation/raspbian.md). The reason for this is twofold: first, the image provides a conveniently-configured system without having to manually `pip install` Kalliope, which even on a Raspberry Pi 3 takes upwards of an hour. Second, and most importantly, Snowboy appears to be broken with the latest Raspbian releases; it is impossible to properly compile it, and hence the `pip install` can fail in obscure ways, usually after you've already been compiling it for an hour. Using their pre-built image, and then upgrading it to the latest Raspbian, bypasses both problems and let's you get right to work.
Once you've written the Kalliope image to your SD card, boot it up, and then perform an upgrade to Raspbian Stretch (the image is Jessie):
```
pi@kalliope:~$ sudo find /etc/apt -name "*.list" -exec sed -i 's/jessie/stretch/g' {} \;
pi@kalliope:~$ sudo apt update && sudo apt upgrade -y
pi@kalliope:~$ sudo reboot
```
Once this finishes, you'll be booted into your Raspbian Stretch system complete with Kalliope installed. I cover the configuration in a later section.
### ReSpeaker Audio
The [ReSpeaker library](https://github.com/respeaker/seeed-voicecard) provides the drivers and utilities for using the ReSpeaker hat with Raspbian. Note however that this library won't work on Raspbian Jessie, only Stretch, which is why we have to upgrade the Kalliope image first. Once the upgrade is finished, clone this repository into a local directory and follow the instructions provided. Verify that the driver is working by checking `arecord -L` and looking for ReSpeaker entries, then configure the volume of the microphones using `alsamixer`. I find that a gain of 90 with a volume of 75 makes a fantastic value, since 100/100 results in nothing but noise. Your mileage here may vary, so do some test recordings and verify as recommended in the library README.
One downside is, however, that the ReSpeaker technically supports directional audio (like, e.g. the Alexa, using the mic closest to you for optimal performance). At the moment though I don't have this support in this project, because I'm making use of PulseAudio to handle the incoming audio, rather than directly interfacing with the ReSpeaker unit - this support would have to be built into Kalliope. It does work, but you don't get the directional listening that you might expect from reading the ReSpeaker page!
### ReSpeaker LEDs
The LED portion of the ReSpeaker requires a little more work. The [examples library for the 4-mic hat](https://github.com/respeaker/4mics_hat) provides all the basic tools needed to get the LEDs working, including several samples based on Google and Amazon device patterns. In my case, I went for a very simple LED feedback design: the LEDs turn on blue while listening, then quickly turn either green on a successful command, or red on a failed command, giving some sort of user feedback without having to listen to the unit try and talk!
To do this, I created a simple Python "daemon" running under Systemd to listen for commands on a FIFO pipe and perform the required action, as well as a helper client utility to trigger the pipe. The code for these can be found [on my GitHub](https://github.com/joshuaboniface/respeaker-led) for convenience. One interesting feature of this configuration is the Systemd unit file. It performs a git pull inside the service directory (e.g. the repo directory) to ensure the service is automatically up-to-date when the service is started. I do the same thing in my Kalliope unit file for its configuration.
### Kalliope configuration
The next step is to actually configure Kalliope. The examples are a good starting point, but integrating everything together is a bit more work. Below is a sample of the `brain.yml` configuration for my instance, showing how it integrates the ReSpeaker LEDs directly, as well as posting to the HomeAssistant URL.
```
# Default/built-in orders
- name: "order-not-found-synapse"
signals: []
neurons:
- shell:
cmd: /usr/bin/env python /srv/respeaker-led/trigger.py leds_red
- shell:
cmd: /bin/sleep 0.5
- shell:
cmd: /usr/bin/env python /srv/respeaker-led/trigger.py leds_off
- shell:
cmd: /bin/sleep 0.2
- name: "on-triggered-synapse"
signals: []
neurons:
- shell:
cmd: /usr/bin/env python /srv/respeaker-led/trigger.py leds_blue
- name: "on-start-synapse"
signals: []
neurons:
- shell:
cmd: /usr/bin/env python /srv/respeaker-led/trigger.py leds_off
- shell:
cmd: /bin/sleep 0.1
# Custom orders
- name: "order-lights-on"
signals:
- order:
text: "lights"
matching-type: "normal"
- order:
text: "lights on"
matching-type: "normal"
- order:
text: "turn on lights"
matching-type: "normal"
- order:
text: "full brightness"
matching-type: "normal"
- order:
text: "all lights on"
matching-type: "normal"
neurons:
- shell:
cmd: /usr/bin/env python /srv/respeaker-led/trigger.py leds_green
- uri:
url: "https://myhomeassistantdomain.net:8123/api/events/bedroomlights"
headers:
x-ha-access: MySuperSecretPassword
Content-Type: application/json
method: POST
data: "{ \"state\": \"on\" }"
- shell:
cmd: /bin/sleep 0.4
- shell:
cmd: /usr/bin/env python /srv/respeaker-led/trigger.py leds_off
- shell:
cmd: /bin/sleep 0.2
```
Using this configuration as a jumping-off point, you can add multiple other options, and including the various shell commands you can ensure that the LED ring shows the status of every task. So far, the only downside I've found with Kalliope is that single-word triggers are generally unsupported; the device doesn't realize to stop listening, so try to keep them to two or more words.
I use a custom Systemd unit to ensure everything is started correctly, including output buffering, and as mentioned above ensures the configuration repository is always up-to-date with the origin, making configuration updates on-the-fly to multiple devices quick and painless.
```
# Kalliope service unit file
[Unit]
Description = Kalliope voice assistant
After = network-online.target
[Service]
Type = simple
WorkingDirectory = /srv/kalliope-config
User = kalliope
ExecStartPre = /usr/bin/pulseaudio --daemon
ExecStartPre = /usr/bin/ssh-agent bash -c 'ssh-add /srv/git-deploy.key; git pull; exit 0'
ExecStart = /usr/bin/stdbuf -oL /usr/local/bin/kalliope start
[Install]
WantedBy = multi-user.target
```
Install and enable the systemd unit file using a full path; this is a relatively unknown feature of systemctl that comes in handy here:
```
pi@kalliope:~$ sudo systemctl enable /srv/kalliope-config/kalliope.service
pi@kalliope:~$ sudo systemctl start kalliope.service
```
## The Next Steps
With all of this assembled, you can test out the system and make sure it's doing what you want. Here's a sample video of my unit in action. I will probably be building a few more (and getting a few more WeMo switches and dimmers) soon!
{{< youtube Q_02nEdvsic >}}
Thank you for checking out this project, and feel free to send me any feedback! Hopefully this helps someone else build up their voice-controlled home automation system!

View File

@ -0,0 +1,86 @@
+++
date = "2022-11-01T00:00:00-05:00"
tags = ["systems administration", "pvc","ceph","homelab","servers","networking"]
title = "State of the Servers 2022"
description = "A complete writeup of my homeproduction system as of end-2022, its 10th anniversary"
type = "post"
weight = 1
draft = true
+++
My home lab/production datacentre is my main hobby. While I have others, over the past 10 years I've definitely spent more time on it than anything else. From humble beginnings I've built a system to rival many small-to-medium enterprises (SMEs) or ISPs in my basement, providing me highly redundant and stable services both for my Internet presence, and for learning.
While I've written about parts of the setup in the past, I don't think I've ever done a complete and thorough writeup of every piece of the system, what went into my choices (spoiler: mostly cost) and design, and how it all fits together. This post is my attempt to rectify that.
For most of its first 8 years, the system was constantly changing even month-to-month as I obtained new parts, tinkered away, and just generally broke things for fun. But over COVID, while working from home, and with money being tight, as well as my maturity as a senior systems architect, things really stabilized, and it's only changed in a few minor ways since 2020. I have big plans for 2023, but for right now things have been stable for long enough to be able to really dig into all the parts, as well as hint at my future plans.
So if you dare, please join me on a virtual tour of my "homeproduction" system, the monster in my basement.
## Part One: A Brief History
My homelab journey started over 16 years ago while still in high school. At the time I was a serious computer enthusiast, and had more than enough spare parts to build a few home servers. Between then and finishing my college program (Network Engineering and Security Analyst at Mohawk College in Hamilton, Ontario) in late 2012, I went through a variety of setups that were almost exclusively based on single servers with storage, some sort of hypervisor, and just for tinkering and media storage.
When I started my career in earnest in January 2013, I finally had the disposable income to buy my first real server: a used Dell C6100 with 4 blade nodes. This system formed the basis of my lab for the next 6 years, and is still running today in a colo providing live functions for me.
My first few iterations tended to focus on a pair of Xen servers for virtualization and a separate ZFS server for storage, while also going through various combinations of routers including Mikrotiks trying to find something that would solve my endless WiFi issues. At this time I was running at most a dozen or so VMs with some core functionality for Internet presence, but nothing too fancy - it was primarily a learning tool. At one point I also tried doing a dual-primary DRBD setup for VM disks, but this went about as well as you might expect (not well at all), so I went back to a ZFS array for ZVOLs. I was also using `bcfg2` for configuration management. Basically, I had fully mirrored what I used and deployed at work built from the ground up, and it gave me some seriously in-depth knowledge of these tools that were crucial to my later role.
![Early Homelab Rack #1](/images/state-of-the-servers-2022/early-rack1.png)
Around this time I was also finally stabilizing on a pretty consistent set of systems, and a rumored change to Google's terms for hosted domains prompted me to move one of my first major production services into my home system: email. I can safely say that, having now run email at home for 7 years, it works plenty fine if you take the proper care.
In early 2016 I discovered two critical new things for the management of my systems: Ansible and Ceph. At first, I was using Ansible mostly for ad-hoc tasks, but I quickly started putting together a set of roles to replace bcfg2 as my primary configuration management tool. While declarative configuration management is nice and all, I liked the flexibility of a more procedural, imperitive system, especially when creating new roles, and it gave me a lot of power to automate complex program deployments that were impossible in bcfg2. By the end of the year I had fully moved over to Ansible for configuration management. I also started using `git` to track my configuration around this time, so this is the earliest period I still have records of, though I might wish to forget it...
![Writing good Git Commits](/images/state-of-the-servers-2022/joshfailsatgit.jpg)
Ceph was the real game-changer for me though. For most of the previous 2 years I had been immensely frustrated with my storage host being a single point of failure in my network: if it needed a kernel update, *everything* had to go down. I had looked into some of the more esoteric "enterprise" solutions like multipath SAS and redundant disk arrays, but cost, space, and power requirements kept me from going that route. Then a good friend introduced me to Ceph which he had been playing with at his workplace. Suddenly I could take 3 generic servers (which he, newly married, was happy to provide due to wife-acceptance-factor reasons) and build a redudant and scalable storage cluster that could tolerate single-host failures. At the time Ceph was a lot more primitive than it is today, forcing some uncommon solutions - using ZFS as an underlying filestore for instance to ensure corrupt data wouldn't be replicated. But this same cluster still serves me now after many years of tweaking and adjusting, having grown from just a dozen 3TB drives to over 20 8TB and 14TB drives and 168TB of raw space. At this time, getting actual file storage on Ceph was hard, due to the immaturity of CephFS at this point, and the hack solution of an XFS array in-VM with a dozen 1TB stripes was fraught with issues, but it worked well enough for long enough for CephFS to mature and for me to move to it for bulk data storage.
The next major change to my hypervisor stack came in mid-2016. In addition to a job change that introduced me to a lot of new technologies, at that point I was really feeling limited by Xen's interface and lack of any sort of batteries, so I looked into alteratives. At this time I looked into ProxMox, but was not at all impressed with its performance, reliability, or featureset; that opinion has not changed since. So I decided on one of my first daring plans: to switch from Xen to KVM+Libvirt, and use Corosync and Pacemaker to manage my VMs, with shared storage provided by the Ceph cluster.
By the end of 2016 I had also finally solved my WiFi problem, using a nice bonus to purchase a pair of Ubiquiti UAP-LR's which were, in addition to their strong signal, capable of proper roaming, finally allowing me to actually cover my entire house with usable WiFi. And a year later I upgraded this to a pair of UAP-AC Pro's for even better speed, keeping one of the UAP-LR's as a separate home automation network. I also moved my routing from the previous Mikrotik Routerboards I was using to pfSense, and bought myself a 10-Gigabit switch to upgrade the connectivity of all of the servers, which overnight nearly doubled the performance of my storage cluster. I also purchased several more servers around this time, first to experiment with, and then to replace my now-aging C6100.
2017 was a year of home automation and routing. I purchased my first set of Belkin WeMo switches, finally set up HomeAssistant, and got to work automating many of my lights, including a [custom voice controller system](https://www.boniface.me/self-hosted-voice-control/). Early in the year I also decided to abandon my long-serving ISP-provided static IP block and move to a new setup. While I liked the services and control it gave me, being DSL on 50 year old lines, the actual Internet performance was abysmal, and I wanted WAN redundancy. So I set up a remote dedicated server with a static IP block routed to it, then piped this back to my home using OpenVPN tunnels load-balanced over my now-redundant DSL and Cable Internet connections, proving both resiliency as well as a more "official" online presence. Later in the year, after discussing with a few coworkers, I invested in a proper colocation, abandoned the dedicated server, and used my now-freed and frustrating C6100 as a redundant pair of remote routers, with a pfSense pair on the home side.
In early 2018, the major drawbacks of Corosync and Pacemaker were rearing their ugly heads more and more often: any attempt to restart the service would trash my VM cluster, which had grown to about 30 VMs running many more service by this point. ProxMox still sort of sucked, and OpenStack was nigh-incomprehensable to a single mere wizard like myself. What I wanted was Nutanix, but even most SME's can't afford that. So, I started building my own, named [PVC or Parallel Virtual Cluster](https://docs.parallelvirtualcluster.org). It wasn't that ambitious at first: I just wanted a replacement to Corosync+Pacemaker which would actually preserve state properly, using Zookeeper as the state management backend. Over time I slowly added more functionality to it, and a major breakthrough came in late 2018 when I left the "new" job and returned to my "old" job, bringing this project with me, and impressing my managers with its potential to replace their aging Xen-based platform (on which I based my original homelab design, ironically enough; student became teacher). By early 2020 I had it deployed in production at 2 ISPs, and today have it deployed at 9, plus two in-house clusters, with several more on the way. I discuss PVC in more detail later.
In late 2018, I finally grew fed up with pfSense. The short of it is, nothing config-wise in pfSense is static: events like "the WAN 1 interface went down" would trigger PHP scripts which would regenerate and reload dozens of services, meaning that trivialities like WAN failovers would take up to 30 seconds. Frustrated, I decided to abandon pfSense entirely and replaced my routers with custom-built FreeBSD boxes in line with the remote routers at the colocation. This setup proved invaluable going forward: 1-second failure detection and seamless failover have been instrumental in keeping a 99+% uptime on my *home* system.
2019 was a fairly quiet year, with some minor upgrades here and there, with the occasional server replacement to help keep power usage down. And by early 2020, most of the current system had fallen into place. While the number of VMs fluctuates month to month still, the core set is about 40 that are always running, across 3 hypervisor hosts running PVC, with bulk data on the 3 Ceph nodes. 2 routers on each side provided redundant connectivity, and after an unfortunate complication with my TPIA cable provider, in 2022 I moved to a business-class Gigabit cable connection, and added a Fixed-Wireless connection in addition to the existing DSL, bringing me to 3 Internet connections. There's no kill like overkill.
## Part Two: The Rack
![Rack Front](/images/state-of-the-servers-2022/rack-front.png)
![Rack Inside](/images/state-of-the-servers-2022/rack-inside.png)
The rack itself is built primarily of 2x4's and wood paneling. Originally, as seen above, I had used Lack tables, but due to the heat output I wanted to contain the heat and try to vent it somewhere useful, or at least not as obtrusive. This went through several iterations, and after scouring for enclosed racks around me to no avail, in ~2017 I took what is now a common refrain for me and built my own.
The primary consturction uses 6 ~6-foot 2x4 risers, which are connected at the top and bottom by horizontal 2x4's to form a stable frame. Heavy castors are present on the bottom below each riser to allow for (relatively) easy movement of the rack around the room as needed, for instance for maintnance or enhancements. The bottom front section also features further horizonal 2x4's to form a base for the heavy UPSes discussed in the next section.
The actual servers sit on pieces of angle iron cut to approximately 3 feet, which bridge the first and second sets of risers on each side, and secured by 2 heavy screws and washers on each riser. This provides an extremely stable support for even the heaviest servers I have, and allows for fairly easy maintenance without having to deal with traditional rails and their mounting points.
The outside of the entire rack is covered by thin veneer paneling to trap heat inside in a controlled way. On the left side, the back section forms a door which can be opened to provide access to the backs of the servers, cabling, and power connections.
I've gone through several airflow configurations to try to keep both the rack itself, and the room it's in, cooler. First, I attempted to directly exhaust the hot air out the adjoining window, but this was too prone to seasonal temperature variation to be useful. I then attempted to route the heat out the side of the rack to the front where it could be cooled by an air conditioner, but this proved ineffective as well. Finally, I moved to a simple, physics-powered solution whereby the top 6 inches of the rack is used to direct hot air, via a set of 4 fans in 2 pairs, towards the top front of the rack and out into the room; this solution works very well to keep the inside temperature of the rack to a relatively reasonable 35 degrees Celsius.
## Part Three: Cooling and Power
Continuing on from the rack exhaust, cooling inside the room is provided by a standalone portable 12000BTU air conditioner from Toshiba.
## Part Four: The Internet Connections and Colocation
## Part Five: The Network
## Part Six: The Bulk Storage Cluster and Backups
## Part Seven: The Hypervisor Cluster
## Part Eight: The "Infrastructure" VMs
## Part Nine: The Matrix/Synapse VMs
## Part Ten: The Other VMs
## Part Eleven: Diagrams!
## Part Twelve: A Video Tour
## Part Thirteen: The Future of the System

Binary file not shown.

After

Width:  |  Height:  |  Size: 158 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 175 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 300 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 KiB

View File

@ -0,0 +1,134 @@
---
title: "The SuperSensor: Your all-in-one Home Assistant satellite"
description: "My own take on the multi-function Home Assistant sensor and voice hub"
date: 2024-04-23
tags:
- DIY
- Home Automation
---
## The Motivations
I've been interested in voice-based home automation for many years now; in fact, it was [one of the first posts on this blog](/posts/self-hosted-voice-control/). For many years I now I've used it to control the lights in my bedroom, for two major reasons: first, the lights I was using were not hardwired, and I had many of them, and thus many little switches on cords; second, I wanted to be able to switch things on and off from wherever I was - be it my bed, my couch, or just outside the room as I was leaving - without having to fiddle with all those switches.
So I went with [Home Assistant](https://www.home-assistant.io/), then and now the *de facto* FLOSS standard for home automation. I bought a few smart plugs, from various manufacturers throughout the years (currently settled very nicely on [Athom ESPHome-based ones](https://www.athom.tech/)). And I set up [Kalliope](https://github.com/kalliope-project/kalliope) on a Raspberry Pi to do it. And it did work wonderfully for a very long time, with some fits and starts at times.
In the middle of 2023 though, I wanted to expand things a bit. Over the years I had added more smart switches in two other main zones: my garage, where they control an infrared heater, several fans, and several lights; and in my basement, where they control again several fans and lights. So I went looking to buy a few more ReSpeakers and Raspberry Pi's to build out a few more voice control nodes.
But I ran into two major problems. First, the original ReSpeaker units I was using had been discontinued, and while there was a nice replacement fit for a Raspberry Pi Mini, these were quite expensive. And second, coming out of the pandemic, Raspberry Pi availability has been basically nonexistent. I specifically wanted the aforementioned Mini units, the Mini 2 W to be precise, and on the one occasion in all of 2023 that I could get one (in October, several months after I started this project!), I could only get *one*. It seemed my plans had to wait...
But then I checked the Home Assistant blog and boy was I in for a treat!
## The Year of the Voice
The Home Assistant project dubbed 2023 "The Year of the Voice". Their goal was to make voice control the primary focus of their development efforts for the year, and by June 2023 they had made excellent progress, landing one of the most critical features: wake word support. I was immediately intrigued and set about trying to set up an ESP32-based satellite device for Home Assistant. Later in 2023, the support got even better. There's still some flaws - at least, I think they're flaws, though others in the project and community disagree - but it works pretty damn well with a bit of tweaking and hackery.
After some experimenting, I had a working prototype streaming voice to my Home Assistant instance. And then I got an idea.
I had recently seen a Reddit post about the [Everything Presence One](https://github.com/EverythingSmartHome/everything-presence-one), and I thought, why not try making my own, but combining it with this voice control aspect? Thus, the idea for the SuperSensor was born.
## Part One: Finding The Parts
Based on the EP1, I wanted the SuperSensor to have the following parts:
1. A microphone. This is pretty much a given for the voice control portion, but I had to find a good option. Luckily support for the INMP441 MEMS microphone is pretty good in ESPHome, and these are what I had used in my tests, so it seemed like the most obvious solution. [Amazon search link](https://www.amazon.ca/s?k=INMP441), since the vendors change fast and both of my orders were different from each other and from the currently available stock. About $3.00 CAD each.
2. LEDs for visual feedback. I'm not a big fan of voice systems that "talk back", and I really liked the ReSpeaker's LED-based feedback mechanism. So I wanted to use LEDs for that purpose here. After a few prototypes, I settled on a design with two RGB common-cathode LEDs which could be driven directly from the GPIOs of the ESP32. Initially I tried using a transistor power delivery system, but due to some missing knowledge on my part and having the wrong kind of transistor, they didn't work, and I found that the ESP's GPIO pins could easily drive both LEDs without issue, so I just went with that. [Amazon search link](https://www.amazon.ca/s?k=5mm+RGB+LED+common+cathode), same situation as the INMP441's. About $0.14 CAD each. I also added a ~300Ω resistor to limit current, though I had these lying around so add ~$0.10 CAD to buy these.
3. A radar sensor. Millimetre-wave radar is absolutely critical for a presence sensor like this, because it allows continuous tracking without motion. I tried 5 different models, but settled on the HLK LD2510C, since it had the best combination of features, configurability, and ESPHome support. [AliExpress link](https://www.aliexpress.com/item/1005006000579211.html) though ensure you select the "LD2410C-P" out of the options. About $5.27 CAD each (including shipping).
4. A PIR sensor. PIR comes in very handy for detecting quick motion, and as a backup (in both directions) for the Radar for initial presence detection. I tried a few options but the SR602 ended up being both the cheapest, nicest form factor, and best performing of the various options I tried. [AliExpress link](https://www.aliexpress.com/item/1005001572550300.html) but I did replace the Fresnel cover with a cover from [a different sensor](https://www.aliexpress.com/item/1005004518651850.html) that looked nicer and seemed to perform a little better, because I had them from my testing. About $2.37 CAD each, plus $0.90 CAD each for 10 of the alternates for their covers (including shipping).
5. A light sensor. While the LD2410C does have "light detection" capabilities, it doesn't actually expose this in a useful way. I wanted a separate light sensor for two main reasons: first, it allows more granular control (e.g. trigger only with presence + lights) for things that are not themselves lights, or for lights in a room with natural light; and second, it can be useful to know how bright the room is for other automations. I originally went with the VEML7700 sensor, but at the time its ESPHome support was subpar, requiring an external custom module that was deprecated. While it is a bit more pricey, I wanted a reliable unit with good ESPHome support, so I went instead with the TSL2591 for the final version. One upside here is that it is able to track infrared light separately from visible light, and though I haven't found an actual use for this yet myself, it might come in handy for someone. [AliExpress link](https://www.aliexpress.com/item/1005005514391429.html). About $6.02 CAD each (including shipping).
6. A temperature sensor. I looked at, and ended up buying, a few different options here, all based on the standard Bosch BM sensors: the BMP280, BME280, and the BME680. The first was an error, as I definitely wanted the humidity option as well, so I purchased several BME280's, before finally deciding that the VOC/air quality detection of the BME680 was worth it. [AliExpress link](https://www.aliexpress.com/item/4000818429803.html). About $6.81 CAD each (including shipping).
7. An ESP32. I also went through several different ESP units before deciding on a particular one with a slim profile and integrated antenna trace, as this made the sensor more compact, and had a USB-C connector for better mechanical and electrical support. [AliExpress link](https://www.aliexpress.com/item/1005006019875837.html) though ensure you select the "HW-395" out of the options. About $10.18 CAD each (including shipping).
The last piece was some female pin receptors to hold the ESP32 boards. I did this to allow a quick swap-out should that ever be needed, and these cost about $15.99 CAD for 10 40-pin strips ([Amazon link](https://www.amazon.ca/gp/product/B08CMNRXJ1)), with each sensor using one strip, or about $1.60 CAD each.
All together, to build one sensor the non-board parts cost about $36.39 CAD including shipping from AliExpress (but not Amazon, because I have Prime). While I could probably improve on some of the parts, I felt this was a good balance and in practice this combination does work quite well with no complaints across 6 sensors.
## Part Two: Designing a Board
Of course, just stringing the parts together wouldn't look very nice or fit very well on a wall, so I designed a custom PCB to integrate all the components using the EasyEDA online builder. I went through two prototype designs before settling on a final design that I'm quite happy with. This design sits in a horizontal orientation with the power cord exiting the right side (when facing the unit), which suited all my planned installation points nicely. This could also be used as a guide to create a vertical or flipped horizontal orientation should one so choose. The design is open source (GPLv3 along with the following ESPHome code) and is available in [the Git repository](https://github.com/joshuaboniface/supersensor/tree/master/board) in both DXF and EasyEDA JSON formats. Note that all pictures are of my second prototype and the final design does fix a few flaws I had with it such as removing some unneeded transistor points, dropping the second LED resistor, and improving a couple trace paths.
The final result with black silkscreening looks fantastic to me, and I love the way the different colours of the individual components are highlighted, as well as hiding the ESP32 behind the unit. In fact, I liked this appearance so much that I abandoned my original plan of designing a 3D-printed case for them, in favour of the bare board. While this might not suit everyone, I like the "DIY PCB" look in my home, and building a case shouldn't be too hard based on the board dimensions.
I was able to order 10 boards for $23.05 CAD shipped, or $2.31 each, bringing the parts total for each SuperSensor to $38.70.
Assembly was a bit more work, having to solder the various sensors and pin strips to the boards, and took about 30 minutes per sensor once I got going. The result was a sleek unit that could be placed fairly inconspicuously in corners of rooms, and I then used extra-long USB3 cables and stainless steel strapping to hold them up, for a definite DIY look.
## Part Three: The ESPHome Configuration
Lastly, I had to write [the ESPHome configuration to make everything work](https://github.com/joshuaboniface/supersensor). Mostly, this just involved exposing the individual sensor components, but as I built it I wanted to make this more universal. To that end, I added several configuration options and a full "presence" system.
The voice component is pretty straightforward to get working with the sensor, and as long as you have a Home Assistant Assist pipeline working with a speech-to-text and wake word engine going, it should be plug-and-play.
You can select how long a PIR detection is "held" for. This is needed since, unlike most DIY PIR sensors, the SR602 as a fixed ~3 second hold time. This wasn't long enough for me in some situations, so I added a latch timer to output a second template sensor value based on the PIR and the hold time: as long as the sensor fires or continues firing within this time, the result sensor remains active; once the timer expires with no further PIR detection, it will become inactive. Thus you could set the PIR detection between 0 (~3) and 60 seconds, with a default of 15 seconds, a good value for most uses.
You can control at what threshold light level "presence" is detected. This is useful to configure the light presence option below based on the actual conditions of the room, the light that falls on the sensor, etc. You can set this anywhere between 0 (always on) and 200 lux, in 5 lux increments, with a default of 30 lux (a decent brightness from a room light).
You can disable voice control. This basically defeats the purpose of my design over something like the EP1, but could be useful in some cases (having multiple sensors in one room for example).
Finally, the big feature: a multi-factor presence detection system. Each major sensor - PIR, Radar, and Light - outputs, based on the above options, an overall "presence" detection sensor. You can select an option both for inbound presence and outbound (cleared) presence, as to what sensors are used to create an overall presence sensor which could then be used in automations.
I go through all the options in the README of the Git repository linked above, but I wanted to highlight the two configurations that I've been using to great success.
### Inbound PIR + Radar + Light, Outbound Radar + Light
I use this configuration for my garage heater. Obviously, a heater turning on when no one is present is both a waste of energy, and a potential fire hazard, so this was the original motivation for making this configurable: I wanted something extremely *safe* that would definitely only trigger when someone was actually there. This option combines all 3 sensors on the inbound, meaning that someone must first turn on the light (light level), walk in front of the sensor (PIR), and finally, to avoid PIR false triggers, trigger the Radar sensor. On the outbound, *either* the light being turned off (light level) or the Radar no longer detecting will turn off presence, and thus, the heater. So far this has worked flawlessly in both directions.
### Inbound PIR + Radar, Outbound Radar
This is the configuration I use for most of my other sensors, i.e. any that trigger lights on or off. The logic is fairly simple: both PIR and Radar must detect presence for the lights to turn on, preventing false-positives, while only the Radar is required to maintain presence.
## Part Four: The Pictures
What would a DIY post be without pictures? Here's a few!
Here's an overall shot showing both a completed unit and the breakout of all the parts.
![Parts](parts.jpg)
Here is the blank PCB, from both the front and back. As mentioned above this is a prototype board, so while there are some differences from the final PCB, the overall layout is correct.
![Blank PCB Front](pcb-front.jpg)
![Blank PCB Back](pcb-back.jpg)
As part of testing all the sensors, I made a socketed version. While this makes the unit extremely thick, it might be a good idea to build one of these to test all your sensors before proceeding with the meticulous soldering of all the components to the boards, because de-soldering them later is basically impossible (godspeed to the BME680 sensor that did not make it).
![Socketed PCB](socketed-pcb.jpg)
Here is a completed board, from the front, back, back without the ESP32 installed, and short side. The ESP32 is socketed on the final boards, both to provide good airflow and to allow quick swapping of the "brains" of individual units if needed, but all the sensors are soldered directly to the board to keep the profile low.
![Completed Board Front](front.jpg)
![Completed Board Back](back.jpg)
![Completed Board Back w/o ESP](back-no-esp.jpg)
![Completed Board Side](side.jpg)
Here is one of the boards in its final mounted location, angled to provide perfect coverage of my garage. Due to where it sits, I had to bodge a makeshift antenna extension onto this one to get a decent WiFi connection, but it works well and this hasn't been needed for any of my other ones.
![Mounted Board](mounted.jpg)
Here is all the information and configuration the SuperSensor provides in Home Assistant.
![Home Assistant Dashboard](dashboard.png)
That is quite a lot of information, so in my actual dashboards I usually only show the most relevant parts for that particular use-case, like this one for my garage sensor.
![Room Dashboard](room.png)
Finally, here is a video demonstration of the voice control in action. This shows the LED feedback colours for listening (blue), processing (cyan), and both positive (green) and negative (red) responses in lieu of a voice response.
{{< youtube Uv4u0GrktBM >}}
## Final Thoughts
All in all, I'm very happy with how the SuperSensor turned out. I'm currently using 5 of them in my house (one in my garage, one in my bedroom, and three in my basement), with another 5 either partially- or fully-built and ready to go when I find locations for them (my outside gazebo seating area being an obvious next choice when the weather improves).
As mentioned above, I've [open sourced both the PCB design and the ESPHome configuration](https://github.com/joshuaboniface/supersensor) under the GNU GPLv3, as well as provided the full parts list with links, so if this design interests you, you can build one (or several, minimum PCB orders and all) yourself for under $40.00 CAD.
I've also ensured that the ESPHome configuration is properly packaged: you can flash a SuperSensor via USB (`esphome run supersensor.yaml` from the repository), then it behaves like most ESPHome-based products: you connect to the broadcasting device WiFi AP to do the initial WiFi configuration, and from there can adopt it into the ESPHome module of Home Assistant to manage, update, and reconfigure it as needed. This will automatically pull ESPHome updates and new configuration changes I make, should I find bugs or add new features.
Hopefully this helps you on your home automation journey!

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 186 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 170 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 162 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 192 KiB