As the lazy proactive engineer that I am, I tend to advocate for self-service tooling and the empowerment of decision-makers. The idea of continuous integration and continuous deployment (CI/CD) can be scary for some shops.
Usually the main concern is code that “goes for a ride” with a desired strategy. What happens when code isn’t quite ready for prime-time? Maybe product & marketing hasn’t announced a new feature or maybe there is a dependency that isn’t quite ready yet? Well part of the solution involves Feature Flagging. This is the idea that code can be deployed, then activated at a later point in time. This decouples deployment from functionality, making the change of behavior a configuration item.
In the past, I have seen a lot of home-grown solutions that tried to solve this problem. Today, it appears there is a growing number of solutions both commercial and open-source that fill this space.
My main recommendation is putting the tooling in the hands of the decision-maker and free up your engineering team to spend their time on more productive activities. Perhaps the decision-maker is a product owner?
A follow up recommendation is to consider the cost of a feature-flag. Generally-speaking a flag’s state is stored in a database? Is your code optimized to check the feature flag only when necessary, or does a page load trigger many external calls? Is the datastore right-sized? Does the network have enough throughput? This is an implementation detail that should be considered before jumping head-first into implementing feature flags.
During my morning scroll, I stumbled upon an open-source offering called Flagr. I like that there is an HTTP API for integration into a diverse set of tooling. I like that there is a GUI that can enable non-programmers to take feature roll-out into their own hands.
I have not used Flagr yet, so I am not 100% understanding of the short-comings of this tool. I do encourage that before you start working on writing your own flagging system to take a survey of offerings that can guide you into a pattern that you don’t have to dream up yourself.
I pretty much live in a vim, tmux, and bash world for most of my interaction outside of the web browser. Here is a snippet of my .bashrc file that saves me some time.
export VAULT_ADDR="https://your-vault-server-domain"
export NOMAD_ADDR="https://your-nomad-server-domain"
export CONSUL_ADDR="https://your-consul-server-domain"
alias vl='vault login -method=oidc'
alias nl='export NOMAD_TOKEN=$(vault read -field=secret_id nomad/creds/developer)'
alias cl='export CONSUL_HTTP_TOKEN=$(vault read -field=token consul/creds/developer)'
My workflow is:
vl – Log into Vault
nl – Retrieve and set Nomad credentials
cl – Retrieve and set Consul credentials
I decided to keep these as separate aliases to only retrieve the tokens I need. I generally use Vault and Nomad the most. My tokens expire after 1 hour, so this saves me quite a bit of typing.
My latest project involves running an emulated IBM mainframe on a virtual machine scheduled by Nomad. I came across a situation where my TSO session became unresponsive and I was unable to break out of the error by F13 or LOGOFF into my 3270 terminal.
If you have access to the operator’s console, you can CANCEL the session by issuing /C U=<USER_ID>. This will terminate the session and allow you to log in again through your terminal.
When you download a VM image straight from a repository, it is pretty vanilla – there isn’t much configured outside of the defaults.
This guide will create a slim shim that runs when your VM starts up to inject your SSH keys.When you start your VM, you will specify the base image as a disk AND the preseed which is configured to run at startup via Cloud-Init.
I am running on an Ubuntu 18.04 workstation, but the process should be the same in newer versions and other distributions.
Install Required Packages
We will use a utility called cloud-localds that is included in the cloud-image-utils package and QEMU to run our VM.
We will use a CentOS 8 image on an Ubuntu host to wrap our heads around the separate machines. Plus CentOS is a fine server image!
Create your pre-seed
Create a file called cloud-init.cfg and copy the following contents into the file:
#cloud-config
users:
- name: centos
sudo: ALL=(ALL) NOPASSWD:ALL
groups: users, admin
home: /home/centos
shell: /bin/bash
lock_passwd: false
ssh-authorized-keys:
- <your ssh public key>
ssh_pwauth: false
disable_root: false
chpasswd:
list: |
centos:linux
expire: False
packages:
- qemu-guest-agent
# written to /var/log/cloud-init-output.log
final_message: "The system is finally up, after $UPTIME seconds"
Hint: Remember the “final_message” statement.
Replace <your ssh public key> with the contents of your public key. This is likely in ~/.ssh/id_rsa.pub. It should be a single line. Just paste it right on the line. It should look like this:
Save this file and then execute the following command.
cloud-localds cloud-init.img cloud-init.cfg
This creates a disk image that contains cloud-init data that will configure your virtual machine upon boot.
Store the disk Images somewhere!
In order for your VM to run, it must have access to the disk images we just downloaded to our workstation.
Since we don’t know where Nomad will place the job within the cluster, we need to put these images into storage that is accessible by all hosts.
In my lab, I run Minio, an S3-compatible object store. For your purposes you may consider S3, HTTP, or even Git. We will take advantage of Nomad’s artifact stanza to download the image to the appropriate spot on the host system for the virtual machine.
For the sake of this post, assume that you have the files accessible over HTTP.
In my own use, I used a Presigned URL that allowed Nomad to retrieve the objects out of Minio.
I uploaded cloud-init.img to object storage as well as CentOS-8-GenericCloud-8.1.1911-20200113.3.x86_64.qcow2.
In the example below, replace the artifact source with a real destination.
Create your Nomad Job
Copy and paste this into a file called blog-test-centos8-vm.nomad Feel free to adjust CPU, Memory, or other settings if you know what you’re doing.
Please remember that the artifact sources need to be real locations that your Nomad workers have access to.
This job specification is pretty straight-forward, but I’ll call out the points of interest.
task.config.image_path: This should be the base image for your virtual machine.
task.config.args: We leverage the arguments to specify a second image, our Cloud Init preseed. When the VM starts, it will run cloud-init using the data on this image.
task.config.port_map: Without setting up consoles, the only way to access the virtual machine will be through SSH. We will expose this port.
resources.network.port.ssh: Since we are running this on bare metal, the host already allocated port 22 to its SSH daemon. We will let Nomad handle the allocation of an ephemeral port for SSH access to the guest VM. Don’t worry, we can find this port later.
artifact(s): there are two artifact stanzas that source the VM images: the preseed and the base image. These are downloaded before the VM runs.
Run the Nomad Job
Alright now that we have the job specification written, you can plan and run it.
nomad plan blog-test-centos8-vm.nomad
nomad job run -check-index 290127 blog-test-centos8-vm.nomad
If all is successful, the allocation should start. You can hop into the Nomad UI, click on the allocated task, and view the log output.
System log output highlighting the final_message stanza from our preseed cloud-init.img.
Under the hood, Nomad executes the following command on the worker. Since Nomad already adds the base image, the trick is to use the args to specify the Cloud Init preseed image.
Alright now that the instance is running, it’s sitting waiting for us to log into it. If you were to SSH to the IP address directly, you’d be logging into the HOST and not the guest VM. We told Nomad to allocate an ephemeral port. To find out what port to connect to run the following commands.
$ nomad status blog-test-centos8
ID = blog-test-centos8
Name = blog-test-centos8
Submit Date = 2020-10-27T11:22:29-05:00
Type = service
Priority = 50
Datacenters = dc1
Namespace = default
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
blog-test-centos8 0 0 1 1 1 0
Allocations
ID Node ID Task Group Version Desired Status Created Modified
a6587dec 6ea587a2 blog-test-centos8 1 run running 2h51m ago 2h49m ago
Then to see the specific allocation information, using your own allocation ID from the output above.
$ nomad status a6587dec
ID = a6587dec-f180-8839-c09f-a15a2ce28ce6
Eval ID = 508e0ae2
Name = blog-test-centos8.blog-test-centos8[0]
Node ID = 6ea587a2
Node Name = nuc2
Job ID = blog-test-centos8
Job Version = 1
Client Status = running
Client Description = Tasks are running
Desired Status = run
Desired Description = <none>
Created = 2h59m ago
Modified = 2h57m ago
Task "tk4-mainframe" is "running"
Task Resources
CPU Memory Disk Addresses
78/500 MHz 830 MiB/1.0 GiB 1.0 GiB ssh: 192.168.100.38:20176
Task Events:
Started At = 2020-10-27T18:14:35Z
Finished At = N/A
Total Restarts = 0
Last Restart = N/A
Recent Events:
Time Type Description
2020-10-27T13:14:35-05:00 Started Task started by client
2020-10-27T13:14:27-05:00 Downloading Artifacts Client is downloading artifacts
2020-10-27T13:14:27-05:00 Task Setup Building Task Directory
2020-10-27T13:12:58-05:00 Received Task received by client
In this particular case we have ssh 192.168.100.38:20176 , meaning we can SSH using something like this:
ssh centos@192.168.100.38 -p 20176
Next Steps
You have used some tooling build around Cloud Init to generate a preseed image that can bootstrap your virtual machine image. You used Nomad to create a VM with those images. This is a pretty rudimentary setup and you can do a lot more.
Head on over to the Cloud Init documentation! You can do a ton with Cloud Init to have your virtual machine come online with the correct configuration.
Sometimes when you’re hacking around (possibly with Hashicorp Waypoint) you will find yourself with a large number of dead jobs. Here’s a small snippet to clean those jobs up. This relies on pattern matching, so in this particular example the script would clean up: