OpenTechShed

Blog of Things

Category: cloud-init

Troubleshooting cloud-init

In a recent blog post, I explained how one could set up a modern-day engineer’s command center on Ubuntu using cloud-init. While writing and testing the cloud-init configuration, I came across few issues that warranted me to write up a post on how to troubleshoot the cloud-init problems. This blog post describes:

  1. Use of syntax checkers to validate cloud-init configuration.
  2. Log files generated by cloud-init.
  3. Sample errors

Validating cloud-init configuration

cloud-init configuration files are written in YAML format. YAML is a user-friendly data format for various programming languages. YAML based syntax is commonly used for configuration files.

Validating the syntax of cloud-init configuration using an online YAML validator is one of the best things to do to avoid any issues related to the YAML syntax.  I use http://yamllint.com/. It’s free to use and easy to use. You enter the configuration in the form and click on Go. Once the input has been processed by the website, it will show you errors (if any) and highlight the line numbers on which errors were found.

The image below shows an example from yamllint website where line number 23 has a single leading space instead of being double-spaced.

yaml-validator

Validating cloud-init configuration

Log files are my best friends when it comes to troubleshooting. cloud-init does a fantastic job here by providing two log files that are detailed and are quite helpful.

/var/log/cloud-init.log

The /var/log/cloud-init.log file contains details on the complete cloud-init process and is an ideal location to view any YAML syntax related issues.

/var/log/cloud-init-output.log

The output from all cloud-init stages are logged under /var/log/cloud-init-output.log. This log contains both errors and success messages. If a download didn’t work or the package manager didn’t install a package, you will see the messages here.

Sample Errors

Let’s look at few errors in this section of the blog post.

Error 1 – Watch for spacing and syntax
yaml.parser.ParserError: while parsing a block collection
in "<unicode string>", line 21, column 3:
- [pip, install, -U, pip, ansibl ...
^
expected <block end>, but found '<block sequence start>'
in "<unicode string>", line 26, column 4:
- [curl, -o, /tmp/gclisdk.tar.gz ...
^

The error shown above was due to incorrect formatting as the line had single space instead of double space. Single space marks the start of a block while double space marks the config inside a block.

Error 2 – Watch out for special characters
yaml.scanner.ScannerError: while scanning for the next token
found character '|' that cannot start any token
in "<unicode string>", line 27, column 95:
... epos/azure-cli/ $AZ_REPO main", |, tee, /etc/apt/sources.list.d/ 
                                    ^

Special characters need to be escaped within a single quote. If the | character would have been written as ‘|’, there wouldn’t have been an issue.

Error 3 – Watch out for missing quotes
"Please check http://pyyaml.org/wiki/YAMLColonInFlowContext for details.")
yaml.scanner.ScannerError: while scanning a plain scalar
in "<unicode string>", line 28, column 17:
- [ curl, -L, https://packages.microsoft.com/k ...
                   ^
found unexpected ':'
in "<unicode string>", line 28, column 22:
- [ curl, -L, https://packages.microsoft.com/keys/m ...

When using URLs, it’s always safe to have them enclosed in double quotes. E.g.:

"https://packages.microsoft.com"

instead of

http://packages.microsoft.com
Error 4 – Misconfigured options in commands
deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main | tee /etc/apt/sources.list.d/azure-cli.list
curl: option -: is unknown
curl: try 'curl --help' or 'curl --manual' for more information

The best way to do this is to try the commands manually on your server and then use them in cloud-init configuration.

Happy troubleshooting!

Modern Day Engineer’s Control Center

Network Engineers, System Administrators, and DevOps engineers have to use various open source software in order to do their day to day jobs. This could be things like using Ansible for server or network configuration, it could be Terraform to deploy and maintain infrastructure, Git for version control, or CLIs provided by cloud infrastructure providers. Even with small teams, one of the biggest challenges that I have seen is how to make sure that the systems can be easily setup on day 1.

I had touched base on cloud-init briefly in an earlier post, but that relied on downloading a bash script and executing it. This blog post covers details on the commonly used software and how to build a Linux based system automatically. It assumes that the system is being built on OpenStack or any other cloud provider infrastructure such as AWS, GCP, Azure, or Digital Ocean and uses Ubuntu 16.04 as the operating system.

Note:

  • There are other ways such as using Vagrant and Docker containers for this.
  • This solution will not scale well for larger teams as security, etc. is not accounted for.
  • Automatic updates are not captured in this blog post.

cloud-init configuration used in this blog post is available on GitHub. It has been tested on the following cloud providers and Ubuntu versions.

Digital Ocean – Ubuntu 16.04
Digital Ocean – Ubuntu 18.04

This blog post is divided into two sections.

  1. Software components – provides an introduction to the software that will be installed.
  2. Cloud-init – We look at cloud-init and describe how it is used to configure the server after first boot.

Software components

  • Python – Python is a general-purpose programming language, which has become a widely adopted programming language of choice for automation. There are other languages such as Ruby that is not covered in this blog
  • python-pip – PIP is used to install Python modules and software components.
  • Terraform – Terraform provides an easy way to automate infrastructure setup.
  • Ansible – I personally like Ansible over other solutions such as Chef and Puppet because of it’s clientless architecture that relies on SSH to configure servers and network devices.
  • Git – Git is a version control system that can be either hosted locally or remotely and allows collaboration and software sync.
  • CLI from cloud providers
    • OpenStackClient – OpenStackClient provides CLI based access to manage the OpenStack infrastructure.
    • Azure CLI v2.0 – Used to manage resources on Azure.
    • Google Cloud SDK – Google Cloud SDK to manage resources on Google Cloud Platform.
    • Amazon Web Service CLI – Used to manage resources on Amazon Web Services.
    • doctl – Used to manage resources on Digital Ocean.

cloud-init

cloud-init website provides a neat single line description.

The standard for customizing cloud instances

cloud-init works with almost all modern Linux operating systems and allows an end user to specify different aspects of the server instance by using the user data file. cloud-init then reads the user data during bootup and configures the servers automatically. This process allows instances to be an identical clone of each other.

Head over to GitHub to download the cloud-init script. Please note that changes are required to the cloud-init script. The changes that are required, are highlighted in the description section below. Using cloud-init, we will be doing the following on our server.

  1. Setup a user that will be able to login to the server using ssh-keys.
  2. Perform software upgrade on the server.
  3. Installs the software from the software component list above.
  4. After installation has completed; post to a URL sending data about the instance. This could be used for inventory tracking, etc.

While I try to do my best in explaining the script, you can read the cloud-init docs that provide more details.

cloud-init runs as root so you don’t need to provide sudo in your commands.

User Setup

The first part of the script configures a user and setups ssh-keys for that user. You can add multiple users if required.

#cloud-config
users:
  - name: add-user-here
    groups: sudo
    shell: /bin/bash
    sudo: ['ALL=(ALL) NOPASSWD:ALL']
    ssh-authorized-keys:
      - add-key-here

The block shown above, will configure a user, add it to the sudo group, assign the shell as bash, update the sudoers file such that the user won’t need to enter password whenever he/she types “sudo -s”, and adds the public key of the user into the .ssh/authorized_keys file to ensure that passwordless authentication is setup.

Package Upgrade

A single line will update the available packages list and will also upgrade the server.

package_upgrade: true
Software installation

Additional packages can be installed during first boot using the configuration block shown below.

packages:
  - python
  - python-pip
  - git
  - zip
  - tcpdump

As shown above, python, python-pip, git, zip, and tcpdump will automatically be installed. If you have requirements for other packages, add them to the list above.

In order to install the packages that are listed in the software components section, we will be using the runcmd feature that cloud-init provides. This allows us to use standard Linux commands to complete installation of the software.

The configuration shown in the block below will

  • upgrade pip
  • install
    • ansible
    • python-openstackclient
    • awscli
runcmd:
  - [ pip, install, -U, pip, ansible, python-openstackclient, awscli ]

The configuration below will download terraform 0.11.7 into /tmp directory and then unzip the contents of the directory to /usr/local/bin. I recommend checking https://terraform.io for the latest terraform release.

  - [ curl, -o, /tmp/terraform.zip, "https://releases.hashicorp.com/terraform/0.11.7/terraform_0.11.7_linux_amd64.zip" ]
  - [ unzip, -d, /usr/local/bin/, /tmp/terraform.zip ]

Installation of Digital Ocean CLI (doctl) is similar to that of Terraform.

  - [ curl, -L, -o, /tmp/doctl.tar.gz, "https://github.com/digitalocean/doctl/releases/download/v1.8.0/doctl-1.8.0-linux-amd64.tar.gz" ]
  - [ tar, -C, /usr/local/bin, -zxvf, /tmp/doctl.tar.gz ]

Installation of Google SDK, Azure CLI are same. In order to show the various features of cloud-init, the description below sets up the apt repo and installs both Google SDK and Azure CLI using apt.

  
  - CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)"
  - 'echo "deb http://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list'
  - 'curl "https://packages.cloud.google.com/apt/doc/apt-key.gpg" | sudo apt-key add -''
  - AZ_REPO=$(lsb_release -cs)
  - 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | tee /etc/apt/sources.list.d/azure-cli.list'
  - 'apt-key adv --keyserver packages.microsoft.com --recv-keys 52E16F86FEE04B979B07E28DB02C46DF417A0893'
  - 'apt-get update && apt-get -y install apt-transport-https azure-cli'
Phone home

Using the phone_home feature, cloud-init can be configured to send an HTTP POST message to a web server.

phone_home:
  url: https://webhook.site/0d1e9a5f-1559-4197-b12f-2f68146eb5fd/$INSTANCE_ID/
  post: all
  tries: 2

In the example above, we will send an HTTP POST to a Webhook Tester endpoint so that you can view what the HTTP POST from cloud-init looks like. In the real world, you would send a POST message to your own web server that could do additional post-processing.

I have this part of the cloud-init configuration commented out in the GitHub hosted file.

HTTP POST Output

Below is a sample output from the HTTP post. To make it easier for a reader’s eye, the output of the HTTP POST message has been trimmed.

{
  "instance_id": "91472478",
  "hostname": "ubuntu-s-1vcpu-1gb-nyc3-01",
  "pub_key_dsa": "ssh-dss+AAAAB3Nz<trim>Y%3D+root%40ubuntu-s-1vcpu-1gb-nyc3-01%0A",
  "pub_key_ecdsa": "ecdsa-sha2-nistp256+AAAAE2<trim>YQ%3D+root%40ubuntu-s-1vcpu-1gb-nyc3-01%0A",
  "pub_key_rsa": "ssh-rsa+AAAAB<trim>1h+root%40ubuntu-s-1vcpu-1gb-nyc3-01%0A",
  "fqdn": "ubuntu-s-1vcpu-1gb-nyc3-01\r\n"
}