OpenTechShed

Blog of Things

Troubleshooting cloud-init

In a recent blog post, I explained how one could set up a modern-day engineer’s command center on Ubuntu using cloud-init. While writing and testing the cloud-init configuration, I came across few issues that warranted me to write up a post on how to troubleshoot the cloud-init problems. This blog post describes:

  1. Use of syntax checkers to validate cloud-init configuration.
  2. Log files generated by cloud-init.
  3. Sample errors

Validating cloud-init configuration

cloud-init configuration files are written in YAML format. YAML is a user-friendly data format for various programming languages. YAML based syntax is commonly used for configuration files.

Validating the syntax of cloud-init configuration using an online YAML validator is one of the best things to do to avoid any issues related to the YAML syntax.  I use http://yamllint.com/. It’s free to use and easy to use. You enter the configuration in the form and click on Go. Once the input has been processed by the website, it will show you errors (if any) and highlight the line numbers on which errors were found.

The image below shows an example from yamllint website where line number 23 has a single leading space instead of being double-spaced.

yaml-validator

Validating cloud-init configuration

Log files are my best friends when it comes to troubleshooting. cloud-init does a fantastic job here by providing two log files that are detailed and are quite helpful.

/var/log/cloud-init.log

The /var/log/cloud-init.log file contains details on the complete cloud-init process and is an ideal location to view any YAML syntax related issues.

/var/log/cloud-init-output.log

The output from all cloud-init stages are logged under /var/log/cloud-init-output.log. This log contains both errors and success messages. If a download didn’t work or the package manager didn’t install a package, you will see the messages here.

Sample Errors

Let’s look at few errors in this section of the blog post.

Error 1 – Watch for spacing and syntax
yaml.parser.ParserError: while parsing a block collection
in "<unicode string>", line 21, column 3:
- [pip, install, -U, pip, ansibl ...
^
expected <block end>, but found '<block sequence start>'
in "<unicode string>", line 26, column 4:
- [curl, -o, /tmp/gclisdk.tar.gz ...
^

The error shown above was due to incorrect formatting as the line had single space instead of double space. Single space marks the start of a block while double space marks the config inside a block.

Error 2 – Watch out for special characters
yaml.scanner.ScannerError: while scanning for the next token
found character '|' that cannot start any token
in "<unicode string>", line 27, column 95:
... epos/azure-cli/ $AZ_REPO main", |, tee, /etc/apt/sources.list.d/ 
                                    ^

Special characters need to be escaped within a single quote. If the | character would have been written as ‘|’, there wouldn’t have been an issue.

Error 3 – Watch out for missing quotes
"Please check http://pyyaml.org/wiki/YAMLColonInFlowContext for details.")
yaml.scanner.ScannerError: while scanning a plain scalar
in "<unicode string>", line 28, column 17:
- [ curl, -L, https://packages.microsoft.com/k ...
                   ^
found unexpected ':'
in "<unicode string>", line 28, column 22:
- [ curl, -L, https://packages.microsoft.com/keys/m ...

When using URLs, it’s always safe to have them enclosed in double quotes. E.g.:

"https://packages.microsoft.com"

instead of

http://packages.microsoft.com
Error 4 – Misconfigured options in commands
deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main | tee /etc/apt/sources.list.d/azure-cli.list
curl: option -: is unknown
curl: try 'curl --help' or 'curl --manual' for more information

The best way to do this is to try the commands manually on your server and then use them in cloud-init configuration.

Happy troubleshooting!

Modern Day Engineer’s Control Center

Network Engineers, System Administrators, and DevOps engineers have to use various open source software in order to do their day to day jobs. This could be things like using Ansible for server or network configuration, it could be Terraform to deploy and maintain infrastructure, Git for version control, or CLIs provided by cloud infrastructure providers. Even with small teams, one of the biggest challenges that I have seen is how to make sure that the systems can be easily setup on day 1.

I had touched base on cloud-init briefly in an earlier post, but that relied on downloading a bash script and executing it. This blog post covers details on the commonly used software and how to build a Linux based system automatically. It assumes that the system is being built on OpenStack or any other cloud provider infrastructure such as AWS, GCP, Azure, or Digital Ocean and uses Ubuntu 16.04 as the operating system.

Note:

  • There are other ways such as using Vagrant and Docker containers for this.
  • This solution will not scale well for larger teams as security, etc. is not accounted for.
  • Automatic updates are not captured in this blog post.

cloud-init configuration used in this blog post is available on GitHub. It has been tested on the following cloud providers and Ubuntu versions.

Digital Ocean – Ubuntu 16.04
Digital Ocean – Ubuntu 18.04

This blog post is divided into two sections.

  1. Software components – provides an introduction to the software that will be installed.
  2. Cloud-init – We look at cloud-init and describe how it is used to configure the server after first boot.

Software components

  • Python – Python is a general-purpose programming language, which has become a widely adopted programming language of choice for automation. There are other languages such as Ruby that is not covered in this blog
  • python-pip – PIP is used to install Python modules and software components.
  • Terraform – Terraform provides an easy way to automate infrastructure setup.
  • Ansible – I personally like Ansible over other solutions such as Chef and Puppet because of it’s clientless architecture that relies on SSH to configure servers and network devices.
  • Git – Git is a version control system that can be either hosted locally or remotely and allows collaboration and software sync.
  • CLI from cloud providers
    • OpenStackClient – OpenStackClient provides CLI based access to manage the OpenStack infrastructure.
    • Azure CLI v2.0 – Used to manage resources on Azure.
    • Google Cloud SDK – Google Cloud SDK to manage resources on Google Cloud Platform.
    • Amazon Web Service CLI – Used to manage resources on Amazon Web Services.
    • doctl – Used to manage resources on Digital Ocean.

cloud-init

cloud-init website provides a neat single line description.

The standard for customizing cloud instances

cloud-init works with almost all modern Linux operating systems and allows an end user to specify different aspects of the server instance by using the user data file. cloud-init then reads the user data during bootup and configures the servers automatically. This process allows instances to be an identical clone of each other.

Head over to GitHub to download the cloud-init script. Please note that changes are required to the cloud-init script. The changes that are required, are highlighted in the description section below. Using cloud-init, we will be doing the following on our server.

  1. Setup a user that will be able to login to the server using ssh-keys.
  2. Perform software upgrade on the server.
  3. Installs the software from the software component list above.
  4. After installation has completed; post to a URL sending data about the instance. This could be used for inventory tracking, etc.

While I try to do my best in explaining the script, you can read the cloud-init docs that provide more details.

cloud-init runs as root so you don’t need to provide sudo in your commands.

User Setup

The first part of the script configures a user and setups ssh-keys for that user. You can add multiple users if required.

#cloud-config
users:
  - name: add-user-here
    groups: sudo
    shell: /bin/bash
    sudo: ['ALL=(ALL) NOPASSWD:ALL']
    ssh-authorized-keys:
      - add-key-here

The block shown above, will configure a user, add it to the sudo group, assign the shell as bash, update the sudoers file such that the user won’t need to enter password whenever he/she types “sudo -s”, and adds the public key of the user into the .ssh/authorized_keys file to ensure that passwordless authentication is setup.

Package Upgrade

A single line will update the available packages list and will also upgrade the server.

package_upgrade: true
Software installation

Additional packages can be installed during first boot using the configuration block shown below.

packages:
  - python
  - python-pip
  - git
  - zip
  - tcpdump

As shown above, python, python-pip, git, zip, and tcpdump will automatically be installed. If you have requirements for other packages, add them to the list above.

In order to install the packages that are listed in the software components section, we will be using the runcmd feature that cloud-init provides. This allows us to use standard Linux commands to complete installation of the software.

The configuration shown in the block below will

  • upgrade pip
  • install
    • ansible
    • python-openstackclient
    • awscli
runcmd:
  - [ pip, install, -U, pip, ansible, python-openstackclient, awscli ]

The configuration below will download terraform 0.11.7 into /tmp directory and then unzip the contents of the directory to /usr/local/bin. I recommend checking https://terraform.io for the latest terraform release.

  - [ curl, -o, /tmp/terraform.zip, "https://releases.hashicorp.com/terraform/0.11.7/terraform_0.11.7_linux_amd64.zip" ]
  - [ unzip, -d, /usr/local/bin/, /tmp/terraform.zip ]

Installation of Digital Ocean CLI (doctl) is similar to that of Terraform.

  - [ curl, -L, -o, /tmp/doctl.tar.gz, "https://github.com/digitalocean/doctl/releases/download/v1.8.0/doctl-1.8.0-linux-amd64.tar.gz" ]
  - [ tar, -C, /usr/local/bin, -zxvf, /tmp/doctl.tar.gz ]

Installation of Google SDK, Azure CLI are same. In order to show the various features of cloud-init, the description below sets up the apt repo and installs both Google SDK and Azure CLI using apt.

  
  - CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)"
  - 'echo "deb http://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list'
  - 'curl "https://packages.cloud.google.com/apt/doc/apt-key.gpg" | sudo apt-key add -''
  - AZ_REPO=$(lsb_release -cs)
  - 'echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $AZ_REPO main" | tee /etc/apt/sources.list.d/azure-cli.list'
  - 'apt-key adv --keyserver packages.microsoft.com --recv-keys 52E16F86FEE04B979B07E28DB02C46DF417A0893'
  - 'apt-get update && apt-get -y install apt-transport-https azure-cli'
Phone home

Using the phone_home feature, cloud-init can be configured to send an HTTP POST message to a web server.

phone_home:
  url: https://webhook.site/0d1e9a5f-1559-4197-b12f-2f68146eb5fd/$INSTANCE_ID/
  post: all
  tries: 2

In the example above, we will send an HTTP POST to a Webhook Tester endpoint so that you can view what the HTTP POST from cloud-init looks like. In the real world, you would send a POST message to your own web server that could do additional post-processing.

I have this part of the cloud-init configuration commented out in the GitHub hosted file.

HTTP POST Output

Below is a sample output from the HTTP post. To make it easier for a reader’s eye, the output of the HTTP POST message has been trimmed.

{
  "instance_id": "91472478",
  "hostname": "ubuntu-s-1vcpu-1gb-nyc3-01",
  "pub_key_dsa": "ssh-dss+AAAAB3Nz<trim>Y%3D+root%40ubuntu-s-1vcpu-1gb-nyc3-01%0A",
  "pub_key_ecdsa": "ecdsa-sha2-nistp256+AAAAE2<trim>YQ%3D+root%40ubuntu-s-1vcpu-1gb-nyc3-01%0A",
  "pub_key_rsa": "ssh-rsa+AAAAB<trim>1h+root%40ubuntu-s-1vcpu-1gb-nyc3-01%0A",
  "fqdn": "ubuntu-s-1vcpu-1gb-nyc3-01\r\n"
}

bash and mailgun for ad-hoc reporting

Mailgun provides an easy way to send out emails that can be integrated in your scripts and in turn helps with automated reporting. Often times we have to run some ad-hoc reports that are short-lived (hopefully) but need to be automated so that nobody has to go through the pain of running them manually. Working with large datasets of information, this becomes a requirement for me every now and then. Most of the times it is someone requesting for information that is not exposed using a web interface or API. While making the information using a web interface or API is definitely a good long-term solution, in order to be agile, I use mailgun and a simple shell script to automate the reporting.

In this blog post, we will look at a script that will

  1. Connect to a PostgreSQL database
  2. Execute a select query
  3. Exports the information into a CSV file
  4. Email the CSV file to a list of email addresses that have been specified.

You can download the script from GitHub. The script has comments that will help guide you through the configuration process. It also has an example of executing this script every Friday at 4 am using cron. If you have used bash scripts in the past, I recommend skipping rest of the article.

Script Requirements

Following are the requirements to execute the script.

  1. Mailgun Account – Head over to https://www.mailgun.com
  2. curl
  3. psql
  4. Optional – .pgpass setup if using PostgreSQL authentication – See blog post at https://blog.sleeplessbeastie.eu/2014/03/23/how-to-non-interactively-provide-password-for-the-postgresql-interactive-terminal/ for details

The sections below describe the variables, and commands used within the script.

Email variables

In this section we look at the various email related variables that can be configured in the script.

# Add a comma separate list of email addresses
to='user1@example.com'

The to variable allows you to configure multiple email addresses to which the email will be delivered.

# The from email address
from='user2@example.com'

The from variable is the email address from which the email will originate. Ideally, this should be something that is read by you or someone else for follow up if someone in the to field responds.

# Subject
subject='My adhoc report'

subject variable allows you to define the subject of the email that is being sent.

#Any text that should be part of the email
text='Please find attached the adhoc report.'

text variable can be used to add text to the body of the email. This can potentially be enhanced by adding information from the SQL query.

#The report file - This should be a friendly name
# of the csv file that is generated by the SQL query
reportFile='givemeafriendlyname.csv'

reportFile is the name of the file that will be sent with the email. The information from SQL query is stored in CSV format in this file and is sent as an attachment.

PostgreSQL Variables

In this section we look at the PostgreSQl variables.

#PostgreSQL Database Server IP/Hostname
pserver='192.168.1.2'

pserver variable hosts the IP address of the database server. Hostname can be used as well as long as DNS/host files are configured properly.

#PostgreSQL Database Name
pdb='dbname'

pdb variable holds the name of the database.

#PostgreSQL Database Server Username
puser='pgsqluser'

puser variable holds the name of the user that will connect to the PostgreSQL database. You might see an error similar to one below if the user specified is incorrect.

psql: FATAL:  role "postgres1" does not exist
#SQL query to execute; replace with what makes sense to you.
pquery='COPY (SELECT ts AS Timestamp, orderid AS Order, qty AS Quantity, sale AS Sale FROM orderlog LIMIT 10) TO STDOUT WITH CSV HEADER'

pquery holds the SQL statement that the script will execute. In this one, we are retrieving timestamp, orderid, quantity, and sale information from a table called orderlog and the output is being sent to standard output with the CSV headers. This allows us to build a CSV file that can be read easily using an application such as Microsoft Excel.

Mailgun Variables

This section describes the mailgun variables. In order to get the variables, login to the mailgun dashboard, click on domains and the select the domain that you want to use for the script. All the information is in the “Domain Information” section.

# Mailgun API key
mapikey='key-REPLACE-WITH-YOUR-OWN-KEY'

mapikey variable holds the api key. This is used for authentication and is used as part of the curl command, which is described in the next section of this blog post.

# Mailgun domain - Replace with your domain name
mdomain='mailgun.example.com'

mdomain variable holds the domain name that you configured in mailgun.

# Mailgun URL
murl="https://api.mailgun.net/v3/$mdomain/messages"

murl variable is computed directly. It creates a URL that curl will send a POST message to in order to send the email. This variable doesn’t needs to be changed.

PostgreSQL and curl commands

The meat of the script consists of two commands.

# Report has header as per the query.
output=$(psql -U $puser -h $pserver -d $pdb -c "$pquery" > $reportFile)

Execute the psql command and store the output to the report file.

# Send mail via mailgun
sendmail=$(curl -s --user api:$mapikey $murl -F from="$from" -F to="$to" -F subject="$subject" -F text="$text" -F h:Reply-To="$from" -F attachment="@$reportFile")

Execute curl command and POST to mailgun. There are three mail curl options that are being used in the script.

  • -s – Makes curl silent and doesn’t shows progress or errors.
  • –user – sends api as the username and $mapikey as the password for server authentication
  • -F – This allows curl to emulate filling up form fields. In this script the various form filds are from, to, subject, text, Reply-To and attachment. Mailgun Documentation has details on all the fileds that can be sent using HTTP POST.
Troubleshooting

One of the easiest ways to troubleshoot bash scripts, is to run them using -x command.

bash -x psqlReport.sh

will print details on script execution and will allow you to debug issues.

Potential enhancements

Few potential enhancements that I would like to do is to write a subroutine that gets triggered if there is an error during psql or during mailgun API call. In case of an error:

  1. Make the script more robust. E.g.: It will still try to send an email if psql query fails.
  2. store the error in a log file.

Hope this helps in your quest to generate ad-hoc reports!

Export logs from FortiGate to FTP server

This is a quick post on how to export logs from a FortiGate to FTP server. On FortiGates running FortiOS 5.6 and above, one can easily transfer all logs from memory to an FTP server.

How about secure copy, Fortinet?

This is useful if you don’t have access to a syslog server or would like to review them on a server rather than FortiOS. Use the command below to transfer all the logs

execute backup memory alllogs ftp 192.168.1.2 ftpuser ftppassword 
tlog memory log is empty.
Please wait...

Connect to ftp server 192.168.1.2 ...
Sent log file _elog.mlog to ftp server as mem_elog_FGTSERIALNUMBER_root_20180418_214705_mlog OK.
vlog memory log is empty.
wlog memory log is empty.
alog memory log is empty.
slog memory log is empty.
mlog memory log is empty.
plog memory log is empty.
dlog memory log is empty.
rlog memory log is empty.
flog memory log is empty.
olog memory log is empty.

For devices with a disk, replace the memory with disk

execute backup disk alllogs ftp

For FortiGates with a disk, you can specify the type of logs you want to export to FTP server as well.

E.g.:

execute backup disk alllogs ftp

where logtype could be one of the following

traffic, event, virus, webfilter, ips, emailfilter, anomaly, voip, dlp, app-ctrl, waf, dns

FortiGate Virtual Machine Config Drive – Missing Pieces

With FortiOS version 5.4.1 and above, Fortinet added support for initial configuration of a FortiGate virtual machine by attaching a cloud-init config drive. When the FortiGate VM powers up for the first time, it will automatically read the data from the config drive and apply both license and configuration to the FortiGate. This is an excellent way to automate deployments of FortiGate virtual machines in production or lab environments. You can read more about the config drive support and how to use one at http://cookbook.fortinet.com/config-drive-esx-vcenter-vmware-5-4/. In this blog post, I will try to capture some of the missing pieces and also provide pointers on how to troubleshoot.

View Full Post

Installing VMware vSphere SDK for Perl v6.5 on Ubuntu 14.04

VMware has a guide available at https://pubs.vmware.com/vsphere-65/topic/com.vmware.ICbase/PDF/vsphere-perl-sdk-65-installation-guide.pdf, which if you follow carefully you will be able to successfully install the vSphere SDK for Perl without any issues. I, unfortunately, didn’t follow the guide properly and landed into some issues, which I have documented here. This blog post captures:

  1. How to install vSphere SDK for Perl on Ubuntu 14.04
  2. Issues encountered

View Full Post

FortiOS 5.4 automatically repeat commands using auto-script

FortiOS 5.4 introduced a long-awaited feature called auto-script. Head over to http://help.fortinet.com/fos50hlp/54/index.htm and then “5.4 What’s New” if you are interested in learning more. For those of us, who have worked on Cisco routers and used aliases or EEM feature, the auto-script feature is somewhere in between the two. It allows commands to be executed periodically or either once and I see this to be a great add to the feature set, especially when it comes to collecting lots of information quickly. This blog post captures:

  1. How to configure auto-script feature
  2. How to execute a script
  3. How to view the results.
  4. How to upload results to an FTP server
  5. Maximum limit
  6. Few features that I would like to see in future FortiOS releases

View Full Post

Using esxcli to add port groups and vlans in bulk

Introduction

esxcli is a command line tool that can be used to manage VMware ESXi host. In my opinion it’s a good way to learn more about the inner workings of ESXi and can be used in scripts for automating tasks. In this blog post, I will show you how to use esxcli to add portgroups and vlans to vSwitch0 of an ESXi host.

View Full Post

Testing DSCP using ping tcpdump and tshark

Introduction and Setup

If you came here via a search engine, chances are that you looking for a quick and dirty way of testing DSCP on your network. Differentiated Services, described in RFC 2474 and RFC 2475 provide a way to mark, prioritize, police, etc IP flows based on various attributes. This allows network operators to maintain different levels of QoS on their networks.

This post captures details on how to generate traffic from a client with different DSCP fields set and verify that they are received on the server side.

View Full Post

bash function/alias for ssh connectivity

In the home or work lab, I often have to connect to various devices that are either temporary or don’t support SSH keys. In my home lab, I typically set all the lab equipment with a standard username and password, which allows me to connect to them quickly. As almost all devices these days support SSH, I setup a bash function that acts as an alias allowing me to quickly connect to a device using SSH either from my Mac or Linux desktop.

View Full Post

Page 1 of 2