Release Day: Automating the Linux Distro (featuring Ubuntu 20.04)

automating-distributions

​Hi there! I’m Brad, a Systems Engineer at Linode. Today marks an eventful day in the Linux world with the release of Ubuntu 20.04, the next LTS (long-term support) version of one of the most popular distributions. As with Ubuntu 18.04 two years ago, we are expecting that this will become our most widely-deployed image, which is already available to all Linode customers.

Same-day distro availability has become the norm here at Linode, thanks to the automation that we’ve developed around building, testing, and deploying new images. But just what is an image (or a template as they are sometimes called), and what goes into making them? To answer that question, we’ll start by looking at how OS installations typically work. ​

Installing an OS

You may be familiar with the process of manually installing an operating system on a home computer or a server. It usually involves downloading an installation ISO, burning it to a CD or USB drive, and booting the system from that. From there you typically run through a series of menus (or sometimes manual commands) which allow you to control various aspects of the installation, such as which hard drive to install to, what software/packages you’d like to include, and perhaps some customization such as setting a username and password. When everything is done, you’ve got a (hopefully) fully-working OS that you can boot into and start using.

This process works well enough for individual users performing one-off installations, but if you need to install more than a small handful of systems, then the manual installation process is simply not feasible. This is where images come in. ​

What’s an Image?

An image is essentially a pre-installed OS that you can deploy to as many systems as you need. It works by performing a normal installation once and then making a copy of that installation that you can later “paste” onto another system. There are a variety of ways and formats in which you can store images for later use, but for the most part they are exact byte-for-byte copies of the original installation.

Now we only need to perform a manual installation once, and we can re-use that everywhere else. But we can still do better. Linode supports a wide variety of distributions, ranging from the usual suspects (Debian, Ubuntu, CentOS, OpenSUSE) to some that you might not find on other providers (Alpine, Arch, and even Gentoo). Each one has their own release schedule and support lifetime. Performing manual installations for all of our supported distros (even just once) and taking time to ensure that the resulting images work and don’t contain any mistakes would take an enormous amount of time and be very accident-prone. So instead we’ve chosen to automate the image building process itself, with the help of a wonderful tool called Packer made by HashiCorp. ​

Automating the Build

Even though every Linux distribution shares a common base (namely, the Linux kernel), they are all very different from one another, including the way in which they are installed. Some distros use command-line instructions, some use menu interfaces, and some include ways of automatically navigating and supplying answers to those menus. Fortunately, Packer is a very versatile tool and can handle all of these use-cases.

The first thing we do is instruct Packer to create a virtual machine (or VM) that resembles a Linode as closely as possible. This means emulating the same “hardware” and features that we use for actual Linodes. This way, the installation will be performed in an environment that closely resembles the final runtime environment. Imagine installing an OS to a USB drive and then using that drive to boot a completely different computer. If you’re lucky then things might just work, but more often than not some hardware device won’t be detected or maybe it won’t boot at all. By having an installation environment that matches the actual running environment, we eliminate these problems.

Once the VM is created Packer will boot it from the distro’s installation ISO, which it fetches from a specified URL. The process from here varies widely between distros. For command-driven distros such as Arch, we feed it a bash script that performs a basic installation. For menu-driven distros such as Ubuntu, we use the distribution’s preferred method of supplying answers to the installer (typically either Preseed on Debian-like distros or Kickstart on RHEL-like distros). In addition to our VM, Packer also creates a tiny HTTP server which allows any needed files to be transferred into the VM. ​

All of this is controlled via a JSON file which defines the settings and build options that Packer will use. To initiate a build, we simply need to run (for example): packer build ubuntu-20.04.json.

Customization

For the most part, we perform installations that are as vanilla as possible. This means installing whatever packages the given distro considers to be “default” (sometimes referred to as “base” or “standard”). In addition to these default packages, we also install a small handful of what we call “support” packages: basic utilities such as iotop, mtr, and sysstat which can help debug any issues that might arise. As a result, the Linode support team can also reasonably assume these tools are installed while assisting customers.

After the installation is finished but before the VM is shut down, we make a few final customizations to ensure proper functionality with all of the features of the Linode platform. For instance, we ensure the bootloader is configured with the correct settings for LISH (our tool for out-of-band console access). In general though, we try to keep things as close to their defaults as we can. This way, a user who prefers a specific distro will get what they are familiar with and it won’t feel like driving someone else’s car.

Packaging

After the installation and configuration is finished, Packer shuts down the VM and exports its disk image to a file. It may sound like we’re done, but there’s still a few steps left. The hard drive of a typical computer will start with a partition table (either MBR, or GPT on newer systems). Linodes are a bit unique in that the GRUB bootloader itself actually lives on our hosts, rather than on each Linode (the configuration is still read from the Linode however). This means that we can actually strip the partition table completely, leaving us with just a single partition.

To accomplish this, we run fdisk -l disk.img on the disk image to determine where the partition starts and ends, and what the block size is. We then use dd if=disk.img of=part.img bs=### skip=### count=### to “forklift” the partition out using the starting offset from our previous command. More specifically, each “###” in that command gets replaced with output from our earlier fdisk command.

Just like on a new computer, most of the space on the drive will be empty, and this will be reflected in our disk image. It would be silly to copy all these “empty” bytes around, so the next thing we do is deflate the image (later, when you deploy an image to your Linode, we re-inflate it to fill the available space on your instance). Since all of our images use ext4 partitions, we can run resize2fs -M part.img, which will automatically shrink our image down to its smallest possible size by removing the empty space. Finally, to ensure the integrity of the resulting image we perform a final fsck on it before compressing it ​with gzip.

Testing

After the image is built and prepped, the next step in the process is to make sure it actually works. Our newly-minted image gets deployed to a testing environment, where a bunch of Linode instances get provisioned from the image in a number of different configurations. We have developed an automated test suite which checks all kinds of different things such as network connectivity and a working package manager, as well as various features of the Linode platform such as Backups and disk resizing; we throw the book at these instances. If any check fails, the process is immediately aborted and the build fails, along with some details about what check failed and why.

Building and testing in an automated way like this allows for rapid development cycles that in turn allow us to release better images, faster. We have structured our testing process such that adding new checks is trivial. If a customer ever reports an issue with one of our images, we can swiftly release a fix and add another item to our growing list of checksalmost like an immune system! ​

The Same, but Different

Mass-deploying systems from a common image is great, but what about the things that are unique to a specific instance, such as the root password or networking configuration? Out of the box our images are configured to use DHCP, which will result in your system automatically receiving its assigned IP address and a unique hostname from our DHCP servers. However, we also provide a variety of “helpers” such as Network Helper (which will automatically configure static networking on your Linode), and our root password reset tool (which sets your initial root password and which you can also use in emergencies if you need to reset it). These tools allow for instance-specific information to be applied to your Linode on top of the base image.

Of course, not every distro handles these tasks in the same way, so our tooling needs to be aware of how to do these things on all of our supported distros. New major versions of a distro will typically require some updates to these systems in order to get things fully working. For example, Debian traditionally configures networking in /etc/network/interfaces, whereas CentOS places network configurations in /etc/sysconfig/network-scripts. Fortunately, most distros provide beta releases ahead of time, which gives us plenty of time to make these changes and ensure everything is ready for launch day. ​

Conclusion

As you can see there’s a lot of things that go into supporting a new distro, so what’s the real benefit of automating this process? Well, years ago before we had the process we have today, a typical distro release (from building to testing to availability) would take at best an entire day, but usually several days, and many people would be involved. By comparison, today’s release of Ubuntu 20.04 required only 5 lines of code changes and took less than an hour from start to finish.This approach to building images saves us a lot of time, hassle, and results in consistent builds that are thoroughly tested and constantly improving. If you have any suggestions or things you’d like to see, let us know! I hang out on IRC as blaboon on OFTC and lblaboon on Freenode. If you’re interested in trying out Packer for yourself, they have great documentation which can be found here. We also have our own Linode builder for Packer which you can use to create your own customized images on our platform. Further documentation for using the Linode builder for Packer can be found in our Guides & Tutorials library here.


Not a Linode customer?  Sign up here  with a $20 credit.

Comments (4)

  1. A.

    This is brilliant and very timely (for me) as I happened to spend some of last weekend to “reverse engineer” the build process (and discovered the GRUB config quirks) while tried to test migrating a Linode VM over to a on-premises (lab) VM Host and vice versa. Maybe this would be a good write-up, if you are searching for tutorial/blog topics. Many thanks for publishing this!

  2. Nathan Melehan

    Hey A. –

    That sounds like a cool topic! If you’re interested, we actually have a paid freelance contributor program for our documentation library called Write For Linode. You can learn more about the program and apply to it here: https://www.linode.com/lp/write-for-linode/

    • A

      Hi Nathan
      Thanks for this, it looks very tempting. I started to work on the process for the first migration but I was stopped by Grub2. As described in the Packaging section above, the stripped out boot partition stops me to boot up the the image in my VM Host and I haven’t been able boot up the image dd’d out of Linode. If I create a vanilla VM image with the same Ubuntu version as in my Linode VM, the (virtual) disk is partitioned with two partitions, sda1 hosting grub (I assume this is what you strip out in the build process) and sda2, which is “/”. The image “exported” out of Linode, on the other hand has only a single partition as described above. Is there any documentation describing how to undu the stripping out of sda1, or how insert a new boot (sda1) partition back into the image? Many thanks, A

      • A

        Ok, I managed to get through booting my local clone in the end and it turns out that the startup process didn’t crash (as I thought), it was just slow*. It is because (unsurprisingly) it has a hard-coded static IP address which now lives on the wrong network, so I had to wait until the network config time-out* during the boot-up process.

        That’s easy enough I’ll just change the config in /etc/network/interfaces (the default/standard place in Ubuntu, and as also mentioned in this article). Looking at the “interfaces” file it, is blank with a note that things have moved on (since I last had to deal with linux networking) and it is now handled by something called “netplan”. [grrrrr….]

        No matter, it can’t be that hard, let’s see what /etc/netplan says. Well, the yaml file says my ethernet interface should be called `enp0s3` and should be getting the address from dhcp. But my actual interface name is `eth0` and has a static address. Where is this coming from?! [&$#^%#]

        Time to “brute force” search the entire config.
        `find /etc/ -exec grep ‘12.34.56.78’ {} \; 2>/dev/null` results in 3 hits, one is by an application so there are only two potential places to change, it might be easy enough to check**:

        `grep -Rn ‘12.34.56.78’ * 2>/dev/null`

        systemd/network/05-eth0.network
        systemd/network/.05-eth0.network

        It was auto-generated by Linode’s Network Helper (as described in the article) which is not available in my lab, unsurprisingly, so let’s just copy back the original config:

        `cp 05-eth0.network 05-eth0.network.bak`
        `cp .05-eth0.network 05-eth0.network`
        `shutdown -r now`

        Bingo!!! The VM came up with a new dynamic IP address and I can ping and nslookup to my heart content!!! I haven’t checked if the actual web application survived the extraction, and what else may have broken in the process.

        *and probably a lot more linode specific configs that are inaccessible to my local clone.
        **The actual IP address is not this

        Lessons learned: It is a lot more complicated to migrate out of Linode to a local VM or different cloud and require a lot of effort to fix/adapt the extracted application and the OS, it would be a lot faster just to build the lab up from the ground up. It might be a simpler process to move the other way around (i.e. develop in my own lab and pull the result into Linode) but wouldn’t hold my breath trying.

        Not sure if my experience warrants an article beyond this (chain of) comment(s). But it was a great weekend project and an inspirational learning exercise, many thanks, Linode!

Leave a Reply

Your email address will not be published. Required fields are marked *