Introduction to Chef
In previous posts we discussed the benefits of having automated environment provisioning & virtualization and how to get started with Vagrant. We also mentioned how this topic relates deeply to the technoculture of DevOps, which focuses heavily on automation, reusability and iterative improvement with regards to building and releasing software.
So let’s get started with Chef, which helps you turn environment provisioning into an automated process, making it not only more reliable and error-proof, but saving loads of time and headaches.
Chef is “Infrastructure as Code”
Chef is a systems (and cloud) infrastructure automation framework that makes it easier to configure servers & deploy applications to any machine, be it physical, virtual, or cloud-based, regardless of the size of the infrastructure. Cool right?
Chef operates using cookbooks–which we can think of as ‘self-contained scenarios’–that are written in Ruby and managed as a source code. Each cookbook contains recipes that describe how a part of your infrastructure should look like. When you configure your new server, you let Chef know what cookbooks and recipes should be applied, and then chef-client configures the node on its own, resulting in a fully-automated infrastructure.
Chef is a logical sequel to the “infrastructure as code” concept of the DevOps methodology. This new infrastructure treatment requires you to change the way you design, build and manage your systems. So when you need your infrastructure to be split into smaller, modular services and then tied together automatically, your old ssh shell scripts won’t help you anymore. Instead, you should use system integration frameworks, like Chef, to handle all of this for you.
Note: We use Chef at Zeroturnaround, but other frameworks, such as Puppet, Ansible, CFEngine and Fabric are alternatives to consider. In fact, this dive into Vagrant, Chef and automated provisioning could be considered as a self-serving investigation to test out other tools for ourselves, muuhahaha!
Using Chef, you configure your servers not by running commands, but by writing code and thus you have all the benefits of storing it in SCM and testing it. That obtains the history of changes and provides more confidence when running scripts on the nodes, because you have already been testing things to ensure nothing will break.
Chef consists of the chef-server, one or more workstations and all the nodes that Chef configures and manages. The chef-server is a place where all your cookbooks, data bags and other information concerning your infrastructure play out their lives. There is also a node list with attributes such as operation system info, IP address, and a list of applied recipes. This is the place where nodes:
- Download cookbooks and recipes from
- Make queries concerning other nodes in your Chef cluster (Example: a new node looks for IP address of a load balancer node).
The workstation here is a developer’s machine, which you need to write new recipes, manage them in SCM, upload them to chef-server and to interact with your nodes. We recommend that you workstations have Knife configured, which is a command-line tool developers use to interact with chef-server and bootstrap new nodes. We go over that below.
Let us start with installing chef-server. The latest version (11.0.x) of chef-server supports only Ubuntu and Red Hat Enterprise Linux with 64 bit architecture. The process is fairly-well documented, and you need only to download the package from the site.
The next step is configuring your workstation. As soon as it is done, you are ready to start writing recipes and configuring your infrastructure with Chef.
Note: You can also install the older version (10.x) of chef-server, which is compatible with more operating systems and does not require 64-bit architecture. Workstation configuration with 10.x is similar to the latest version, so you can just download the right version of the installer that corresponds to your chef-server version.
So now we are ready to get acquainted with Chef’s fundamental unit of configuration–the Chef cookbook. A cookbook defines a particular scenario, such as install and configure Apache, and contains (almost) all the components required to do this, among them:
- Attributes – values that are set on nodes
- Definitions – reusable collections of resources
- Files – can be copied to the destination node
- Libraries – class or module definitions that extend Chef and/or provide helpers to Ruby code
- Recipes – the code that will be run by chef-client. They specify which resources will be managed and in which order you want them to occur.
- Templates – similar to Files, but additionally contain placeholders in them to insert values.
- Metadata – holds info about recipes: including dependencies, version constraints, supported platforms, and so on.
- Custom resources and providers (see below)
Resources and Providers
A resource is a discrete chunk of system configuration. It can be a package, a service, a user, a file or a directory, and so on. Resources tell Chef which provider to use to manage the resource, e.g for installing a package, or creating a file/directory. The resource is a cross-platform abstraction and one and the same resource declaration results in running different commands on different platforms, but the result will be the expected one.
package ‘tar’ do action :install end
Results into running:
apt-get install tar #on Ubuntu and Debian yum install tar #on Fedora and Red Hat port install tar #on Mac Os X
But the result is similar on all platforms: tar package is installed. Chef provides many resources out of the box, but if you need something additional or non-standard, you can always create one yourself using Lightweight Resources and Providers (LWRP), but that’s a more advanced topic and we will not cover it here.
Another thing worth mentioning are data bags. When we mentioned earlier that almost all information is held in cookbooks, the leftovers are contained in data bags . A data bag is a json formatted data file that can be stored on chef-server. The important feature is that it can be encrypted, and thus a data bag is a good way to store secure information, such as passwords or private keys.
The last but not least important piece of Chef infrastructure is Knife. It’s a command line tool that allows developer to interact with chef-server from his workstation. Knife uses the same API as chef-client does. The tool is actually very powerful and allows developer to manage practically everything, including:
- Cookbooks and recipes
- Data bags, including encrypted ones
- Environments (e.g. test, staging, production)
- Cloud resources (e.g. Amazon EC2), including provisioning
Knife can also be used to make search requests to chef-server concerning all the above mentioned things. But it also can connect by ssh and run some commands on several nodes in parallel. The following command, for example, will restart apache service on all nodes that have apache2::server in run list.
knife ssh “recipes:apache2\:\:server” “sudo service apache2 restart”
At first glance, Chef may seem to be a rather complex tool, but don’t let this scare you away from using it and obtaining some of the good benefits of DevOps orientation. As soon as you get acquainted with Chef’s basics – the rest will be rather simple.
We covered here some of the basics, which is just about enough to get you started off looking into new ways of configuring your infrastructure by writing code. But we have more to discuss:
- Lightweight Resources and Providers (LWRP)
- Anatomy of chef-run (not so simple as it seems to be)
- Testing Chef cookbooks and recipes.