We're hiring!

We're actively seeking designers and developers for all three of our locations.

From Imperative to Declarative System Configuration with Puppet

Peanut Butter & Jam Sandwich

After my impromptu presentation about configuration management with Puppet at BarCampGR a few weeks ago, several people mentioned that they had tried to use Puppet before, but couldn’t figure out how to make it do anything in the first place.

I’d like to clear up some of that uncertainty if I can, so here is an example of the simplest thing that could possibly work. This is not an example of how best to organize your code or write expressively, but it will show how you might start transitioning from imperative to declarative thinking through the use of Puppet’s Exec resource type.

Concepts

Declarative vs. Imperative Programming

Puppet’s standard DSL1 uses a declarative programming style that is often unfamiliar to newcomers, even if they are experienced programmers in other domains. Most commonly-used programming languages are examples of imperative programming, in which the programmer must describe a specific algorithm or process. Declarative programming instead focuses on describing the particular state or goal be be achieved. I’ll illustrate the difference with an example in natural language:

Make Me a Sandwich! (Imperative)
Spread peanut butter on one slice of bread. Set this slice of bread on a plate, face-up. Spread jelly on another slice of bread. Place this second slice of bread on top of the first, face-down. Bring me the sandwich.

The Sandwich I Desire. (Declarative)
There should be a sandwich on a plate in front of me in five minutes’ time. It should have only peanut butter and jelly between the two slices of bread.

Declarative programming is a more natural fit for managing system configuration. We want to be talking about whether or not MySQL is installed on this machine or Apache on that machine, not whether yum install mysql-server has been run here or apt-get install apache2 there. It allows us to express intent more clearly in the code. It is also less tedious to write and can even be more portable to different platforms. (See Luke Kanies’ blogpost1 for more advantages specific to the Puppet DSL.)

Puppet’s Resource Types

I won’t go into detail here, but Puppet uses an abstraction layer to manage what it calls “resources.” These are anything from users, to packages, to services, to files, and even commands to be executed (like the Exec resource type we’ll be starting with in this example). For a complete list of the available resource types (and all of their parameters), see the Puppet Type Reference documentation2.

Environment Setup

For this example, I will assume we’re working with a minimal install of CentOS 6.3 (for example, this Vagrant box provided by OpsCode – the makers of Chef, the ‘other’ popular configuration management tool). I’ll assume that we’ve already installed rvm, ruby, and the puppet gem using a bootstrapping script like the one from Justin Kulesza ‘s recent blog post. Here is a Vagrantfile to do just that:

# -*- mode: ruby -*-
# vi: set ft=ruby :
 
Vagrant::Config.run do |config|
  config.vm.box = "centos-6.3"
  config.vm.box_url = "https://opscode-vm.s3.amazonaws.com/vagrant/boxes/opscode-centos-6.3.box"
 
  # Execute bootstrap.sh script (from a Gist) to install RVM, Ruby, Puppet, etc.
  # Read the Gist: https://gist.github.com/3615875
  config.vm.provision :shell, :inline => "curl -s -L https://raw.github.com/gist/3615875/bootstrap.sh > ~/bootstrap.sh"
  config.vm.provision :shell, :inline => "bash ~/bootstrap.sh"
 
end

What we’d like to do is install and configure tmux and Matt Furden’s wemux script for managing shared tmux sessions.

Installing Tmux Using Puppet Execs

We’ll start with installing tmux. (Note again: we’re deliberately abusing the Exec resource – I’ll come back to this when we refactor.)

Let’s make one file to hold all of our Puppet code and call it site.pp and put it in /etc/puppet.

sudo mkdir /etc/puppet && sudo vim /etc/puppet/site.pp

Inside site.pp we’ll add our first resource, an Exec to install the EPEL repos:

exec{"install-epel":
  command => "/bin/rpm -i http://linux.mirrors.es.net/fedora-epel/6/i386/epel-release-6-7.noarch.rpm",
}

Now we can apply this manifest:

[vagrant@localhost ~]$ rvmsudo puppet apply /etc/puppet/site.pp
/usr/local/rvm/rubies/ruby-1.9.3-p125/lib/ruby/site_ruby/1.9.1/rubygems/custom_require.rb:36:in `require': iconv will be deprecated in the future, use String#encode instead.
notice: /Stage[main]//Exec[install-epel]/returns: executed successfully
notice: Finished catalog run in 5.52 seconds
[vagrant@localhost ~]$

Great! We’ve installed the EPEL repository we need! Unfortunately, if we try to apply this manifest again…

[vagrant@localhost ~]$ rvmsudo puppet apply /etc/puppet/site.pp
/usr/local/rvm/rubies/ruby-1.9.3-p125/lib/ruby/site_ruby/1.9.1/rubygems/custom_require.rb:36:in `require': iconv will be deprecated in the future, use String#encode instead.
err: /Stage[main]//Exec[install-epel]/returns: change from notrun to 0 failed: /bin/rpm -i http://linux.mirrors.es.net/fedora-epel/6/i386/epel-release-6-7.noarch.rpm returned 1 instead of one of [0] at /etc/puppet/site.pp:3
notice: Finished catalog run in 5.49 seconds
[vagrant@localhost ~]$

…Puppet will re-run the exact same command and this time it will fail (return with non-zero exit code) because the repository is already installed. We need a check to make sure that our manifest remains idempotent – that is, that it will result in the same system state as a single run if it is run more than one times3. As you might guess, Puppet has a many ways of doing this. We’ll go with what’s most straightforward for our present case:

exec{"install-epel":
  command => "/bin/rpm -i http://linux.mirrors.es.net/fedora-epel/6/i386/epel-release-6-7.noarch.rpm",
  creates => "/etc/yum.repos.d/epel.repo",
}

Adding the creates attribute to our Exec resource ensures that the command will only be run if no file exists at the path provided to creates. Installing the EPEL package creates at least one file: /etc/yum.repos.d/epel.repo. As long as that file is present, this Exec will not be run again. If it is missing, then the EPEL repo has probably been uninstalled and our Exec will reinstall it.

Next we’ll install tmux:

exec{"install-tmux":
  command => "/usr/bin/yum install -y tmux",
  creates => "/usr/bin/tmux",
}

Now when we apply our manifest we should see something like:

[vagrant@localhost ~]$ rvmsudo puppet apply /etc/puppet/site.pp
/usr/local/rvm/rubies/ruby-1.9.3-p125/lib/ruby/site_ruby/1.9.1/rubygems/custom_require.rb:36:in `require': iconv will be deprecated in the future, use String#encode instead.
notice: /Stage[main]//Exec[install-tmux]/returns: executed successfully
notice: Finished catalog run in 18.55 seconds
[vagrant@localhost ~]$

But what if we were to start from scratch with a new base VM? Can we use our site.pp manifest to get back to the state we’re in now? You might think we have all of the information needed to apply this manifest to an identical base VM and reach the same state, but you’d be wrong. The order of application for Puppet resources is only deterministic where it is explicitly defined. As our manifest is currently written, we have not stated whether tmux or the EPEL repo needs to be installed first. Sometimes it might work just fine, but sometimes the "install-tmux" Exec will be applied first, and the entire catalog run will fail.

This is a big gotcha with Puppet, but it’s easy to fix. We’ll simply add a requirement to the "install-tmux" Exec:

exec{"install-tmux":
  command => "/usr/bin/yum install -y tmux",
  creates => "/usr/bin/tmux",
  require => Exec["install-epel"],
}

Now Puppet will only apply the "install-tmux" resource if the "install-epel" has already been successfully applied. If "install-epel" needs to be run (i.e. /etc/yum.repos.d/epel.repo doesn’t exist), then Puppet will first run that Exec. If "install-epel" fails, Puppet won’t even try to run "install-tmux" – instead it will display an error that "install-tmux" was not run due to failed dependencies.

Installing Wemux Using Puppet Execs

Now we’re ready to follow the steps in the wemux README to create some more Exec’s:

exec{"clone-wemux-repo":
  command => "/usr/bin/git clone git://github.com/zolrath/wemux.git /usr/local/share/wemux",
  creates => "/usr/local/share/wemux",
}
 
exec{"symlink-wemux-into-path":
  command => "/bin/ln -s /usr/local/share/wemux/wemux /usr/local/bin/wemux",
  creates => "/usr/local/bin/wemux",
}
 
exec{"cp-wemux-conf":
  command => "/bin/cp /usr/local/share/wemux/wemux.conf.example /usr/local/etc/wemux.conf",
  creates => "/usr/local/etc/wemux.conf",
}

But what about that last part?

Then set a user to be a wemux host by adding their username to the host_list in /usr/local/etc/wemux.conf
vim /usr/local/etc/wemux.conf
host_list=(foobar)

We could use sed and create an extra file to mark that the job is done…

exec{"configure-wemux":
  command => "/bin/sed -i -e 's/change_this/vagrant/g' /usr/local/etc/wemux.conf && touch /etc/wemux-configured",
  creates => "/etc/wemux-configured",
  require => Exec["cp-wemux-conf"],
}

This will get the job done, but it’s definitely not pretty.

Hopefully you can see that we’re running against the grain by forcing our Puppet manifests to act as imperative code rather than declarations about the desired state of our system. Puppet gives us much better tools to work with than Execs if we’re willing to think about things a little bit differently.

Refactoring More Declaratively

Here’s what a first pass at refactoring our manifests might look like:

We’ll start by creating a ‘wemux’ module4

/etc/puppet/modules/wemux/manifests/init.pp:

class wemux($wemux_hosts = 'foobar'){
  package{"epel-release":
    provider => rpm,
    source => "http://linux.mirrors.es.net/fedora-epel/6/i386/epel-release-6-7.noarch.rpm",
    ensure => installed,
  }
  package{"tmux":
    ensure => installed,
    require => Package["epel-release"],
  }
  exec{"wemux-clone":
    command => "/usr/bin/git clone git://github.com/zolrath/wemux.git /usr/local/share/wemux",
    creates => "/usr/local/share/wemux",
  }
  file{"/usr/local/bin/wemux":
    ensure => link,
    target => "/usr/local/share/wemux/wemux",
    require => Exec["wemux-clone"],
  }
  file{"/usr/local/etc/wemux.conf":
    ensure => present,
    content => template("wemux/wemux.conf.erb"),
  }
}

/etc/puppet/modules/wemux/templates/wemux.conf.erb:

(This is a templatized version of the wemux config file we were copying and editing before)

...
host_list=(<%= wemux_hosts %>)
...

Then in our site.pp we can include the wemux class to pull in all of the resources described in our module…

/etc/puppet/site.pp:

class{"wemux":
  wemux_hosts => "vagrant"
}

Here I’ve moved most of the code into a self-contained wemux module. In this case our module consists of a single parameterized class containing all of the necessary resources to install and configure wemux. It is included in site.pp where a value for the wemux_hosts class parameter is also provided.

I have also used a couple new resources types here: files and packages. Both of these are much better suited to the task at hand. You’ll notice that the File resource type also lets us use an ERB template for the configuration file. I simply modified the example configuration file and added it to our wemux module as a template. Then I used the Puppet’s template() function to provide a value for the File resource’s content attribute.

Summary

There’s a lot more we could do to improve things even further. For example, we could extend Puppet with a custom type for our git repository rather than using an Exec to clone from github. I’ll leave that as an exercise for the reader (hint), but this is a good start.

We’ve now moved from specifying how to install and configure tmux and wemux to specifying what we want the state of our system to be. Our code is more readable, better expresses our intent, and will be easier to maintain. We’re working with the declarative Puppet DSL now, not against it.

Footnotes

1 Why Puppet has its own configuration language

2 Puppet Type Reference Documentation

3 Idempotence Is Not a Medical Condition

4 Puppet Learning: Modules and Classes

Additional Resources

Learning Puppet

Pro Puppet

Puppet Style Guide

Mike English (28 Posts)

Professional Problem Solver

This entry was posted in DevOps & System Admin. and tagged , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.

3 Comments

  1. Posted September 13, 2012 at 5:09 pm

    No need to write a native resource type for Git.

    http://forge.puppetlabs.com/puppetlabs/vcsrepo

    VCSrepo is a resource type for version control with providers that support SVN and Git.

    • Mike English
      Posted September 13, 2012 at 5:29 pm

      Teyo,

      Thanks for the tip. I actually linked to the VCSrepo module in the hint. ;-)

      -Mike

  2. Matt Williams
    Posted October 10, 2012 at 5:49 am

    I found something interesting yesterday — on centos 6 minimal install, without epel the rvm install fails (despite the yaml distribution being installed as a tar ball). However, after I added epel and installed yaml from epel, it worked.

    Thank you for an excellent article.

2 Trackbacks

  1. By Technology Short Take #27 | Strategic HR on December 7, 2012 at 5:31 pm

    [...] found this article on imperative vs. declarative system configuration is quite helpful in understanding Puppet’s declarative model. If you’re trying to [...]

  2. [...] whole configuration management scene where tools like Puppet, Chef, and others play, you might find this article helpful. It walks through the difference between configuring a system imperatively and configuring [...]