Creating a Custom Yum Repository for Puppet

Mike English and myself have been preparing to deploy Puppet to help manage our infrastructure. Puppet is a Ruby-based server automation tool for *nix systems which uses a declarative language to express a system configuration. The Puppet master distributes a catalog which contains configuration information to Puppet agents which then execute various changes on managed systems to achieve the desired final configuration.

The Puppet configuration allows for various resource types which represent different resources or aspects which Puppet can manage on a given system. For instance, Puppet has a resource type named “package” which allows the management of installed applications and libraries. For Debian or RedHat based distributions which utilize apt or yum, Puppet can automate the retrieval and installation of packages by interacting with each of the package management systems.

For example, with Puppet, I could specify the following in a Puppet manifest in order to install the Apache web server on a RedHat-based system:

name => "httpd",
ensure => present,

When a Puppet agent encounters this declaration, it will attempt to retrieve and install the Apache HTTP server from the system’s available repositories utilizing the package management system.

This works great for installing required applications and libraries on a given system to prepare it for its role. For example, a typical RedHat-based web server would probably need packages declared for httpd, openssl, and php.

However, what if you needed an application or library installed which wasn’t available in the system’s available repositories, or even other repositories often used with a given package management system?

You could always build from source, but having Puppet build and install applications and libraries from source would require some rather fragile coupling between a given package, and the manifests for a desired system configuration.

A better idea is to create a custom package repository so that Puppet can make use of its “package” resource type, and its integration with the underlying package management system.

For RedHat-based systems which use yum for package management, this process is very easy.

RedHat-based systems have a utility available named createrepo which will create the necessary metadata to host a yum repository. The utility examines RPM‘s in a given repository, and then generates XML RPM metadata which yum queries to determine available packages, and their characteristics. Once yum is made aware of the existence of the custom repository, it can run the necessary queries, and packages can be retrieved and installed as desired.

There are three primary steps to creating a custom yum repository:

  1. Obtaining the RPMs to be hosted in the repository
  2. Creating the repository metadata.
  3. Making the RPMs and repository metadata accessible to other systems, such as via Apache HTTP Server.

A very basic repository could be constructed on RedHat-based systems doing the following:

  1. Ensure createrepo and httpd are installed, that the system has a valid hostname, and that port 80 (http) is open.
  2. Get and place necessary RPMs for a system in /var/www/html/repository
  3. Run createrepo /var/www/html/repository
  4. Ensure that httpd is running with the document root at /var/www/html (the default)
  5. Change the ownership of /var/www/html/repository recursively to ensure that httpd can read all of the RPMs and metadata (chown -R apache:apache /var/www/html works).
  6. On systems that will use this custom repository, place the following text file in /etc/yum.repos.d:

    name = My Repo

    (This could easily be accomplished with a Puppet file resource type, and populating the content to point to the custom repository)

  7. On a system that will use this custom repository, attempt to search for one of the RPMs you hosted on the custom repository. You should be able to locate it.
  8. Use the Puppet ‘package’ resource type as normal — the custom repository should be utilized when installing new packages.

Clearly this is a much more elegant solution than either trying to force Puppet to build and install packages from source, or copying RPMs to each system in a cluster individually and installing them with a set of commands. It is quite easy to find suitable RPMs and place them in your custom repository or, if not such RPM exists, create your own.