Configuring a Laptop with Ansible, Part One

Setting up a new laptop can be disorienting. It’s easy to forget all the configuration tweaks that accumulate over time. With just a little work upfront, a configuration management system can turn those notes into executable documentation, making it easy to reproduce a heavily customized setup on other laptops down the line.

While there are several configuration management systems, this post will focus on Ansible, because it’s easy to use locally. Where Chef and Puppet generally use a central server for tracking configurations and clients on the host computers pull changes, Ansible connects to hosts via SSH and pushes changes. (There are projects that adapt Chef or Puppet for local use, such as configuring laptops, so people already familiar with those may prefer to use Boxen, Kitchenplan, or Sprout instead.)

Part one of this post will provide some background on configuration management tools and specifics about Ansible usage, and part two will walk through an example laptop configuration.

Ansible Installation

Ansible can be installed via pip (a packaging tool for Python), or via homebrew, apt-get, or other OS-specific tooling. Ansible only depends on SSH and Python.

Inventory / Hosts

Before Ansible can configure hosts, it needs to know their names. It installs a default inventory file in a place like /etc/ansible/hosts or /usr/local/etc/ansible/hosts.

This file can be as simple as:

localhost

or can set groups of hosts:

# ungrouped
localhost

[webservers]
annie

[dbservers]
troy
abed

# union group combining child groups
[servers:children]
webservers
dbservers

I keep an inventory file in my ansible_config git repo, along with a script that symlinks the default ansible inventory path to it. (While this may not be a good approach for a large organization, it works very well for a personal setup.)

Ansible Modules

The Ansible configuration specifies what the hosts’ states should be, and modules analyze and figure out what updates are necessary. This means most modules are idempotent—once a configuration is up to date, nothing else needs to happen. (There are a few exceptions, such as the raw and command modules, which run arbitrary shell commands.)

Modules can be run directly from the command line with the ansible command. For example, the ping module checks which hosts in the inventory are accessible:

# 'all' is a default group of all hosts
$ ansible -m ping all

annie | FAILED => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue
localhost | success >> {
    "changed": false, 
    "ping": "pong"
}

troy | success >> {
    "changed": false, 
    "ping": "pong"
}

abed | success >> {
    "changed": false, 
    "ping": "pong"
}

The host “annie” did not respond to a ping, the other nodes did, and no state changed.

A custom inventory file can be specified with -i:

$ ansible -i hosts -m ping all

Another useful module for debugging is setup, which returns all the “facts” (detected host info) that can be used during setup:

$ ansible -m setup abed

abed | success >> {
    "ansible_facts": {
        "ansible_all_ipv4_addresses": [
            "192.168.23.138"
        ], 
        "ansible_architecture": "armv7l", 
        "ansible_distribution": "Debian", 
        ... lots more info ...
        },
    }, 
    "changed": false

Most modules require a few options, such as names of packages to install, or paths to files or templates. Since abed is running Debian Linux, the following command uses the apt module to install cowsay via apt-get, Debian’s package tool:

# -s: use sudo, -K: prompt for sudo password, -a: arguments to module
ansible -K -s -m apt -a "name=cowsay state=present" abed
sudo password: (enters password)

abed | success >> {
    "changed": false
}

Nothing changed because cowsay was already present. Removing it and
re-installing it shows that the configuration changed.

$ ansible -K -s -m apt -a "name=cowsay state=absent" abed
sudo password:  (enters password) 
abed | success >> {
    "changed": true, 
    "stderr": "", 
    "stdout": "Reading package lists...\n [apt-get output...]\r\n"
}

$ ansible -K -s -m apt -a "name=cowsay state=present" abed
sudo password:  (enters password)
abed | success >> {
    "changed": true, 
    "stderr": "", 
    "stdout": "Reading package lists...\n [apt-get output...]\r\n"
}

Other useful modules include file, copy, template, synchronize (rsync), lineinfile, command, git, script, etc. To list available Ansible modules, use ansible-doc --list, or check the official documentation’s “Modules by Category” page. ansible-doc MODULE will also print the options and example uses for a specific module.

Ansible Playbooks

Of course, running everything line by line isn’t much of an improvement. Ansible can also run “playbooks”, YAML files that group several tasks, calls to modules with their arguments. Playbooks can include variables and groups of other .yml files. Tasks in playbooks are run sequentially, and can update multiple hosts in parallel.

When getting started, I found some of the YAML syntax rules about which structures could nest / include other structures confusing, but once I had a few tasks working, most of my setup has followed a similar structure – once I figured out the general patterns it was less of a problem. (The second part of this post will have more examples.) Also, the ansible-playbook command has a --syntax-check option, which checks the structure of the playbook.

Here is a simple playbook, showing multiple tasks, an include, and the overall structure.

$ cat dbservers.yml 
---
- hosts: dbservers
  tasks:
  - name: install cowsay
    sudo: yes
    when: ansible_os_family == "Debian"
    # the following is equivalent to `apt name=cowsay state=present`
    # but reads better with modules that have lots of options
    apt:
      name: cowsay
      state: present

  # Another task, indented at the same level with a new "- name:" field.
  - name: install figlet
    sudo: yes
    when: ansible_os_family == "Debian"
    apt:
      name: figlet
      state: present

  # Lists can eliminate some repetition
  - name: install a list of packages
    sudo: yes
    when: ansible_os_family == "Debian"
    apt:
      # {{item}} is a variable, used by with_items below.
      name: "{{item}}"
      state: present
    with_items:
      - cowsay
      - figlet

  # Other files can be included
  - include: install_keys.yml

A playbook can be a single flat file, but this one includes a second file to show how how the structure flattens. Note that included files are already in the ‘hosts’ and ‘tasks’ level of the file including them.

$ cat install_keys.yml 
---
- name: install SSH key
  sudo: yes
  authorized_key:
    key: "ssh-rsa [...]"
    user: "{{ansible_user_id}}"
    state: present

- name: install another SSH key
  sudo: yes
  authorized_key:
    key: "ssh-rsa [...]"
    user: "{{ansible_user_id}}"
    state: present

To run the playbook, use the ansible-playbook command:

$ ansible-playbook -i hosts -K dbservers.yml 

Individual tasks can specify whether they use sudo, but the sudo password is usually set for the whole playbook with -K. ansible-playbook has many options, but a couple are especially useful: --check determines which tasks would lead to changes, but doesn’t actually update anything. --list-tasks and --list-hosts print details about what the playbook will do, and --start NAME resumes a playbook from the task named NAME. (This can save time if the playbook stops with an error along the way.)

Playbooks can also define handlers, additional tasks to trigger if a task caused a change. For example, services should usually be restarted if their config file has been updated.

Roles

Just including .yml files from playbooks can eventually get messy. If files or tasks are referenced by several playbooks, it can be neater to group them into roles:

---
- hosts: laptop
  roles:
    - common
    - git_homedir
    # roles can also be used conditionally
    - { role: x11, when: ansible_distribution != "MacOSX" }
    - { role: erlang, when: install_erlang is defined }

Then ansible-playbook looks for a corresponding role directory, roles/ROLE_NAME, and includes roles/ROLE_NAME/tasks/main.yml. If the role’s tasks use files, templates, handlers, etc., it will check in appropriate subdirectories within the role first. Roles give better modularity to common configurations, similar to Chef cookbooks.

Host and Group Variables

Along with the inventory, ansible-playbook checks for host and group-specific options. While these can be included in the hosts file itself, it’s cleaner to put them in host_vars/HOSTNAME and group_vars/GROUPNAME. These directories go in the same place as the inventory file.

For example, the file host_vars/cynar for my laptop looks like this:

---
install_git_homedir: true
install_android_dev: true
install_erlang: true
install_ocaml: true
install_electronics: true
...

and those variables affect when conditionals in my roles.

In my next post, I will cover an example configuration.