Tools for Testing Command Line Interfaces

Article summary

I’ve been working on a command line utility for building and deploying Craft CMS websites. To prevent regressions, I’ve converted my manual test plans into a small suite of automated system tests. In the process, I’ve added some new tools to my testing toolbox.


The Bash Automated Testing System (Bats) is a TAP-compliant testing framework for Bash created by Sam Stephenson. I first became aware of Bats when I switched from using RVM to using rbenv to manage local installations of several versions of Ruby. The test suite for rbenv uses Bats.

A while later, I had the chance to use it along with Test Kitchen to test some of our Chef cookbooks. When I started looking at the manual validation I was doing for my craftutil utility, I realized that many of the cases could be easily described and automated by Bats tests.

Here’s an example Bats test:

@test "craftutil build should compile SCSS to CSS" {
  # Setup
  mkdir -p dev/assets/css
  mkdir -p public
  echo '$text-color: #0000FF;
  nav {
  p { color: $text-color;}
  a { color: $text-color; }
  }' > dev/assets/css/style.scss

  # Execution
  run ${CRAFTUTIL} build
  [ "$status" -eq 0 ]

  # Validation
  run cat public/assets/css/style.css
  [ "$status" -eq 0 ]
  [ "$output" = "nav p{color:blue}nav a{color:blue}" ]

Using Bats, I was able to automate a lot of these sorts of tests, but then I hit a part of my interface that Bats couldn’t easily test.


I reached the limits of Bats’ utility when I wanted to test some of the interactive craftutil interfaces. For example, craftutil also has a watch subcommand that works much like build except that it will rebuild assets when changes are detected in the dev/ directory. Two things made this difficult to test with Bats:

  1. craftutil watch runs in the foreground until it receives SIGINT (when the user presses Ctrl-C).
  2. craftutil watch runs in the foreground and responds to events (files being updated) that happen in another process.

It turns out there’s a classic Tcl program readily available for most UNIX systems that solves both of these problems: Expect. With Expect, it’s possible to spawn processes and describe complex interactions with those processes programatically.


Expect is powerful, but Tcl can be daunting. I put off learning Expect for longer than I needed because I didn’t know where to start. The secret to quickly creating useful Expect scripts is a tool called autoexpect.  It allows you to easily record manual interactions with your programs to a working Expect script.

In some cases, this may be all you need. More often, I use these autoexpect recordings as means of quickly generating more tedious sections of a script. After creating a script with autoexpect, it is not hard to clean it up and add your own logic.

Spawn, Send, and Expect

Three commands form the basis for most Expect scripts:

  1. spawn spawns a new process (e.g. a shell or the program you want to interact with).
  2. send sends input to that process.
  3. expect waits for output from that process that matches certain criteria (e.g. a simple string match or a regular expression ), and then reacts (e.g. by disputing another send of input).

Simple Expect Example

The utility of Expect can be demonstrated with just these three commands:

spawn ftp
expect "username:"
send "anonymous\r"
expect "password:"
send "[email protected]\r"
expect "ftp>"
send "cd /pub/linux/kernel/v4.x"
send "get linux-4.3.3.tar.gz\r"
expect "ftp>"
send "bye\r"

Expect with Multiple Processes

Expect is not limited to interacting with just one process at a time. Working with multiple processes is as simple as calling spawn multiple times and saving the spawn_ids of each. By setting spawn_id back to one of these saved values, we can switch which process send and expect interact with.


Here’s a slightly more complex Expect script adapted from an autoexpect recording:

#!/usr/local/bin/expect -f

set timeout 30
set prompt "$ "

spawn $env(SHELL)
set craftutil_watch_spawn_id $spawn_id

expect $prompt
send -- "craftutil watch\r"

expect "*Finished '*watch*' after *\r"

sleep 3
spawn $env(SHELL)
set second_shell_spawn_id $spawn_id

expect $prompt
send -- "touch dev/assets/css/style.scss\r"

expect $prompt

set timeout 20
set spawn_id $craftutil_watch_spawn_id

expect {
  "*Finished '*styles*' after *\r" { }
  timeout { puts "Timed out after $timeout seconds"; exit 1 }

send -- ""
expect $prompt
send -- "exit\r"
expect eof
puts "passed craftutil watch tests"

Further Reading on Expect

Expect has much more functionality than what I’ve covered here, and my style is still that of a novice. (Expect experts, feel free to leave critiques in the comments). Besides the documentation included with Expect, another good source of information is Exploring Expect, an O’Reilly book by Expect’s author, Don Libes.


Bats and Expect are useful and powerful tools, but at times I’ve wanted a more concise, elegant syntax for demonstrating the intended behavior of a program. Many years ago, Darius Bacon wrote a tool of that sort called Tush. I don’t remember exactly when I first encountered Tush, but I suspect I stumbled across it after following up on something Scott Vokes shared with me. Regardless, Tush’s simple syntax and lightweight implementation left an impression on me, and I tracked it down again recently when I started looking at tools like Bats and Expect.

Implemented with just a handful of small awk scripts, Tush can be used to check or generate transcript-like examples suitable for embedding in plain text documentation or standing on their own.

For example:

$ echo Hello world
| Hello world

In the above example, the line starting with $ denotes a command to run, and the line starting with | denotes the expected output. The tush-check command can verify that the expected output correspond with the actual result of running the command. Alternatively tush-bless can be used to update lines starting with | to match the actual resulting output.

For example:

$ echo Hello Dave
| Hello world

…when run through tush-bless would be updated to read:

$ echo Hello Dave
| Hello Dave

Besides checking output on standard output, Tush can also check standard error and return values. Lines starting with @ represent output on standard error. Lines starting with ? represent expected return values.

For example:

$ cat nonesuch
@ cat: nonesuch: No such file or directory
? 1

I find Tush’s minimalist, literate approach very attractive. If Tush (or something like it) could be extended to cover at least a few of the affordances Expect provides for interactivity and job control, I could see myself making very heavy use of it. I’ll leave the implementation of such a thing as an exercise for a wise and generous reader.

Other Tools

There are a lot of tools available for testing command line applications, and I would be remiss if I didn’t at least mention a couple more of my favorites.


Empty is a stand-alone C implementation of some of the key features of Expect. It can be useful for embedding co-process interactivity in shell scripts or even just as a simple way to create a Pseudo-Terminal (pty) to run a process in when a real interactive terminal is not available (for example: in a build step of a continuous integration process running on a remote build agent).


Aruba is a Ruby gem that extends Cucumber (or RSpec, or minitest) with better support for testing command line applications. While Aruba is written in Ruby, the programs under test can be written in any language so long as they provide a command line interface. If you already have a Ruby project with a lot of Cucumber features, Aruba could be a good fit for testing command line interfaces.

  • Jay Pancholi says:

    Hi Mike,
    I was looking into some CLI automation tools and this article is very useful. Thanks for sharing it. I was wondering if you can suggest some windows based option for SSH/telnet automation.

  • Comments are closed.