Article summary
Recently, I had to deal with a command line process that was occasionally hanging during my project’s continuous integration test suite. I decided to write a wrapper script that would watch the output of the wrapped process. If it didn’t see a particular bit of output after some period of time, it would kill the process and try again. To limit the dependencies needed in the CI environment, I decided to write this wrapper script in Bash.
The biggest question I had was how to execute a process while monitoring its output, and doing all of this with a timeout. The answer turned out to be the Expect
command line utility. In this post, I’m going to walk through an example of using expect
to set a timeout while watching the output of a background process.
Random Words
For the purposes of this post, I’m going to start with a utility that emits random words (to stdout) until it’s killed. The heart of this script was found in a Generating Random Words blog post.
#!/bin/bash
n=`cat /usr/share/dict/words | wc -l`
while true
do
head -n`jot -r 1 1 $n` /usr/share/dict/words | tail -n1
done
Find a Match
Now, let’s say I want to run the random word script until it emits a word that contains a specified substring. For example, to look for a word that contains “cat,” you could do the following:
random-words | grep -m 1 "cat"
Expect with a Timeout
That works great except for two issues. First, it doesn’t actually exit when a matching word is found. Secondly, it could take a very long time to find a match, and I only want it to look for a certain amount of time. That’s where Expect
comes in. Expect
is described on its home page as
… a tool for automating interactive applications such as telnet, ftp, passwd, fsck, rlogin, tip, etc.
I’m not dealing with an interactive application, but Expect
can still be useful because of its timeout feature.
The basic technique of using Expect
with a timeout was found in this StackExchange answer. But instead of just setting a timeout for how long the process can run, I want to wait for specific output and only time out if that output is not found.
#!/bin/bash
TIMEOUT=15
SUBSTRING=$1
COMMAND="random-word"
expect -c "set echo \"-noecho\"; set timeout $TIMEOUT; spawn -noecho $COMMAND; expect \"$SUBSTRING\" { exit 0 } timeout { puts \"\nTimed out!\"; exit 1 }"
This script will exit with code 1 if no match is found within 15 seconds (specified by the TIMEOUT
variable), or exit with code 0 if a match is found. The substring to look for is expected to be passed in as the first argument to the script.
This works great if you want the process being monitored to exit either when the expected substring is found or when the timeout is hit. But the to fix the problem I was running into in CI, with the long running process that would occasionally hang, I wanted the process to run to completion once the expected output was found.
Expect an EOF
I’m going to preface this by saying that I’m no Expect
expert. I’d never heard of it prior to stumbling across it trying to solve this problem, meaning it’s quite possible there’s a better way to accomplish what I’m trying to do using Expect
than what I’m about to show you.
That being said, I’m going to implement a solution by redirecting the output of the random-words
script to a file (and do that in the background), and then use grep
to look for a matching word in the output file. I’ll use Expect
to run the grep and look for EOF (from the grep) to signal success.
Because the script is now kicking off the random-words
script in the background, it needs to worry about killing that process when it’s done, or if the wrapper script is killed, which adds some complexity to the script.
Here’s the final script all put together. It will look for a word that contains the substring passed in on the command line and will print only the matching word if one is found.
#!/bin/bash
if [ $# -ne 1 ]; then
script=`basename "$0"`
echo "Usage: $script <substring>"
echo " e.g. $script foo"
exit 2
fi
# http://stackoverflow.com/a/11697822/4592309
function killstuff {
jobs -p | xargs kill > /dev/null 2>&1
}
trap killstuff SIGINT
PROCESS_TO_MONITOR=random-words
OUTPUT_FILE=/tmp/monitor-output.log
SUBSTRING=$1
TIMEOUT=15
$PROCESS_TO_MONITOR > $OUTPUT_FILE &
pid=$!
TAIL_AND_GREP="tail -n 5000 -f $OUTPUT_FILE | grep -m 1 '$SUBSTRING' > /dev/null"
COMMAND="/bin/sh -c \"$TAIL_AND_GREP\""
expect -c "set echo \"-noecho\"; set timeout $TIMEOUT; spawn -noecho $COMMAND; expect timeout { exit 1 } eof { exit 2 }"
TIMEOUT_EXIT_CODE=$?
exit_code=10
if [ $TIMEOUT_EXIT_CODE = 1 ]; then
echo "Did not find '$SUBSTRING' in $TIMEOUT seconds."
exit_code=1
elif [ $TIMEOUT_EXIT_CODE = 2 ]; then
exit_code=0
grep -m 1 "$SUBSTRING" $OUTPUT_FILE
fi
kill $pid > /dev/null 2>&1
wait $pid 2>/dev/null
rm -f $OUTPUT_FILE
exit $exit_code
This script uses the ability of Expect
to look for EOF and then exits with a specific code (two in this case) to signal that it found a match and did not time out. The background process continues to run after the Expect
completes. I manually kill the background process at the end of the script, but in a more real life usage, it would probably do a wait $pid
to wait for the background process to finish.
Conclusion
From my experience, Expect
appears to be a powerful command line tool. I’m just beginning to scratch the surface as far as understanding what it’s capable of doing, but using its timeout capability has allowed me to do something in a shell script I wasn’t even sure was possible prior to finding it.
Could you have used bash’s builtin “read” command?