4 Comments

String Tricks that Bash Knows

I use the terminal for a lot of my work, so when I need to process output from other tools, I have a lot of options. I started out using shell scripts, but eventually moved on to scripting languages — first with Perl, then Ruby, with occasional Python. Lately I’ve been getting familiar with my shell (bash) again, finding new ways to stretch its usefulness as a tool on its own, without pulling in typical auxiliary support from grep, sed, or awk.

Bash is a big program, and the bulky man page can hide some gems. One area where it’s quite capable, which I only recently dug into, is string handling. Except it’s not really called that, so you might not have noticed it. As James Coglan rightfully puts it:

In this case, what I’m going to show is a couple of useful bits from the “Parameter Expansion” section of the bash man page. And though I otherwise enjoy the stress reduction of telling my shell to man bash, I actually learned these tricks by reading the Advanced Bash-Scripting Guide, which usually calls this “Parameter Substitution”. It even elaborates my point about knowing what to look for:

Bash supports a surprising number of string manipulation operations. Unfortunately, these tools lack a unified focus.

The ABSG is a great guide, with detailed examples (like pattern matching) and a handy table for the (reasonably-named) “String Operations”. Let’s look at a few of them.

Remove Shortest or Longest Match from Beginning or End

My first example of interesting string tools in bash are # and %. They’re both for removing parts of a string, and they can be used singly or doubly:

  • # removes the shortest match from the beginning
  • ## removes the longest match from the beginning
  • % removes the shortest match from the end
  • %% removes the longest match from the end

I’ll paraphrase some examples from another rich bash guide to affect some useful path munging:

var=/Users/karlin/git/langton_loops/index.html.erb
echo ${var}         # => /Users/karlin/git/langton_loops/index.html.erb
echo ${var#*.}      # => html.erb
echo ${var##*.}     # => erb
echo ${var%/*.*}    # => /Users/karlin/git/langton_loops
 
file=${var##/*/}    # => index.html.erb
echo ${file%.*}     # => index.html
echo ${file%%.*}    # => index

Seeing how these are pitiful symbols for their associated behavior, the same guide points out a helpful mnemonic:

The # key is on the left side of the $ key and operates from the left, while % is to right

I remember that by visualizing this shift-laden dance across the second row:

# ## $ %% %

Replace First or All Substring Matches

Rather than just remove parts of the string, you can also tell bash to replace them with some slashes in your expansion:

var='1,2 3,4 5,6'
echo ${var/,/&}    # => 1&2 3,4 5,6
echo ${var//,/&}   # => 1&2 3&4 5&6

This has singly reduced my use of both sed and perl on the command-line, where those tools seemed to spend most of their time replacing bits of strings for me.

But… why?

I recently wanted a quick way to get the IP of a Mac in a Ruby script. I wrote this:

puts %x{ifconfig en0 inet}[/\sinet ([^ ]*)/,1]

Yeah! Sure, that works. But it’s not really that far from this:

ip=$(ifconfig en0 inet); ip=${x##*inet };echo ${ip%% *}

In my opinion, knowing many ways of wrangling strings means I’ll be more likely to chose the right one depending on context.