Bash String Manipulation in Shell Scripts

Article summary

Replace First or All Substring Matches
But... why?

I use the terminal for a lot of my work, so when I need to process output from other tools, I have a lot of options. I started out using shell scripts, but eventually moved on to scripting languages — first with Perl, then Ruby, with occasional Python. Lately I’ve been getting familiar with my shell (bash) again, finding new ways to stretch its usefulness as a tool on its own, without pulling in typical auxiliary support from grep, sed, or awk.

Bash is a big program, and the bulky man page can hide some gems. One area where it’s quite capable, which I only recently dug into, is string handling. Except it’s not really called that, so you might not have noticed it. As James Coglan rightfully puts it:

The thing about “rtfm” is that bash stuff is ungooglable unless you know the jargon for the thing you’re trying to do.

— jcoglan.txt (@jcoglan) January 26, 2014

In this case, what I’m going to show is a couple of useful bits from the “Parameter Expansion” section of the bash man page. And though I otherwise enjoy the stress reduction of telling my shell to man bash, I actually learned these tricks by reading the Advanced Bash-Scripting Guide, which usually calls this “Parameter Substitution”. It even elaborates my point about knowing what to look for:

Bash supports a surprising number of string manipulation operations. Unfortunately, these tools lack a unified focus.

The ABSG is a great guide, with detailed examples (like pattern matching) and a handy table for the (reasonably-named) “String Operations”. Let’s look at a few of them.

Remove Shortest or Longest Match from Beginning or End

My first example of interesting string tools in bash are # and %. They’re both for removing parts of a string, and they can be used singly or doubly:

# removes the shortest match from the beginning
## removes the longest match from the beginning
% removes the shortest match from the end
%% removes the longest match from the end

I’ll paraphrase some examples from another rich bash guide to affect some useful path munging:

var=/Users/karlin/git/langton_loops/index.html.erb
echo ${var}         # => /Users/karlin/git/langton_loops/index.html.erb
echo ${var#*.}      # => html.erb
echo ${var##*.}     # => erb
echo ${var%/*.*}    # => /Users/karlin/git/langton_loops

file=${var##/*/}    # => index.html.erb
echo ${file%.*}     # => index.html
echo ${file%%.*}    # => index

Seeing how these are pitiful symbols for their associated behavior, the same guide points out a helpful mnemonic:

The # key is on the left side of the $ key and operates from the left, while % is to right

I remember that by visualizing this shift-laden dance across the second row:

# ## $ %% %

Replace First or All Substring Matches

Rather than just remove parts of the string, you can also tell bash to replace them with some slashes in your expansion:

var='1,2 3,4 5,6'
echo ${var/,/&}    # => 1&2 3,4 5,6
echo ${var//,/&}   # => 1&2 3&4 5&6

This has singly reduced my use of both sed and perl on the command-line, where those tools seemed to spend most of their time replacing bits of strings for me.

But… why?

I recently wanted a quick way to get the IP of a Mac in a Ruby script. I wrote this:

puts %x{ifconfig en0 inet}[/\sinet ([^ ]*)/,1]

Yeah! Sure, that works. But it’s not really that far from this:

ip=$(ifconfig en0 inet); ip=${x##*inet };echo ${ip%% *}

In my opinion, knowing many ways of wrangling strings means I’ll be more likely to chose the right one depending on context.

Conversation

Mike Hall says:

February 16, 2014

This is great. Shell scripting is a skill that I think most developers give short shrift but is incredibly valuable. I wasn’t even aware of these string manipulation operators and thanks for the links to resources as well. Fun article.

Karlin Fox says:

February 17, 2014

Thanks Mike, I’m glad it was helpful!

I totally agree that shell scripting skills are valuable. Even if you are getting shell-like work done in a favorite scripting language, you’d usually end up configuring environments and executing the process through your shell. Knowing the basics of string ops and for loops opens up a whole category of tasks to quick completion in the shell, without firing up an editor at all.

eric stewart says:

January 21, 2015

Great blog, I have a question on spooling duplicate strings. Example of EDI file below I am interested in spooling out for analysis line 2 & 3 that are duplicates. For purposes of test lets call the file name test.txt. Could you please assist with the approach I should use..

N1*12324*01212015
REF*59*abcdefg
REF*59*abcdefg
XYZ*IL*3407

REF*59

Eddie 7 says:

March 16, 2016

Perl has an EDI library.
Been using it for 10 years+

Comments are closed.

String Tricks that Bash Knows

Article summary

Remove Shortest or Longest Match from Beginning or End

Replace First or All Substring Matches

But… why?

Tell Us About Your Project

Article summary

Remove Shortest or Longest Match from Beginning or End

Replace First or All Substring Matches

But… why?

Related Posts

How to Create Docusign Envelopes with a Variable Number of Signers

Solve Optimization Problems with MiniZinc and Google OR Tools

3 Reasons I Love CSS and You Should Too

Keep up with our latest posts.

Tell Us About Your Project