2 Comments

Table Relationship Diagrams with Graphviz

I recently found myself looking for a modern tool to diagram relationships between tables of data. When I came up short, I used a very old tool instead.

Specifically, I was documenting how a software system accepts data from a third party, mapping fields from external records to an internal message type:

Basically, I wanted to draw arrows between rows of tables. But I was very particular about how!

Requirements

The diagrams I was producing describe software. Like software, they had to be easy to change. In particular, I sought the following properties:

  1. Arrows attach to rows of the table.
    I do not want to manually position the ends of arrows, ever.
  2. Table entries can be edited and reordered without breaking links.
  3. It works with a text-based source format.
    I intended to check these files into source control, so they needed to diff and merge nicely.

Do any tools come to mind? I searched around for a while before dusting off an old one:

Graphviz

Graphviz is a collection of utilities and libraries for generating diagrams from a language called DOT. Even if you haven’t used it directly, you’ve probably seen Graphviz output before, perhaps in an academic paper or a database entity relationship diagram.

I was familiar with its flowchart-like diagrams of bubbles, boxes, and diamonds, but I thought I’d check to see if it could be coaxed into producing connected tables. I fiddled with a few examples from the gallery, read through some documentation, and finally found just the ticket.

HTML-ish

A relatively recent addition to Graphviz (circa 2003!) is the “HTML-Like Label.” This carefully-named feature lets you specify node appearance using familiar HTML <table> syntax, with the addition of named ports where you can connect arrows.

The documentation scared me a little when it warned that the syntax is “not really HTML,” but then they made up for it by providing a formal grammar of exactly what is accepted.

Here’s the source for the image at the top of the post:

digraph {
    graph [pad="0.5", nodesep="0.5", ranksep="2"];
    node [shape=plain]
    rankdir=LR;


Foo [label=<
<table border="0" cellborder="1" cellspacing="0">
  <tr><td><i>Input Foo</i></td></tr>
  <tr><td port="1">one</td></tr>
  <tr><td port="2">two</td></tr>
  <tr><td port="3">three</td></tr>
  <tr><td port="4">four</td></tr>
  <tr><td port="5">five</td></tr>
  <tr><td port="6">six</td></tr>
</table>>];


Bar [label=<
<table border="0" cellborder="1" cellspacing="0">
  <tr><td><i>Input Bar</i></td></tr>
  <tr><td port="7">seven</td></tr>
  <tr><td port="8">eight</td></tr>
  <tr><td port="9">nine</td></tr>
  <tr><td port="10">ten</td></tr>
</table>>];


Baz [label=<
<table border="0" cellborder="1" cellspacing="0">
  <tr><td><i>Output Baz</i></td></tr>
  <tr><td port="a">alpha</td></tr>
  <tr><td port="b">bravo</td></tr>
  <tr><td port="c">charlie</td></tr>
  <tr><td port="d">delta</td></tr>
  <tr><td port="e">echo</td></tr>
  <tr><td port="f">foxtrot</td></tr>
</table>>];

Foo:2 -> Baz:a;
Foo:3 -> Baz:e;
Foo:6 -> Baz:b;
Bar:7 -> Baz:d;
Bar:9 -> Baz:f;
}

It’s not beautiful, but it sure beats dragging endpoints around with a mouse.

Tooling

The primary tool in the Graphviz suite is dot, which reads a source file and emits one of several output types. I’m producing SVG like this:

dot input.gv -Tsvg -o out.svg

Next, wanting to iterate quickly, I set up a watcher to automatically recompile when the input file changes. (Thanks, Alex, for introducing me to entr):

ls *.gv |entr dot /_ -Tsvg -o out.svg

None of my SVG-viewing applications pick up the file change automatically, but it’s not too bad to tab over and refresh. One last stop before I start writing: syntax highlighting. I searched Visual Studio Code’s available extensions:

I wasn’t surprised at all by the first result, which provides syntax highlighting for this arcane language. I was surprised, however, by the second, which previews the output right there in the editor, as you type!

So there I was, nearly done cobbling together my own tooling to work on Graphviz diagrams, when I found a much smoother experience presented on a silver platter. A different developer contributed each piece of the puzzle, and I discovered and installed them in about three seconds. This is what I love about large-community projects like VS Code.

Old Software

It’s obviously not perfect, but Graphviz satisfied my criteria. When I skimmed through modern diagramming tools, nothing even came close! Do you know of any other tools that can do this kind of thing?

According to Wikipedia, Graphviz is 26 years old. This puts it right up there with venerable command-line utilities like curl and ImageMagick. What ancient software do you keep coming back to?