Recently another developer was setting up a new laptop. This usually goes fairly smoothly, but this time we ran into a glitch. Bundler just hung. We were rather frustrated.
If you really want to skip to the end, the bottom line is: don’t try to build OS X C libraries on a FAT file system. It doesn’t seem surprising that we’d run into trouble trying to do that, but it was a bit of a long road to dig down into that. Here are the gory details.
We had a fairly straightforward setup: a git repo, a gemfile, some Ruby code — nothing too fancy. But running bundle install
just left us sitting at:
] bundle install
Fetching gem metadata from https://rubygems.org/.......
Resolving dependencies...
Installing memcached (1.7.2)
I started out in the usual spot: looking at the log files. The mkmf.log file didn’t show anything interesting. Next, I pulled up pstree to look at what was actually happening:
-+= 13962 jrandomdev /Users/jrandomdev/.rbenv/versions/2.0.0-p195/bin/ruby /Users/jrandomdev/.rbenv/versions/2.0.0-p195/bin/bundle install
\-+- 14132 jrandomdev sh -c /Users/jrandomdev/.rbenv/versions/2.0.0-p195/bin/ruby extconf.rb 2>&1
\-+- 14133 jrandomdev /Users/jrandomdev/.rbenv/versions/2.0.0-p195/bin/ruby extconf.rb
\-+- 16916 jrandomdev ruby ../extconf-make.rb
\-+- 16917 jrandomdev sh -c make CXXFLAGS=' -O3 -fno-fast-math -ggdb3 -Wall -Wextra -Wno-unused-parameter -Wno-parentheses -Wno-long-long -Wno-missing-field-initializers -Wunused-variable -Wpointer-arith -Wwrite-strings -Wdeclaration-after-statement -Wshorten-64-to-32 -Wimplicit-function-declaration -std=gnu++98 -O3 -Wno-error=shorten-64-to-32 -pipe -O3 -Wno-error=shorten-64-to-32 -pipe' || true 2>&1
\-+- 16918 jrandomdev /Library/Developer/CommandLineTools/usr/bin/make CXXFLAGS= -O3 -fno-fast-math -ggdb3 -Wall -Wextra -Wno-unused-parameter -Wno-parentheses -Wno-long-long -Wno-missing-field-initializers -Wunused-variable -Wpointer-arith -Wwrite-strings -Wdeclaration-after-statement -Wshorten-64-to-32 -Wimplicit-function-declaration -std=gnu++98 -O3 -Wno-error=shorten-64-to-32 -pipe -O3 -Wno-error=shorten-64-to-32 -pipe
\-+- 16919 jrandomdev /Library/Developer/CommandLineTools/usr/bin/make all-recursive
\-+- 16920 jrandomdev /bin/sh -c fail=; if (target_option=k; case ${target_option-} in ?) ;; *) echo "am__make_running_with_option: internal error: invalid" "target option '${target_option-}' specified" >&2; exit 1;; esac; has_opt=no; sane_makeflags=$MAKEFLAGS; if test -n ' Makefile' && test -n '1'; then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \^I]*) bs=\\; sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs^I]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac; done; test $has_opt = yes); then failcom='fail=yes'; else failcom='exit 1'; fi; dot_seen=no; target=`echo all-recursive | sed s/-recursive//`; case "all-recursive" in distclean-* | maintainer-clean-*) list='libmemcached libmemcachedutil support clients tests' ;; *) list='libmemcached libmemcachedutil support clients tests' ;; esac; for subdir in $list; do echo "Making $target in $subdir"; if test "$subdir" = "."; then dot_seen=yes; local_target="$target-am"; else local_target="$target"; fi; (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && /Library/Developer/CommandLineTools/usr/bin/make $local_target) || eval $failcom; done; if test "$dot_seen" = "no"; then /Library/Developer/CommandLineTools/usr/bin/make "$target-am" || exit 1; fi; test -z "$fail"
\-+- 16925 jrandomdev /bin/sh -c fail=; if (target_option=k; case ${target_option-} in ?) ;; *) echo "am__make_running_with_option: internal error: invalid" "target option '${target_option-}' specified" >&;2; exit 1;; esac; has_opt=no; sane_makeflags=$MAKEFLAGS; if test -n ' Makefile' && test -n '1'; then sane_makeflags=$MFLAGS; else case $MAKEFLAGS in *\\[\ \^I]*) bs=\\; sane_makeflags=`printf '%s\n' "$MAKEFLAGS" | sed "s/$bs$bs[$bs $bs^I]*//g"`;; esac; fi; skip_next=no; strip_trailopt () { flg=`printf '%s\n' "$flg" | sed "s/$1.*$//"`; }; for flg in $sane_makeflags; do test $skip_next = yes && { skip_next=no; continue; }; case $flg in *=*|--*) continue;; -*I) strip_trailopt 'I'; skip_next=yes;; -*I?*) strip_trailopt 'I';; -*O) strip_trailopt 'O'; skip_next=yes;; -*O?*) strip_trailopt 'O';; -*l) strip_trailopt 'l'; skip_next=yes;; -*l?*) strip_trailopt 'l';; -[dEDm]) skip_next=yes;; -[JT]) skip_next=yes;; esac; case $flg in *$target_option*) has_opt=yes; break;; esac; done; test $has_opt = yes); then failcom='fail=yes'; else failcom='exit 1'; fi; dot_seen=no; target=`echo all-recursive | sed s/-recursive//`; case "all-recursive" in distclean-* | maintainer-clean-*) list='libmemcached libmemcachedutil support clients tests' ;; *) list='libmemcached libmemcachedutil support clients tests' ;; esac; for subdir in $list; do echo "Making $target in $subdir"; if test "$subdir" = "."; then dot_seen=yes; local_target="$target-am"; else local_target="$target"; fi; (CDPATH="${ZSH_VERSION+.}:" && cd $subdir && /Library/Developer/CommandLineTools/usr/bin/make $local_target) || eval $failcom; done; if test "$dot_seen" = "no"; then /Library/Developer/CommandLineTools/usr/bin/make "$target-am" || exit 1; fi; test -z "$fail"
\-+- 16926 jrandomdev /Library/Developer/CommandLineTools/usr/bin/make all
\-+- 16927 jrandomdev /Library/Developer/CommandLineTools/usr/bin/make all-am
\-+- 17413 jrandomdev /bin/sh ../libtool --tag=CC --mode=link clang -std=gnu99 -O3 -pedantic -Wall -Wextra -Wundef -Wshadow -fdiagnostics-show-option -fvisibility=hidden -Wformat=2 -Wconversion -Wstrict-aliasing -Wstrict-prototypes -Wmissing-prototypes -Wredundant-decls -Wmissing-declarations -Wcast-align -Wswitch-default -Wswitch-enum -Wwrite-strings -Wlogical-op -D_THREAD_SAFE -fPIC -O3 -Wno-error=shorten-64-to-32 -pipe -O3 -Wno-error=shorten-64-to-32 -pipe -version-info 2:0:0 -lsasl2 -fPIC -L. -L/Users/jrandomdev/.rbenv/versions/2.0.0-p195/lib -fstack-protector -L. -L/Users/jrandomdev/.rbenv/versions/2.0.0-p195/lib -fstack-protector -L/Users/jrandomdev/.rbenv/versions/2.0.0-p195/lib -o libmemcached.la -rpath /Volumes/repos/test1/vendor/gems/ruby/2.0.0/gems/memcached-1.7.2/ext/lib libmemcached_la-crc.lo libmemcached_la-memcached.lo libmemcached_la-memcached_auto.lo libmemcached_la-memcached_analyze.lo libmemcached_la-memcached_behavior.lo libmemcached_la-memcached_connect.lo libmemcached_la-memcached_delete.lo libmemcached_la-memcached_do.lo libmemcached_la-memcached_dump.lo libmemcached_la-memcached_exist.lo libmemcached_la-memcached_fetch.lo libmemcached_la-memcached_flush.lo libmemcached_la-memcached_get.lo libmemcached_la-memcached_touch.lo libmemcached_la-memcached_hash.lo libmemcached_la-memcached_hosts.lo libmemcached_la-memcached_io.lo libmemcached_la-memcached_purge.lo libmemcached_la-memcached_flush_buffers.lo libmemcached_la-md5.lo libmemcached_la-memcached_key.lo libmemcached_la-memcached_quit.lo libmemcached_la-memcached_parse.lo libmemcached_la-memcached_response.lo libmemcached_la-memcached_result.lo libmemcached_la-memcached_server.lo libmemcached_la-memcached_storage.lo libmemcached_la-memcached_string.lo libmemcached_la-memcached_stats.lo libmemcached_la-memcached_strerror.lo libmemcached_la-memcached_verbosity.lo libmemcached_la-memcached_version.lo libmemcached_la-murmur_hash.lo libmemcached_la-jenkins_hash.lo libmemcached_la-memcached_allocators.lo libmemcached_la-byteorder.lo libmemcached_la-memcached_sasl.lo libmemcachedcallbacks.la
\--- 18718 jrandomdev sleep 2
Hmm, that’s strange. I mean, I suppose it makes sense that it’s sleeping, since it clearly isn’t *doing* anything. But *why*?!
Some digging found “this bug report”:https://lists.gnu.org/archive/html/bug-libtool/2013-06/msg00001.html titled “bug#14579: locking breaks across filesystems”. We were on an encrypted filesystem, so maybe that’s what’s happening. I then tried to duplicated it by creating a TrueCrypt volume on my machine and seeing if it broke in the same way. It didn’t. But then another developer tried the same thing and was able to replicate the problem. Now I had an example that worked and one that didn’t; what was the difference?
After I watched the other dev walk through it again, I noticed something that we did differently: he selected FAT as the filesystem. That is TrueCrypt’s default, so he just clicked through it. Based on that bug report, which described how libtool uses symlinks to create lock files, I realized that the filesystem was probably the culprit: FAT doesn’t support symlinks.
This one was a frustrating one to track down because it’s so specific to that use case. Not all gems have native components, so those all worked fine. And not all gems that have native components require libtool to build, so you can work successfully on a FAT-formatted TrueCrypt volume for a long time before you start running into these issues.
As frustrating as these are to diagnose, these are my favorite sorts of bugs to talk about afterwards: the kind where the bizarre symptoms make absolutely no sense at first, but turn out to follow logically from the root cause once you’ve figured that out. What crazy bugs have you found?