FFI: Foreign Function Interfaces for Fun & Industry

February 15, 2013

Article summary

What Is FFI?
Why Use FFI?
When Not To Use FFI
Making a C Library FFI Friendly
Reaching Beyond C
Wrapping Up

I love writing code in high-level languages, like Ruby. But a lot of cool and useful libraries are written in lower-level languages, like C. Alas! What is a programmer to do?

There are three common approaches to solving this conundrum:

Port all or part of the library to your language of choice.
Write an extension in C code to bridge the gap between the library and your language.
Wrap the library using your language’s foreign function interface (FFI) support.

I used each of these three approaches over the 7-year history of developing Rubygame. I began with a combination of Ruby port and C extension, then migrated to a combination of Ruby port and FFI wrapper using Ruby-FFI. And a few years ago, fellow Atom Shawn Anderson wrote about his experience creating a Ruby-FFI wrapper for the Chipmunk game physics engine.

Based on our experience, we have to say: FFI is awesome. An FFI wrapper is much easier to write and maintain than a C extension, more portable across platforms and language implementations, and easier for users to install.

lthough most of my FFI experience is with Ruby, many other languages offer some form of FFI support: Common Lisp, Haskell, Lua, Python, Perl, and Java, to name just a few. Many of the same benefits (and limitations) of FFI that I’ll be describing in this post apply just as much to other languages as they do to Ruby.

What Is FFI?

FFI refers to the ability for code written in one language (the “host language,” such as Ruby), to access and invoke functions written in another language (the “guest language,” such as C). The term “foreign” refers to the fact that the functions come from another language and environment.

Depending on the language and its FFI support, you might also be able to access global named variables, automatically convert data types between the host and guest languages, and have code in the guest language invoke functions in the host language as callbacks.

In interpreted languages like Ruby, it’s usually not possible to use a library’s compile-time features like C preprocessor macros and constants (i.e. things #define‘d in the library headers). This is because the FFI accesses the library’s binary code (e.g. its .so, .dylib, or .dll) directly, without compiling any code.

However, the FFI support in some compiled languages works by compiling down to C code; in these cases, you may be able to use compile-time features. It all depends on the language and how it implements FFI.

Why Use FFI?

As I mentioned earlier, using FFI has many benefits over writing extensions in C. When I migrated Rubygame from an extension written in to a wrapper using Ruby-FFI, the most significant benefits I observed were:

It’s much easier to write and maintain your code, because it’s written in your language of choice and can leverage the language’s high-level features and metaprogramming to abstract the wrapper code.
It’s easier for users to install, because they don’t need a C compiler or the development headers for the library you are wrapping. This is especially important if you want your library to be accessible to Windows and MacOS X users, because those platforms don’t have a C compiler installed by default.
In the case of Ruby, FFI code is more portable to other implementations than extensions written in C. For example, code using Ruby-FFI can run just as well on MRI (the “usual” Ruby implementation), JRuby, and Rubinius with no modifications.

Whatever your language of choice, if you need to interface with a library written in C, I highly recommend trying out FFI first.

When Not To Use FFI

As amazing as FFI is, it’s not always the right tool for the job. For example, a C extension may be a better approach in any of these scenarios:

You need to implement your own low-level or highly-optimized code. Ideally, the functions in the C library you are wrapping will do most of the heavy lifting, but if you need to write some custom code to directly process huge arrays of numerical or binary data, you might need to write code in C or another lower-level language to get the performance you want.
You need to perform some delicate callbacks from the guest language into the host language. Although it’s sometimes possible (depending on the host language’s FFI support) to perform callbacks, some kinds of complex callback function signatures can be quite tricky to satisfy through FFI.
The library makes heavy use of compile-time or preprocessor features, such as C macros. In the case of simple macros, you may be able to reimplement its behavior as a function in your language of choice. But if the library does some serious macro-fu, you might be better off just writing a C extension.

Constants created using #define can also be a slight nuisance, since most FFI systems will not be able to see them. You can re-define the constants in your own code, but be aware that if the library changes the values of those constants, you’ll need to update your code as well. This can make supporting multiple library versions a tricky endeavor — especially if the library’s version number itself is defined only as a preprocessor constant!

Making a C Library FFI Friendly

Looking at it from another angle, here’s a checklist for making a C library FFI friendly. It’s not hard, and it drastically increases the potential user base of the library.

Always, always, always export the library’s version number(s) as either global const variables (e.g. three const uint8_t variables) or as a function that returns the version number(s) in a struct. It’s fine to also #define version numbers in the library headers, but #define alone is not sufficient.
Either export important constants as global const variables (perhaps in addition to using enum or #define), or publicly document their values and make a commitment to announce any future changes well in advance.
Provide function equivalents of any important preprocessor macros, and/or clearly document the purpose and functionality of each important macro, so that it can be reimplemented in another language.
Keep callback signatures simple, shallow, and well-documented. This is especially important if callbacks are necessary to use the library’s core functionality.
For libraries written in C++, consider also offering a C-compatible interface (using extern C { ... }), at least for the core functionality of the library. When C++ libraries are compiled, method names may be “mangled” in ways that are not entirely predictable, making it tricky to wrap them with FFI.

It’s worth noting that, besides making the library FFI friendly, these are all just general good practices that make the library more stable, mature, and accessible.

Reaching Beyond C

Many languages provide a C library that allows developers to interface with that language — for example, libobjc for Objective-C, liblua for Lua, and the Java Native Interface (JNI). Such libraries are typically used within programs written in C or C++, but if you’re feeling clever — and more than a bit foolhardy — you can actually use FFI to build a bridge from Ruby into the other language!

I used this technique in Rubygame to interface with Objective-C so I could make calls into MacOS X’s Cocoa UI framework. Here’s a barebones example of using Ruby-FFI to interface with Objective-C on MacOS X:

# Load the 'ffi' gem.
require 'ffi'

module ObjC
  extend FFI::Library

  # Load the 'libobjc' library.
  ffi_lib 'objc'

  # Bind the 'sel_registerName' function, which accepts a String and
  # returns an equivalent Objective-C selector (i.e. message name).
  attach_function :sel_registerName, [:string], :pointer

  # Bind the 'objc_msgSend' function, which sends a message to an
  # Objective-C object. It accepts a pointer to the object being sent
  # the message, a pointer to a selector, and a varargs array of
  # arguments to be sent with the message. It returns a pointer to the
  # result of sending the message.
  attach_function :objc_msgSend, [:pointer, :pointer, :varargs], :pointer

  # A convenience method using objc_msgSend and sel_registerName to easily
  # send Objective-C messages from Ruby.
  def self.msgSend( id, selector, *args )
    selector = sel_registerName(selector) if selector.is_a? String
    return objc_msgSend( id, selector, *args )
  end

  # Bind the 'objc_getClass' function, which accepts the name of an
  # Objective-C class, and returns a pointer to that class.
  attach_function :objc_getClass, [:string], :pointer
end


module Cocoa
  extend FFI::Library

  # Load the Cocoa framework's binary code
  ffi_lib '/System/Library/Frameworks/Cocoa.framework/Cocoa'

  # Needed to properly set up the Objective-C environment.
  attach_function :NSApplicationLoad, [], :bool
  NSApplicationLoad()

  # Accepts a Ruby String and creates an equivalent NSString instance
  # and returns a pointer to it.
  def self.String_to_NSString( string )
    nsstring_class = ObjC.objc_getClass("NSString")
    ObjC.msgSend( nsstring_class, "stringWithUTF8String:",
                  :string, string )
  end

  # Accepts a pointer to an NSString object, and returns the string
  # contents as a Ruby String.
  def self.NSString_to_String( nsstring_pointer )
    c_string_pointer = ObjC.msgSend( nsstring_pointer, "UTF8String" )
    if c_string_pointer.null?
      return "(NULL)"
    else
      return c_string_pointer.read_string()
    end
  end
end


# Create a new empty NSMutableArray instance
nsmutablearray_class = ObjC.objc_getClass("NSMutableArray")
array = ObjC.msgSend( nsmutablearray_class, "array" )

# Add two NSString objects to the array
%w( Hello World ).each do |string|
  ObjC.msgSend( array, "addObject:",
                :pointer, Cocoa.String_to_NSString(string) )
end

# Print the array's description (analogous to Ruby's "inspect")
description = ObjC.msgSend( array, "description" )
puts Cocoa.NSString_to_String( description )

# The console output when this script is run:
#
#   (
#       Hello,
#       World
#   )

As you can see, connecting Ruby ↔ C ↔ Objective-C requires a significant amount of glue code. After all, you’re dealing with three different systems for objects/data and functions/methods. But, it can be done! And you can actually make it quite manageable and easy to use by investing in some useful abstractions, as I did for the real Rubygame code.

Of course, if you want to write Ruby code that interfaces with Objective-C, it’s probably easier to use MacRuby (or RubyMotion for iOS development). Likewise, JRuby is a solid platform for interfacing with Java code. But FFI can still be useful to provide broader platform support, or just for the intellectual challenge. (My code actually allowed JRuby code to interface with Cocoa. That’s a chain of Ruby ↔ Java Virtual Machine ↔ C ↔ Objective-C!)

Wrapping Up

As you can see, FFI can be a powerful tool in Ruby and other languages. If you need to wrap a library in another language, I would encourage you to reach for FFI first, and only resort to a C extension if some special circumstance makes in a necessity. Your users will thank you, and so will your future self as you maintain your code.

Conversation

Pankaj Thaulia says:

March 27, 2013

Hi John,

Thanks for an informative tutorial!

Query: Can we use FFI to call the Java methods from the JAR’ed classes (wrapper.jar)?

Details:
I am already using FFI to access a bunch of C DLLs (or .so on Linux) into my Cucumber-Ruby framework to test the APIs. Now, there’s a Java Wrapper built on top of those C libraries for Android Application Development. So my challenge is to test that Java Wrapper using my Cucumber-Ruby framework. Is it possible with FFI, just like for C libraries?

Note: I am still doing the R&D on that, and haven’t tried yet calling or including the “wrapper.jar” into my ruby code.

I would appreciate your help/suggestions on this.

Thanks
Pankaj Thaulia.

John Croisant says:

March 28, 2013

Hi, Pankaj. I’m glad you enjoyed this post!

I’m afraid I don’t have experience interacting with Java via FFI, so I’m not the best person to answer your question. But, two things I would suggest trying are:

1. Use JRuby for this task, if possible. JRuby can interact with Java libraries directly, so testing the wrapper should be quite easy.

2. If using JRuby is not possible, research the Java Native Interface (JNI). The JNI is what you would use to interact with Java from C, so it may be possible to wrap the JNI using Ruby FFI, and use that as a bridge to interact with Java. This would be much, much harder than using JRuby, though!

Good luck, and let me know how it goes. :)

Arseni says:

July 1, 2013

Hello, John!

As you’ve used FFI in a real-world task, can you say something about performance and memory management? Is it worth it?
I mean, if I need to process some data using external C code/program, would I benefit from FFI comparing to opened IO with same program?
Because I’m a bit lost, if I need to switch to C or just wrap used programs as a dll to make my code fast enough and maintainable using Ruby.

John Croisant says:

July 8, 2013

Hi, Arseni.

I have not compared the performance of FFI with a DLL versus IO with an external process, so I can’t really give you a definite answer. I’m sure there are trade-offs depending on the situation, depending on a number of factors, such as:

* Whether the DLL or external program already exists, or if you are starting from scratch. For this factor, I would favor FFI if the DLL exists or you are starting from scratch. If the program exists and has a well-defined IO interface, using that is probably an easier path.

* How much overhead there is in parsing the output of the program, and in formatting the input sent to the program. If it is a complex format (e.g. XML), then it may be more efficient to access the data structures directly via FFI (if that is possible). If the input/output format is something simple like a comma- or newline-separated list, IO is probably more efficient. This factor becomes more important depending on how much data is being passed back and forth over IO.

* The method used to communicate with the program. For example, Unix sockets are quite efficient in terms of performance, but somewhat awkward to use in code, and not available on all platforms. TCP/IP or HTTP connections will be less efficient in performance, but easier to use in code due to the libraries available. Like before, this factor becomes more important as the amount of IO increases.

So, in the end it is a judgement call based on the specific scenario. The performance of FFI is pretty good, so it usually won’t be the most important factor in the decision.

The most reliable way to tell which is better, of course, is to implement in both ways and compare! :)

Comments are closed.

Article summary

What Is FFI?

Why Use FFI?

When Not To Use FFI

Making a C Library FFI Friendly

Reaching Beyond C

Wrapping Up

Related Posts

Embracing Mainline Development: Beyond Feature Branches

Inspired by Nature: An Introduction to Genetic Algorithms

Learn the Fascinating History and Uses of the Public Suffix List

Keep up with our latest posts.

Tell Us About Your Project