Hey, C Is a Functional Language Too!

So it turns out C is a functional language too!

On the way to Strange Loop this year, John Van Enk and I were trying to find a way to write some C code that avoided dynamic (malloc) allocation. We discovered a technique that allows you to forgo the use of malloc in many common cases. It also enables very pure functional C code.

You doubt? I shall demonstrate! I will show how you can write a linked list reversal function in C using:

  • No mutation!
  • Linked lists, with no malloc!

And this isn’t just a trick that only works in this special case, it is quite generally applicable.
Behold!


#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>

/*
This is our integer linked list type (Null is an empty list)
(it is immutable, note the consts)
*/
typedef struct IntList_s const * const IntList;
static IntList const Nil = NULL; /* the empty list */

/*
This is a Cons element. (singly linked list element)
(note: also immutable)
*/
typedef struct IntList_s
{
  int32_t const value;
  IntList next;
} Cons;

/*
This is a pointer to where we are going to store the the "result" of our
 computation. We won't really know the type of it until we have our whole 
 chain of calls built up, so it's void * for now. 
 I think if I did this in C++ I could use templates to make this
 statically typed.
*/
typedef void * const CPS_Result;

/* prototypes and typedefs for the continuations */
typedef void (*MakeListCallback)(IntList list, CPS_Result result);
void make_list( uint32_t const array_size
              , int32_t const array[]
              , IntList new_list
              , MakeListCallback const callback
              , CPS_Result result);

void reverse_and_stuff(IntList list, CPS_Result result);
void stuff_list(IntList list, CPS_Result result);

typedef void (*ReversedListCallback)( IntList reversed_list
                                    , CPS_Result result);
void reverse( IntList list
            , IntList reversed_list
            , ReversedListCallback const callback
            , CPS_Result result);

typedef void(*VoidMappable)(int32_t const value);
void void_map_array( VoidMappable const f
                   , uint32_t const size
                   , int32_t const * const array);
void print_value(int32_t const value);


/*
Define an array to reverse, and an array to store the result, and 
 then do stuff
*/
int main(int argc, char * argv[])
{
  (void)argc;
  (void)argv;
  
  int32_t my_array[] = { 2, 5, 6, 1, 9, 23, 7654, 12, 0, 4};
  uint32_t const my_array_size = sizeof(my_array)/sizeof(my_array[0]);
  
  int32_t result[my_array_size];
  /* call make_list and pass reverse_and_stuff as the "continuation".
     The action to perform next. */
  make_list(my_array_size, my_array, Nil, reverse_and_stuff, result);

  /* print out our reversed array */
  void_map_array(print_value, my_array_size, result);
  printf("\n");
  return 0;
}


/* constructs a linked list from an array */
void make_list( uint32_t const array_size
              , int32_t const array[]
              , IntList new_list
              , MakeListCallback const callback
              , CPS_Result result)
{
  if (array_size > 0)
  {
    Cons cell = { .value = array[array_size - 1], .next = new_list };
    make_list(array_size - 1, array, &cell, callback, result);
  }
  else
  {
    /* call our "continuation" with our result */
    callback(new_list, result);
  }
}

/* function that reverses a list and then stores it in an array */
void reverse_and_stuff(IntList list, CPS_Result result)
{
  reverse(list, Nil, stuff_list, result);
}

/* stuffs a linked list into an array */
void stuff_list(IntList list, CPS_Result result)
{
  if (Nil != list)
  {
    int32_t * array = result;
    array[0] = list->value;
    stuff_list(list->next, array + 1);
  }
}

void reverse( IntList list
            , IntList reversed_list
            , ReversedListCallback const callback
            , CPS_Result result)
{
  if (Nil == list)
  {
    callback(reversed_list, result);
  }
  else
  {
    /* build up the reversed list */
    Cons cell = { .value = list->value, .next = reversed_list };
    reverse(list->next, &cell, callback, result);
  }
}

/*
"loops" over an array and performs action f on each element
*/
void void_map_array( VoidMappable const f
                   , uint32_t const size
                   , int32_t const * const array)
{
  if (size > 0)
  {
    f(array[0]);
    void_map_array(f, size - 1, array + 1);
  }
}


void print_value(int32_t const value)
{
  printf("%d ", value);
}

This style actually has a special name: Continuation Passing Style. If you’ve worked with node.js it should look familiar.

The trick is to allocate values on the stack and then pass pointers to them to the next “continuation” to build up your results. This lets you get away with a functional style in C without the need to call malloc and free (malloc and free pretty much ruin everything).

As long as your intermediate values don’t get so big that you blow the stack, you should be good. The main limitation is that you need to know the size of the return value.

Would this ever actually be useful?

While I find this style strangely addictive, I don’t think I would advocate its general use. However it has a few interesting advantages that might be handy in certain situations:

  • Fixed Memory Bounds – Since everything is allocated on the stack. If we can prove that our functions terminate (say we show that a function is structurally recursive ). We get a proof of finite memory use with it!
  • Better Function Composition – I was kinda surprised by this, but I guess I shouldn’t have been. The code is a bit more ugly, but it’s definitely easier to build large things up using smaller reusable pieces.
  • No Malloc – Almost non-existent allocation overhead.

So what does “functional language” mean, then?

Almost every language has a functional subset. And it seems pretty silly to call a language “functional” just because it has a functional subset.

I think it would be much more meaningful if we described a language as “functional” only if it encourages a functional style. A functional style is a style that emphasizes“values” over states or “places”, expressions over statements, recursion over mutation.

So my title is misleading. I don’t think C is a functional language. But it’s an awful lot of fun (and sometimes very useful) to use C’s functional subset.

 

Conversation
  • Andrew says:

    Posts like this are why I love this blog :) Will definitely have to come back and take a deeper dive into the code.

  • Artyom Shalkhakov says:

    By the way, this style of programming is supported in ATS (www.ats-lang.org).

    Here is an example:

    http://ats-lang.svn.sourceforge.net/viewvc/ats-lang/trunk/utils/scripts/pkgconfig.dats?revision=2990&view=markup

    (starting at line 100)

    Please note that this program is memory- and type-safe, and involves dynamic memory allocation only to return result to the caller. It is certainly possible, I think, to reimplement this code in CPS as in your example, while still retaining safety.

    Your example can be translated to ATS, too (also retaining safety).

  • Bobby Newmark says:

    OK, I’ll be that guy. The code is cut off horizontally and it makes it really annoying to read. Yes, I could copy and paste it, but I shouldn’t have to.

    • Job Vranish Job Vranish says:

      You’re absolutely right. 130+ column lines… I’m so ashamed D:
      Fixed.

  • Nice. This is one half of a generational garbage-collection technique called Cheney-on-the-MTA, invented by Henry Baker: http://www.pipeline.com/~hbaker1/CheneyMTA.html and used in the Chicken Scheme compiler: http://www.call-cc.org/ in order to simultaneously (1) target C code (2) have a good C-FFI (3) support garbage-collection and (4) support proper tail calls.

  • This trick has been used in the Linux kernel for singly and doubly linked lists for ages. Check out list.h and its attendant uses for examples of its use all around the kernel code.

  • Richo says:

    Stating broadly that “malloc(2) and free(2) break everything” is about as counterproductive as you can be.

    At this point you’re writing the antithesis of C, you’re taking style over function.

    More specifically, if you’re writing anything sizable, and allocating everything on the stack you’re going to end up in a world of pain, not to mention callback hell.

    For a more useful example, do exactly the same thing, but allocate your memory on the heap at the stop of your stack and then pass those pointers through.

  • Matt Wilbur says:

    Another good one Job. How’s the AE embedded language coming?

  • […] Hey, C Is a Functional Language Too! mah, dubbi; però il sito è bello e loro sono cool ::: Atomic Spin […]

  • Ron Lewis says:

    I’m thinking malloc() and free() do indeed ruin everything as far as memory leaks go. I have yet to figure out some way you have a memory leak without heap functions malloc(), calloc(), realloc(), and any others, if any.

    I see two uses for malloc():
    1. to return some kinds of data from a function.
    2. to allocate a variable amount of memory.

    I see how to eliminate #1. Instead of using a malloc() inside the function, make the calling function responsible for providing the memory. Then a point to that memory is passed as an argument to the called function. This solution might look like this:

    int *called_function(int *result)
    {
    // … some useful code …
    return result;
    }

    int a[5];
    int *b;
    void calling_function(void)
    {
    int c[] = {1, 2, 3, 0};
    int *d;

    b = called_function(&a);
    d = called_function(&c);
    }

    And, of course, the calling function could use malloc() if desired and is also responsible for calling free().

    I have not figured out how to eliminate #2, allocating a variable amount of memory. Allocating the max memory for everything is not good enough because the program might run on different systems with different amounts of memory available. And you might need to reuse memory. Allocating on the stack allows reuse of memory.

    Reading and writing to files might eliminate #2 but might not be practical because file i/o slows the program.

    Are there other uses for malloc() other than #1 and #2? Can we have memory leaks even without malloc()?

    • Ben Goldberg says:

      Memory leaks without malloc are easy. Just use sbrk.

  • Comments are closed.