Generic Memoization in C#

I have seen some very nice, generic forms of memoization in the dynamic languages I’ve used. In languages like Ruby and Perl, for example, dynamically redefining a method to be a memoized version of itself is a good way to transparently handle it. However, I haven’t seen any examples of generic case memoization for C# methods that I’ve been happy with. So I took a stab at it.

First, let’s review what memoization is. Simply put, memoization is caching the results of a function so that the potentially expensive calculations don’t have to be repeated. We’re trading space in memory for speed. Depending on how the memoized function is used, and the characteristics of the function, the result can range from a slight to a dramatic increase in performance. See “the Wikipedia article on the topic”:http://en.wikipedia.org/wiki/Memoization if you’d like more details.

As with any type of caching, there are some constraints. First of all, your function must have no side effects. It also must always return the same value when given the same input since the input will be the keys to the cached results. I’ve imposed a few additional constraints for this implementation:

* All input objects must have sane implementations of GetHashCode.
* Memoized methods on a class will be properties of type Func<>.

Regarding the second point above, I’m not sure how I feel about it but there were a few reasons I chose this method. First of all, it means the memoized function is not redefined on each call. Also, the field initializer gets run on every object instantiation so it handles static as well as non-static methods properly. All this while keeping normal-looking method call syntax for those using the method. It does cause IntelliSense to behave a little differently, because it is a property. I’m interested to hear if other people have a better way of handling this. As a result of my decision, my code for declaring a memoized method is as follows:

public Func<string, string> WithOneInput = Memoizer.Memoize(
    (string x) =>
        {
            return String.Format("Hello, {0}", x);
        }
);

Now for the code that implements memoization. There are three interesting cases to tackle — they are the usual 0, 1, and 2 or more.

h2. No input

Memoization with no input is caching only a single value, but still worthwhile if the value is costly to compute.

public static Func<TReturn> Memoize<TReturn>(Func<TReturn> func)
{
    object cache = null;
    return () =>
    {
        if (cache == null)
            cache = func();
        return (TReturn)cache;
    };
}

h2. One input

When we have only one input value, we can use the input object itself as the key into the dictionary cache.

public static Func<TSource, TReturn> Memoize<TSource, TReturn>(Func<TSource, TReturn> func)
{
    var cache = new Dictionary<TSource, TReturn>();
    return s =>
    {
        if (!cache.ContainsKey(s))
        {
            cache[s] = func(s);
        }
        return cache[s];
    };
}

h2. More than one input

When we have more than one input value, we have to compute a reasonable key for the dictionary. I opted to use the concatenation of the object hash codes as that seems like a reasonable approach. If there’s a better way I’d like to hear your input.

public static Func<TSource1, TSource2, TReturn> Memoize<TSource1, TSource2, TReturn>(Func<TSource1, TSource2, TReturn> func)
{
    var cache = new Dictionary<string, TReturn>();
    return (s1, s2) =>
    {
        var key = s1.GetHashCode().ToString() + s2.GetHashCode().ToString();
        if (!cache.ContainsKey(key))
        {
            cache[key] = func(s1, s2);
        }
        return cache[key];
    };
}

h2. See the entire example on github

The “entire example is hosted on github”:https://github.com/atomicobject/CSharpMemoizer for those interested in seeing it in its entirety.

 
Conversation
  • Pete Johanson says:

    For the 2+ input parameters case, you may be able to use System.Tuple. I would be surprised if System.Tuple didn’t have a sane Equals/GetHashCode implementation which would allow you to simply create a new tuple and then use it for the keys of your dictionary cache.

    Also, doing so would avoid the (however unlikely) case where the concatenation of the keys causes a collision between different inputs. A System.Tuple with a sane Equals method would still properly work even with a hashcode collision since Dictionary uses Equals as a fallback to check uniqueness when there is such a collision.

  • Ron Jeffries says:

    Very interesting, and clear, thanks.

    Seems to me that the 2+ case can fail with a hash collision, which is not impossible.

    I’m not clear on the value of creating the generalized Memoizer, but suppose it might be there. And I’d like to see some more general way of supporting 0-n or at least 2-n, rather than needing to generate all the cases manually, which seems to be the case now.

  • Comments are closed.