Sunday, November 27, 2016

The Performance Cost of Boxing in .NET

I recently had to do some performance optimizations against a sorted dictionary that yielded some interesting results...

Background: I am used to using Tuples a lot, simply because they are easy to use and normally quite efficient. Please remember that Tuples were changed from structs to classes back in .NET 4.0.

Problem: A struct decreased performance!

I had a SortedDictionary that was using a Tuple as a key, so I thought "hey, I'll just change that tuple to a struct and reduce the memory usage." ...bad news, that made performance WORSE!

Why would using a struct make performance worse? It's actually quite simple and obvious when you think about it: it was causing comparisons to repeatedly box the primitive data structure, thus allocating more memory on the heap and triggering more garbage collections.

Solution: Use a struct with an IComparer.

I then created a custom struct and used that; it was must faster, but it was still causing boxing because of the non-generic IComparable interface. So finally I added a generic IComparer and passed that into my dictionary constructor; my dictionary then ran fast and efficient, causing a total of ZERO garbage collections!

See for yourself:

The Moral of the Story

Try to be aware of what default implementations are doing, and always remember that boxing to object can add up fast. Also, pay attention to the Visual Studio Diagnostics Tools window; it can be very informative!

Here is how many lines of code it took to achieve a 5x performance increase:

private struct MyStruct
{
    public MyStruct(int i, string s) { I = i; S = s; }
    public readonly int I;
    public readonly string S;
}
 
private class MyStructComparer : IComparer<MyStruct>
{
    public int Compare(MyStruct x, MyStruct y)
    {
        var c = x.I.CompareTo(y.I);
        return c != 0 ? c : StringComparer.Ordinal.Compare(x.S, y.S);
    }
}

Test Program

I have written some detailed comments in the Main function about what each test is doing and how it will affect performance. Let's take a look...

public static class Program
{
    public static void Main(string[] args)
    {
        // This is going to use the class implementation
        // of Tuple that was changed in .NET 4.0. It is
        // be inefficient for three reasons:
        // 1) It is allocating a new class on the heap 
        //    for each and every key.
        // 2) It is using the default object comparer
        //    to check each item, causing inefficient 
        //    comparisons and...
        // 3) Causing boxing of primitive values.
        ClassTupleAsKey();
        Reset();
        // DURATION: 8,479 miliseconds
 
        // This is going to use the struct implementation
        // of Tuple that was available from .NET 2.0
        // until 4.0. It is even less efficiente than the
        // class tuple for both reasons listed above, and 
        // one new reason:
        // 4) The StructTuple itself will be boxed when
        //    being passed between comparers, causing
        //    even more garbage collections to occur!
        StructTupleAsKey();
        Reset();
        // DURATION: 8,880 miliseconds
 
        // Now we move to a custom struct implementation
        // for the dictionary key, but because we are using
        // a SortedDictionary we still need to implement
        // IComparable, which is not generic, and thus will
        // be slow for one reason:
        // 1) The struct itself will be boxed when being
        //    passed between comparers.
        MyStructAsKey();
        Reset();
        // DURATION: 2,896 miliseconds
 
        // This is pretty much as fast as we can get; no
        // heap allocations, no boxing, fast comparisons.
        MyStructAsKeyWithCompare();
        // DURATION: 1,442 miliseconds
    }
 
    private static void Reset()
    {
        GC.Collect();
        Thread.Sleep(500);
    }
 
    private static void ClassTupleAsKey()
    {
        var dictionary = new SortedDictionary<ClassTuple<int, string>, int>();
        var sw = Stopwatch.StartNew();
 
        for (var i = 0; i < 1000000; i++)
        {
            var key = new ClassTuple<int, string>(i % 1000, i.ToString());
            dictionary.Add(key, i);
        }
 
        sw.Stop();
        Console.WriteLine("ClassTupleAsKey: " + sw.ElapsedMilliseconds);
    }
 
    private static void StructTupleAsKey()
    {
        var dictionary = new SortedDictionary<StructTuple<int, string>, int>();
        var sw = Stopwatch.StartNew();
 
        for (var i = 0; i < 1000000; i++)
        {
            var key = new StructTuple<int, string>(i % 1000, i.ToString());
            dictionary.Add(key, i);
        }
 
        sw.Stop();
        Console.WriteLine("StructTupleAsKey: " + sw.ElapsedMilliseconds);
    }
 
    private static void MyStructAsKey()
    {
        var dictionary = new SortedDictionary<MyStructA, int>();
        var sw = Stopwatch.StartNew();
 
        for (var i = 0; i < 1000000; i++)
        {
            var key = new MyStructA(i % 1000, i.ToString());
            dictionary.Add(key, i);
        }
 
        sw.Stop();
        Console.WriteLine("StructAsKey: " + sw.ElapsedMilliseconds);
    }
 
    private static void MyStructAsKeyWithCompare()
    {
        var dictionary = new SortedDictionary<MyStructB, int>(MyStructBComparer.Instance);
        var sw = Stopwatch.StartNew();
 
        for (var i = 0; i < 1000000; i++)
        {
            var key = new MyStructB(i % 1000, i.ToString());
            dictionary.Add(key, i);
        }
 
        sw.Stop();
        Console.WriteLine("StructAsKeyWithCompare: " + sw.ElapsedMilliseconds);
    }
 
    public class ClassTuple<T1, T2> : IStructuralEquatable, IStructuralComparable, IComparable
    {
        private readonly T1 _item1;
        private readonly T2 _item2;
 
        public T1 Item1 => _item1;
        public T2 Item2 => _item2;
 
        public ClassTuple(T1 item1, T2 item2)
        {
            _item1 = item1;
            _item2 = item2;
        }
 
        public override bool Equals(object obj)
        {
            return ((IStructuralEquatable) this).Equals(obj, EqualityComparer<object>.Default);
        }
 
        bool IStructuralEquatable.Equals(object other, IEqualityComparer comparer)
        {
            var objTuple = other as ClassTuple<T1, T2>;
            if (objTuple == null) return false;
            return comparer.Equals(_item1, objTuple._item1) && comparer.Equals(_item2, objTuple._item2);
        }
 
        int IComparable.CompareTo(object obj)
        {
            return ((IStructuralComparable) this).CompareTo(obj, Comparer<object>.Default);
        }
 
        int IStructuralComparable.CompareTo(object other, IComparer comparer)
        {
            if (other == null) return 1;
            var objTuple = other as ClassTuple<T1, T2>;
 
            if (objTuple == null)
                throw new ArgumentException("ArgumentException_TupleIncorrectType", nameof(other));
 
            var c = comparer.Compare(_item1, objTuple._item1);
            return c != 0 ? c : comparer.Compare(_item2, objTuple._item2);
        }
 
        public override int GetHashCode()
        {
            return ((IStructuralEquatable) this).GetHashCode(EqualityComparer<object>.Default);
        }
 
        int IStructuralEquatable.GetHashCode(IEqualityComparer comparer)
        {
            return CombineHashCodes(comparer.GetHashCode(_item1), comparer.GetHashCode(_item2));
        }
 
        private static int CombineHashCodes(int h1, int h2)
        {
            return ((h1 << 5) + h1) ^ h2;
        }
    }
 
    public struct StructTuple<T1, T2> : IStructuralEquatable, IStructuralComparable, IComparable
    {
        private readonly T1 _item1;
        private readonly T2 _item2;
 
        public T1 Item1 => _item1;
        public T2 Item2 => _item2;
 
        public StructTuple(T1 item1, T2 item2)
        {
            _item1 = item1;
            _item2 = item2;
        }
 
        public override bool Equals(object obj)
        {
            return ((IStructuralEquatable) this).Equals(obj, EqualityComparer<object>.Default);
        }
 
        bool IStructuralEquatable.Equals(object other, IEqualityComparer comparer)
        {
            if (!(other is StructTuple<T1, T2>)) return false;
            var objTuple = (StructTuple<T1, T2>) other;
            return comparer.Equals(_item1, objTuple._item1) && comparer.Equals(_item2, objTuple._item2);
        }
 
        int IComparable.CompareTo(object obj)
        {
            return ((IStructuralComparable) this).CompareTo(obj, Comparer<object>.Default);
        }
 
        int IStructuralComparable.CompareTo(object other, IComparer comparer)
        {
            var objTuple = (StructTuple<T1, T2>) other;
            var c = comparer.Compare(_item1, objTuple._item1);
            return c != 0 ? c : comparer.Compare(_item2, objTuple._item2);
        }
 
        public override int GetHashCode()
        {
            return ((IStructuralEquatable) this).GetHashCode(EqualityComparer<object>.Default);
        }
 
        int IStructuralEquatable.GetHashCode(IEqualityComparer comparer)
        {
            return CombineHashCodes(comparer.GetHashCode(_item1), comparer.GetHashCode(_item2));
        }
 
        private static int CombineHashCodes(int h1, int h2)
        {
            return ((h1 << 5) + h1) ^ h2;
        }
    }
 
    private struct MyStructA : IComparable
    {
        public MyStructA(int i, string s)
        {
            I = i;
            S = s;
        }
 
        public readonly int I;
        public readonly string S;
 
        public int CompareTo(object obj)
        {
            var y = (MyStructA) obj;
            var c = I.CompareTo(y.I);
            return c != 0 ? c : StringComparer.Ordinal.Compare(S, y.S);
        }
    }
 
    private struct MyStructB
    {
        public MyStructB(int i, string s)
        {
            I = i;
            S = s;
        }
 
        public readonly int I;
        public readonly string S;
    }
 
    private class MyStructBComparer : IComparer<MyStructB>
    {
        public static readonly MyStructBComparer Instance = new MyStructBComparer();
 
        public int Compare(MyStructB x, MyStructB y)
        {
            var c = x.I.CompareTo(y.I);
            return c != 0 ? c : StringComparer.Ordinal.Compare(x.S, y.S);
        }
    }
}

Enjoy,
Tom

7 comments:

  1. What happens if you implement IComparable (generic) instead of IComparable (non-generic)?

    ReplyDelete
    Replies
    1. Implementing the generic IComparer makes all the difference! When it the IComparer is not generic it will have to box the struct for every comparison, leading to the performance problems described in the blog post.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. I created another test case using a struct that implemented the generic IComparable interface, i.e. (using [] braces instead of <> due to HTML stripping):

      private struct ComparableStruct: IComparable[ComparableStruct]
      {
      public int CompareTo(ComparableStruct other)
      {
      int c = I.CompareTo(other.I);
      return c != 0 ? c : StringComparer.Ordinal.Compare(S, other.S);
      }
      }

      The results on my PC (i7 3770K @ 3.5GHz) showed that there was no boxing and it was only slightly slower than the struct with an IComparer (~1000 ms vs ~900 ms). It's also simpler to implement and easier to use since you don't have to remember to use the non-default Comparer every time you create a container...

      Delete
    4. Yes, you are absolutely correct; the IComparable will solve the problem as well! I apologize, when I read the question I thought that Mackenzie was referring to the non generic version of IComparer. Bottom line: prevent the cast to object!

      Delete
  2. It should be generated by the compiler

    ReplyDelete
    Replies
    1. The default Comparer implementation will often provide the desired functionality, however it will come at the performance cost of boxing the values for comparison.

      Delete

Real Time Web Analytics