Showing posts with label Performance. Show all posts
Showing posts with label Performance. Show all posts

Thursday, March 30, 2017

C# String Interpolation Performance

Time for a follow up to my String.Concat vs String.Format Performance post from back in 2014!

I recently found out that string interpolation is not nearly as efficient as I would have thought. I also suspected that it was just doing a string concatenation, but it is actually doing a string format. This leads to a pretty significant performance degradation; the following test runs one million iterations of each.

Number
of Args
Interpolation
Milliseconds
String.Format
Milliseconds
String.Concat
Milliseconds
String Add
Milliseconds
StringBuilder
Milliseconds
2 262 260 19 18 34
3 367 367 25 24 35
4 500 513 31 32 41
5 646 635 67 66 44
6 740 723 79 76 49
7 802 819 86 85 52
8 938 936 97 98 58

So, what's the lesson? Don't use string interpolation in high performance areas (such as your logger)!

Enjoy,
Tom

Sunday, November 27, 2016

The Performance Cost of Boxing in .NET

I recently had to do some performance optimizations against a sorted dictionary that yielded some interesting results...

Background: I am used to using Tuples a lot, simply because they are easy to use and normally quite efficient. Please remember that Tuples were changed from structs to classes back in .NET 4.0.

Problem: A struct decreased performance!

I had a SortedDictionary that was using a Tuple as a key, so I thought "hey, I'll just change that tuple to a struct and reduce the memory usage." ...bad news, that made performance WORSE!

Why would using a struct make performance worse? It's actually quite simple and obvious when you think about it: it was causing comparisons to repeatedly box the primitive data structure, thus allocating more memory on the heap and triggering more garbage collections.

Solution: Use a struct with an IComparer.

I then created a custom struct and used that; it was must faster, but it was still causing boxing because of the non-generic IComparable interface. So finally I added a generic IComparer and passed that into my dictionary constructor; my dictionary then ran fast and efficient, causing a total of ZERO garbage collections!

See for yourself:

The Moral of the Story

Try to be aware of what default implementations are doing, and always remember that boxing to object can add up fast. Also, pay attention to the Visual Studio Diagnostics Tools window; it can be very informative!

Here is how many lines of code it took to achieve a 5x performance increase:

private struct MyStruct
{
    public MyStruct(int i, string s) { I = i; S = s; }
    public readonly int I;
    public readonly string S;
}
 
private class MyStructComparer : IComparer<MyStruct>
{
    public int Compare(MyStruct x, MyStruct y)
    {
        var c = x.I.CompareTo(y.I);
        return c != 0 ? c : StringComparer.Ordinal.Compare(x.S, y.S);
    }
}

Test Program

I have written some detailed comments in the Main function about what each test is doing and how it will affect performance. Let's take a look...

Saturday, November 26, 2016

10x faster than Delegate.DynamicInvoke

This is a follow up to my previous blog posts, Optimizing Dynamic Method Invokes in .NET, and Dynamically Invoke Methods Quickly, with InvokeHelpers.EfficientInvoke. Basically, I have re-implemented this for Tact.NET in a way that makes it smaller, faster, and compatible with the .NET Standard.

So, how much faster is this new way of doing things? EfficientInvoker.Invoke is over 10x faster than Delegate.DynamicInvoke, and 10x faster than MethodInfo.Invoke.

Check out the source on GitHub:

Simple Explanation

Here is an example of a method and a class that we might want to invoke dynamically...

public class Tester
{
    public bool AreEqual(int a, int b)
    {
        return a == b;
    }
}

...and then here is the code that the EfficientInvoker will generate at runtime to call that method:

public static object GeneratedFunction(object target, object[] args)
{
    return (object)((Tester)target).AreEqual((int)args[0], (int)args[1]);
}

See, it's simple!

Sunday, August 21, 2016

Dynamically Invoke Methods Quickly, with InvokeHelpers.EfficientInvoke

In my previous blog post, I talked about Optimizing Dynamic Method Invokes in .NET. In this post, we will use that information to create a static helper method that is twice as fast as MethodInfo.Invoke.

Basically, we create and cache a delegate in a concurrent dictionary, and then cast both it and it's arguments to dynamics and invoke them directly. The concurrent dictionary introduces overhead, but it still more than twice as fast as calling MethodInfo.Invoke. Please note that this method is highly optimized to reduce the use of hash code look ups, property getters, closure allocations, and if checks.

let's take a look at the code...

InvokeHelpers.EfficientInvoke

public static class InvokeHelpers
{
    private const string TooManyArgsMessage = "Invokes for more than 10 args are not yet implemented";
 
    private static readonly Type VoidType = typeof(void);
 
    private static readonly ConcurrentDictionary<Tuple<string, object>, DelegatePair> DelegateMap 
        = new ConcurrentDictionary<Tuple<string, object>, DelegatePair>();
 
    public static object EfficientInvoke(object obj, string methodName, params object[] args)
    {
        var key = Tuple.Create(methodName, obj);
        var delPair = DelegateMap.GetOrAdd(key, CreateDelegate);
            
        if (delPair.HasReturnValue)
        {
            switch (delPair.ArgumentCount)
            {
                case 0: return delPair.Delegate();
                case 1: return delPair.Delegate((dynamic)args[0]);
                case 2: return delPair.Delegate((dynamic)args[0], (dynamic)args[1]);
                case 3: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2]);
                case 4: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3]);
                case 5: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4]);
                case 6: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5]);
                case 7: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6]);
                case 8: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6], (dynamic)args[7]);
                case 9: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6], (dynamic)args[7], (dynamic)args[8]);
                case 10: return delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6], (dynamic)args[7], (dynamic)args[8], (dynamic)args[9]);
                default: throw new NotImplementedException(TooManyArgsMessage);
            }
        }
 
        switch (delPair.ArgumentCount)
        {
            case 0: delPair.Delegate(); break;
            case 1: delPair.Delegate((dynamic)args[0]); break;
            case 2: delPair.Delegate((dynamic)args[0], (dynamic)args[1]); break;
            case 3: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2]); break;
            case 4: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3]); break;
            case 5: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4]); break;
            case 6: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5]); break;
            case 7: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6]); break;
            case 8: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6], (dynamic)args[7]); break;
            case 9: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6], (dynamic)args[7], (dynamic)args[8]); break;
            case 10: delPair.Delegate((dynamic)args[0], (dynamic)args[1], (dynamic)args[2], (dynamic)args[3], (dynamic)args[4], (dynamic)args[5], (dynamic)args[6], (dynamic)args[7], (dynamic)args[8], (dynamic)args[9]); break;
            default: throw new NotImplementedException(TooManyArgsMessage);
        }
 
        return null;
    }
 
    private static DelegatePair CreateDelegate(Tuple<string, object> key)
    {
        var method = key.Item2
            .GetType()
            .GetMethod(key.Item1);
 
        var argTypes = method
            .GetParameters()
            .Select(p => p.ParameterType)
            .Concat(new[] { method.ReturnType })
            .ToArray();
 
        var newDelType = Expression.GetDelegateType(argTypes);
        var newDel = Delegate.CreateDelegate(newDelType, key.Item2, method);
 
        return new DelegatePair(newDel, argTypes.Length - 1, method.ReturnType != VoidType);
    }
 
    private class DelegatePair
    {
        public DelegatePair(dynamic del, int argumentCount, bool hasReturnValue)
        {
            Delegate = del;
            ArgumentCount = argumentCount;
            HasReturnValue = hasReturnValue;
        }
 
        public readonly dynamic Delegate;
        public readonly int ArgumentCount;
        public readonly bool HasReturnValue;
    }
}

Now let's take a look at some performance tests...

Optimizing Dynamic Method Invokes in .NET

I recently had a lot of fun helping to optimize some RPC code that was using reflection to dynamically invoke methods in a C# application. Below are a list of implementations that we experimented with, and their performance.

  1. Directly Invoking the Method
  2. Using MethodInfo.Invoke
  3. Using Delegate.DynamicInvoke
  4. Casting to a Func
  5. Casting a Delegate to Dynamic

Spoilers: Here are the results. (The tests for this can be see below.)

Name First Call (Ticks) Next Million Calls Invoke Comparison
Invoke 1 39795 -
MethodInfo.Invoke 12 967523 x24
Delegate.DynamicInvoke 32 1580086 x39
Func Invoke 731 41331 x1
Dynamic Cast 1126896 85495 x2

Conclusion: Invoking a method or delegate directly is always fastest, but when you need to execute code dynamically, then (after the first invoke) the dynamic invoke of a delegate is significantly faster than using reflection.

Let's take a look at the test code...

Saturday, January 16, 2016

How to Optimize Json.NET Serialization Performance

Newtonsoft is a pretty fast JSON serializer, but you can make it even faster!

By default, JsonConvert uses reflection to recursively search through the structure of an object during the serialization process. By implementing a custom JsonConverter that already knows the exact structure of the object, you can significantly increase serialization performance.

How much faster? That depends! The more complicated the data structure, the larger the performance gain. Below is a simple example...

Action Method Milliseconds Performance Increase
Serialize Standard 1134 115.59%
Custom 526
Deserialize Standard 1488 62.98%
Custom 913

Sunday, December 14, 2014

How much does RegexOptions.Compiled improve performance in .NET?

Just how much does the RegexOptions.Compiled flag improve regular expression performance in .NET? The answer: a lot! People have spoken about this before, but below are some more numbers to how you just how much it matters!

Performance Stats

Character Count Regex Pattern
1
[a-z]
3
[b-y].[1-8]
5
[b-y].[c-x].[1-8].[2-7]
7
[b-y].[c-x].[d-w].[1-8].[2-7].[3-6]
9
[b-y].[c-x].[d-w].[e-v].[1-8].[2-7].[3-6].[4-5]

RegexOptions 1 3 5 7 9
None 234176 285067 653016 690282 687343
Compiled 193945 235213 430609 452483 454625
Percent Gain 17% 17% 34% 34% 34%

Sunday, July 27, 2014

RavenDB 2.5 vs 3.0 Write Performance (so far)

Is Voron out performing Esent in RavenDB 3.0?

...not yet, at least in terms of write speed. I ran a few tests on my home machine to compare the write and indexing speeds of Raven 2.5 (build 2908) against Raven 3.0 (build 3358). Unfortunately the results were not encouraging. However it is worth pointing out that the Raven team did save their performance updates for last when releasing Raven 2.5, so I do expect that this will improve before we see an RC.

Test Results

Here are the results of my (little) performance tests:

Document Count RavenDB 2.5 RavenDB 3.0 Difference
Elapsed Import Time Elapsed Index Time Elapsed Import Time Elapsed Index Time Import Percent Index Percent
0 - 100k 0:57.48 1:41.45 1:08.59 1:25.39 -19.33% 15.82%
100k - 200k 1:02.68 1:34.85 1:10.87 1:35.65 -13.08% -0.84%
200k - 300k 1:00.34 2:17.84 1:12.94 1:47.20 -20.89% 22.22%
300k - 400k 1:00.85 1:38.59 1:13.46 1:45.61 -20.73% -7.12%
400k - 500k 1:02.03 1:38.70 1:12.03 1:58.51 -16.12% -20.07%

Saturday, March 15, 2014

String.Concat vs StringBuilder Performance

Time for yet another micro-optimization!

Everyone knows that Strings are immutable in .NET, thus the StringBuilder class is very important for saving memory when dealing with manipulating large strings.

...but what about performance?

Interestingly, StringBuilder is just an all around better way to combine strings! It is more memory efficient, and less processor intensive; but not by much. Below is a comparison of performance between different ways of combine strings.

Saturday, March 8, 2014

String.Concat vs String.Format Performance

Time for another micro-optimization!

When building strings it is almost always easiest to write and maintain a typical format statement. However, what is the cost of that over just concatenating strings? When building strings for cache keys (which I know are going to get called a lot) I try to use String.Concat instead of String.Format. Let's look at why!

Below is a table showing a comparison the performance difference between String.Concat and String.Format. The Y axis is the number of arguments being concatenated. The X axis is the number of milliseconds it takes to complete 100,000 runs.

Number
of Args
String.Concat
Milliseconds
String.Format
Milliseconds
Concat
Percent Faster
2 4ms 10ms 150%
3 3ms 13ms 333%
4 4ms 16ms 300%
5 12ms 21ms 75%
6 14ms 24ms 71%
7 16ms 28ms 75%
8 18ms 31ms 72%

Sunday, March 2, 2014

Log Performance in a Using Block with Common.Logging

You should be using Common.Logging to share your logger between projects.

Common.Logging is a great and lightweight way to share dependencies without requiring that you also share implementation. It is how several of my projects that use Log4Net are able to shares resources with another team that uses NLog. But that is not what I am here to talk about!

How do you log performance quickly and easily?

No, I do not mean performance counters. No, I do not mean interceptors for dependency injection. I want something far more lightweight and simplistic! What I want is the ability to simply log if too much time is spent in a specific block of code. For example...

public void MyMethod1(ILog log)
{
    // If this using block takes more than 100 milliseconds,
    // then I want it to write to my Info log. However
    // if this using block takes more than 1 second,
    // then I want it to write to my Warn log instead.
    using (log.PerfElapsedTimer("MyMethod took too long!"))
    {
        var obj = GetFromApi();
        SaveToDatabase(obj);
    }
}

The PerfElapsedTimer is just a simple little extension method that I wrote, under the hood it just wraps a Stopwatch in an IDisposable. Feel free to grab the code from below and start using it yourself.

Sunday, November 24, 2013

Generic Enum Attribute Caching

Attributes are wonderful for decorating your Enums with additional markup and meta data. However, looking up attributes via reflection is not always a fast operation in terms of performance. Additionally, no one likes typing that code over and over again.

Well not to worry, just use the following extension methods to help cache your Enum attribute look ups, and increase your application's performance! But how much faster is this? Good question...

Iterations Average Elapsed Ticks Difference
No Cache With Cache
2 1182.00 3934.00 4x Slower
10 20.10 7.10 3x Faster
100 13.07 1.37 10x Faster
1,000 13.27 1.40 10x Faster
10,000 13.56 1.45 10x Faster
100,000 13.02 1.33 10x Faster
Real Time Web Analytics