Some time ago at work, I ran across code resembling the following.
Please note that we are still on VS2013 and as such do not have access to C# 7’s pattern matching.
Anyway, traditionally C# developers have had two ways of solving the “I want a reference to my object as an
X but I also want to handle the case where my object is not an
X” situation. One is to use the
is operator and then a C-style cast. The other is to use the
as operator and then null check. This is what those look like in C# and CIL respectively.
as version has fewer instructions and as such certainly seems shorter, and
is is really the same thing as
as with the result then down-casted to a boolean. The magic behind them both is the
isinst opcode, which returns the reference to the thing as the desired type if it matched and
null otherwise. But
is runs the exact same thing and simply treats the return value as a boolean (where
false and not-
true). But I’ve always tended to prefer
is for this type of scenario since in the happy path, the
is method would still require me to cast again, which is indicated in the IL by the
I would like to use pattern matching from C# 7.0 either in a
switch or in an
if. That’s not available to me, but both of those use the same
isinst opcode under the hood. Anyway, back to this thing…
I had occasion to add another type to that surfeit of
elseifs, so I did. Obviously, one would like to refactor the
clusterf situation to leverage polymorphism, but in the project in question that would take more development time than we would be able to get permission from management to spend. This code also doesn’t change all that often, so–even if it’s ugly–the ugliness really isn’t costing us maintenance very often and seems to be working well as is.
Even so, I got a comment on my pull request saying the following.
…casting isn’t free, and this feels pretty inefficient to me. Could we just get the object’s type and then cast it once based off that, rather than casting it to every type of object that it might be?
Having no strong feelings one way or the other, I started to implement the requested approach. The obvious thing was to implement it with a similar set of nested
elseifs. Then it occurred to me that I could store the types in a dictionary and get constant time lookup. O(1), amirite?
In wondering which implementation (dictionary vs
elseifs) would be more performant, at some point it occurred to me that I might as well run a benchmark to see for myself. So I began to set it up. Then I figured while I was at it, I might as well benchmark the current implementation just to see how many sweet sweet milliseconds I was going to save.
In an effort to make sure I do my benchmarking right, I searched for advice and discovered the following excellent post on benchmarking in C# by the illustrious Jon Skeet.
Some salient points made in that article include:
- Make your tests take minutes, not seconds. (to smooth out the bumps of the GC)
- In your benchmarking code, prefer local variables over fields. (the CLR can optimize locals better than fields in some cases, so this can reduce the time of your baseline)
- Check the results of the calculation are correct, not merely how long it took. (the compiler might optimize away the part you’re trying to benchmark if it doesn’t know for sure that you’ll actually use the result!)
I set up a new project with six classes called, unimaginatively,
F, and ran them through each set of type detection logic for 900 million iterations on my workstation, running in Release mode in VS2013 (i.e. the most relevant version for my use case).
I ran with six options. The one I called
Polymorphic simply required all six types to implement a common C# interface and leveraged that interface to perform the operations. The one I called
IfIsCStyle was a series of
elseifs where each condition was an
is operator and each code block used a C-style cast. The one I called
CastAsEachTime was the existing implementation:
as cast to all six types and then
elseif the results not
null. The one I called
CastWithElse was the same thing but with all the casts moved into the
elseif blocks. It will obviously be more performant than the
CastAsEachTime version since it does less casting in the average and best cases, but the code visually resembles a descending staircase, which is distracting. I also added one called
TypeDictionary where I stored the desired types in a static dictionary, and one called
TypeElse which compared the object’s
Type against each of the six known desired types in
elseifs with the
== operator. Then I ran the benchmark.
|Function||Adjusted Time (ms)||Speed Cost|
I was surprised. For one thing, I was surprised that
CastWithElse. I have no good explanation for that. But what really surprised me was the fact that the
TypeElse implementations were far and away the worst implementations thus far. How could this be?
I was able to find an explanation in this post. Essentially, using
isinst allows the CLR to check whether the object belongs to the type without having to load various reflection metadata that’s required when you create a whole
System.Type object. That
GetTypeFromHandle call that’s the backbone of C#’s
…causes the entire
System.Type to be created, where we really only need the inheritance hierarchy data. And
isinst is optimized for that.
That link also mentioned a different possibility for solving my use case:
Type.GetTypeHandle. It’s a way of getting a type’s unique identifier without having to load the entire type with all its metadata and dragons.
That could be a lot more efficient, and his benchmarking seems to indicate that it is. But the act of benchmarking had already surprised me once in this journey, so I tried my own. I tried one version with the type handles stored in static fields and
elseifing through the results of
HandleElse. And I added one that stored the handles in a static dictionary, called
TypeDictionary. Here are the results.
|Function||Adjusted Time (ms)||Speed Cost|
Obviously, the new idea didn’t fare too well. I can only guess, but I suspect that it has to do with the fact that his benchmark only needed to compare to see if the types were equal but did not then need to actually obtain a reference to the instance as the desired type, so it had no need to cast at all. Adding a cast to the method, as my use case required, seems to have trumped the supposed performance gains of not loading the entire type object into memory.
At this point, not only was I sure the existing way was faster than what had been recommended by my pull request reviewer, but I was having fun. So I broke out my VS2017 Community Edition and reran the benchmarks but with overloads for
PatternMatchIfElse (C# 7.0 pattern matching in an
PatternMatchSwitch (C# 7.0 pattern matching in a
switch). Just when I thought nothing could surprise me, I was once again surprised.
|Function||Adjusted Time (ms)||Speed Cost|
I was very interested in the drop in performance of the
CastAsEachTimeversion compared to the rest. My only guess is that the JIT Compiler has recently received some optimizations related to common type matching patterns, which included some of the others but did not include
In all the tests I did, the fastest any implementation ran was the polymorphic one on VS2017. It completed in 33,490 milliseconds. The slowest any implementation ran was the type handle dictionary in VS2013, which completed in 396,930 milliseconds. Thus, the maximum savings for worrying about all this performance was 404 milliseconds per one million iterations. The original function under discussion is not called in a tight loop. It is not the bottleneck. This performance difference is not even measurable to our end users.
So I learned that, when getting feedback of the “X would be more performant than this” variety, it’s sometimes good to race your own horses. And
sometimes most of the times the difference is so minuscule that it’s better to just focus on clean and readable code.
However, there are times when performance does matter. Just to prove this, I implemented an eleventh type checker. This one would try a C-style cast and then catch the
InvalidCastException when it fails, and proceed to try the next type. Remember the slowest one above (type handle dictionary in 2013)? It ran 900 million iterations in 396,930 milliseconds. That’s over 2,200 iterations per millisecond. My C-style cast version took…well, I never let it finish. I ran it on 3,000 iterations, and it took 212,815 milliseconds. That’s about 70.94 milliseconds per iteration. Assuming that rate holds constant, if I tried to let it finish the same 900 million iterations I gave to the rest of the implementations, that would place it finishing in about two years plus a week. I elected not to let it run that long.
So there it is. Performance doesn’t really matter, except when it does.
You can download the .cs file that I used for my benchmarking here.