Some time ago at work, I ran across code resembling the following.
Please note that we are still on VS2013 and as such do not have access to C# 7’s pattern matching.
Anyway, traditionally C# developers have had two ways of solving the “I want a reference to my object as an X
but I also want to handle the case where my object is not an X
” situation. One is to use the is
operator and then a C-style cast. The other is to use the as
operator and then null check. This is what those look like in C# and CIL respectively.
The as
version has fewer instructions and as such certainly seems shorter, and is
is really the same thing as as
with the result then down-casted to a boolean. The magic behind them both is the isinst
opcode, which returns the reference to the thing as the desired type if it matched and null
otherwise. But is
runs the exact same thing and simply treats the return value as a boolean (where null
is false
and not-null
is true
). But I’ve always tended to prefer as
over is
for this type of scenario since in the happy path, the is
method would still require me to cast again, which is indicated in the IL by the castclass
opcode.
I would like to use pattern matching from C# 7.0 either in a switch
or in an if
. That’s not available to me, but both of those use the same isinst
opcode under the hood. Anyway, back to this thing…
I had occasion to add another type to that surfeit of as
es and elseif
s, so I did. Obviously, one would like to refactor the clusterf situation to leverage polymorphism, but in the project in question that would take more development time than we would be able to get permission from management to spend. This code also doesn’t change all that often, so–even if it’s ugly–the ugliness really isn’t costing us maintenance very often and seems to be working well as is.
Even so, I got a comment on my pull request saying the following.
…casting isn’t free, and this feels pretty inefficient to me. Could we just get the object’s type and then cast it once based off that, rather than casting it to every type of object that it might be?
Having no strong feelings one way or the other, I started to implement the requested approach. The obvious thing was to implement it with a similar set of nested elseif
s. Then it occurred to me that I could store the types in a dictionary and get constant time lookup. O(1), amirite?
In wondering which implementation (dictionary vs elseif
s) would be more performant, at some point it occurred to me that I might as well run a benchmark to see for myself. So I began to set it up. Then I figured while I was at it, I might as well benchmark the current implementation just to see how many sweet sweet milliseconds I was going to save.
In an effort to make sure I do my benchmarking right, I searched for advice and discovered the following excellent post on benchmarking in C# by the illustrious Jon Skeet.
http://www.yoda.arachsys.com/csharp/benchmark.html
Some salient points made in that article include:
- Make your tests take minutes, not seconds. (to smooth out the bumps of the GC)
- In your benchmarking code, prefer local variables over fields. (the CLR can optimize locals better than fields in some cases, so this can reduce the time of your baseline)
- Check the results of the calculation are correct, not merely how long it took. (the compiler might optimize away the part you’re trying to benchmark if it doesn’t know for sure that you’ll actually use the result!)
I set up a new project with six classes called, unimaginatively, A
, B
, C
, D
, E
, and F
, and ran them through each set of type detection logic for 900 million iterations on my workstation, running in Release mode in VS2013 (i.e. the most relevant version for my use case).
I ran with six options. The one I called Polymorphic
simply required all six types to implement a common C# interface and leveraged that interface to perform the operations. The one I called IfIsCStyle
was a series of elseif
s where each condition was an is
operator and each code block used a C-style cast. The one I called CastAsEachTime
was the existing implementation: as
cast to all six types and then elseif
the results not null
. The one I called CastWithElse
was the same thing but with all the casts moved into the elseif
blocks. It will obviously be more performant than the CastAsEachTime
version since it does less casting in the average and best cases, but the code visually resembles a descending staircase, which is distracting. I also added one called TypeDictionary
where I stored the desired types in a static dictionary, and one called TypeElse
which compared the object’s Type
against each of the six known desired types in elseif
s with the ==
operator. Then I ran the benchmark.
Function | Adjusted Time (ms) | Speed Cost |
---|---|---|
Polymorphic | 30322 | 0% |
IfIsCStyle | 131462 | 76.93% |
CastWithElse | 149699 | 79.74% |
CastAsEachTime | 198622 | 84.73% |
TypeDictionary | 356411 | 91.49% |
TypeElse | 362060 | 91.63% |
I was surprised. For one thing, I was surprised that IfIsCStyle
beat CastWithElse
. I have no good explanation for that. But what really surprised me was the fact that the TypeDictionary
and TypeElse
implementations were far and away the worst implementations thus far. How could this be?
I was able to find an explanation in this post. Essentially, using isinst
allows the CLR to check whether the object belongs to the type without having to load various reflection metadata that’s required when you create a whole System.Type
object. That GetTypeFromHandle
call that’s the backbone of C#’s typeof
operator…
…causes the entire System.Type
to be created, where we really only need the inheritance hierarchy data. And isinst
is optimized for that.
That link also mentioned a different possibility for solving my use case: Type.GetTypeHandle
. It’s a way of getting a type’s unique identifier without having to load the entire type with all its metadata and dragons.
That could be a lot more efficient, and his benchmarking seems to indicate that it is. But the act of benchmarking had already surprised me once in this journey, so I tried my own. I tried one version with the type handles stored in static fields and elseif
ing through the results of Type.GetTypeHandle
, called HandleElse
. And I added one that stored the handles in a static dictionary, called TypeDictionary
. Here are the results.
Function | Adjusted Time (ms) | Speed Cost |
---|---|---|
Polymorphic | 30322 | 0% |
IfIsCStyle | 131462 | 76.93% |
CastWithElse | 149699 | 79.74% |
CastAsEachTime | 198622 | 84.73% |
HandleElse | 203630 | 85.11% |
TypeDictionary | 356411 | 91.49% |
TypeElse | 362060 | 91.63% |
HandleDict | 396930 | 92.36% |
Obviously, the new idea didn’t fare too well. I can only guess, but I suspect that it has to do with the fact that his benchmark only needed to compare to see if the types were equal but did not then need to actually obtain a reference to the instance as the desired type, so it had no need to cast at all. Adding a cast to the method, as my use case required, seems to have trumped the supposed performance gains of not loading the entire type object into memory.
At this point, not only was I sure the existing way was faster than what had been recommended by my pull request reviewer, but I was having fun. So I broke out my VS2017 Community Edition and reran the benchmarks but with overloads for PatternMatchIfElse
(C# 7.0 pattern matching in an if
) and PatternMatchSwitch
(C# 7.0 pattern matching in a switch
). Just when I thought nothing could surprise me, I was once again surprised.
Function | Adjusted Time (ms) | Speed Cost |
---|---|---|
Polymorphic | 33490 | 0% |
HandleElse | 58392 | 42.65% |
PatternMatchSwitch | 63171 | 46.99% |
PatternMatchIfElse | 64653 | 48.2% |
CastWithElse | 65342 | 48.75% |
IfIsCStyle | 66402 | 49.56% |
TypeElse | 70936 | 52.79% |
CastAsEachTime | 110476 | 69.69% |
TypeDictionary | 269841 | 87.59% |
HandleDict | 272549 | 87.71% |
I was very interested in the drop in performance of the CastAsEachTime
version compared to the rest. My only guess is that the JIT Compiler has recently received some optimizations related to common type matching patterns, which included some of the others but did not include CastAsEachTime
.
In all the tests I did, the fastest any implementation ran was the polymorphic one on VS2017. It completed in 33,490 milliseconds. The slowest any implementation ran was the type handle dictionary in VS2013, which completed in 396,930 milliseconds. Thus, the maximum savings for worrying about all this performance was 404 milliseconds per one million iterations. The original function under discussion is not called in a tight loop. It is not the bottleneck. This performance difference is not even measurable to our end users.
So I learned that, when getting feedback of the “X would be more performant than this” variety, it’s sometimes good to race your own horses. And sometimes most of the times the difference is so minuscule that it’s better to just focus on clean and readable code.
However, there are times when performance does matter. Just to prove this, I implemented an eleventh type checker. This one would try a C-style cast and then catch the InvalidCastException
when it fails, and proceed to try the next type. Remember the slowest one above (type handle dictionary in 2013)? It ran 900 million iterations in 396,930 milliseconds. That’s over 2,200 iterations per millisecond. My C-style cast version took…well, I never let it finish. I ran it on 3,000 iterations, and it took 212,815 milliseconds. That’s about 70.94 milliseconds per iteration. Assuming that rate holds constant, if I tried to let it finish the same 900 million iterations I gave to the rest of the implementations, that would place it finishing in about two years plus a week. I elected not to let it run that long.
So there it is. Performance doesn’t really matter, except when it does.
You can download the .cs file that I used for my benchmarking here.