5

How can I further improve the performance of the code below while still maintaining the public interface:

public interface IMapper<in TSource, in TDestination>
{
    void Map(TSource source, TDestination destination);
}

public static TDestination Map<TSource, TDestination>(
    this IMapper<TSource, TDestination> translator,
    TSource source)
    where TDestination : new()
{
    var destination = new TDestination();
    translator.Map(source, destination);
    return destination;
}

public static List<TDestination> MapList<TSource, TDestination>(
    this IMapper<TSource, TDestination> translator,
    List<TSource> source)
    where TDestination : new()
{
    var destinationCollection = new List<TDestination>(source.Count);
    foreach (var sourceItem in source)
    {
        var destinationItem = translator.Map(sourceItem);
        destinationCollection.Add(destinationItem);
    }
    return destinationCollection;
}

Example Usage

public class MapFrom { public string Property { get; set; } }

public class MapTo { public string Property { get; set; } }

public class Mapper : IMapper<MapFrom, MapTo>
{
    public void Map(MapFrom source, MapTo destination)
    {
        destination.Property = source.Property;
    }
}

var mapper = new Mapper();
var mapTo = mapper.Map(new MapFrom() { Property = "Foo" });
var mapToList = mapper.MapList(
    new List<MapFrom>() 
    {
        new MapFrom() { Property = "Foo" } 
    });

Current Benchmark

When I run a benchmark against a raw manual conversion, these are the numbers I get:

|             Method |  Job | Runtime |      Mean |     Error |    StdDev |       Min |       Max | Scaled | ScaledSD |  Gen 0 | Allocated |
|------------------- |----- |-------- |----------:|----------:|----------:|----------:|----------:|-------:|---------:|-------:|----------:|
|           Baseline |  Clr |     Clr |  1.969 us | 0.0354 us | 0.0332 us |  1.927 us |  2.027 us |   1.00 |     0.00 | 2.0523 |   6.31 KB |
|  Mapper            |  Clr |     Clr |  9.016 us | 0.1753 us | 0.2019 us |  8.545 us |  9.419 us |   4.58 |     0.12 | 2.0447 |   6.31 KB |
|           Baseline | Core |    Core |  1.820 us | 0.0346 us | 0.0355 us |  1.777 us |  1.902 us |   1.00 |     0.00 | 2.0542 |   6.31 KB |
|  Mapper            | Core |    Core |  9.043 us | 0.1725 us | 0.1613 us |  8.764 us |  9.294 us |   4.97 |     0.13 | 2.0447 |   6.31 KB |

Here is the code for the baseline:

var mapTo = new MapTo() { Property = mapFrom.Property };
var mapToCollection = new List<MapTo>(this.mapFrom.Count);
foreach (var item in this.mapFrom)
{
    destination.Add(new MapTo() { Property = item.Property });
}

Benchmark Code

I have a fully working project containing the mapper and Benchmark.NET project in the Dotnet-Boxed/Framework GitHub repository.

Muhammad Rehan Saeed
  • 28,236
  • 27
  • 169
  • 261
  • I've updated the question with details from a benchmark I did. I'd like to get as close to the baseline as possible and maybe even exceed it. – Muhammad Rehan Saeed Sep 30 '17 at 08:08
  • Why just not implement ISharedProperties interface and use O(1) assignment? – shadow32 Sep 30 '17 at 08:12
  • @shadow32 `MapFrom` and `MapTo` happen to have the same property name in my example but this may not always be the case. Think Automapper. – Muhammad Rehan Saeed Sep 30 '17 at 08:16
  • "Maybe even exceed it" - yeah, that's not going to happen, unless your baseline isn't really baseline. Your `MapList` method is as simple as it gets. If you want to have virtual methods, you will have to spend some CPU time on virtual dispatch. Also, is there a reason for reinventing the wheel? Why not simply use Automapper? AFAIK, Automapper uses Reflection.Emit to generate mapping code and compile it, in order to avoid property name lookups. – Groo Sep 30 '17 at 08:16
  • 5
    One performance bottleneck I've found is using `new T()` for a constrained type parameter. I've found that passing in a `Func()` instead can make a huge difference. If you could edit your code into a [mcve] that lets us really easily run the tests, I could see how much difference that makes. (But it's still not going to get to your baseline, of course...) – Jon Skeet Sep 30 '17 at 08:28
  • I've updated the question with a link to my GitHub repo with a full solution file which you can open and just run the benchmark.NET project, doesn't get any easier than that. – Muhammad Rehan Saeed Sep 30 '17 at 08:51
  • *[edited]* contrary to what I wrote previously, the top snipped does compile. Still, keep it in mind: if you ever want to be able to perform mapping with value type (struct) `TDestination`, your `IMapper.Map` method *must* return `TDestination`. – Kirill Shlenskiy Sep 30 '17 at 09:40
  • @KirillShlenskiy Yes, this interface is specific to complex `class` object types only. – Muhammad Rehan Saeed Sep 30 '17 at 09:44
  • Also, your SO code is a little oversimplified and will actually behave differently to what you have in your repository. For example, the SO code has a `foreach` over `List`, which should be faster than `foreach` over `TSourceCollection where TSourceCollection : IEnumerable` due to `List` providing a struct enumerator (which would need to be boxed in the `TSourceCollection` use case). – Kirill Shlenskiy Sep 30 '17 at 10:01
  • Ah, yes I forgot to check-in a change using foreach I was trying out. I've now done it for that one method. – Muhammad Rehan Saeed Sep 30 '17 at 11:00
  • Cool. At this point I'd try out Jon Skeet's `Func` suggestion, try using `destinationCollection[i] = translator.Map(sourceItem)` instead of using `Add`, and finally try inlining (or, rather, eliminating) your `Map` extension method (which may or may not make a difference as the method body is small enough that the compiler could already be inlining it for you - look at your `MapList` IL after compiling in Release to confirm). Other than those 3 things I am not really seeing any further room for optimisation. – Kirill Shlenskiy Sep 30 '17 at 11:16
  • Thanks, will try that. The really interesting thing is Automapper being so fast on .NET Core (Twice as fast compared to full framework and twice as fast as my code above too). I raised a [GitHub issue](https://github.com/AutoMapper/AutoMapper/issues/2345) asking when they think this might be the case. – Muhammad Rehan Saeed Sep 30 '17 at 12:43
  • I've raised an [issue](https://github.com/dotnet/roslyn/issues/22464) in the C# lang issue to get the generic `new()` constraint optimized. It turns out, that it calls `Activator.CreateInstance` which uses reflection. – Muhammad Rehan Saeed Oct 02 '17 at 09:39

1 Answers1

5

After implementing the suggestions discussed in the comments, here is the most efficient MapList<TSource, TDestination> implementation I've been able to come up with:

using System;
using System.Collections.Generic;
using System.Linq.Expressions;

public static List<TDestination> MapList<TSource, TDestination>(
    this IMapper<TSource, TDestination> translator,
    List<TSource> source)
    where TDestination : new()
{
    var destinationCollection = new List<TDestination>(source.Count);

    foreach (var sourceItem in source)
    {
        TDestination dest = Factory<TDestination>.Instance();
        translator.Map(sourceItem, dest);
        destinationCollection.Add(dest);
    }

    return destinationCollection;
}

static class Factory<T>
{
    // Cached "return new T()" delegate.
    internal static readonly Func<T> Instance = CreateFactory();

    private static Func<T> CreateFactory()
    {
        NewExpression newExpr = Expression.New(typeof(T));

        return Expression
            .Lambda<Func<T>>(newExpr)
            .Compile();
    }
}

Note that I managed to take advantage of Jon Skeet's suggestion not to use new TDestination() without requiring the caller to provide the Func<TDestination> delegate, thus preserving your API.

Of course, the cost of compiling the factory delegate is non-negligible, but in common mapping scenarios I expect it to be worth the trouble.

Kirill Shlenskiy
  • 8,623
  • 24
  • 37
  • That's really clever. Adding a `Func CreateNew()` method to the `IMapper` interface and removing `new()` as @JonSkeet suggested is 1.21 times the speed of the baseline while your code is 1.34 times slower, so very close. I know I said not to change the interface I can't quite decide which to go with. – Muhammad Rehan Saeed Oct 01 '17 at 12:28