3

For the following piece of code (.NET v4.0.30319) I am getting a null reference exception indicated below in the second continuation.

Most interestingly this issue has only occurred in machines with 8GB RAM but other users have 16GB and more and they haven't reported any issue, and it is a very intermittent issue which leads me to suspect a garbage collection issue.

The GetData() can be called multiple times so the first continuation of _businessObjectTask will only be called once as _businessObjects will have been populated from that point forward.

I imagine the Object reference not set to an instance of an object exception is being thrown because either

  1. _businessObjectTask is null and it can't continue from a null task.
  2. items variable which is passed in as a parameter is null somehow

The line number in my log file (748) points to the one highlighted below and not the lambda expression which points to #1 above instead of #2. I've played around in Visual Studio and each of the lines following businessObjectTask.ContinueWith() are considered different i.e. if it was a null reference within the lambda expression it would give a different line number to 748

Any help would be greatly appreciated.

Edit: This isn't related to What is a NullReferenceException, and how do I fix it? because that is a much more basic explanation of null reference whereas this is much more complicated and subtle.

Exception

Full details of the stack trace (edited for simplicity with dummy class and namespace names)

Object reference not set to an instance of an object.
   at ApplicationNamespace.ClassName`1.<>c__DisplayClass4e.<GetData>b__44(Task`1 t) in e:\ClassName.cs:line 748
   at System.Threading.Tasks.ContinuationTaskFromResultTask`1.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()

Code

private static IDictionary<string, IBusinessObject> _businessObjects;
private Task<IDictionary<string, IBusinessObject>> _businessObjectTask;

public Task GetData(IList<TBusinessItem> items))
{
    Log.Info("Starting GetData()");

    if (_businessObjects == null)
    {
        var businessObjectService = ServiceLocator.Current.GetInstance<IBusinessObjectService>();

        _businessObjectTask = businessObjectService.GetData(CancellationToken.None)
        .ContinueWith
        (
            t => 
            {
                _businessObjects = t.Result.ToDictionary(e => e.ItemId);

                return _businessObjects;
            },
            CancellationToken.None,
            TaskContinuationOptions.OnlyOnRanToCompletion,
            TaskScheduler.Current
        );
    }


    var taskSetLEData = _businessObjectTask.ContinueWith // Line 748 in my code - "Object reference not set to an instance of an object." thrown here
    (
        task =>
        {
            items.ToList().ForEach
            (
                item =>
                {
                    IBusinessObject businessObject;

                    _businessObjects.TryGetValue(item.Id, out businessObject);
                    item.BusinessObject = businessObject;
                }
            );
        },
        CancellationToken.None,
        TaskContinuationOptions.OnlyOnRanToCompletion, 
        TaskScheduler.Default
    );
}

Resolution:

So having used what I've learned from this question I went back to the original code and figured it all out.

Turns out the reason for this NRE was because the _businessObjectTask is non-static where as the _businessObjects is static.

This means that _businessObjects is null the first time GetData() is called which then sets _businessObjectTask to non-null. Then when _businessObjectTask.ContinueWith gets called it is non-null and continues without problem.

However if a second instance of this class above is instantiated, _businessObjects has already been populated so _businessObjectTask remains null. Then when _businessObjectTask.ContinueWith gets called, a NRE on _businessObjectTask is thrown.

There were a couple of options I could have taken but I ended up removing the _businessObjectTask to a synchronous method call which means that I dont need to use the continuation anymore and I set _businessObjects once.

Community
  • 1
  • 1
Ciaran Martin
  • 529
  • 4
  • 12
  • Is `GetData` invoked by multiple threads in parallel? – Yacoub Massad Dec 09 '15 at 11:03
  • @Ciaran Martin, are you *sure* you posted the code in its entirety? You clearly have a race, but I don't see any `await`s between checking `_businessObjectTask` for null and setting it, and the call to `ContinueWith`, *or* any code setting `_businessObjectTask` back to null – Kirill Shlenskiy Dec 09 '15 at 11:03
  • @KirillShlenskiy I haven't pasted the entirety of the code but all the relevant pieces are there. I've updated the question and tags to state that I'm using .NET 4.0 which means async and await are unavailable – Ciaran Martin Dec 09 '15 at 11:09
  • *Edited*: nevermind, that wouldn't explain it. – Kirill Shlenskiy Dec 09 '15 at 11:09
  • @KirillShlenskiy I've added the detail of the stack trace which may be of assistance and related to the comment you just made and then edited. There is either a null reference at `_businessObjectTask` or one within the `ContinueWith` lambda expression – Ciaran Martin Dec 09 '15 at 11:13
  • I've updated the question with some background explaining why I think it `_businessObjectTask` which is causing the exception – Ciaran Martin Dec 09 '15 at 11:22
  • Your `_businessObjectTask` will, ineed, be null if you create another instance of your class and call `GetData` *after* the first instance successfully sets the static `_businessObjects` (one of the fields is static and one instance, hence the confusion). So, if `_businessObjects` is not null and you've just created your `ClassName` instance, `_businessObjectTask.ContinueWith` will throw. Maybe those people pointing to a basic `NullReferenceException` scenario were onto something after all... – Kirill Shlenskiy Dec 09 '15 at 11:23
  • And even the above explanation is plausible, and you have denied that the lambdas have anything to do with the exception, the way I'm reading the stack trace the exception is actually thrown inside a lambda's body (so, most likely `items == null` in the second anonymous function). Checking the arg at the beginning of `GetData` might be a good idea. – Kirill Shlenskiy Dec 09 '15 at 11:43
  • Ok, give me sometime to investigate further but please feel free to post an answer so I can mark it as the solution. The cause is either of your two suggestions so you can add both to the answer and I can come back later to confirm which it was. – Ciaran Martin Dec 09 '15 at 12:14
  • I'm curious about the cause, so I'll keep an eye on this question. However, I will not post an answer as, regardless of the cause, Marten's answer contains a superior solution (although you'll have to make a few simple tweaks for it to work with .NET 4.0) – Kirill Shlenskiy Dec 09 '15 at 14:46
  • @CiaranMartin, good to know I was onto something with my Dec 9 11:23 comment. Still baffled by the stack trace though. Thanks for the update. – Kirill Shlenskiy Dec 16 '15 at 13:11

1 Answers1

1

This is a synchronization problem.

You are assuming that _businessObjectTask is always assigned before _businessObjects.

That is, however, not guaranteed. It is possible that your continuation that assign _businessObjects is executed inline and therefore before businessObjectService.GetData(...).ContinueWith(...) returns.

// This assignment could happend AFTER the inner assignment.
_businessObjectTask = businessObjectService.GetData(CancellationToken.None)
    .ContinueWith
    (
        t => 
        {
           // This assignment could happen BEFORE the outer assignment.
            _businessObjects = t.Result.ToDictionary(e => e.ItemId);              

Therefore, it is possible that _businessObjects is not null although _businessObjectTask is null.

If a concurrent thread would enter your GetData method at that time it would clearly not enter

if (_businessObjects == null) // not entered because it's not null
{
    ...
}

...and instead continue with

var taskSetLEData = _businessObjectTask.ContinueWith // line 748

...which will cause a null reference exception since _businessObjectTask is null.


Here's how you could simplify your code and resolve this synchronization problem:

private Lazy<Task<IDictionary<string, IBusinessObject>>> _lazyTask =
    new Lazy<Task<IDictionary<string, IBusinessObject>>>(FetchBusinessObjects);

private static async Task<IDictionary<string, IBusinessObject>> FetchBusinessObjects()
{
    var businessObjectService = ServiceLocator.Current.GetInstance<IBusinessObjectService>();
    return await businessObjectService.GetData(CancellationToken.None).ToDictionary(e => e.ItemId);
}

public async Task GetData(IList<TBusinessItem> items)
{
    Log.Info("Starting GetData()");

    var businessObjects = await _lazyTask.Value;

    items.ToList().ForEach
    (
        item =>
        {
            IBusinessObject businessObject;
            businessObjects.TryGetValue(item.Id, out businessObject);
            item.BusinessObject = businessObject;
        }
    );
}

Notes:

  • Using Lazy<T> to ensure that the business object service is only invoked once (per instance of this class, whatever it is).

  • Using async/await to simplify code.

  • You may want to consider declaring _lazyTask as static. In your code there seem to be a mixup between static/non-static fields. I cannot know which is right for you.

Mårten Wikström
  • 10,284
  • 4
  • 39
  • 82
  • It seems unlikely for an async method *and* its continuation (which is scheduled without the `ExecuteSynchronously` flag) to both execute inline, but even if it happens, your theory doesn't seem consistent with the stack trace posted above (see `ApplicationNamespace.ClassName\`1.<>c__DisplayClass4e.b__44(Task\`1 t)`). I'm bad at reading stack traces though, so could be completely wrong. – Kirill Shlenskiy Dec 09 '15 at 14:09
  • It doesn't matter how unlikely it is to happen. If it *could* happen, then you can rest assure that it *will* occasionally do so, especially on your customer's machine in the production environment :-) – Mårten Wikström Dec 09 '15 at 14:20
  • I don't understand why you think that the stack trace would contradict my answer? – Mårten Wikström Dec 09 '15 at 14:21
  • The way I read it the last stack frame (on top) points to an anonymous method with a signature consistent with one of the continuation delegates. If the exception was really happening on the `_businessObjectTask.ContinueWith` line, wouldn't the top frame be `ApplicationNamespace.ClassName\`1.GetData(IList items)` (seeing as it's a "normal", synchronous method without any async-related compiler rewriting)? Like I said, I'm out of my comfort zone when it comes to reading stack traces, so correct me if I'm wrong. – Kirill Shlenskiy Dec 09 '15 at 14:31
  • I don't think it's possible to draw such conclusions from the stack trace without knowing how the code is invoked. In any case, there is a null reference synchronization problem as I've described it. There may however, be other problems too. – Mårten Wikström Dec 09 '15 at 14:35
  • Thank you for your answer but the solutions are not .NET 4.0 so are not available to me anyway but good to know - I hadn't know about Lazy. The issue caused by inlining may exist but it isn't likely in the actual scenario for multiple parallel calls to `GetData()` to happen at the exact same time. I've updated my answer to explain how I resolved the issue. – Ciaran Martin Dec 11 '15 at 17:07