9

I have a project with a couple of .NET Core based Azure Functions running on a schedule. One of them runs once every 10 minutes and serves to update view counts similar to how SO tracks question views and the other sends emails once a week. These functions were running just fine for a year or so. I recently updated them to use Azure Functions SDK v3 and Azure Functions runtime v3 as well as .NET Core 3.1 (basically moved from .NET Core 2.1 to .NET Core 3.1 so I needed to update the functions runtime).

At one point I received a bill much higher than usual. Turns out the functions which share the same underlying storage account began making A LOT of API transactions to the storage. Like many thousands each 5 minutes. Normally every run generates like 100 storage transactions (probably retrieving the function files?) but at some point the transactions jumped sharply. Upon restarting the functions the transaction drop to normal and everything is normal for a few days then they jump again and stay high until restart.

The functions code has not been changed with the upgrade only the SDK and the runtime. The function code has a constant number of log writes (like 7) through the SDK-provided logger and does not interact with the storage in any other way.

I have two identical environments one for test and one for production and both have the same problem. The interval that it takes for the function to go rogue is a couple of days but seems to be different each time. However if I restart both test and production at the same time the next spike happens simultaneously on both environments so there is something deterministic there.

According to my investigation through the Metrics tool the offending transaction types are Create, Close and ChangeNotify with some Cancel (but less than the others). The storage is not used for anything else (in fact it only exist because Azure Functions require a backing storage to store its files or something)

Here is the trigger code in case it is relevant

[FunctionName("ViewCountUpdater")]
public static async Task RunAsync([TimerTrigger("0 */10 * * * *"/*, RunOnStartup = true*/)]TimerInfo timer, ILogger log, ExecutionContext context)

I believe I have hit a bug with the Azure Functions runtime or the Azure Functions .NET Core SDK. Has anyone experienced this or know how to work around it?

Stilgar
  • 20,008
  • 6
  • 58
  • 93
  • have you opened a ticket case on Azure Support? They should be able to investigate, and if it was a bug, refund you – Thiago Custodio Feb 07 '20 at 13:21
  • 1
    Yes. I've been in an email thread with Azure support for multiple weeks now with no resolution. At least two people have looked into the problem and sent it to other people after confirming that the issue happens. According to the emails my issue is now transferred to the "Product Group" but no new person is in the email thread. I'm thinking of opening an issue in the Azure Functions SDK GitHub but I decided to try SO first. – Stilgar Feb 07 '20 at 13:30
  • Definitively looks like a bug. Send a tweet with a link to this question to @ AzureSupport and @ jeffhollan, Principal PM Mgr for Microsoft Azure Functions – CSharpRocks Feb 07 '20 at 13:37
  • @CSharpRocks I will give it a few more days, I don't want to skip the people in Azure support, they've been very nice and seem to be doing their best. Also I estimate that there is like 10% probability that there is something I did wrong. The main reason I posted this is that if someone else hits the bug they can find a result on Google and confirm that it is indeed a bug. – Stilgar Feb 07 '20 at 13:44

2 Answers2

2

Do you have AzureWebJobsDashboard enabled? If yes, you should disable it (delete the Connection String from the App Settings) and switch to application insights. This setting has been known to cause unexpected writes to storage which can't be properly explained.

https://github.com/Azure/Azure-Functions/issues/832

Alex AIT
  • 10,190
  • 3
  • 23
  • 49
  • I don't have a storage connection string under connection strings. I have a AzureWebJobsStorage key in settings but this I think is default for functions (I am not using it in my code). – Stilgar Feb 07 '20 at 14:42
  • I read through the GitHub comments. Looks suspiciously similar. I wonder if it is possible that Azure Functions have an implicit dashboard somehow. – Stilgar Feb 07 '20 at 15:28
  • Yes, you have to remove the setting even if you are not explicitly using it in your code – Alex AIT Feb 07 '20 at 15:47
  • Yeah but I don't have AzureWebJobsDashboard setting I only have AzureWebJobsStorage – Stilgar Feb 07 '20 at 15:54
  • Ah, sorry. I believe the storage is fine – Alex AIT Feb 07 '20 at 15:54
1

After weeks of investigation the Azure support team and I think we have found the line that causes the issue and this is it:

.AddJsonFile("local.settings.json", optional: true, reloadOnChange: true)

The config file was not published as part of the publish process and is not present in Azure. Experiments for now seem to confirm that when this is present the transactions spike and when not they are normal. This does not answer

  • Why the bug happens at all
  • Is it a regression in .NET Core or in the functions runtime?
  • Why does the bug happen randomly and not on every run?

Note that testing this takes time as I have to wait for days for the random spike and I can never be sure if it is gone forever so I am not 100% sure that at some point in the future the spike won't happen again and turn out that the issue was something else.

Stilgar
  • 20,008
  • 6
  • 58
  • 93