1

I'm trying to upload *.iso files via my API using multipartform-data and stream them into local folder. I used Stream.CopyAsync(destinationStream) and it worked slow, but not too bad. But now I need to report progress. So I used custom CopyTOAsync and added a progress report to it. But the method is very slow(not acceptable at all), even compared to Stream::CopyToASync.

 public async Task CopyToAsync(Stream source, Stream destination, long? contentLength, ICommandContext context, int bufferSize = 81920 )
    {
        var buffer = new byte[bufferSize];
        int bytesRead;
        long totalRead = 0;
        while ((bytesRead = await source.ReadAsync(buffer, 0, buffer.Length)) > 0)
        {
            await destination.WriteAsync(buffer, 0, bytesRead);
            totalRead += bytesRead;
            context.Report(CommandResources.RID_IMAGE_LOADIND, Math.Clamp((uint)((totalRead * 100) / contentLength), 3, 99));
        }
        _logger.Info($"Total read during upload : {totalRead}");
    }

What I tried: default buffer size for Stream::CopyToAsync is 81920 bytes, I used the same value first, then I tried to increase buffer size up to 104857600 bytes- no difference.

Do you have any other ideas on how to improve the performance of custom CopyToAsync?

Scott
  • 9
  • 5
  • What are those files? You can use compressed stream to transfer uncompressed files. – Sinatr Mar 20 '20 at 10:54
  • How long does it take to transfer, say, a 1GB file? Are you uploading these files to a remote server? Or does everything stay on your local network? – Patrick Tucci Mar 20 '20 at 11:00
  • I hate to tell you but your approach is faulty to start with. Want faster performance? DO NOT SEND MULTIPARTFORM-DATA - the encoding there makes your data 30% larger to start with as it has to be ASCII encoded. Stream them in searate requests, no form, as binary content. Send necesary metadata aeitehr before (getting an upload url) or as headers. Look how Youtube does it in their SDK. – TomTom Mar 20 '20 at 11:00
  • 1
    @ChristophLütjen I'm skeptical that an `ArrayPool` would help considering that the code in question isn't allocating arrays in a tight-loop. – Dai Mar 20 '20 at 11:08

1 Answers1

2
  • Always use ConfigureAwait with await to specify thread synchronization for the async continuation.
    • Depending on the platform, omitting ConfigureAwait may default to synchronizing with the UI thread (WPF, WinForms) or to any thread (ASP.NET Core). If it's synchronizing with the UI thread inside your Stream copy operation then it's no wonder performance takes a nose-dive.
    • If you're running code in a thread-synchronized context, then your await statements will be unnecessarily delayed because the program schedules the continuation to a thread that might be otherwise busy.
  • Use a buffer sized at least a couple hundred KiB - or even megabyte-sized buffer for async operations - not a typical 4KiB or 80KiB sized array.
  • If you're using FileStream ensure you used FileOptions.Asynchronous or useAsync: true otherwise the FileStream will fake its async operations by performing blocking IO using a thread-pool thread instead of Windows' native async IO.

With respect to your actual code - just use Stream::CopyToAsync instead of reimplementing it yourself. If you want progress reporting then consider subclassing Stream (as a proxy wrapper) instead.

Here's how I would write your code:

  1. First, add my ProxyStream class from this GitHub Gist to your project.
  2. Then subclass ProxyStream to add support for IProgress:
  3. Ensure any FileStream instances are created with FileOptions.Asynchronous | FileOptions.SequentialScan.
  4. Use CopyToAsync.
public class ProgressProxyStream : ProxyStream
{
    private readonly IProgress<(Int64 soFar, Int64? total)> progress;
    private readonly Int64? total;

    public ProgressProxyStream( Stream stream, IProgress<Int64> progress, Boolean leaveOpen )
        : base( stream, leaveOpen ) 
    {
        this.progress = progress ?? throw new ArgumentNullException(nameof(progress));
        this.total = stream.CanSeek ? stream.Length : (Int64?)null;
    }

    public override Task<Int32> ReadAsync( Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken )
    {
        this.progress.Report( ( offset, this.total ) );
        return this.Stream.ReadAsync( buffer, offset, count, cancellationToken );
    }
}

If performance still suffers with the above ProgressProxyStream then I'm willing to bet the bottleneck is inside the IProgress.Report callback target (which I assume is synchronised to a UI thread) - in which case a better solution is to use a (System.Threading.Channels.Channel) for the ProgressProxyStream (or even your implementation of IProgress<T>) to dump progress reports to without blocking any other IO activity.

Dai
  • 110,988
  • 21
  • 188
  • 277
  • this is good advice, but given that the [framework's own implementation](https://referencesource.microsoft.com/#mscorlib/system/io/stream.cs,171) seems to take none of this into account, it does not really explain the "slow but not too bad" vs. "very slow" difference OP observes after adding the progress reporting part. they also have tried a reasonably increased buffer size. – Cee McSharpface Mar 20 '20 at 11:03
  • 1
    @CeeMcSharpface I'm willing to bet that there's blocking thread-synchronization happening inside the `context.Report` call. – Dai Mar 20 '20 at 11:05
  • @Dai , Thank you very much. The bottleneck was really inside the Report method. I appreciate your help. – Scott Mar 23 '20 at 10:35