What is the correct way to read from NetworkStream in .NET

Question

I've been struggling with this and can't find a reason why my code is failing to properly read from a TCP server I've also written. I'm using the TcpClient class and its GetStream() method but something is not working as expected. Either the operation blocks indefinitely (the last read operation doesn't timeout as expected), or the data is cropped (for some reason a Read operation returns 0 and exits the loop, perhaps the server is not responding fast enough). These are three attempts at implementing this function:

// this will break from the loop without getting the entire 4804 bytes from the server 
string SendCmd(string cmd, string ip, int port)
{
    var client = new TcpClient(ip, port);
    var data = Encoding.GetEncoding(1252).GetBytes(cmd);
    var stm = client.GetStream();
    stm.Write(data, 0, data.Length);
    byte[] resp = new byte[2048];
    var memStream = new MemoryStream();
    int bytes = stm.Read(resp, 0, resp.Length);
    while (bytes > 0)
    {
        memStream.Write(resp, 0, bytes);
        bytes = 0;
        if (stm.DataAvailable)
            bytes = stm.Read(resp, 0, resp.Length);
    }
    return Encoding.GetEncoding(1252).GetString(memStream.ToArray());
}

// this will block forever. It reads everything but freezes when data is exhausted
string SendCmd(string cmd, string ip, int port)
{
    var client = new TcpClient(ip, port);
    var data = Encoding.GetEncoding(1252).GetBytes(cmd);
    var stm = client.GetStream();
    stm.Write(data, 0, data.Length);
    byte[] resp = new byte[2048];
    var memStream = new MemoryStream();
    int bytes = stm.Read(resp, 0, resp.Length);
    while (bytes > 0)
    {
        memStream.Write(resp, 0, bytes);
        bytes = stm.Read(resp, 0, resp.Length);
    }
    return Encoding.GetEncoding(1252).GetString(memStream.ToArray());
}

// inserting a sleep inside the loop will make everything work perfectly
string SendCmd(string cmd, string ip, int port)
{
    var client = new TcpClient(ip, port);
    var data = Encoding.GetEncoding(1252).GetBytes(cmd);
    var stm = client.GetStream();
    stm.Write(data, 0, data.Length);
    byte[] resp = new byte[2048];
    var memStream = new MemoryStream();
    int bytes = stm.Read(resp, 0, resp.Length);
    while (bytes > 0)
    {
        memStream.Write(resp, 0, bytes);
        Thread.Sleep(20);
        bytes = 0;
        if (stm.DataAvailable)
            bytes = stm.Read(resp, 0, resp.Length);
    }
    return Encoding.GetEncoding(1252).GetString(memStream.ToArray());
}

The last one "works", but it certainly looks ugly to put a hard-coded sleep inside the loop considering that sockets already support read timeouts! Do I need to setup some property(ies) on the TcpClient of the NetworkStream? Does the problem resides in the server? The server don't close the connections, it is up to the client to do so. The above is also running inside the UI thread context (test program), maybe it has something to do with that...

Does someone know how to properly use NetworkStream.Read to read data until no more data is available? I guess what I'm wishing for is something like the old Win32 winsock timeout properties... ReadTimeout, etc. It tries to read until the timeout is reached, and then return 0... But it sometimes seem to return 0 when data should be available (or on the way.. can Read return 0 if is available?) and it then blocks indefinitely on the last read when data is not available...

Yes, I'm at a loss!

Be careful. The code in your attempt (and the answers) do not close client or stream, which causes a resource leak with big consequences if called repeatedly. You should probably do `using (var client = new System.Net.Sockets.TcpClient(ip, port)) using (var stm = client.GetStream())` then enclose the rest of the method in braces. This will guarantee that on method exit, whatever the reason, connections are closed, resources are reclaimed. — Stéphane Gourichon, May 13 '15 at 08:45
Have you seen the code for the TcpClient class? I advise to take a look at it... If you still want to answer back on the endpoint, you should not close the stream nor the tcpclient (which closes the stream, if I'm not mistaken). But the actual code I'm using is different, I'll dig it out and update the answers here too. — Loudenvier, May 18 '15 at 21:50
thanks for the suggestion. I found [TCPClient.cs](http://referencesource.microsoft.com/#system/net/System/Net/Sockets/TCPClient.cs). Indeed closing or disposing `TCPClient` disposes the stream. I've actually seen connection failures when a program made thousands of connections without much care (for other, bad, reasons). Why should one deviate from the usual IDisposable pattern here ? Implementing IDisposable is error-prone, `using()` it is easy and safe. — Stéphane Gourichon, May 20 '15 at 07:25

score 20 · Answer 1 · edited May 22 '17 at 15:55

Networking code is notoriously difficult to write, test and debug.

You often have lots of things to consider such as:

what "endian" will you use for the data that is exchanged (Intel x86/x64 is based on little-endian) - systems that use big-endian can still read data that is in little-endian (and vice versa), but they have to rearrange the data. When documenting your "protocol" just make it clear which one you are using.
are there any "settings" that have been set on the sockets which can affect how the "stream" behaves (e.g. SO_LINGER) - you might need to turn certain ones on or off if your code is very sensitive
how does congestion in the real world which causes delays in the stream affect your reading/writing logic

If the "message" being exchanged between a client and server (in either direction) can vary in size then often you need to use a strategy in order for that "message" to be exchanged in a reliable manner (aka Protocol).

Here are several different ways to handle the exchange:

have the message size encoded in a header that precedes the data - this could simply be a "number" in the first 2/4/8 bytes sent (dependent on your max message size), or could be a more exotic "header"
use a special "end of message" marker (sentinel), with the real data encoded/escaped if there is the possibility of real data being confused with an "end of marker"
use a timeout....i.e. a certain period of receiving no bytes means there is no more data for the message - however, this can be error prone with short timeouts, which can easily be hit on congested streams.
have a "command" and "data" channel on separate "connections"....this is the approach the FTP protocol uses (the advantage is clear separation of data from commands...at the expense of a 2nd connection)

Each approach has its pros and cons for "correctness".

The code below uses the "timeout" method, as that seems to be the one you want.

See http://msdn.microsoft.com/en-us/library/bk6w7hs8.aspx. You can get access to the NetworkStream on the TCPClient so you can change the ReadTimeout.

string SendCmd(string cmd, string ip, int port)
{
  var client = new TcpClient(ip, port);
  var data = Encoding.GetEncoding(1252).GetBytes(cmd);
  var stm = client.GetStream();
  // Set a 250 millisecond timeout for reading (instead of Infinite the default)
  stm.ReadTimeout = 250;
  stm.Write(data, 0, data.Length);
  byte[] resp = new byte[2048];
  var memStream = new MemoryStream();
  int bytesread = stm.Read(resp, 0, resp.Length);
  while (bytesread > 0)
  {
      memStream.Write(resp, 0, bytesread);
      bytesread = stm.Read(resp, 0, resp.Length);
  }
  return Encoding.GetEncoding(1252).GetString(memStream.ToArray());
}

As a footnote for other variations on this writing network code...when doing a Read where you want to avoid a "block", you can check the DataAvailable flag and then ONLY read what is in the buffer checking the .Length property e.g. stm.Read(resp, 0, stm.Length);

The size is unknown, can be on the MB range... As I've said on the question, the third option does read everything correctly, but I can't accept the need for that Sleep there, and I can't explain it either. I know what it does: by sleeping I give time for data to arrive in the socket, but I didn't want to sleep 20ms if data arrived before that. I know how to do that with non-blocking reads and my own overlapped IO... but I don't think I should need to resort to that in .NET! — Loudenvier, Oct 27 '12 at 05:13
What method do you use to allow the reading to know if it's read all the data i.e. the right amount/size? The typical way is to use a size header that is encoded in the first part of the response. — Colin Smith, Oct 27 '12 at 05:17
I do have a Content-lenght (the protocol is almost identical to HTTP), but for simplicity I would use a "incoming data timeout" and I thought that Read would block for a time and then timeout and not wait forever for that to arrive! With the "inside-loop" sleep I'll pay the 20ms price in every iteration, with a "incoming data timeout", I would pay it only when needed and at the last read. I can optimize it using the ContentLenght, but relying on the length alone is also risky as the server may misbehave and don't send everything! — Loudenvier, Oct 27 '12 at 05:24
ReadTimeout only times out after a read operation actually started. If no data arrives ever ReadTimeout has no effect. I'm looking into Socket.ReceiveTimeout... — Loudenvier, Oct 27 '12 at 05:47
I'm needing to monitor TCP traffic from my co's legacy PBX (via a known IP Address and Port #) to its legacy client software (Oaisys NetPhone v4.6) both no longer supported (unless we upgrade which we won't any time soon) to extract Caller ID phone #'s to work around the client sfw no longer logging Caller ID #'s under Windows 10 (vs. 7 or XP). Can I use `TcpClient` to do this (e.g. without possibly blocking comms between NetPhone andthe PBX)? Or would I have to use `IPGlobalPrperties` or somethng else? I'm totally new to networking sfw. Any help appreciated. — Tom, Jul 10 '17 at 21:52

score 10 · Accepted Answer · answered Oct 27 '12 at 06:13

Setting the underlying socket ReceiveTimeout property did the trick. You can access it like this: yourTcpClient.Client.ReceiveTimeout. You can read the docs for more information.

Now the code will only "sleep" as long as needed for some data to arrive in the socket, or it will raise an exception if no data arrives, at the beginning of a read operation, for more than 20ms. I can tweak this timeout if needed. Now I'm not paying the 20ms price in every iteration, I'm only paying it at the last read operation. Since I have the content-length of the message in the first bytes read from the server I can use it to tweak it even more and not try to read if all expected data has been already received.

I find using ReceiveTimeout much easier than implementing asynchronous read... Here is the working code:

string SendCmd(string cmd, string ip, int port)
{
  var client = new TcpClient(ip, port);
  var data = Encoding.GetEncoding(1252).GetBytes(cmd);
  var stm = client.GetStream();
  stm.Write(data, 0, data.Length);
  byte[] resp = new byte[2048];
  var memStream = new MemoryStream();
  var bytes = 0;
  client.Client.ReceiveTimeout = 20;
  do
  {
      try
      {
          bytes = stm.Read(resp, 0, resp.Length);
          memStream.Write(resp, 0, bytes);
      }
      catch (IOException ex)
      {
          // if the ReceiveTimeout is reached an IOException will be raised...
          // with an InnerException of type SocketException and ErrorCode 10060
          var socketExept = ex.InnerException as SocketException;
          if (socketExept == null || socketExept.ErrorCode != 10060)
              // if it's not the "expected" exception, let's not hide the error
              throw ex;
          // if it is the receive timeout, then reading ended
          bytes = 0;
      }
  } while (bytes > 0);
  return Encoding.GetEncoding(1252).GetString(memStream.ToArray());
}

Be careful using this hack. We have a Java application that relied on socket timeout to detect end-of-message and from time to time we lost a byte in the stream. Admittedly I do not know if this also holds in .NET, but I suspect it is the underlying TCP/IP stack that was the cause. You are simply not supposed to reuse a socket after a timeout. — Søren Boisen, Jun 19 '14 at 12:29
@SørenBoisen I'm not reusing the socket after the timeout... I won't loose data, but I can have a false "positive" timeout (if the network is slow data can take longer than the intercharacter timeout to arrive), but in this case the other side will simply handle a disconnection and, hopefully, try again. This is not actually a hack, the protocol, which I can't change, sends unknown sized messages, so I must rely on timeouts. I've actually made the code more robust with time, but didn't update the article. Now it's a good time. — Loudenvier, Jun 20 '14 at 04:39
10035 means it's trying really hard to connect, aka the stream is empty at the moment but if we wait briefly then the message will continue. I use the request header content length vs sumOfBytesRead to decide if a 10035 should keep retrying to read. — BeatriceThalo, Dec 23 '20 at 20:36

score 0 · Answer 3 · answered Oct 27 '12 at 04:59

0

As per your requirement, Thread.Sleep is perfectly fine to use because you are not sure when the data will be available so you might need to wait for the data to become available. I have slightly changed the logic of your function this might help you little further.

string SendCmd(string cmd, string ip, int port)
{
    var client = new TcpClient(ip, port);
    var data = Encoding.GetEncoding(1252).GetBytes(cmd);
    var stm = client.GetStream();
    stm.Write(data, 0, data.Length);
    byte[] resp = new byte[2048];
    var memStream = new MemoryStream();

    int bytes = 0;

    do
    {
        bytes = 0;
        while (!stm.DataAvailable)
            Thread.Sleep(20); // some delay
        bytes = stm.Read(resp, 0, resp.Length);
        memStream.Write(resp, 0, bytes);
    } 
    while (bytes > 0);

    return Encoding.GetEncoding(1252).GetString(memStream.ToArray());
}

Hope this helps!

answered Oct 27 '12 at 04:59

Furqan Safdar

14,930
12
52
85

1

I've written the method exactly as this after posting the question! Still I don't find it acceptable. If I was using good old WIN32 overlapped IO I could have "blocked" for 20 or more msecs until data started arriving, so my code could be safer with something like this pseudo-code: `stm.ReadIfDataBecomesAvailableInUpTo(timeout=2000)` I know 20ms is a small price to pay, but it will be paid in every interation! To ready a MB sized response it will be a HUGE OVERHEAD! I didn't want to resort to writing my own timeout logic... :-( – Loudenvier Oct 27 '12 at 05:17
7

You have to be careful with Sleeping on threads. If the time is too small, you are forcing unnecessary context switches on the OS, thus thrashing the CPU. A better mechanism would probably be interrupt-driven or 'event'-driven. For this you could use stm.BeginRead() which would allow you to have an event called when data is ready, that way your process can be in a blocked state where the OS will only awake it when the resource is ready. Sleep will yield back to the OS every time the process awoken. – user1132959 Jun 18 '13 at 03:13

What is the correct way to read from NetworkStream in .NET

3 Answers3

Linked

Related