-1

We have an app that exports very long CSV files, thousands to millions of lines. We use a StringBuilder and lots of .Appends to construct the file, and this runs out of memory around 500,000 lines. StringWriter is based on StringBuilder, and fails at the same point in my tests.

MemoryStream (or Tributary) should not have any problems dealing with a file this large, but its API is based on byte[]. I know I can convert, but this makes the code somewhat (lots?) more difficult to read.

Is there a simpler solution for writing multiple strings to a stream that I'm missing, or perhaps a clean way to implement such functionality (an extension perhaps?)

Cœur
  • 32,421
  • 21
  • 173
  • 232
Maury Markowitz
  • 8,303
  • 8
  • 36
  • 79
  • 5
    Do you really need to have the data all in memory, why don't you write it directly into the file you're exporting? – Joni Jan 20 '16 at 20:51
  • 1
    The way I understand it, when adding to a string and thus going OVER maximum capacity, `StringBuilder` will try and double it's capacity when it runs out of "room" before it "settles" for just enough to hold the current string plus the addition.My guess is that trying to double is where your error occurs. `StringBuilder` has a `Capacity` property. If you know roughly the size you need, set it so it doesn't have to do these in-memory resizes. – Jeroen Jan 20 '16 at 21:03
  • Force the application to run as a 64-bit application, and use Framework 4.5. Here's a SE that might help: http://stackoverflow.com/questions/1087982/single-objects-still-limited-to-2-gb-in-size-in-clr-4-0 – JerryM Jan 20 '16 at 21:08
  • @Joni - that's the same question - how to I write lots of strings to X, whether X be a MemoryStream, a FileStream or a DeflateStream. – Maury Markowitz Jan 20 '16 at 21:24
  • @JerryM - it doesn't make a difference the bit-size or actual memory, this is a well known problem inside the .net memory routines - Google up MemoryTributary some time. – Maury Markowitz Jan 20 '16 at 21:24
  • I'm not sure I am understanding the question, but is something like [`StreamWriter`](https://msdn.microsoft.com/en-us/library/system.io.streamwriter%28v=vs.110%29.aspx) what you are looking for? The API supports writing strings (with `Write` or `WriteLine`) directly to a file, or can be wrapped around a `MemoryStream` if you really need the data in memory. – Mark Jan 20 '16 at 21:39
  • 1
    Why don't you just read this in chunks if that is the case? Then you can perform your action and then continue reading... – zaggler Jan 20 '16 at 21:45

1 Answers1

0

Just so there's some closure on this: Mark's answer was the one I was looking for. By building up the stream using StreamWriter it works. StringWriter fails. I still have some encoding issues, but those are minor in comparison.

Maury Markowitz
  • 8,303
  • 8
  • 36
  • 79