0

I have large text files that I need to embed a code (time) into, part of the way through the file. I do this by iterating through an ifstream to the point the code needs to be inserted into, and then proceed to iterate through the rest of the file, continuously copying the data into a new ofstream file.

These files are large, and almost all of this simple copy and paste operation occurs after the code insertion. This takes a while to execute. I was wondering if there was a way to optimize copying the rest of the file in bulk (rather than word-by-word iteration for the rest of the file). This is the relevant code segment:

while (!in.eof())
{
    in >> value;
    if ((counter > 392) && (counter < 399) && (timePosition < 6))
    {
        rounded = floorf(value * 1000) / 1000;
        value = rounded + (time[timePosition] * .00001);
        timePosition++;
    }
    out << value << " ";
    counter++;
}
Alex Mack
  • 21
  • 4
  • If you're using Linux, the `split` command could be a lot more efficient. I.e., split the file in two, write your new lines, then concatenate them all back together again. – JeffUK Feb 24 '19 at 23:14
  • https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong – Retired Ninja Feb 24 '19 at 23:16
  • Check the return value of >> – Kenny Ostrom Feb 24 '19 at 23:17
  • 2
    [Copy a file in a sane, safe and efficient way](https://stackoverflow.com/questions/10195343/copy-a-file-in-a-sane-safe-and-efficient-way) – Deduplicator Feb 24 '19 at 23:22
  • @JeffUK That sounds useful - is there a good example of split in code? (there will be a Linux version and a Windows version, but the Linux one needs to be the fast one) – Alex Mack Feb 24 '19 at 23:53
  • 2
    `out << in.rdbuf();`. – Pete Becker Feb 24 '19 at 23:54
  • @Kenny Ostrom Kenny, you lost me there. Could you explain? – Alex Mack Feb 24 '19 at 23:57
  • Do these files fit into memory? Then you could read the whole file into a std::string, parse to the right location, then write out the first part, then the inserted part, and finally the last part back to the output file. – J.R. Feb 24 '19 at 23:57
  • @Pete Becker - Could you invoke that .rdbuf() at the completion of inserting the time code (which would copy the rest of the text file from that point in the ifstream)? – Alex Mack Feb 25 '19 at 00:00
  • @J.R. They are about 11MB. Would the string parsing be a faster route? How would this be expressed in code? – Alex Mack Feb 25 '19 at 00:05
  • 11 MB sounds OK; can't say for sure if it is faster; you could try. To read the string: https://stackoverflow.com/questions/2602013/read-whole-ascii-file-into-c-stdstring – J.R. Feb 25 '19 at 00:11
  • @Deduplicator that's a great link, but it really only covers the case where you want to copy the entire file, not just a part of it. – Mark Ransom Feb 25 '19 at 18:01
  • @MarkRansom As others commented, one can adapt most of the given solutions appropriately. – Deduplicator Feb 25 '19 at 19:25

1 Answers1

2

Pete Becker's answer above was just what was needed.

out << in.rdbuf();

What used to execute in a minute now takes seconds using this buffer-pointer command. The new code:

while (counter < 399)
{
  in >> value;
  if ((counter > 392) && (counter < 399) && (timePosition < 6))
  {
    rounded = floorf(value * 1000) / 1000;
    value = rounded + (time[timePosition] * .00001);
    timePosition++;
  }
  out << value << " ";
  counter++;
}
out << in.rdbuf();

Thank you to all of you who commented; you were very informative, and I now know a lot more than I did when I asked this question!

Alex Mack
  • 21
  • 4
  • That doesn't seem like it would copy the whole file, just a small portion of it. – Mark Ransom Feb 25 '19 at 01:42
  • @Mark Ransom From what I've tested so far, it seems to have worked; it transfers the rest of the file extremely quickly after the code insertion loop (the [out << in.rdbuf();] at the end) – Alex Mack Feb 25 '19 at 01:50
  • I've been reading up on `rdbuf` and it does appear to have the ability to stream the entire contents of the file, but it would require `operator< – Mark Ransom Feb 25 '19 at 04:18
  • 1
    As an improvement, consider reading the first 399+ bytes into a buffer at once, replace parts as you want, and write it at once. Should be much faster. – Deduplicator Feb 25 '19 at 19:28
  • @Deduplicator That's the case; I'm just not sure how to express that extra bit of efficiency via code. How would one bring those bytes into a buffer, and access the specific bytes for modification? (pardon my ignorance) – Alex Mack Feb 25 '19 at 23:04
  • Use [`std::istream::read()`](https://en.cppreference.com/w/cpp/io/basic_istream/read) and [`std::ostream::write()`](https://en.cppreference.com/w/cpp/io/basic_ostream/write), and an automatic char-array. – Deduplicator Feb 25 '19 at 23:07