-3

Possible Duplicate:
What is the best way to slurp a file into a std::string in c++?

I'm trying to imitate PHP's file_get_contents() function for C++.

However, when I convert a char array into a string, it stops at nullbyte:

fread(charbuf, 1, file_size, fp);
string str(charbuf);

How can I initialize the string as a static size array, and read the file contents directly to that container? Also, how do I check the errors for it, for example if there is not enough memory for initializing that string. This would also get me rid of the temporary memory allocation I'm currently using, which I would like to get rid of.

How about safety? Is it possible that many processes read the same file at same time and/or one of them writes in it at same time when I am reading it? How do I avoid such things happening?

I hope you can answer other way than "string isn't binary container".

I ask to reopen this question for the fact: "Apparently, this question is as relevant as ever: two years later, the two most efficient solutions still copy the whole file contents in memory, and this copy cannot be elided by the optimizer. This is a quite unsatisfactory state of affairs. – Konrad Rudolph Oct 25 '10 at 6:25" at What is the best way to read an entire file into a std::string in C++? Or do you want me to create new question that asks to read a file without having extra copy of the string?

Community
  • 1
  • 1
Rookie
  • 3,648
  • 5
  • 50
  • 84
  • Do you have any problems with using `ifstream`? http://stackoverflow.com/questions/2912520/read-file-contents-into-a-string-in-c – F0G Sep 21 '12 at 13:58
  • @loler, does that allocate only that amount of memory as the file_size would be? or does it create temporary copy of the string? (for example if the file is 100MB, does it have 200MB allocated at some phase of the program?). – Rookie Sep 21 '12 at 14:07
  • **Isn't it funny** that this "exact duplicate" has -2 votes, but the duplicate has +18 votes? I guess its not that exact duplicate as you thought. – Rookie Sep 21 '12 at 14:15
  • 2
    No, it isn't funny, nor unexpected. One of the things that contributes to the perceived quality of a question is the research that went into it. There are at least two excellent questions on this subject already; this question may have been downvoted because you didn't find them. – Robᵩ Sep 21 '12 at 14:17
  • Also, string _is_ a binary container. It can contain NUL just fine, and it makes no assumptions whatsoever about what is being stored, not even endianness, for example. – sehe Sep 21 '12 at 14:22
  • **I ask to reopen this question for the fact:** "Apparently, this question is as relevant as ever: two years later, the two most efficient solutions still copy the whole file contents in memory, and this copy cannot be elided by the optimizer. This is a quite unsatisfactory state of affairs. – Konrad Rudolph Oct 25 '10 at 6:25" at http://stackoverflow.com/questions/116038/what-is-the-best-way-to-slurp-a-file-into-a-stdstring-in-c Or do you want me to create new question that asks to read a file without having extra copy of the string? – Rookie Sep 21 '12 at 14:23

2 Answers2

6
std::ifstream fin("somefile.txt");
std::stringstream buffer;
buffer << fin.rdbuf();
std::string result = buffer.str();

This snippet will put all your file into std::string

Denis Ermolin
  • 5,330
  • 5
  • 25
  • 41
  • there is extra memory allocation too: you convert (copy) buffer to string... i want to avoid all that and just read it directly to string container without any side-allocations. – Rookie Sep 21 '12 at 14:03
  • 1
    present compilers have rvo-value optimization. No problem with temporary object here. – Denis Ermolin Sep 21 '12 at 14:04
  • @Rookie honestly what do you care about one allocation more or less, unless you're on an embedded system, it isn't going to slow you down that much. – Tony The Lion Sep 21 '12 at 14:04
  • @TonyTheLion, i like to make it good, instead of "enough good". it hurts my brain to see its having a copy of something, which is waste of memory and time. – Rookie Sep 21 '12 at 14:08
  • 2
    @Rookie: I can understand the sentiment. However, many C++ constructs that *look* like they are copying data actually don't. (That's not because C++ is stupid, to the contrary: It is because C++ can optimize some things that other languages cannot.) First rule of optimization: Measure, optimize, measure. – DevSolar Sep 21 '12 at 14:15
  • @DevSolar, oh, so this stringstream isnt actually copying the data, just the pointer? But what happens when the `buffer` is destroyed? Thats why i thought it must be copying the data... – Rookie Sep 21 '12 at 14:17
  • @DevSolar, you are wrong: "Apparently, this question is as relevant as ever: two years later, the two most efficient solutions still copy the whole file contents in memory, and this copy cannot be elided by the optimizer. This is a quite unsatisfactory state of affairs. – Konrad Rudolph Oct 25 '10 at 6:25" at http://stackoverflow.com/questions/116038/what-is-the-best-way-to-slurp-a-file-into-a-stdstring-in-c – Rookie Sep 21 '12 at 14:22
  • @Rookie: Read again what I *said*, not what you did read into it. That comment is two years old, and you still didn't measure if you actually have a problem if you do it this way, or if it simply doesn't matter because the matrix multiplication in the critical path of your application needs three orders of magnitude more RAM and CPU (or the buffer handling has been optimized by other people in the meantime). – DevSolar Sep 21 '12 at 14:30
  • @DevSolar, its not really about optimization here, its about doing it right, instead of doing it "enough good". i could as well get a kick in my nuts every day, just because you say it doesnt reduce my work-time much. it still hurts, though, but i would like to get rid of that pain since **it is not needed**. – Rookie Sep 21 '12 at 14:37
  • 1
    @Rookie: Which brings us right back to square one: "I understand the sentiment..." I really do. But you *still* haven't measured whether *your* compiler can optimize the call in *your* library. And, as unfortunate as it is to perfectionists like you (and me), outside the critical path simplicity and readability trump optimization. (That's a statement with over twelve years hands-on experience behind it.) – DevSolar Sep 21 '12 at 14:41
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/16965/discussion-between-devsolar-and-rookie) – DevSolar Sep 21 '12 at 14:45
0

I hope you can answer other way than "string isn't binary container.

std::string is a binary container, but the constructor you chose takes a C-style string as an argument. Try a different constructor:

std::fread(charbuf, 1, file_size, fp);
std::string str(charbuf, file_size);

EDIT: Taking into account requirement to avoid memory allocations:

std::string str(file_size, 0);
std::fread(&str[0], 1, file_size, fp);
Robᵩ
  • 143,876
  • 16
  • 205
  • 276
  • the problem here is that i need to allocate charbuf and then free it... so if i read 100MB file, i have 200MB data allocated at same time (for char* and std:string). it would be more efficient if i only had 100MB data allocated at once. – Rookie Sep 21 '12 at 14:01
  • See http://stackoverflow.com/questions/116038/what-is-the-best-way-to-slurp-a-file-into-a-stdstring-in-c. – Robᵩ Sep 21 '12 at 14:02
  • that seems to be exaclty what the other guy here told, it still is using a temporary memory, so i get 200MB allocated at same time for 100MB file. – Rookie Sep 21 '12 at 14:04
  • could i perhaps use str.resize() somehow...? seems to have such function. – Rookie Sep 21 '12 at 14:06
  • that is same as the other guy (seems like he removed his post now); it has wasteful memory assignment to zeroes. he told that it cannot be avoided... thats a pity. – Rookie Sep 21 '12 at 14:11
  • @Rookie honestly you're being silly. That is not going to make a difference in performance. Have you measured if it that would cause a bottleneck? I have my doubts – Tony The Lion Sep 21 '12 at 14:12
  • @TonyTheLion, well if its not possible to avoid that, i guess its my only go... but i just wonder, how does it do it internally? if i could get that code, i could get rid of memset() line of code. I really dont understand whats the point setting the memory which will be ultimately replaced with something else anyways. – Rookie Sep 21 '12 at 14:13
  • @Rookie mostly the point of doing `memset` is to fill with zero bytes, because you don't want to have memory full of garbage. At least for null bytes you can check, you can't check for junk – Tony The Lion Sep 21 '12 at 14:14
  • @TonyTheLion, indeed, but why would i need to check the data, if i check if the fread() function was successful? – Rookie Sep 21 '12 at 14:19
  • To reserve space without initializing, use [std::string::reserve](http://en.cppreference.com/w/cpp/string/basic_string/reserve) of course. And yes the performance _may_ be a factor if the file was actually a kernel special character device with a zero-copy path or something silly like that. Of course, _memory mapped files_ beat the crap out of all of these approaches. – sehe Sep 21 '12 at 14:27
  • @sehe, looks like reserve() works as i wanted!? why did nobody say this before in all those 10 topics? Is that safe to use then? – Rookie Sep 21 '12 at 14:53
  • 1
    @Rookie because it doesn't work. You can't `fread` into a `std::string` for a very simple reason: how do you make `str.size()` return the correct value? (hint: `str.resize()` will overwrite everything with zeroes). – R. Martinho Fernandes Sep 21 '12 at 14:54
  • Hmm. Sadly I was mentally confusing resize and reserve. @R.MartinhoFernandes is right. Will post another idea at the other answer – sehe Sep 21 '12 at 15:06
  • @R.MartinhoFernandes, so i guess the resize() method is best for avoiding extra memory allocations? – Rookie Sep 21 '12 at 16:20
  • @Rookie No, the resize method behaves more-or-less identically to the 2-argument constructor. It allocates memory and initializes all of the members. So, in terms of theoretical performance, they are equal. In terms of practical performance, you need to measure it in your own environment. – Robᵩ Sep 21 '12 at 16:45
  • Aaaaahhh! My eyes! Are you *seriously* using a C function to pump data into the innards of a C++ object, and advertising that as good practice? Please don't touch any of my projects... – DevSolar Sep 22 '12 at 05:04