19

I have numerous text files of data in the form of float numbers. I'm looking for the fastest way to read them in C++. I can change the file to binary if that's the fastest.

It would be great if you could give me hint or refer me to a website with complete explanation. I don't know whether there is any library that does the work fast. Even if there is any open source software that does the work, that would be helpful.

Lightness Races in Orbit
  • 358,771
  • 68
  • 593
  • 989
Kiarash
  • 5,626
  • 7
  • 38
  • 67

3 Answers3

27

Having a binary file is the fastest option. Not only you can read it directly in an array with a raw istream::read in a single operation (which is very fast), but you can even map the file in memory if your OS supports it; you can use open/mmap on POSIX systems, CreateFile/CreateFileMapping/MapViewOfFile on Windows, or even the Boost cross-platform solution (thanks @Cory Nelson for pointing it out).

Quick & dirty examples, assuming the file contains the raw representation of some floats:

"Normal" read:

#include <fstream>
#include <vector>

// ...

// Open the stream
std::ifstream is("input.dat");
// Determine the file length
is.seekg(0, std::ios_base::end);
std::size_t size=is.tellg();
is.seekg(0, std::ios_base::beg);
// Create a vector to store the data
std::vector<float> v(size/sizeof(float));
// Load the data
is.read((char*) &v[0], size);
// Close the file
is.close();

Using shared memory:

#include <boost/interprocess/file_mapping.hpp>
#include <boost/interprocess/mapped_region.hpp>

using boost::interprocess;

// ....

// Create the file mapping
file_mapping fm("input.dat", read_only);
// Map the file in memory
mapped_region region(fm, read_only);
// Get the address where the file has been mapped
float * addr = (float *)region.get_address();
std::size_t elements  = region.get_size()/sizeof(float);
Matteo Italia
  • 115,256
  • 16
  • 181
  • 279
  • 7
    [Boost Interprocess](http://www.boost.org/doc/libs/1_47_0/doc/html/interprocess/sharedmemorybetweenprocesses.html#interprocess.sharedmemorybetweenprocesses.mapped_file) supplies cross-platform memory-mapped files. – Cory Nelson Jul 19 '11 at 23:04
  • @Cory: uh, nice, didn't know that Boost got that too. – Matteo Italia Jul 19 '11 at 23:05
  • memory mapping is going to be very fast if the file is already cached. if not, `read` will beat it. – Karoly Horvath Jul 19 '11 at 23:05
  • is actually mmap going to help in continuos (or how to call it) read? – tomasz Jul 19 '11 at 23:06
  • 3
    @yi_H: I don't think that `mmap` will be any slower than `read`. – Matteo Italia Jul 19 '11 at 23:07
  • Theoretically it shouldn't, for both of them the OS should figure out that it has to prefetch the next blocks, but practically `read` is faster. At least that's what I've measured. – Karoly Horvath Jul 19 '11 at 23:11
  • 2
    I don't know if this can be helpful but I have made the test of reading 10 million float number with the "normal read" and the "shared memory read". Using an O3 optimization, it took 185 ms with the normal read and 328 ms with the boost one. O3 speeds up the normal read by 100% but doesn't change boost reading speed. – wizmer Apr 21 '15 at 23:21
8

Your bottleneck is in the I/O. You want the program to read in as much data into memory in fewest I/O calls. For example reading 256 numbers with one fread is faster than 256 fread of one number.

If you can, format the data file to match the target platform's internal floating point representation, or at least your program's representation. This reduces the overhead of translating textual representation to internal representation.

Bypass the OS and use the DMA controller to read in the file data, if possible. The DMA chip takes the burden of reading data into memory off the shoulders of the processor.

Compact you data file. The data file wants to be in one contiguous set of sectors on the disk. This will reduce the amount of time spent seeking to different areas on the physical platters.

Have you program demand exclusive control over the disk resource and the processors. Block all other unimportant tasks; raise the priority of your program's execution.

Use multiple buffers to keep the disk drive spinning. A large portion of time is spent waiting for the hard drive to accelerate and decelerate. Your program can be processing the data while something else is storing the data into a buffer, which leads to ...

Multi-thread. Create one thread to read in the data and alert the processing task when the buffer is not empty.

These should keep you busy for a while. All other optimizations will result in negligible performance gains. (Such as accessing the hard drive controller directly to transfer into one of your buffers.)

Thomas Matthews
  • 52,985
  • 12
  • 85
  • 144
  • 1
    The OP has a text file. Speedup #2, converting that text file to binary, is going to speed things up immensely. Follow that by #1, read as much as you can in one gulp. Everything after that is gravy. – David Hammen Jul 19 '11 at 23:33
2

Another attention to compile mode. I have tried parsing a file with 1M lines. Debug mode consumed 50secs to parse data and append to my container. Release mode consumed at least ten times faster, about 4secs. The code below is to read the whole file before using istringstream to parse the data as 2D points (,).

vector <float> in_data;
string raw_data;

ifstream ifs;
ifs.open(_file_in.c_str(), ios::binary);
ifs.seekg(0, ios::end);
long length = ifs.tellg();
ifs.seekg(0, ios::beg);
char * buffer;
buffer = new char[length];
ifs.read(buffer, length);
raw_data = buffer;
ifs.close();
delete[]buffer;
cout << "Size: " << raw_data.length()/1024/1024.0 << "Mb" << endl;
istringstream _sstr(raw_data);
string _line;

while (getline(_sstr, _line)){
    istringstream _ss(_line);
    vector <float> record;
    //maybe using boost/Tokenizer is a good idea ...
    while (_ss)
    {
        string s;
        if (!getline(_ss, s, ',')) break;
        record.push_back(atof(s.c_str()));
    }
    in_data.push_back(record[0]);
}
Brian Ng
  • 854
  • 10
  • 13