I have to read a 8192x8192 matrix into memory. I want to do it as fast as possible.
Right now I have this structure:
char inputFile[8192][8192*4]; // I know the numbers are at max 3 digits
int8_t matrix[8192][8192]; // Matrix to be populated
// Read entire file line by line using fgets
while (fgets (inputFile[lineNum++], MAXCOLS, fp));
//Populate the matrix in parallel,
for (t = 0; t < NUM_THREADS; t++){
pthread_create(&threads[t], NULL, ParallelRead, (void *)t);
}
In the function ParallelRead
, I parse each line, do atoi
and populate the matrix. The parallelism is line-wise like thread t parses line t, t+ 1 * NUM_THREADS..
On a two-core system with 2 threads, this takes
Loading big file (fgets) : 5.79126
Preprocessing data (Parallel Read) : 4.44083
Is there a way to optimize this any further?