In the following code I'm trying to find the frequencies of the rows in fileA which have the same value on the second column. (each row has two column and both are integers.) Sample of fileA:
1 22
8 3
9 3
I have to write the output in fileB like this:
22 1
3 2
Because element 22 has been repeated once in second column(and 3 repeated 2 times.)
fileA is very large(30G). And there are 41,000,000 elements in it(in other words, fileB has 41,000,000) rows. This is the code that I wrote:
void function(){
unsigned long int size = 41000000;
int* inDeg = new int[size];
for(int i=0 ; i<size; i++)
{
inDeg[i] = 0;
}
ifstream input;
input.open("/home/fileA");
ofstream output;
output.open("/home/fileB");
int a,b;
while(!input.eof())
{
input>>a>>b;
inDeg[b]++; //<------getting error here.
}
input.close();
for(int i=0 ; i<size; i++)
{
output<<i<<"\t"<<inDeg[i]<<endl;
}
output.close();
delete[] inDeg;
}
I'm facing segmentation fault error on the second line of the while loop. On the 547387th iteration. I have already assigned 600M to the stack memory based on this. I'm using gcc 4.8.2 (on Mint17 x86_64).
Solved
I analysed fileA thoroughly. The reason of the problem as hyde mentioned wasn't with hardware. Segfault reason was wrong indexing. Changing the size to 61,500,000 solved my problem.