I am trying to write a program to compare two text files using UNIX API calls. Here are the contents of my two files:
f1.txt
This is my sample.
It contains text
And for some reason
The last few chars
are duplicated?
f2.txt
This is another sample
Sometimes instead of
duplicating the last few chars,
it prints another new line
instead
4567865
I have a cpp file that opens and reads these files. My OpenRead function takes a filename as a c string, and puts the contents of the text file into a string and returns it.
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <iostream>
#include <string>
#include <cstring>
using namespace std;
string OpenRead(const char*);
int main(int argc, char **argv)
{
string text1 = "", text2 = "";
string file1(argv[1]);
string file2(argv[2]);
text1 = OpenRead(file1.c_str());
text2 = OpenRead(file2.c_str());
cout << text1 << endl;
cout << text2 << endl;
exit(EXIT_SUCCESS);
return 0;
}
string OpenRead(const char* filename)
{
int inFD1;
string text;
char * buf = new char[fsize(filename)];
inFD1 = open(filename, O_RDONLY, 0);
if(inFD1 < 0) exit(EXIT_FAILURE);
else
{
while (read(inFD1, buf, sizeof(int)) != 0)
text += buf; //cout << buf;
}
close(inFD1);
delete [] buf;
return text;
}
size_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
return -1;
}
The issue is that when I compile this into an executable and run my command: fileComp f1.txt f2.txt, it opens and reads them almost perfectly fine, but produces strange output where extra characters are appended to the end. Here is what the output looks like:
This is my sample.
It contains text
And for some reason
The last few chars
are duplicated?
e
This is another sample
Sometimes instead of
duplicating the last few chars,
it prints another new line
instead
4567865
8
For some reason it appends on an e to the first file and an 8 to the second. This behavior varies among text files, but it always appends random characters from the buffer to the end.