Hallo experts,
I am trying to read in a large file in consecutive blocks of 10000 lines. This is
because the file is too large to read in at once. The "skip" field of read.csv comes in
handy to accomplish this task ( see below). However I noticed that the program starts
slowing down towards the end of the file ( for large values of i).
I suspect this is because each call to read.csv(file,skip=nskip,nrows=block) always
starts reading the file from the beginning until the required starting line "skip" is
reached. This becomes increasingly time-consuming as i increases.
Question: Is there a way to continue reading a file starting from the last location that
was reached in the previous block?
numberOfBlocksInFile<-800
block<-10000
for ( i in 1:(n-1))
{
print(i)
nskip<-i*block
out<-read.csv(file,skip=nskip,nrows=block)
colnames(out)<-names
.....
print("keep going")
}
many thanks (:-