0

We are trying to parse a file and stuck with one problem.

Problem is to move the file pointer to the next line if we see that we do not need the current line.

Suppose, the file(file.txt) is like this:

A quick brown fox 
// Blah blah
// Blah blah
jumps over the little lazy dog

In our program, we iterate over all the file line by line by doing(skeleton):

ifstream fp("file.txt");
do {
  std::string str;
  std::getline(fp, str);
  std::cout << str << std::endl;
} while(!foef(fp));

So, above chunk of code reads lines starting with // as well and it goes to the end of that line, thereby wasting time.

Question is: Is it possible to skip the lines which starts with //. This will save us time to traverse that line to the end.

Any help is appreciated.

Hemant Bhargava
  • 2,431
  • 1
  • 17
  • 33
  • 1
    No it's not possible. How would you even know that a line is a "comment" (begins with `"//"`) if you don't even read it? – Some programmer dude Nov 30 '18 at 07:01
  • 1
    And regarding the "save us time" argument, have you *measured* that it's a bottleneck? Always do that first before thinking about possible optimizations. – Some programmer dude Nov 30 '18 at 07:02
  • Lastly, while the code you show is pseudo-code, I think you should read [Why is iostream::eof inside a loop condition considered wrong?](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong) anyway. – Some programmer dude Nov 30 '18 at 07:03
  • 1
    Interestingly the reason why you can't skip comments without reading is the same reason you loop on eof: You don't know if you've found it until after you've looked. – user4581301 Nov 30 '18 at 07:05
  • *Problem is to move the file pointer to the next line* -- Read the file in as binary in one big memory blob, and parse that blob. Then there is no concern about the file pointer. – PaulMcKenzie Nov 30 '18 at 07:18
  • @Someprogrammerdude, Regrarding first point, How does the compiler do that in case it sees comments in the cpp file? Doesn't it skip the whole line if it starts with //? I know they do that in lex/yaac or whatever but similar concept can be used here also. Is'nt it? – Hemant Bhargava Nov 30 '18 at 08:04
  • @Someprogrammerdude, Regarding your second point, File which I wrote is just a skeleton. Basically, my idea is to read/capture only the lines which starts with something.. And yes, that is a bottleneck for me. – Hemant Bhargava Nov 30 '18 at 08:05
  • @Someprogrammerdude, Third point, I agree with that but that is not the intention of this question. – Hemant Bhargava Nov 30 '18 at 08:06
  • @user4581301, I can skip the whole line if I have looked at the line and if it starts with //. Possible, right? – Hemant Bhargava Nov 30 '18 at 08:07
  • @PaulMcKenzie, Did not get it at all. – Hemant Bhargava Nov 30 '18 at 08:07
  • 2
    A compiler, generally, when it sees the character `/` followed by another `/` reads the rest of the line (until `\n`) while discarding all the character. There's no "magic" that can have it skip reading the full line. – Some programmer dude Nov 30 '18 at 08:52
  • @HemantBhargava -- How big is this file? If you want to avoid what you say is "slow", then read the entire file contents into memory by opening it as a binary file and reading it into memory, not one line at a time, but in one big chunk. Then you have the entire contents in memory (in a char array or similar) and you can do whatever you want with it, thus not having to deal with "file pointers". – PaulMcKenzie Nov 30 '18 at 11:12

1 Answers1

-1

Below is a straightforward solution to your problem:

#include<fstream>
#include<iostream>
using namespace std;

// Prototype
bool startswith(string text, string sub);

// Main driver
int main(int argc, char **argv)
{
     string filename("tmp.txt");
     ifstream input(filename);
     string line;
     string sub("//") // substring we'll be hunting in 'line'

     if(!input.is_open()){
         cerr << filename << " does not exist" << endl;
     }

     while(!getline(input, line).fail()){
         if(!startswith(line, sub)){
             cout << line << endl;
         }
     }
     input.close();

     return 0;
}


bool startswith(string text, string sub){
    // Check whether the string "sub" matches the beginning of the string "text"
    int n = sub.length();
    string test = text.substr(0, n);
    if(test.compare(sub)==0){
        return true;
    }else{
        return false;
    }
}

When I test this code with the input (tmp.txt):

A quick brown fox 
// Blah blah
// Blah blah
jumps over the little lazy dog

Compiling with:

g++ -std=c++11 escape.cpp -o a.out

Executing:

./a.out

The output is:

A quick brown fox
jumps over the little lazy dog
eapetcho
  • 509
  • 3
  • 10
  • You too should probably take some time to read [Why is iostream::eof inside a loop condition considered wrong?](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong) And this doesn't solve the problem of the OP, how to skip the line *without* reading them. – Some programmer dude Nov 30 '18 at 07:49
  • This does not answer my question. I am assuming that getline also goes to the end of that line to search for '\n' character. – Hemant Bhargava Nov 30 '18 at 08:08