-2

I'm having trouble filtering the 'commented' lines in a text file. I want to filter the lines beginning with # or // in the text file.

Thanks!

string code, name, year, semester, value, data;

char delimiter = '|';

ifstream ifsUnits;
ifsUnits.open("./data/units.txt");

if (ifsUnits.fail())
    cout << "\nError reading from file <units.txt>.";
else
{
    while (!ifsUnits.eof())
    {
        getline(ifsUnits, data);
        stringstream ssData(data);
        while (ssData.good())
        {
            getline(ssData, code, delimiter);
            getline(ssData, name, delimiter);
            getline(ssData, year, delimiter);
            getline(ssData, semester, delimiter);
            getline(ssData, value, delimiter);

            lUnits.push_back(Unit(stoi(code), name, stoi(year), stoi(semester), stoi(value)));
        }
    }
}
ifsUnits.close();

Text file contents:

//idNumber|name <-- i want to bypass all the lines starting with // 1001|Mary Doe 1002|John Down 1003|John Doe 1004|Marilyn Hendrix

zumuha
  • 51
  • 1
  • 7
  • how about just getting the first segment `code`, detecting if it starts with `//` and then `continue`? – JHBonarius Dec 22 '20 at 19:59
  • @JHBonarius there's probably no guaranteed `delimiter` in a commented line. But that's something OP should clarify – scohe001 Dec 22 '20 at 19:59
  • 2
    A separate issue: [Why it's bad to use feof() to control a loop](https://faq.cprogramming.com/cgi-bin/smartfaq.cgi?answer=1046476070&id=1043284351). The article is about C, but it applies here as well. – Michael Burr Dec 22 '20 at 20:00
  • just suggesting. Could also just get the first two chars, and add them to `code` if they're not `//`. – JHBonarius Dec 22 '20 at 20:01
  • One way: instead of calling `getline()` directly, write a wrapper that calls `getline()` and discards any that start with "//" or "#". Return the first line that doesn't. – Michael Burr Dec 22 '20 at 20:02
  • 1
    Simply check if the first two characters of `data` contain `'#'` or `'/'`. – πάντα ῥεῖ Dec 22 '20 at 20:02
  • Or better, use a regex that checks for any amount of leading whitespace followed by `#` or `"//"` and if it matches, simply discard the line by `continue;` in your read loop. (always control your read loop with the return from your read function and stream-state) – David C. Rankin Dec 22 '20 at 20:03
  • @David Regex is total overkill for this simple thing. – πάντα ῥεῖ Dec 22 '20 at 20:03
  • I would have generally thought so too, but given their simple implementation, it's certainly an option. Of course `.find_first_of()` and `.find_first_not_of()` are also simple alternatives. – David C. Rankin Dec 22 '20 at 20:05
  • [Watch out for `while (!ifsUnits.eof())`](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-i-e-while-stream-eof-cons). `while (ssData.good())` will result in similar bugs.. – user4581301 Dec 22 '20 at 20:15

1 Answers1

2

You're already pulling one line into a string at a time. Just check if that string starts with your comment characters and if so, continue:

while (getline(ifsUnits, data)) // more robust than eof check
{
    if ((data.size() > 1 && data[0] == '/' && data[1] == '/')
        || (data.size() > 0 && data[0] == '#')) { continue; }
    
    stringstream ssData(data);
    // ...
scohe001
  • 13,879
  • 2
  • 28
  • 47
  • while (getline(ifsUnits, data)) // more robust than eof check Why if this method more robust? Thanks for you answer. – zumuha Dec 22 '20 at 20:58
  • 1
    @zumuha [Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?](https://stackoverflow.com/q/5605125/2602718) – scohe001 Dec 22 '20 at 21:19