32

I am interested in discussing methods for using stringstream to parse a line with multiple types. I would begin by looking at the following line:

"2.832 1.3067 nana 1.678"

Now lets assume I have a long line that has multiple strings and doubles. The obvious way to solve this is to tokenize the string and then check converting each one. I am interested in skipping this second step and using stringstream directly to only find the numbers.

I figured a good way to approach this would be to read through the string and check if the failbit has been set, which it will if I try to parse a string into a double.

Say I have the following code:

string a("2.832 1.3067 nana 1.678");

 stringstream parser;
 parser.str(a);

 for (int i = 0; i < 4; ++i)
 {
     double b;
     parser >> b;
     if (parser.fail())
     {
         std::cout << "Failed!" << std::endl;
         parser.clear();
     }
     std::cout << b << std::endl;
 }

It will print out the following:

2.832
1.3067
Failed!
0
Failed!
0

I am not surprised that it fails to parse a string, but what is happening internally such that it fails to clear its failbit and parse the next number?

πάντα ῥεῖ
  • 83,259
  • 13
  • 96
  • 175
Fantastic Mr Fox
  • 27,453
  • 22
  • 81
  • 151
  • Check my answer here please: [c++ moving to next element in a file.txt](http://stackoverflow.com/a/24501035/1413395). I think it's relevant. – πάντα ῥεῖ Jul 01 '14 at 07:24
  • @πάνταῥεῖ Ahhh, ok so it gets stuck on the first fail. – Fantastic Mr Fox Jul 01 '14 at 07:27
  • 1
    @πάνταῥεῖ I have un-deleted at your request. After your suggested post I was pretty sure it was a duplicate. But if you would like to add an answer that would be great. – Fantastic Mr Fox Jul 01 '14 at 22:42
  • 1
    @πάνταῥεῖ I think that is reasonably simple right. I am doing a for loop and only reading 4 times because i know what length it should be. If it fails to parse the string and so it doesnt move on then it will just fail to parse it again. Using the code in your suggested answer provides the result: `2.832 1.3067 Failed! 0 1.678` – Fantastic Mr Fox Jul 01 '14 at 23:19
  • I'm afraid I have sent so many questions containing `!istream::eof()` being closed as duplicates unnecessarily. Waiting until someone finds the _real complete_ dupe for this one. THX a lot for your effort @Ben! I think this is a good starting point for a canonical answer of a FAQ. – πάντα ῥεῖ Jul 02 '14 at 00:05

5 Answers5

24

The following code works well to skip the bad word and collect the valid double values

istringstream iss("2.832 1.3067 nana 1.678");
double num = 0;
while(iss >> num || !iss.eof()) {
    if(iss.fail()) {
        iss.clear();
        string dummy;
        iss >> dummy;
        continue;
    }
    cout << num << endl;
}

Here's a fully working sample.


Your sample almost got it right, it was just missing to consume the invalid input field from the stream after detecting it's wrong format

 if (parser.fail()) {
     std::cout << "Failed!" << std::endl;
     parser.clear();
     string dummy;
     parser >> dummy;
 }

In your case the extraction will try to read again from "nana" for the last iteration, hence the last two lines in the output.

Also note the trickery about iostream::fail() and how to actually test for iostream::eof() in my 1st sample. There's a well known Q&A, why simple testing for EOF as a loop condition is considered wrong. And it answers well, how to break the input loop when unexpected/invalid values were encountered. But just how to skip/ignore invalid input fields isn't explained there (and wasn't asked for).

Community
  • 1
  • 1
πάντα ῥεῖ
  • 83,259
  • 13
  • 96
  • 175
3

I have built up a more fine tuned version for this, that is able to skip invalid input character wise (without need to separate double numbers with whitespace characters):

#include <iostream>
#include <sstream>
#include <string>
using namespace std;

int main() {

    istringstream iss("2.832 1.3067 nana1.678 xxx.05 meh.ugh");
    double num = 0;
    while(iss >> num || !iss.eof()) {
        if(iss.fail()) {
            iss.clear();
            while(iss) {
                char dummy = iss.peek();
                if(std::isdigit(dummy) || dummy == '.') {
                    // Stop consuming invalid double characters
                    break;
                }
                else {
                    iss >> dummy; // Consume invalid double characters
                }
            }
            continue;
        }
        cout << num << endl;
    }
    return 0;
}

Output

 2.832
 1.3067
 1.678
 0.05

Live Demo

πάντα ῥεῖ
  • 83,259
  • 13
  • 96
  • 175
3

Few minor differences to πάντα ῥεῖ's answer - makes it also handle e.g. negative number representations etc., as well as being - IMHO - a little simpler to read.

#include <iostream>
#include <sstream>
#include <string>

int main()
{
    std::istringstream iss("2.832 1.3067 nana1.678 x-1E2 xxx.05 meh.ugh");
    double num = 0;
    for (; iss; )
        if (iss >> num)
            std::cout << num << '\n';
        else if (!iss.eof())
        {
            iss.clear();
            iss.ignore(1);
        }
}

Output:

2.832
1.3067
1.678
-100
0.05

(see it running here)

Tony Delroy
  • 94,554
  • 11
  • 158
  • 229
2

If you like concision - here's another option that (ab?)uses && to get cout done only when a number's been parsed successfully, and when a number isn't parsed it uses the comma operator to be able to clear() stream error state inside the conditional before reading a character to be ignored...

#include <iostream>
#include <sstream>
#include <string>

int main()
{
    std::istringstream iss("2.832 1.3067 nana1.678 x-1E2 xxx.05 meh.ugh");
    double num = 0;
    char ignored;
    while (iss >> num && std::cout << num << '\n' ||
           (iss.clear(), iss) >> ignored)
        ;
}

http://ideone.com/WvtvfU

Tony Delroy
  • 94,554
  • 11
  • 158
  • 229
1

You can use std::istringstream::eof() to validate input like this:

#include <string>
#include <sstream>
#include <iostream>

// remove white-space from each end of a std::string
inline std::string& trim(std::string& s, const char* t = " \t")
{
    s.erase(s.find_last_not_of(t) + 1);
    s.erase(0, s.find_first_not_of(t));
    return s;
}

// serial input
std::istringstream in1(R"~(
 2.34 3 3.f 3.d .75 0 wibble 
)~");

// line input
std::istringstream in2(R"~(
2.34
 3

3.f
3.d
.75
0
wibble 
)~");

int main()
{
    std::string input;

    // NOTE: This technique will not work if input is empty
    // or contains only white-space characters. Therefore
    // it is safe to use after a conditional extraction
    // operation >> but it is not reliable after std::getline()
    // without further checks.

    while(in1 >> input)
    {
        // input will not be empty and will not contain white-space.
        double d;
        if((std::istringstream(input) >> d >> std::ws).eof())
        {
            // d is a valid double
            std::cout << "d1: " << d << '\n';
        }
    }

    std::cout << '\n';

    while(std::getline(in2, input))
    {
        // eliminate blank lines and lines
        // containing only white-space (trim())
        if(trim(input).empty())
            continue;

        // NOW this is safe to use

        double d;
        if((std::istringstream(input) >> d >> std::ws).eof())
        {
            // d is a valid double
            std::cout << "d2: " << d << '\n';
        }
    }
}

This works because the eof() check ensures that only the double was entered and not garbage like 12d4.

Galik
  • 42,526
  • 3
  • 76
  • 100
  • If `cin >> input` fails at eof, the `if` statement will evaluate `true` - easily fixed ala `if (cin >> input && (...)) ...`. – Tony Delroy Mar 31 '15 at 02:24
  • @TonyD You are totally right, it is important to remember this technique expects `input` to contain *something*. – Galik Mar 31 '15 at 11:32