-3

I would like to write a function that breaks up a string by token, I came up with the following so far:

#include <cstring>
#include <iostream>
#include <vector>
#define MAXLEN 20

void mytoken(std::string input, std::vector<std::string> & out);

int main() 
{
    std::vector<std::string> out;
    std::string txt = "XXXXXX-CA";
    mytoken(txt, out);
    std::cout << "0: " << out[0] <<std::endl;
    std::cout << "1: " << out[1] <<std::endl;
}

void mytoken(std::string instr, std::vector<std::string> & out) {
    std::vector<std::string> vec;
    char input[MAXLEN] = {0};
    strcpy(input, instr.c_str());
    char *token = std::strtok(input, "-");
    while (token != NULL) {
        std::cout << token << '\n';
        token = std::strtok(NULL, "-");
        out.push_back(token);
    }    
}

which produces the following output:

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid
XXXXXX
CA
bash: line 7: 21987 Aborted                 (core dumped) ./a.out

and I wonder why that is.

stdcerr
  • 9,675
  • 15
  • 50
  • 95
  • 1
    So you've got a core dump. What does your debugger say about it? – melpomene Nov 19 '18 at 02:25
  • The terminate message mentions "null not valid", for constructing a string. Look at where your code mentions null. – Drew Dormann Nov 19 '18 at 02:29
  • 1
    Looks like you throw away the first token, search again for `-`, there isn't one, and you try to construct a string with NULL. Are you sure you don't want to move the `push_back` above the second `strtok`? – Retired Ninja Nov 19 '18 at 02:41
  • @RetiredNinja right! `out.push_back(token);` needs to go ahead of `token = std::strtok(NULL, "-");` - ands it works as expected! Move this to an answer and I'll gladly accept! – stdcerr Nov 19 '18 at 02:44
  • 1
    @cerr *I would like to write a function that breaks up a string by token* -- Using `strtok` is one of the worst ways of doing this. – PaulMcKenzie Nov 19 '18 at 02:46
  • @PaulMcKenzie Judged by what? – stdcerr Nov 19 '18 at 02:48
  • 1
    @cerr `strtok` 1) Destroys the original input. 2) Is not thread safe 3) You can't use `strtok` twice or more within nested calls, since it uses a static buffer. It is a function best left for newbies and `C` coders who want to handle that mess. You're using C++, and there are far more better ways to tokenize a string by using what C++ offers. – PaulMcKenzie Nov 19 '18 at 02:50
  • @PaulMcKenzie yes, These are fair arguments!This should address these issues: https://stackoverflow.com/questions/14265581/parse-split-a-string-in-c-using-string-delimiter-standard-c – stdcerr Nov 19 '18 at 02:52
  • 1
    Yes, all of those solutions seem to not use `strtok`. I bet even `C` programmers wouldn't mind seeing `strtok` die (but of course a lot of code would break). – PaulMcKenzie Nov 19 '18 at 02:54
  • @PaulMcKenzie Thanks, I've got a working solution now because you posted that comment and https://stackoverflow.com/questions/14265581/parse-split-a-string-in-c-using-string-delimiter-standard-c if you want to post an answer referring to this link, i will accept! – stdcerr Nov 19 '18 at 03:00

1 Answers1

2

It is better to use 'c++-style' functions : it is a bit simpler and more readable:

#include <sstream>

void mytoken(std::string instr, std::vector<std::string> & out)
{
    std::istringstream ss(instr);
    std::string token;
    while(std::getline(ss, token, '-'))
    {
        std::cout << token << '\n';
        out.push_back(token);
    }
}

For your example to work correctly, you need to change the order of operations in your loop:

//...
while(token != NULL)
{
    out.push_back(token);
    std::cout << token << '\n';
    token = std::strtok(NULL, "-");
}
snake_style
  • 1,120
  • 7
  • 15