2

I am using strtok() function to split a string into Tokens.The problem is when there are 2 delimiters in row.

/* strtok example */
#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,,";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str,", ");
  while (pch != NULL)
  {
    printf ("Token = %s\n",pch);
    pch = strtok (NULL, ", ");
  }
  return 0;
}

And outputs:

Splitting string "Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,," into tokens:
Token = Test=
Token = 0.28
Token = 0.0
Token = 1
Token = 1.9
Token = 2.2
Token = 1.0
Token = 8
Token = 4
Token = 42

There some easy way to get all tokens;I need to know if there's something inside delimiters cause there's times i get ,, or ,xxx,

Thank you.

iSwear
  • 23
  • 1
  • 6
  • 1
    If you're on a system which have [`strsep`](http://man7.org/linux/man-pages/man3/strsep.3.html) you can use it instead. – Some programmer dude Feb 18 '17 at 13:38
  • You could look at [I need a mix of `strtok()` and `strtok_single()`](http://stackoverflow.com/questions/30294129/i-need-a-mix-of-strtok-and-strtok-single) for ideas too. There are many questions linked to from there that could also help. You're not the first to run into problems like this. – Jonathan Leffler Feb 18 '17 at 20:59

1 Answers1

1

strtok() does explicitly the opposite of what you want.

Found in an online manual:

A sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter. Delimiter bytes at the start or end of the string are ignored. Put another way: the tokens returned by strtok() are always nonempty strings.

strtok(3) - Linux man page

I implemented strtoke() - a variant of strtok() which behaves similar but does what you want:

/* strtoke example */
#include <stdio.h>
#include <string.h>

/* behaves like strtok() except that it returns empty tokens also
 */
char* strtoke(char *str, const char *delim)
{
  static char *start = NULL; /* stores string str for consecutive calls */
  char *token = NULL; /* found token */
  /* assign new start in case */
  if (str) start = str;
  /* check whether text to parse left */
  if (!start) return NULL;
  /* remember current start as found token */
  token = start;
  /* find next occurrence of delim */
  start = strpbrk(start, delim);
  /* replace delim with terminator and move start to follower */
  if (start) *start++ = '\0';
  /* done */
  return token;
}

int main ()
{
  char str[] ="Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,,";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtoke(str,", ");
  while (pch != NULL)
  {
    printf ("Token = %s\n",pch);
    pch = strtoke(NULL, ", ");
  }
  return 0;
}

Compiled and tested with gcc on cygwin:

$ gcc -o test-strtok test-strtok.c

$ ./test-strtok.exe 
Splitting string "Test= 0.28,0.0,1,,1.9,2.2,1.0,,8,4,,,42,," into tokens:
Token = Test=
Token = 0.28
Token = 0.0
Token = 1
Token = 
Token = 1.9
Token = 2.2
Token = 1.0
Token = 
Token = 8
Token = 4
Token = 
Token = 
Token = 42
Token = 
Token = 

Another cite from the above link:

Be cautious when using these functions. If you do use them, note that:

  • These functions modify their first argument.
  • These functions cannot be used on constant strings.
  • The identity of the delimiting byte is lost.
  • The strtok() function uses a static buffer while parsing, so it's not thread safe. Use strtok_r() if this matters to you.

These issues apply to my strtoke() also.

Scheff's Cat
  • 16,517
  • 5
  • 25
  • 45