1

Given a path

/level1/level2/level3/level4

I want to be able to split this string such that I can retrieve each individual entry,
i.e "level1", "level2", "level3", "level4".

Now my first idea was using strtok, but apparently most people recommend against using this function. What is another approach so that I can pass in a string (char* path) and get each entry split at "/".

zx485
  • 24,099
  • 26
  • 45
  • 52
SS'
  • 669
  • 1
  • 5
  • 14

2 Answers2

4

strtok is actually the preferred way to tokenize a string such as this. You just need to be aware that:

  • The original string is modified
  • The function uses static data during its parsing, so it's not thread safe and you can't interleave parsing of two separate strings.

If you don't want the original string modified, make a copy using strdup and work on the copy, then copy the results as needed. If you need to worry about multiple threads or interleaved usage, use strtok_r instead which has an additional state parameter.

dbush
  • 162,826
  • 18
  • 167
  • 209
  • Yeah, probably it will be OK, when comes about `char[]` (https://ideone.com/p2H6WX), but it will be a problem with `const char *ptr` – Michi Nov 21 '17 at 19:45
  • As noted in [my answer](https://stackoverflow.com/a/47421849/14660), there's more to splitting up a Unix path than just splitting on `/`. – Schwern Nov 21 '17 at 20:12
3

Splitting Unix paths is more than just splitting on /. These all refer to the same path...

  • /foo/bar/baz/
  • /foo/bar/baz
  • /foo//bar/baz

As with many complex tasks, it's best not to do it yourself, but to use existing functions. In this case there are the POSIX dirname and basename functions.

  • dirname returns the parent path in a filepath
  • basename returns the last portion of a filepath

Using these together, you can split Unix paths.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <libgen.h>

int main(void) {
    char filepath[] = "/foo/bar//baz/";

    char *fp = filepath;
    while( strcmp(fp, "/") != 0 && strcmp(fp, ".") != 0 ) {
        char *base = basename(fp);
        puts(base);

        fp = dirname(fp);
    }

    // Differentiate between /foo/bar and foo/bar
    if( strcmp(fp, "/") == 0 ) {
        puts(fp);
    }
}

// baz
// bar
// foo
// /

It's not the most efficient, it does multiple passes through the string, but it is correct.

Schwern
  • 127,817
  • 21
  • 150
  • 290