Performance of GNU C implementation of getcwd()

Question

According to the GNU Lib C documentation on getcwd()...

The GNU C Library version of this function also permits you to specify a null pointer for the buffer argument. Then getcwd allocates a buffer automatically, as with malloc(see Unconstrained Allocation). If the size is greater than zero, then the buffer is that large; otherwise, the buffer is as large as necessary to hold the result.

I now draw your attention to the implementation using the standard getcwd(), described in the GNU documentation:

char* gnu_getcwd ()
{
   size_t size = 100;
   while (1)
   {
      char *buffer = (char *) xmalloc (size);
      if (getcwd (buffer, size) == buffer)
        return buffer;
      free (buffer);
      if (errno != ERANGE)
        return 0;
      size *= 2;
   }
}

This seems great for portability and stability but it also looks like a clunky compromise with all that allocating and freeing memory. Is this a possible performance concern given that there may be frequent calls to the function?

*It's easy to say "profile it" but this can't account for every possible system; present or future.

The actual glibc implementation of `getpwd` on Linux doesn't actually use the implementation described in the manual. Instead it allocates a buffer that's the biggest possible pathname that corresponding Linux system call supports. If that fails then the glibc implementation falls back on the traditional method of going up the directory tree through `..` directory entries. — Ross Ridge, Jul 26 '14 at 22:38

Pascal Cuoq · Answer 1 · 2014-07-25T13:45:36.883

The initial size is 100, holding a 99-char path, longer than most of the paths that exist on a typical system. This means in general that there is no “allocating and freeing memory”, and no more than 98 bytes are wasted.

The heuristic of doubling at each try means that at a maximum, a logarithmic number of spurious allocations take place. On many systems, the maximum length of a path is otherwise limited, meaning that there is a finite limit on the number of re-allocations caused.

This is about the best one can do as long as getcwd is used as a black box.

score 1 · Answer 2 · answered Jul 25 '14 at 13:42

This is not a performance concern because it's the getcwd function. If that function is in your critical path then you're doing it wrong.

Joking aside, there's none of this code that could be removed. The only way you could improve this with profiling is to adjust the magic number "100" (it's a speed/space trade-off). Even then, you'd only have optimized it for your file system.

You might also think of replacing free/malloc with realloc, but that would result in an unnecessary memory copy, and with the error checking wouldn't even be less code.

score 0 · Answer 3 · answered Jul 26 '14 at 21:49

Thanks for the input, everyone. I have recently concluded what should have been obvious from the start: define the value ("100" in this case) and the increment formula to use (x2 in this case) to be based on the target platform. This could account for all systems, especially with the use of additional flags.

Performance of GNU C implementation of getcwd()

3 Answers3