6

Very often, I have to use multiple libraries that handle errors differently or define their own enums for errors. This makes it difficult to write functions that might have to deal with errors from different sources, and then return its own error code. For example:

int do_foo_and_bar()
{
    int err;
    if ((err = libfoo_do_something()) < 0) {
        // return err and indication that it was caused by foo
    }
    if ((err = libbar_do_something()) < 0) {
        // return err and indication that it was caused by bar
    }
    // ...
    return 0;
}

I've thought of two possible solutions:

  • Create my own list of error codes and translate these error codes to new ones, using functions like int translate_foo_error(int err), and I would write my own string representations for each error.
  • Create a struct my_error that holds both an enum identifying the library and the error code. The translation to string would be delegated to an appropriate function for each library.

This seems like a problem that would come up very often, so I'm curious, how is this usually handled? It seems like the former is what most libraries do, but the latter is less work and plays on the tools already provided. It doesn't help that most tutorials just print a message to stderr and exit on any error. I'd rather have each function indicate what went wrong, and the caller can decide from that how to handle it.

  • You also have to watch out for the ones that reverse the convention: 1 for success, 0 for failure and then you need to call another routine to find out what the failure was. One error for every function is a good option but what happens when there are additions to the library - do you add it at the bottom or shift everything else down or assign blocks of codes for each library. Each technique has its own problems. – cup Jul 08 '14 at 06:39

2 Answers2

2

The answer is, it depends on your code's constraints.

collectd prints to standard error then bails if it hits a fatal error.

OpenGL will set some shared state that you can query. Ignoring this error often results in undefined behavior.

Collectd has lots of threading concerns, and most errors can't be fixed or recovered from by the program. For example, if a plugin depends on some library, and a call to that library fails, the plugin knows the most about how to recover from that error. Bubbling that error up isn't helpful as collectd core will never know about plugin N+1

On the other hand, OpenGL applications are usually responsible for any errors encountered and can attempt to correct errors. Such as, if they are attempting to compile a shader, but might have a special one for a specific vendor or platform.

Based on your program's design, consider the following:

  1. Will bubbling up the error allow you to make a better decision?
    • Does the caller know about your implementation? Should they?
  2. Are the likely errors correctable by your application?
    • E.G. If you can't open a socket or file, you can retry or fail, not much else.
  3. Are there constraints around global state if you make a GetLastError() function?

Update:

Going with the global state option you might have something like this:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>

char* last_err = NULL;

void set_err(char* error_message) {
    if(last_err)
        free(last_err);
    /* Make a deep copy to be safe. 
     * The error string might be dynamically allocated by an external library. 
     * We can't know for sure where it came from. 
     */
    last_err = strdup(error_message); 
}

int can_sqrt(int a) {
    if(a < 0) {
        set_err("We can't take the square root of a negative number");
        return 0;
    }
    return 1;
}

int main(int argc, char* argv[]) {
    int i = 1;
    for(i = 1; i < argc; i++) {
      int square = atoi(argv[i]);
      if(can_sqrt(square)) {
        fprintf(stdout, "the square root of %d is: %.0f\n", square, sqrt(square));
      } else {
        fprintf(stderr, "%s\n", last_err);
      }
    }
    return 0;
}

Running the above program

$ ./a.out -1 2 -4 0 -6 4
We can't take the square root of a negative number
the square root of 2 is: 1
We can't take the square root of a negative number
the square root of 0 is: 0
We can't take the square root of a negative number
the square root of 4 is: 2
EnabrenTane
  • 7,294
  • 2
  • 24
  • 44
  • Thanks, I think that cleared up a lot of what I was thinking about. How would you handle a situation where you wanted failure to just result in stopping the current action and printing a message to the user? I can think how to do this with C++ exceptions, but I'm curious to know how it is handled in C. Global state sounds like the way to go to minimize data passed between functions, since I don't have any issues with threads. – James Hagborg Jul 10 '14 at 18:02
  • @JamesHagborg I have updated my answer with a working example of how you might use global state to track error messages. – EnabrenTane Jul 11 '14 at 17:40
  • @JamesHagborg obviously the example is trivial and you could wrap the global data in a structure like you suggested. You could even add a mutex on reading / writing the error string if you need to get some concurrency, though each process or thread could have it's own last error data. – EnabrenTane Jul 11 '14 at 17:44
0

I like to use Thread-Local-Storage (TLS) to store errors detected deep within libraries. They are fast & thread-safe. The only real issue is that the error then belongs to the thread calling the function generating the error. This can be an issue in some threading models (example, anonymous threads in a thread-pool). Other threads can't see the error, unless you have some way of propagating the error from one thread to another. However there are ways to do that too, & this method of error propagation is fast & efficient & makes for more elegant code in the library (I believe). The general philosophy is push error reporting as well as recovery decisions upwards towards the interface. Error recovery can be treated at any level in the call-stack (for example at a mid-layer level before the interface) but the general idea is for every function in the library to push responsibility upwards in a direction towards the caller. Each function should assume a little bit of responsibility & then pass the rest back up the call-chain. For example, every (well maybe most) function in the library could log any error to TLS & return a boolean indicating the success of the function/operation to the caller. The caller can then look at the returned boolean & if the operation was unsuccessful, either decide to do something about it (like retry perhaps), or just abort, clean-up the stack & return false. If the information you were storing in TLS was a structure, you could aggregate error information (as well as any remedial actions taken) up the call-chain. This process then could continue all the way back up to the interface level. At any time, a caller can ask for the last error & decide to do whatever it likes based on the indicated error. Obviously your library would need to provide a top-level SetLastError()/GetLastError() pair of interface functions. Also, you would probably have code on entry to every interface function (except SetLastError()/GetLastError() of course) in the library that resets the last error state when the interface function is called.