258

What is your favorite method to handle errors in Bash? The best example of handling errors I have found on the web was written by William Shotts, Jr at http://www.linuxcommand.org.

He suggests using the following function for error handling in Bash:

#!/bin/bash

# A slicker error handling routine

# I put a variable in my scripts named PROGNAME which
# holds the name of the program being run.  You can get this
# value from the first item on the command line ($0).

# Reference: This was copied from <http://www.linuxcommand.org/wss0150.php>

PROGNAME=$(basename $0)

function error_exit
{

#   ----------------------------------------------------------------
#   Function for exit due to fatal program error
#       Accepts 1 argument:
#           string containing descriptive error message
#   ---------------------------------------------------------------- 

    echo "${PROGNAME}: ${1:-"Unknown Error"}" 1>&2
    exit 1
}

# Example call of the error_exit function.  Note the inclusion
# of the LINENO environment variable.  It contains the current
# line number.

echo "Example of error with line number and message"
error_exit "$LINENO: An error has occurred."

Do you have a better error handling routine that you use in Bash scripts?

codeforester
  • 28,846
  • 11
  • 78
  • 104

14 Answers14

169

Use a trap!

tempfiles=( )
cleanup() {
  rm -f "${tempfiles[@]}"
}
trap cleanup 0

error() {
  local parent_lineno="$1"
  local message="$2"
  local code="${3:-1}"
  if [[ -n "$message" ]] ; then
    echo "Error on or near line ${parent_lineno}: ${message}; exiting with status ${code}"
  else
    echo "Error on or near line ${parent_lineno}; exiting with status ${code}"
  fi
  exit "${code}"
}
trap 'error ${LINENO}' ERR

...then, whenever you create a temporary file:

temp_foo="$(mktemp -t foobar.XXXXXX)"
tempfiles+=( "$temp_foo" )

and $temp_foo will be deleted on exit, and the current line number will be printed. (set -e will likewise give you exit-on-error behavior, though it comes with serious caveats and weakens code's predictability and portability).

You can either let the trap call error for you (in which case it uses the default exit code of 1 and no message) or call it yourself and provide explicit values; for instance:

error ${LINENO} "the foobar failed" 2

will exit with status 2, and give an explicit message.

Charles Duffy
  • 235,655
  • 34
  • 305
  • 356
  • 4
    @draemon the variable capitalization is intentional. All-caps is conventional only for shell builtins and environment variables -- using lowercase for everything else prevents namespace conflicts. See also http://stackoverflow.com/questions/673055/correct-bash-and-shell-script-variable-capitalization/673940#673940 – Charles Duffy Jun 09 '11 at 03:25
  • 1
    before you break it again, test your change. Conventions are a good thing, but they're secondary to functioning code. – Draemon Jun 09 '11 at 21:10
  • 3
    @Draemon, I actually disagree. Obviously-broken code gets noticed and fixed. Bad-practices but mostly-working code lives forever (and gets propagated). – Charles Duffy May 22 '14 at 16:55
  • 1
    but you didn't notice. Broken code get noticed *because* functioning code is the primary concern. – Draemon Jul 11 '14 at 18:54
  • @Draemon, the `function` keyword is bad practice, introducing gratuitous incompatibility with POSIX sh for no benefit whatsoever (as opposed to this code's other incompatibilities with POSIX sh, which add value). I'd appreciate it, at this point, if you'd let my code be. – Charles Duffy Aug 29 '14 at 19:23
  • 5
    it's not exactly gratuitous (http://stackoverflow.com/a/10927223/26334) and if the code is already incompatible with POSIX removing the function keyword doesn't make it any more able to run under POSIX sh, but my main point was that you've (IMO) devalued the answer by weakening the recommendation to use set -e. Stackoverflow isn't about "your" code, it's about having the best answers. – Draemon Aug 29 '14 at 22:18
  • @Draemon, to be sure. That said, it _is_ accepted protocol that edits can be rejected for substantially changing the meaning -- and also that edit wars are considered exceedingly poor form. If I need provide meta links on either of those points, let me know. – Charles Duffy Aug 30 '14 at 05:03
  • @Draemon, ...moreover, as that particular topic is one that I rather frequently and publicly rant about (as a proponent of being in the habit of writing POSIX-compliant code by default, and making deviations by intent), making a change contrary to that (1) puts my name next to practices which I'm very publicly opposed to, and (2) feels a bit like it might be deliberate trolling. – Charles Duffy Aug 30 '14 at 05:35
  • you'll note that I haven't changed anything back again precisely because this has clearly descended into an edit war (which was not the intention). I wouldn't know what you're generally opposed to, but note that you included the function keyword originally. Not trolling, just have a different opinion. – Draemon Aug 30 '14 at 13:47
  • 2
    For those reading this today: Some recent versions of bash have a bug impacting accuracy of `LINENO` within trap handlers. Thus, there are cases where this won't work today where it used to be functional in the past. – Charles Duffy Feb 04 '15 at 20:45
  • @CharlesDuffy Correct me if I'm wrong but I think you are needlessly using `local code="${3:-1}"`. It's the same as `local code="$3"`. If you want to get last character of third argument you should use `local code="${3: -1}"` or `local code="${3:(-1)}"`. – piotrekkr May 13 '15 at 07:36
  • 1
    @piotrekkr, the usage is correct as-given -- and this is not a substring operation (as it would be if I changed it per your suggestions); check behavior when `$3` is and is not unset. The intended expansion -- which happens to be what this correctly functions as -- is `${varname:-defaultvalue}`, not `${varname:offset}`. – Charles Duffy May 13 '15 at 12:53
  • @CharlesDuffy you are right. It seems that I didn't understood example at tldp.org `stringZ=abcABC123ABCabc; echo ${stringZ:-4} Defaults to full string, as in ${parameter:-default}`. Bash syntax is quite misleading, there is only one space difference between `${3: -1}` and `${3:-1}` and I get two different things... – piotrekkr May 13 '15 at 19:12
  • @piotrekkr, I'd strongly recommend using the bash-hackers reference (http://wiki.bash-hackers.org/syntax/pe) or the Wooledge one (see http://mywiki.wooledge.org/BashFAQ/073 and links from same); the ABS is rather well-known for its poor maintenance (and tendency to showcase bad practices in examples). – Charles Duffy May 13 '15 at 19:43
  • @piotrekkr, ...re: "bash syntax", `${foo:-bar}` is mandatory to be compliant with POSIX sh; the only place where there would be room for bash to do anything different would be the substring syntax, and variance there would be incompatible with ksh. Thus, for compatibility's sake, there's very little choice. – Charles Duffy May 13 '15 at 19:49
  • Has anyone looked at creating or using a GitHub project to host a solution? This [answer](http://stackoverflow.com/a/30019669/320399) looks compelling. I'm considering trying it, but would really like input from any of you. – blong Jul 20 '15 at 13:52
  • @blong, I did a quick code read and filed a few tickets about obvious issues some time back. Even with the more serious of those tickets resolved, however, I'm conservative enough to be wary of anything so heavy-handed (and that library is indeed heavy-handed), particularly in a language like bash with so little care about scope. – Charles Duffy Jul 20 '15 at 14:33
  • @blong, ...there are also several forward-compatibility issues, such as function names using characters not guaranteed by either the POSIX spec or bash documentation to be supported. Works now, sure, but that doesn't mean it'll continue to work in the future. – Charles Duffy Jul 20 '15 at 14:35
  • (also, *grumble* about using `echo -e` rather than `printf` with a format string -- means that escape sequences inside of data can be honored rather than treated literally). – Charles Duffy Jul 20 '15 at 14:36
  • ...also, *grumble* about sloppy quoting practices. – Charles Duffy Jul 20 '15 at 14:38
  • @CharlesDuffy Thanks for the feedback. In that case, can you better explain your answer? Is the usage identified by your last 2 lines? `fn_facade="$()"` and then `past_fns+=( "$fn_facade" )` ? – blong Jul 20 '15 at 15:28
  • @blong, no -- I didn't go so generic as to create a stack of cleanup functions, though it'd be easy to do. `cleanup_fns=( )` at the top; `for cleanup_fn in "${cleanup_fns[@]}"; do "$cleanup_fn"; done` in the handler; then, `cleanup_foo() { some function; }` to define a new function and `cleanup_fns+=( cleanup_foo )`. Though, really, one could just iterate over all defined functions named starting with `cleanup_` if one wanted such a thing. The code you quoted is specific to cleanup of temporary files. – Charles Duffy Jul 20 '15 at 15:34
  • @Draemon, `set -e` is [actively harmful](http://mywiki.wooledge.org/BashFAQ/105) to portability (not only across shells, but also across individual releases within a shell), predictability, and consequently readability of nontrivial code. I'm entirely comfortable defending any decision against advising its use. See https://www.in-ulm.de/~mascheck/various/set-e/, focused *specifically* on portability-related aspects of behavior. – Charles Duffy Jun 28 '17 at 14:14
  • @CharlesDuffy, in the case where trap calls `error` would you consider using `${BASH_COMMAND}` to retrieve and report the problematic command? Additionally, `$?` when retrieved first thing within `error` appears to be reporting the return code associated with the failed command - do you see any issues with saving it off and propagating it out of `error`? – iruvar Sep 05 '17 at 02:52
  • @iruvar, you'd need to pass it as an argument in the trap itself -- `trap 'error "$LINENO" "$BASH_COMMAND" "$?"' ERR`. Whereas `$?` will survive into the call, `BASH_COMMAND` won't. – Charles Duffy Sep 05 '17 at 15:49
  • @CharlesDuffy, thank you. Incidentally `"${BASH_COMMAND}"` survives into the call to `error` at least in my use case. But I agree it would make sense to pass it as an argument to `trap` – iruvar Sep 05 '17 at 18:27
  • @iruvar, there are definitely releases of bash where it doesn't, and you get the text of the function call itself in `BASH_COMMAND`. I'd need to research to find the details, but from a compatibility perspective, better to to do the safe thing. (Aside: There's absolutely no value provided by `"${foo}"` over `"$foo"` except for consistency with cases where the brackets *are* necessary, either for a parameterized expansion or concatenation with a following string containing characters legal in names). – Charles Duffy Sep 05 '17 at 18:55
  • @CharlesDuffy I added your trap model to a script on Bash 4.4. It does well traping errors I make at the prompt but I have not been able to trap the same error in a function. Does trap not run in functions? – Buoy Mar 19 '18 at 18:15
  • @CharlesDuffy I made progress with shopt -s extdebug obtained here in the 2nd answer, trap works in functions: https://unix.stackexchange.com/questions/419017/return-trap-in-bash-not-executing-for-function – Buoy Mar 19 '18 at 20:17
  • @RickLan, the "the" you proposed adding implied that `set -e` only weakens *this specific* code's predictability and portability. That's not what the prose was written to state; `set -e` weakens *all* code's predictability and portability. The original grammar was thus correct for the intended meaning. – Charles Duffy Oct 02 '19 at 15:19
  • Do I need to enable the `errexit` shell option in order for the trap to work? @CharlesDuffy Thanks for your help! It seems like everytime I look for something related to bash **yo̲u̲** have the answer and it is always the best one. – Elie G. Jan 15 '20 at 16:30
  • No, you don't need to enable errexit for an ERR trap to trigger. (That said, ERR traps are subject to the [same general caveats `errexit` is](http://mywiki.wooledge.org/BashFAQ/105#Exercises), making them... less than reliable). – Charles Duffy Jan 15 '20 at 16:43
  • What should I use then? – Elie G. Jan 15 '20 at 16:53
  • Explicit, manual error handling, same as you would in C or Go or another language that doesn't support exceptions -- check each command's return value, explicitly act accordingly. – Charles Duffy Jan 15 '20 at 16:56
  • Doc for `trap`: https://www.gnu.org/software/bash/manual/html_node/Bourne-Shell-Builtins.html#index-trap – Jacktose Nov 18 '20 at 18:55
134

That's a fine solution. I just wanted to add

set -e

as a rudimentary error mechanism. It will immediately stop your script if a simple command fails. I think this should have been the default behavior: since such errors almost always signify something unexpected, it is not really 'sane' to keep executing the following commands.

Bruno De Fraine
  • 39,825
  • 8
  • 50
  • 62
  • 33
    `set -e` is not without gotchas: See http://mywiki.wooledge.org/BashFAQ/105 for several. – Charles Duffy Jul 30 '12 at 16:41
  • 3
    @CharlesDuffy, some of the gotchas can be overcome with `set -o pipefail` – hobs Sep 07 '12 at 22:31
  • 7
    @CharlesDuffy Thanks for pointing to the gotchas; overall though, I still think `set -e` has a high benefit-cost ratio. – Bruno De Fraine Sep 11 '12 at 08:21
  • 3
    @BrunoDeFraine I use `set -e` myself, but a number of the other regulars in irc.freenode.org#bash advise (in quite strong terms) against it. At a minimum, the gotchas in question should be well-understood. – Charles Duffy Sep 11 '12 at 13:17
  • 3
    set -e -o pipefail -u # and know what you are doing – Sam Watkins Jul 03 '14 at 08:32
  • More explanation on `set -euo pipefail` can be found here: https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html – Hieu Oct 12 '14 at 01:52
  • `set -euo pipefail; shopt -s failglob # safe mode` – Tom Hale Feb 26 '17 at 05:31
  • 2
    @TomHale, can you answer every exercise in BashFAQ#105 (re: `set -e` corner cases) correctly? Are you confident you'll never hit any of the portability bugs in https://www.in-ulm.de/~mascheck/various/set-e/? There's nothing "safe" about what you're calling "safe mode". – Charles Duffy Jan 04 '18 at 17:28
  • This is not a solution, it's a band-aid. `set -e` can easily and should be replaced with actual (robust) error checking. – Alexej Magura Oct 02 '19 at 19:47
86

Reading all the answers on this page inspired me a lot.

So, here's my hint:

file content: lib.trap.sh

lib_name='trap'
lib_version=20121026

stderr_log="/dev/shm/stderr.log"

#
# TO BE SOURCED ONLY ONCE:
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

if test "${g_libs[$lib_name]+_}"; then
    return 0
else
    if test ${#g_libs[@]} == 0; then
        declare -A g_libs
    fi
    g_libs[$lib_name]=$lib_version
fi


#
# MAIN CODE:
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

set -o pipefail  # trace ERR through pipes
set -o errtrace  # trace ERR through 'time command' and other functions
set -o nounset   ## set -u : exit the script if you try to use an uninitialised variable
set -o errexit   ## set -e : exit the script if any statement returns a non-true return value

exec 2>"$stderr_log"


###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##
#
# FUNCTION: EXIT_HANDLER
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

function exit_handler ()
{
    local error_code="$?"

    test $error_code == 0 && return;

    #
    # LOCAL VARIABLES:
    # ------------------------------------------------------------------
    #    
    local i=0
    local regex=''
    local mem=''

    local error_file=''
    local error_lineno=''
    local error_message='unknown'

    local lineno=''


    #
    # PRINT THE HEADER:
    # ------------------------------------------------------------------
    #
    # Color the output if it's an interactive terminal
    test -t 1 && tput bold; tput setf 4                                 ## red bold
    echo -e "\n(!) EXIT HANDLER:\n"


    #
    # GETTING LAST ERROR OCCURRED:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    #
    # Read last file from the error log
    # ------------------------------------------------------------------
    #
    if test -f "$stderr_log"
        then
            stderr=$( tail -n 1 "$stderr_log" )
            rm "$stderr_log"
    fi

    #
    # Managing the line to extract information:
    # ------------------------------------------------------------------
    #

    if test -n "$stderr"
        then        
            # Exploding stderr on :
            mem="$IFS"
            local shrunk_stderr=$( echo "$stderr" | sed 's/\: /\:/g' )
            IFS=':'
            local stderr_parts=( $shrunk_stderr )
            IFS="$mem"

            # Storing information on the error
            error_file="${stderr_parts[0]}"
            error_lineno="${stderr_parts[1]}"
            error_message=""

            for (( i = 3; i <= ${#stderr_parts[@]}; i++ ))
                do
                    error_message="$error_message "${stderr_parts[$i-1]}": "
            done

            # Removing last ':' (colon character)
            error_message="${error_message%:*}"

            # Trim
            error_message="$( echo "$error_message" | sed -e 's/^[ \t]*//' | sed -e 's/[ \t]*$//' )"
    fi

    #
    # GETTING BACKTRACE:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #
    _backtrace=$( backtrace 2 )


    #
    # MANAGING THE OUTPUT:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    local lineno=""
    regex='^([a-z]{1,}) ([0-9]{1,})$'

    if [[ $error_lineno =~ $regex ]]

        # The error line was found on the log
        # (e.g. type 'ff' without quotes wherever)
        # --------------------------------------------------------------
        then
            local row="${BASH_REMATCH[1]}"
            lineno="${BASH_REMATCH[2]}"

            echo -e "FILE:\t\t${error_file}"
            echo -e "${row^^}:\t\t${lineno}\n"

            echo -e "ERROR CODE:\t${error_code}"             
            test -t 1 && tput setf 6                                    ## white yellow
            echo -e "ERROR MESSAGE:\n$error_message"


        else
            regex="^${error_file}\$|^${error_file}\s+|\s+${error_file}\s+|\s+${error_file}\$"
            if [[ "$_backtrace" =~ $regex ]]

                # The file was found on the log but not the error line
                # (could not reproduce this case so far)
                # ------------------------------------------------------
                then
                    echo -e "FILE:\t\t$error_file"
                    echo -e "ROW:\t\tunknown\n"

                    echo -e "ERROR CODE:\t${error_code}"
                    test -t 1 && tput setf 6                            ## white yellow
                    echo -e "ERROR MESSAGE:\n${stderr}"

                # Neither the error line nor the error file was found on the log
                # (e.g. type 'cp ffd fdf' without quotes wherever)
                # ------------------------------------------------------
                else
                    #
                    # The error file is the first on backtrace list:

                    # Exploding backtrace on newlines
                    mem=$IFS
                    IFS='
                    '
                    #
                    # Substring: I keep only the carriage return
                    # (others needed only for tabbing purpose)
                    IFS=${IFS:0:1}
                    local lines=( $_backtrace )

                    IFS=$mem

                    error_file=""

                    if test -n "${lines[1]}"
                        then
                            array=( ${lines[1]} )

                            for (( i=2; i<${#array[@]}; i++ ))
                                do
                                    error_file="$error_file ${array[$i]}"
                            done

                            # Trim
                            error_file="$( echo "$error_file" | sed -e 's/^[ \t]*//' | sed -e 's/[ \t]*$//' )"
                    fi

                    echo -e "FILE:\t\t$error_file"
                    echo -e "ROW:\t\tunknown\n"

                    echo -e "ERROR CODE:\t${error_code}"
                    test -t 1 && tput setf 6                            ## white yellow
                    if test -n "${stderr}"
                        then
                            echo -e "ERROR MESSAGE:\n${stderr}"
                        else
                            echo -e "ERROR MESSAGE:\n${error_message}"
                    fi
            fi
    fi

    #
    # PRINTING THE BACKTRACE:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    test -t 1 && tput setf 7                                            ## white bold
    echo -e "\n$_backtrace\n"

    #
    # EXITING:
    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #

    test -t 1 && tput setf 4                                            ## red bold
    echo "Exiting!"

    test -t 1 && tput sgr0 # Reset terminal

    exit "$error_code"
}
trap exit_handler EXIT                                                  # ! ! ! TRAP EXIT ! ! !
trap exit ERR                                                           # ! ! ! TRAP ERR ! ! !


###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##
#
# FUNCTION: BACKTRACE
#
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~##

function backtrace
{
    local _start_from_=0

    local params=( "$@" )
    if (( "${#params[@]}" >= "1" ))
        then
            _start_from_="$1"
    fi

    local i=0
    local first=false
    while caller $i > /dev/null
    do
        if test -n "$_start_from_" && (( "$i" + 1   >= "$_start_from_" ))
            then
                if test "$first" == false
                    then
                        echo "BACKTRACE IS:"
                        first=true
                fi
                caller $i
        fi
        let "i=i+1"
    done
}

return 0



Example of usage:
file content: trap-test.sh

#!/bin/bash

source 'lib.trap.sh'

echo "doing something wrong now .."
echo "$foo"

exit 0


Running:

bash trap-test.sh

Output:

doing something wrong now ..

(!) EXIT HANDLER:

FILE:       trap-test.sh
LINE:       6

ERROR CODE: 1
ERROR MESSAGE:
foo:   unassigned variable

BACKTRACE IS:
1 main trap-test.sh

Exiting!


As you can see from the screenshot below, the output is colored and the error message comes in the used language.

enter image description here

Luca Borrione
  • 15,077
  • 5
  • 48
  • 61
  • 3
    this thing is awesome.. you should create a github project for it, so people can easily make improvements and contribute back. I combined it with log4bash and together it creates a powerful env for creating good bash scripts. – Dominik Dorn Dec 15 '13 at 00:13
  • 1
    FYI -- `test ${#g_libs[@]} == 0` isn't POSIX-compliant (POSIX test supports `=` for string comparisons or `-eq` for numeric comparisons, but not `==`, not to mention the lack of arrays in POSIX), and if you're **not** trying to be POSIX compliant, why in the world are you using `test` at all rather than a math context? `(( ${#g_libs[@]} == 0 ))` is, after all, easier to read. – Charles Duffy Feb 14 '14 at 20:24
  • The `function` keyword is also non-POSIX; the compatible way to declare functions is `foo() { ... }`, with no `function` keyword at all. – Charles Duffy Feb 14 '14 at 20:24
  • 2
    @Luca - this is truly great! Your picture inspired me to create my own implementation of this, which takes it even a few steps further. I've posted it in my [answer below](http://stackoverflow.com/questions/64786/error-handling-in-bash/30019669#30019669). – niieani May 03 '15 at 21:40
  • Why is it this should only be sourced once? Will something break if it's sourced multiple times? – blong Jul 20 '15 at 19:03
  • 3
    Bravissimo!! This is an excellent way to debug a script. *Grazie mille* The only thing I added was a check for OS X like this: `case "$(uname)" in Darwin ) stderr_log="${TMPDIR}stderr.log";; Linux ) stderr_log="/dev/shm/stderr.log";; * ) stderr_log="/dev/shm/stderr.log" ;; esac` – SaxDaddy Aug 27 '16 at 05:50
  • Add a link to gist so we can all fork / improve. – johndpope Apr 25 '17 at 14:13
  • if `test ${#g_libs[@]} == 0` is about declaration, it would be better to use similar: `declare -p g_libs > /dev/null 2> /dev/null || declare -A g_libs`. – Kirby Jul 26 '18 at 11:25
  • 1
    A bit of a shameless self-plug, but we've taken this snippet, cleaned it up, added more features, improved the output formatting, and made it more POSIX compatible (works on both Linux and OSX). It's published as part of Privex ShellCore on Github: https://github.com/Privex/shell-core – Someguy123 Oct 08 '19 at 04:39
23

An equivalent alternative to "set -e" is

set -o errexit

It makes the meaning of the flag somewhat clearer than just "-e".

Random addition: to temporarily disable the flag, and return to the default (of continuing execution regardless of exit codes), just use

set +e
echo "commands run here returning non-zero exit codes will not cause the entire script to fail"
echo "false returns 1 as an exit code"
false
set -e

This precludes proper error handling mentioned in other responses, but is quick & effective (just like bash).

Dave Snigier
  • 2,404
  • 2
  • 19
  • 27
  • 1
    using `$(foo)` on a bare line rather than just `foo` is usually the Wrong Thing. Why promote it by giving it as an example? – Charles Duffy Apr 08 '13 at 17:28
20

Inspired by the ideas presented here, I have developed a readable and convenient way to handle errors in bash scripts in my bash boilerplate project.

By simply sourcing the library, you get the following out of the box (i.e. it will halt execution on any error, as if using set -e thanks to a trap on ERR and some bash-fu):

bash-oo-framework error handling

There are some extra features that help handle errors, such as try and catch, or the throw keyword, that allows you to break execution at a point to see the backtrace. Plus, if the terminal supports it, it spits out powerline emojis, colors parts of the output for great readability, and underlines the method that caused the exception in the context of the line of code.

The downside is - it's not portable - the code works in bash, probably >= 4 only (but I'd imagine it could be ported with some effort to bash 3).

The code is separated into multiple files for better handling, but I was inspired by the backtrace idea from the answer above by Luca Borrione.

To read more or take a look at the source, see GitHub:

https://github.com/niieani/bash-oo-framework#error-handling-with-exceptions-and-throw

Community
  • 1
  • 1
niieani
  • 3,395
  • 28
  • 20
  • This is inside the _Bash Object Oriented Framework_ project. ... Luckily it only has 7.4k LOC (according to [GLOC](https://github.com/artem-solovev/gloc) ). OOP -- Object-oriented pain? – ingyhere Dec 02 '19 at 16:53
  • @ingyhere it's highly modular (and delete-friendly), so you can only use the exceptions part if that is what you came for ;) – niieani Dec 04 '19 at 19:51
11

I prefer something really easy to call. So I use something that looks a little complicated, but is easy to use. I usually just copy-and-paste the code below into my scripts. An explanation follows the code.

#This function is used to cleanly exit any script. It does this displaying a
# given error message, and exiting with an error code.
function error_exit {
    echo
    echo "$@"
    exit 1
}
#Trap the killer signals so that we can exit with a good message.
trap "error_exit 'Received signal SIGHUP'" SIGHUP
trap "error_exit 'Received signal SIGINT'" SIGINT
trap "error_exit 'Received signal SIGTERM'" SIGTERM

#Alias the function so that it will print a message with the following format:
#prog-name(@line#): message
#We have to explicitly allow aliases, we do this because they make calling the
#function much easier (see example).
shopt -s expand_aliases
alias die='error_exit "Error ${0}(@`echo $(( $LINENO - 1 ))`):"'

I usually put a call to the cleanup function in side the error_exit function, but this varies from script to script so I left it out. The traps catch the common terminating signals and make sure everything gets cleaned up. The alias is what does the real magic. I like to check everything for failure. So in general I call programs in an "if !" type statement. By subtracting 1 from the line number the alias will tell me where the failure occurred. It is also dead simple to call, and pretty much idiot proof. Below is an example (just replace /bin/false with whatever you are going to call).

#This is an example useage, it will print out
#Error prog-name (@1): Who knew false is false.
if ! /bin/false ; then
    die "Who knew false is false."
fi
  • 2
    Can you expand on the statement _"We have to explicitly allow aliases"_ ? I'd be worried that some unexpected behavior might result. Is there a way to achieve the same thing with a smaller impact? – blong Jul 29 '15 at 13:19
  • I dont need `$LINENO - 1`. Show correctly without it. – kyb Apr 13 '18 at 18:17
  • Shorter usage example in bash and zsh `false || die "hello death"` – kyb Apr 13 '18 at 18:18
6

Another consideration is the exit code to return. Just "1" is pretty standard, although there are a handful of reserved exit codes that bash itself uses, and that same page argues that user-defined codes should be in the range 64-113 to conform to C/C++ standards.

You might also consider the bit vector approach that mount uses for its exit codes:

 0  success
 1  incorrect invocation or permissions
 2  system error (out of memory, cannot fork, no more loop devices)
 4  internal mount bug or missing nfs support in mount
 8  user interrupt
16  problems writing or locking /etc/mtab
32  mount failure
64  some mount succeeded

OR-ing the codes together allows your script to signal multiple simultaneous errors.

yukondude
  • 22,441
  • 13
  • 45
  • 56
4

I use the following trap code, it also allows errors to be traced through pipes and 'time' commands

#!/bin/bash
set -o pipefail  # trace ERR through pipes
set -o errtrace  # trace ERR through 'time command' and other functions
function error() {
    JOB="$0"              # job name
    LASTLINE="$1"         # line of error occurrence
    LASTERR="$2"          # error code
    echo "ERROR in ${JOB} : line ${LASTLINE} with exit code ${LASTERR}"
    exit 1
}
trap 'error ${LINENO} ${?}' ERR
Jean-Christophe Meillaud
  • 1,659
  • 1
  • 19
  • 27
Olivier Delrieu
  • 637
  • 4
  • 16
3

Not sure if this will be helpful to you, but I modified some of the suggested functions here in order to include the check for the error (exit code from prior command) within it. On each "check" I also pass as a parameter the "message" of what the error is for logging purposes.

#!/bin/bash

error_exit()
{
    if [ "$?" != "0" ]; then
        log.sh "$1"
        exit 1
    fi
}

Now to call it within the same script (or in another one if I use export -f error_exit) I simply write the name of the function and pass a message as parameter, like this:

#!/bin/bash

cd /home/myuser/afolder
error_exit "Unable to switch to folder"

rm *
error_exit "Unable to delete all files"

Using this I was able to create a really robust bash file for some automated process and it will stop in case of errors and notify me (log.sh will do that)

eraxillan
  • 1,450
  • 1
  • 19
  • 35
Nelson Rodriguez
  • 503
  • 3
  • 10
  • 2
    Consider using the POSIX syntax for defining functions -- no `function` keyword, just `error_exit() {`. – Charles Duffy Apr 08 '13 at 17:30
  • 2
    is there a reason why you don't just do `cd /home/myuser/afolder || error_exit "Unable to switch to folder"` ? – Pierre-Olivier Vares Jul 29 '14 at 15:59
  • @Pierre-OlivierVares No particular reason about not using ||. This was just an excerpt of an existing code and I just added the "error handling" lines after each concerning line. Some are very long and it was just cleaner to have it on a separate (immediate) line – Nelson Rodriguez Jan 18 '17 at 16:00
  • Looks like a clean solution, though, shell check complains: https://github.com/koalaman/shellcheck/wiki/SC2181 – mhulse Mar 18 '20 at 07:11
3

This has served me well for a while now. It prints error or warning messages in red, one line per parameter, and allows an optional exit code.

# Custom errors
EX_UNKNOWN=1

warning()
{
    # Output warning messages
    # Color the output red if it's an interactive terminal
    # @param $1...: Messages

    test -t 1 && tput setf 4

    printf '%s\n' "$@" >&2

    test -t 1 && tput sgr0 # Reset terminal
    true
}

error()
{
    # Output error messages with optional exit code
    # @param $1...: Messages
    # @param $N: Exit code (optional)

    messages=( "$@" )

    # If the last parameter is a number, it's not part of the messages
    last_parameter="${messages[@]: -1}"
    if [[ "$last_parameter" =~ ^[0-9]*$ ]]
    then
        exit_code=$last_parameter
        unset messages[$((${#messages[@]} - 1))]
    fi

    warning "${messages[@]}"

    exit ${exit_code:-$EX_UNKNOWN}
}
l0b0
  • 48,420
  • 21
  • 118
  • 185
3

I've used

die() {
        echo $1
        kill $$
}

before; i think because 'exit' was failing for me for some reason. The above defaults seem like a good idea, though.

pjz
  • 38,171
  • 5
  • 45
  • 60
1

This trick is useful for missing commands or functions. The name of the missing function (or executable) will be passed in $_

function handle_error {
    status=$?
    last_call=$1

    # 127 is 'command not found'
    (( status != 127 )) && return

    echo "you tried to call $last_call"
    return
}

# Trap errors.
trap 'handle_error "$_"' ERR
Orwellophile
  • 11,307
  • 3
  • 59
  • 38
  • Wouldn't `$_` be available in the function the same as `$?`? I'm not sure there is any reason to use one in the function but not the other. – ingyhere Dec 02 '19 at 16:32
1

This function has been serving me rather well recently:

action () {
    # Test if the first parameter is non-zero
    # and return straight away if so
    if test $1 -ne 0
    then
        return $1
    fi

    # Discard the control parameter
    # and execute the rest
    shift 1
    "$@"
    local status=$?

    # Test the exit status of the command run
    # and display an error message on failure
    if test ${status} -ne 0
    then
        echo Command \""$@"\" failed >&2
    fi

    return ${status}
}

You call it by appending 0 or the last return value to the name of the command to run, so you can chain commands without having to check for error values. With this, this statement block:

command1 param1 param2 param3...
command2 param1 param2 param3...
command3 param1 param2 param3...
command4 param1 param2 param3...
command5 param1 param2 param3...
command6 param1 param2 param3...

Becomes this:

action 0 command1 param1 param2 param3...
action $? command2 param1 param2 param3...
action $? command3 param1 param2 param3...
action $? command4 param1 param2 param3...
action $? command5 param1 param2 param3...
action $? command6 param1 param2 param3...

<<<Error-handling code here>>>

If any of the commands fail, the error code is simply passed to the end of the block. I find it useful when you don't want subsequent commands to execute if an earlier one failed, but you also don't want the script to exit straight away (for example, inside a loop).

xarxziux
  • 341
  • 3
  • 13
0

Using trap is not always an option. For example, if you're writing some kind of re-usable function that needs error handling and that can be called from any script (after sourcing the file with helper functions), that function cannot assume anything about exit time of the outer script, which makes using traps very difficult. Another disadvantage of using traps is bad composability, as you risk overwriting previous trap that might be set earlier up in the caller chain.

There is a little trick that can be used to do proper error handling without traps. As you may already know from other answers, set -e doesn't work inside commands if you use || operator after them, even if you run them in a subshell; e.g., this wouldn't work:

#!/bin/sh

# prints:
#
# --> outer
# --> inner
# ./so_1.sh: line 16: some_failed_command: command not found
# <-- inner
# <-- outer

set -e

outer() {
  echo '--> outer'
  (inner) || {
    exit_code=$?
    echo '--> cleanup'
    return $exit_code
  }
  echo '<-- outer'
}

inner() {
  set -e
  echo '--> inner'
  some_failed_command
  echo '<-- inner'
}

outer

But || operator is needed to prevent returning from the outer function before cleanup. The trick is to run the inner command in background, and then immediately wait for it. The wait builtin will return the exit code of the inner command, and now you're using || after wait, not the inner function, so set -e works properly inside the latter:

#!/bin/sh

# prints:
#
# --> outer
# --> inner
# ./so_2.sh: line 27: some_failed_command: command not found
# --> cleanup

set -e

outer() {
  echo '--> outer'
  inner &
  wait $! || {
    exit_code=$?
    echo '--> cleanup'
    return $exit_code
  }
  echo '<-- outer'
}

inner() {
  set -e
  echo '--> inner'
  some_failed_command
  echo '<-- inner'
}

outer

Here is the generic function that builds upon this idea. It should work in all POSIX-compatible shells if you remove local keywords, i.e. replace all local x=y with just x=y:

# [CLEANUP=cleanup_cmd] run cmd [args...]
#
# `cmd` and `args...` A command to run and its arguments.
#
# `cleanup_cmd` A command that is called after cmd has exited,
# and gets passed the same arguments as cmd. Additionally, the
# following environment variables are available to that command:
#
# - `RUN_CMD` contains the `cmd` that was passed to `run`;
# - `RUN_EXIT_CODE` contains the exit code of the command.
#
# If `cleanup_cmd` is set, `run` will return the exit code of that
# command. Otherwise, it will return the exit code of `cmd`.
#
run() {
  local cmd="$1"; shift
  local exit_code=0

  local e_was_set=1; if ! is_shell_attribute_set e; then
    set -e
    e_was_set=0
  fi

  "$cmd" "$@" &

  wait $! || {
    exit_code=$?
  }

  if [ "$e_was_set" = 0 ] && is_shell_attribute_set e; then
    set +e
  fi

  if [ -n "$CLEANUP" ]; then
    RUN_CMD="$cmd" RUN_EXIT_CODE="$exit_code" "$CLEANUP" "$@"
    return $?
  fi

  return $exit_code
}


is_shell_attribute_set() { # attribute, like "x"
  case "$-" in
    *"$1"*) return 0 ;;
    *)    return 1 ;;
  esac
}

Example of usage:

#!/bin/sh
set -e

# Source the file with the definition of `run` (previous code snippet).
# Alternatively, you may paste that code directly here and comment the next line.
. ./utils.sh


main() {
  echo "--> main: $@"
  CLEANUP=cleanup run inner "$@"
  echo "<-- main"
}


inner() {
  echo "--> inner: $@"
  sleep 0.5; if [ "$1" = 'fail' ]; then
    oh_my_god_look_at_this
  fi
  echo "<-- inner"
}


cleanup() {
  echo "--> cleanup: $@"
  echo "    RUN_CMD = '$RUN_CMD'"
  echo "    RUN_EXIT_CODE = $RUN_EXIT_CODE"
  sleep 0.3
  echo '<-- cleanup'
  return $RUN_EXIT_CODE
}

main "$@"

Running the example:

$ ./so_3 fail; echo "exit code: $?"

--> main: fail
--> inner: fail
./so_3: line 15: oh_my_god_look_at_this: command not found
--> cleanup: fail
    RUN_CMD = 'inner'
    RUN_EXIT_CODE = 127
<-- cleanup
exit code: 127

$ ./so_3 pass; echo "exit code: $?"

--> main: pass
--> inner: pass
<-- inner
--> cleanup: pass
    RUN_CMD = 'inner'
    RUN_EXIT_CODE = 0
<-- cleanup
<-- main
exit code: 0

The only thing that you need to be aware of when using this method is that all modifications of Shell variables done from the command you pass to run will not propagate to the calling function, because the command runs in a subshell.

skozin
  • 3,191
  • 2
  • 18
  • 23