8

In the process of coming up with a way to catch errors in my Bash scripts, I've been experimenting with "set -e", "set -E", and the "trap" command. In the process, I've discovered some strange behavior in how $LINENO is evaluated in the context of functions. First, here's a stripped down version of how I'm trying to log errors:

#!/bin/bash

set -E
trap 'echo Failed on line: $LINENO at command: $BASH_COMMAND && exit $?' ERR

Now, the behavior is different based on where the failure occurs. For example, if I follow the above with:

echo "Should fail at: $((LINENO + 1))"
false

I get the following output:

Should fail at: 6
Failed on line: 6 at command: false

Everything is as expected. Line 6 is the line containing the single command "false". But if I wrap up my failing command in a function and call it like this:

function failure {
    echo "Should fail at $((LINENO + 1))"
    false
}
failure

Then I get the following output:

Should fail at 7
Failed on line: 5 at command: false

As you can see, $BASH_COMMAND contains the correct failing command: "false", but $LINENO is reporting the first line of the "failure" function definition as the current command. That makes no sense to me. Is there a way to get the line number of the line referenced in $BASH_COMMAND?

It's possible this behavior is specific to older versions of Bash. I'm stuck on 3.2.51 for the time being. If the behavior has changed in later releases, it would still be nice to know if there's a workaround to get the value I want on 3.2.51.

EDIT: I'm afraid some people are confused because I broke up my example into chunks. Let me try to clarify what I have, what I'm getting, and what I want.

This is my script:

#!/bin/bash

set -E
function handle_error {
    local retval=$?
    local line=$1
    echo "Failed at $line: $BASH_COMMAND"
    exit $retval
}
trap 'handle_error $LINENO' ERR

function fail {
    echo "I expect the next line to be the failing line: $((LINENO + 1))"
    command_that_fails
}

fail

Now, what I expect is the following output:

I expect the next line to be the failing line: 14
Failed at 14: command_that_fails

Now, what I get is the following output:

I expect the next line to be the failing line: 14
Failed at 12: command_that_fails

BUT line 12 is not command_that_fails. Line 12 is function fail {, which is somewhat less helpful. I have also examined the ${BASH_LINENO[@]} array, and it does not have an entry for line 14.

user108471
  • 2,168
  • 2
  • 23
  • 37
  • 1
    FYI -- don't use the `function` keyword; it's gratuitously incompatible with POSIX, but adds no functionality over the syntax that's actually compatible with other shells: `failure() { ... }` – Charles Duffy Jun 25 '14 at 01:25
  • I appreciate that, but given that I make no effort to write POSIX-compatible scripts, the `function` keyword is my way of warning people about the fact in an obvious sort of way. – user108471 Jun 25 '14 at 01:29
  • This looks to be version-specific; when I try to reproduce it on modern releases, it correctly reports the error on line 14, not line 12. – Charles Duffy Jun 25 '14 at 02:08
  • That's also true in the ideone sandbox: http://ideone.com/RqdLjZ shows your issue failing to reproduce. – Charles Duffy Jun 25 '14 at 02:10
  • 1
    Okay -- I *can* reproduce this on the 3.2.x release shipped with OS X, though it doesn't happen on any bash 4 release I've yet tried. – Charles Duffy Jun 25 '14 at 02:13
  • Thanks, now we're getting somewhere. I had a suspicion that this was related to my older Bash version. Can you think of a workaround? – user108471 Jun 25 '14 at 02:16
  • Trying to find one; I'll add another answer or amend my existing one if I do. – Charles Duffy Jun 25 '14 at 02:18
  • Added it as a separate answer to avoid polluting the answer for folks on modern shells with awful hackery. – Charles Duffy Jun 25 '14 at 02:31

3 Answers3

10

For bash releases prior to 4.1, a special level of awful, hacky, performance-killing hell is needed to work around an issue wherein, on errors, the system jumps back to the function definition point before invoking an error handler.

#!/bin/bash

set -E
set -o functrace
function handle_error {
    local retval=$?
    local line=${last_lineno:-$1}
    echo "Failed at $line: $BASH_COMMAND"
    echo "Trace: " "$@"
    exit $retval
}
if (( ${BASH_VERSION%%.*} <= 3 )) || [[ ${BASH_VERSION%.*} = 4.0 ]]; then
        trap '[[ $FUNCNAME = handle_error ]] || { last_lineno=$real_lineno; real_lineno=$LINENO; }' DEBUG
fi
trap 'handle_error $LINENO ${BASH_LINENO[@]}' ERR

fail() {
    echo "I expect the next line to be the failing line: $((LINENO + 1))"
    command_that_fails
}

fail
Charles Duffy
  • 235,655
  • 34
  • 305
  • 356
  • I see. You're setting a DEBUG trap to save the line number at every single line, then accessing the latest value of that variable within the ERR handler. I'll have a chance to confirm this tomorrow, but it looks to me like it will work. Thank you. – user108471 Jun 25 '14 at 02:38
  • 1
    @user108471, not the latest -- the second-to-latest; the *latest* value is that of the function definition point, whereas the one prior to it is the error's actual location (hence the distinction between `real_lineno` and `last_lineno`). Obviously, that's only true on bash versions exhibiting this bug, so it's important to turn the hack on only on bash versions that require it. A more sophisticated version might actually test for the bug and decide whether to enable the behavior based on results. – Charles Duffy Jun 25 '14 at 02:38
  • It would also be helpful if I could find any mention of this bug or the fix, so I could know in what version the behavior was corrected. As it stands, I'm just assuming this was corrected in Bash 4. – user108471 Jun 25 '14 at 02:50
  • Changelog entry j between 4.0->4.1 looks like a strong candidate to me. – Charles Duffy Jun 25 '14 at 03:04
5

BASH_LINENO is an array. You can refer to different values in it: ${BASH_LINENO[1]}, ${BASH_LINENO[2]}, etc. to back up the stack. (Positions in this array line up with those in the BASH_SOURCE array, if you want to get fancy and actually print a stack trace).

Even better, though, you can just inject the correct line number in your trap:

failure() {
  local lineno=$1
  echo "Failed at $lineno"
}
trap 'failure ${LINENO}' ERR

You might also find my prior answer at https://stackoverflow.com/a/185900/14122 (with a more complete error-handling example) interesting.

Community
  • 1
  • 1
Charles Duffy
  • 235,655
  • 34
  • 305
  • 356
-1

That behaviour is very reasonable.

The whole picture of the call stack provides comprehensive information whenever an error occurs. Your example had demonstrated a good error message; you could see where the an error actually occurred and which line triggered the function, etc.

If the interpreter/compiler can't precisely indicate where the error actually occurs, you could be more easily confused.

  • I disagree that it's reasonable, because it only tells me which function had the failure, not what line of the function failed. – user108471 Jun 25 '14 at 01:41
  • What you described is what it was designed. You can find this behaviour in almost all interpreters and compilers, though some might provide additional information of the call stack. – Christopher C. S. Ke Jun 25 '14 at 04:43
  • I'm sorry, but you are wrong. I think you misunderstand why the behavior is incorrect here. It turns out that this was a bug that was officially fixed in Bash 4.1. (See item p in the 4.0 to 4.1-alpha change list: http://lists.gnu.org/archive/html/bug-bash/2009-12/msg00139.html). – user108471 Jun 25 '14 at 13:35
  • Yes, you are right. I misunderstood your question. I just tested your example and see what you expected is actually the case that I thought it is reasonable. Thanks for clarification. – Christopher C. S. Ke Jun 25 '14 at 14:54