0

I'm currently debugging a shell script, which acts as a master-script in a data pipeline. In order to run the pipeline, you feed a bunch of arguments into the shell script. From there, the shell script sequentially calls 6 different scripts [4 in R, 2 in Python], writes out stuff to log files, and so on. Basically, my idea is to use this script to automate a data pipeline that takes a long time to run.

Right now, if any of the individual R or Python scripts break within the shell script, it just jumps to the next script that it's supposed to call. However, running script 03.py requires the data input to scripts 01.R and 02.R to be fully run and processed, otherwise 03 will produce erroneous output data which will then be written out and further processed in later scripts.

What I want to do is, 1. Break the overall shell script if there's an error in any of the R scripts 2. Output a message telling me where this error happened [line of individual R / python script]

Here's a sample of the master.sh shell script which calls the individual scripts.

#############
# STEP 2 : RUNNING SCRIPTS 
#############

# A - 01.R 
#################################################################

# log_file - this needs to be reassigned for every individual script
log_file=01.log
current_time=$(date)
echo "Current time: $current_time"

echo "Now running script 01. Log file output being written to $log_file_dir$log_file."
Rscript 01.R -f $input_file -s $sql_db > $log_file_dir$log_file 

# current time/date
current_time=$(date)
echo "Current time: $current_time"

# B - 02.R 
#################################################################

log_file=02.log
current_time=$(date)
echo "Current time: $current_time"

echo "Now running script 02. Log file output being written to $log_file_dir$log_file"

Rscript 02.R -f $input_file -s $sql_db > $log_file_dir$log_file 

# PRINT OUT TIMINGS
current_time=$(date)
echo "Current time: $current_time"

This sequence is repeated throughout the master.sh script until script 06.R, after which it collates some data retrieved from output files and log files, and prints them to stout.

Here's some sample output that gets printed by my current master.sh, which shows how the script just keeps moving even though 01.R has produced an error.

file: test-data/minisample.txt
There are a total of 101 elements in file.
Using the main database.
Writing log-files to this directory: log_files/minisample/.
Writing output-csv with classifications to output/minisample.csv.
Current time: Wed Nov 14 18:19:53 UTC 2018
Now running script 01. Log file output being written to log_files/minisample/01.log.
Loading required package: stringi
Loading required package: dplyr

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Loading required package: RMySQL
Loading required package: DBI
Loading required package: methods
Loading required package: hms
Error: The following 2 arguments need to be provided:
  -f <input file>.csv
  -s <MySQL db name>
Execution halted
Current time: Wed Nov 14 18:19:54 UTC 2018
./master.sh: line 95: -1: substring expression < 0
./master.sh: line 100: -1: substring expression < 0
./master.sh: line 104: -1: substring expression < 0
Total time taken to run script 01.R:
Average time taken per user to run script 01.R:
Total time taken to run pipeline so far [01/06]:
Average time taken per user to run pipeline so far [01/06]:
Current time: Wed Nov 14 18:19:54 UTC 2018
Now running script 02. Log file output being written to log_files/minisample/02.log

Seeing as the R script 01.R produces an error, I want the script master.sh to stop. But how? Any help would be greatly appreciated, thanks in advance!

nikUoM
  • 687
  • 1
  • 5
  • 13

2 Answers2

0

As another user mentioned, simply running set -e will make your script terminate on first error. However, if you want more control, you can also check the exit status with ${?} or simply $? assuming your program gives an exit code of 0 on success, and non-zero otherwise.

#!/bin/bash
url=https://nosuchaddress1234.com/nosuchpage.html
error_file=errorFile.txt
wget ${url} 2> ${error_file}
exit_status=${?}
if [ ${exit_status} -ne 0 ]; then
    echo -n "wget ${url} "
    if [ ${exit_status} -eq 4 ]; then
        echo "- Network failure."
    elif [ ${exit_status} -eq 8 ]; then
        echo "- Server issued an error response."
    else
        echo "- Other error"
    fi
    echo "See ${error_file} for more details"
    exit ${exit_status};
fi
PageFault
  • 21
  • 1
0

I like to put some boilerplate at the top of most scripts like this -

trap 'echo >&2 "ERROR in $0 at line $LINENO, Aborting"; exit $LINENO;' ERR
set -u 

While coding at debugging, I usually add

set -x

And a lot of trace "comments" with colons -

: this will parse its args but only show under set -x

Then the trick is to make sure any errors you know are ok are handled. Conditionals consume the errors, so those are safe.

if grep foo nonexistantfile
then : do the success stuff
else : if you *want* a failout here, just call false
     false here will abort # args don't matter :)
fi

By the same token, if you just want to catch and ignore a known possible error -

ls $mightNotExist ||: # || says "do on fail"; : is an alias for "true"

Just always check your likely errors. Then the only thing that will crash your script is a fail.

Paul Hodges
  • 8,723
  • 1
  • 12
  • 28