6

I can't seem to figure out how to write a regex correctly in an if statement. I wanted it to print out all the lines with "End Date" in it.

NUMBERS contains a text file with the following content:

End Date    ABC ABC ABC ABC ABC ABC
05/15/13    2   7   1   1   4   5  
04/16/13    4   3   0   1   3   6  
03/17/13    6   9   3   8   5   9  
02/18/13    8   2   7   1   0   1  
01/19/13    1   9   2   2   5   2  
12/20/12    7   2   7   1   0   1 

Here is a snippet of my code that I am having problems with:

if [ -f $NUMBERS ]
then
        while read line
        do
                if [ $line = ^End ]
                then
                        echo "$line"
                else
                        echo "BROKEN!"
                        break                   
                fi
        done < $NUMBERS
else
        echo "===== $NUMBERS failed to read  ====="
fi

The output is:

Broken!

glenn jackman
  • 207,528
  • 33
  • 187
  • 305
dalawh
  • 759
  • 8
  • 12
  • 31
  • You don't need regular expressions for something this simple: `[[ $line == End* ]]` – glenn jackman May 16 '13 at 00:21
  • Does your file actually contain the `>` characters at the start of each line? – glenn jackman May 16 '13 at 00:22
  • @glennjackman I was doing the regex for the ones with dates, but it wasn't working, so I did it for the first line to make sure nothing was going wrong. The actual file does not contain the ">". – dalawh May 16 '13 at 02:07

3 Answers3

10

if you're using bash, try =~:

...
if [[ $line =~ ^End ]]

Note that the following will NOT work:1

if [[ "$line" =~ "^End" ]]
Community
  • 1
  • 1
datguy
  • 573
  • 5
  • 23
3

The portable solution is to use case which supports wildcards (glob wildcards; not actual regular expressions) out if the box. The syntax is slightly freaky, but you get used to it.

while read -r line; do
    case $line in
        End*) ... your stuff here 
            ... more your stuff here
            ;;   # double semicolon closes branch
    esac
done
tripleee
  • 139,311
  • 24
  • 207
  • 268
  • @cnst Indeed, this should be portable all the way back to the original Bourne `sh`. (No guarantees for `csh`, which some *BSD variants still - inexplicably - seem to cherish.) – tripleee Mar 23 '19 at 07:14
1

You can use the following to check if the line begins with End

if [[ "$line" == End* ]]

You can use the following if you want to use regex.

if [[ "$line" =~ ^End* ]]

http://tldp.org/LDP/abs/html/comparison-ops.html

Also note that it is recommended to always quote your variables.

http://tldp.org/LDP/abs/html/quotingvar.html

Bill
  • 4,477
  • 5
  • 28
  • 49
  • I thought there was no difference between = and ==. Is there? Also, what is the difference from using one [] and two []? – dalawh May 16 '13 at 02:08
  • `if [ "$a" = "$b" ]` -- does a string comparison, but `==` with `[[ ]]` can be used for regex searching. Pls look at the link I gave in the answer. Hope it helps. Copied from that link -- `"The == comparison operator behaves differently within a double-brackets test than within single brackets."` – Bill May 16 '13 at 02:11
  • can you show me your code, it should work...i have tested it. – Bill May 16 '13 at 02:28
  • @dalawh I had a small typo. I updated the answer. Pls use `End*` and see if that works. – Bill May 16 '13 at 02:32
  • `[` is the basic old Bourne `sh` command `test` which does not support regex. The `[[` variant is a Bash (and `ksh` etc) extension with improved robustness around corner cases as well as more capabilities, including `=~` regex matching. The correct string equality operator is `=` though Bash leniently permits `==` as an alias. – tripleee Mar 23 '19 at 07:18
  • The trailing asterisk in `^End*` is wrong; it says you want zero or more occurrences of `d` (and so the regex is equivalent to `^En`). – tripleee Mar 23 '19 at 07:20