2

I have an assigment that asks me to print on screen the number of words that are longer than a given number ,let's say k , which is read from the keyboard. and then to order the result. Until now I decided to try in this way :

#!bin/bash
k=0
if [ $# -eq 0 ]
then 
    echo "No argument supplied."
    exit 1
fi
echo -n "Give the minimal lenght of the words : "
read k 
for files in "$@"
do
    if [ -f "$files" ]; then
        echo "$(cat $files | egrep -o '[^ ]{k,}' $files | wc -w) : $files."
    else
        echo "Error: File $files has not been found."
    fi
done | sort -n

My issues is that whenever I try this program with k in the section "egrep -o '[^ ]{k,}'" it always gives a wrong answer. But if I replaced it with an integer, in works exactly as I wanted. Which is the right way to make this code work for k read from keyboard? which is the syntax , can't really understand how I should write there , tried other ways to like "$k" , $k , ((k)) , k. Any help is welcome , a hint if someone could give me please? I am stuck

mklement0
  • 245,023
  • 45
  • 419
  • 492
Wiuwi
  • 23
  • 3
  • 1
    While this is a different command the concept of the question is the same "how do I use shell variables in a string part of a command I am running" so the answer is still the correct one. http://stackoverflow.com/questions/7680504/sed-substitution-with-bash-variables – Etan Reisner Mar 29 '15 at 22:50
  • 1
    As an additional point `echo $(command)` is almost **never** at all useful. – Etan Reisner Mar 29 '15 at 22:51

1 Answers1

1

Try

echo "$(egrep -o '[^ ]{'"$k"',}' "$files" | wc -w): $files"
  • Your immediate problem was using k instead of $k.
    • To refer to a variable in Bash, you must prefix its name with $ (by contrast, you mustn't use $ when assigning to it). In some cases, to separate the variable name from subsequent tokens, you must enclose the name in {..}, e.g., ${k}, which you may also opt to do for visual clarity, even when it's not strictly required.
  • Your next problem was using single quotes to delimit the egrep search regex (inside the command substitution), which prevents expansion of variable references.
    • Note that even though you were using a double-quoted string overall, embedded command substitutions ($(...)) are their own worlds, in which the usual parsing rules apply (single-quoted strings are literals, whereas double-quoted strings may have embedded variable references, command substitutions, arithmetic expansions).
  • However, simply using double quotes instead - inside which variable references are expanded - would potentially not fix the problem in this case, due to a bug in bash 3.2.57 related to brace expansion ({...}) inside a command substitution:
    • This bug is no longer present in bash 4.3.30 (don't know when it was fixed), so there you could use a single, double-quoted string: "[^ ]{$k,}". However, the solution presented works in both bash 3.x and 4.x
    • Here's a minimal example that demonstrates the bug in bash 3.2.57:
      • k=3; echo "$(egrep -co "[^ ]{$k,}" <<<$'abc\nde')"
      • This should return 1, but returns 0 2, due to mistakenly applying brace expansion to {3,} (resulting in two strings: egrep -co '[^ ]3' and egrep -co '[^ ]'), even though it is contained inside a double-quoted string.
  • Thus, the answer is to splice in the variable reference between two single-quoted strings:
    • '[^ ]{'"$k"',}' concatenates literal [^ ]{ with the value of variable $k and literal ,}. If $k is 4, for instance, egrep then sees the following string: [^ ]{4,}
  • (Also, the initial cat $files | was unnecessary, since the file is also passed as an operand (non-option argument), and you should always double-quote variables containing filenames, so the command won't break with filenames with embedded spaces.)
  • Finally, I suggest renaming $files to $file to avoid confusion: in each iteration of the loop, you're only dealing with a single file.
mklement0
  • 245,023
  • 45
  • 419
  • 492
  • Hello and thanks for the fast answer and also thanks a lot for the detailed explanation it helped me to understand the though procces :). First of all yes cat was an extra I didn't have to use it , was my reflex to write in from the labs where I wrote only 1 line comands .Then about the egrep -co I might have an issue. I've tried also egrep -co before posting this question , and whne I used it it would return only the number of lines that had a word that was longer than my k.I use this longer version with wc in order to count all words. And also using 4.3.3 and your other solution words.thx :) – Wiuwi Mar 30 '15 at 08:43
  • @Wiuwi: My pleasure; sorry that I messed up the `egrep -co` part - it's fixed now. – mklement0 Mar 30 '15 at 12:44