74

I've been trying to read input into environment variables from program output like this:

echo first second | read A B ; echo $A-$B 

And the result is:

-

Both A and B are always empty. I read about bash executing piped commands in sub-shell and that basically preventing one from piping input to read. However, the following:

echo first second | while read A B ; do echo $A-$B ; done

Seems to work, the result is:

first-second

Can someone please explain what is the logic here? Is it that the commands inside the while ... done construct are actually executed in the same shell as echo and not in a sub-shell?

codeforester
  • 28,846
  • 11
  • 78
  • 104
huoneusto
  • 966
  • 1
  • 7
  • 14

4 Answers4

71

How to do a loop against stdin and get result stored in a variable

Under (and other also), when you pipe something by using | to another command, you will implicitely create a fork, a subshell who's a child of current session and who can't affect current session's environ.

So this:

TOTAL=0
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 |
  while read A B;do
      ((TOTAL+=A-B))
      printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
    done
echo final total: $TOTAL

won't give expected result! :

  9 -   4 =    5 -> TOTAL=    5
  3 -   1 =    2 -> TOTAL=    7
 77 -   2 =   75 -> TOTAL=   82
 25 -  12 =   13 -> TOTAL=   95
226 - 664 = -438 -> TOTAL= -343
echo final total: $TOTAL
final total: 0

Where computed TOTAL could'nt be reused in main script.

Inverting the fork

By using Process Substitution, Here Documents or Here Strings, you could inverse the fork:

Here strings

read A B <<<"first second"
echo $A
first

echo $B
second

Here Documents

while read A B;do
    echo $A-$B
    C=$A-$B
  done << eodoc
first second
third fourth
eodoc
first-second
third-fourth

outside of the loop:

echo : $C
: third-fourth

Here Commands

TOTAL=0
while read A B;do
    ((TOTAL+=A-B))
    printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
  done < <(
    printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664
)
  9 -   4 =    5 -> TOTAL=    5
  3 -   1 =    2 -> TOTAL=    7
 77 -   2 =   75 -> TOTAL=   82
 25 -  12 =   13 -> TOTAL=   95
226 - 664 = -438 -> TOTAL= -343

# and finally out of loop:
echo $TOTAL
-343

Now you could use $TOTAL in your main script.

Piping to a command list

But for working only against stdin, you may create a kind of script into the fork:

printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 | {
    TOTAL=0
    while read A B;do
        ((TOTAL+=A-B))
        printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
    done
    echo "Out of the loop total:" $TOTAL
  }

Will give:

  9 -   4 =    5 -> TOTAL=    5
  3 -   1 =    2 -> TOTAL=    7
 77 -   2 =   75 -> TOTAL=   82
 25 -  12 =   13 -> TOTAL=   95
226 - 664 = -438 -> TOTAL= -343
Out of the loop total: -343

Note: $TOTAL could not be used in main script (after last right curly bracket } ).

Using lastpipe bash option

As @CharlesDuffy correctly pointed out, there is a bash option used to change this behaviour. But for this, we have to first disable job control:

shopt -s lastpipe           # Set *lastpipe* option
set +m                      # Disabling job control
TOTAL=0
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 |
  while read A B;do
      ((TOTAL+=A-B))
      printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
    done

  9 -   4 =    5 -> TOTAL= -338
  3 -   1 =    2 -> TOTAL= -336
 77 -   2 =   75 -> TOTAL= -261
 25 -  12 =   13 -> TOTAL= -248
226 - 664 = -438 -> TOTAL= -686

echo final total: $TOTAL
-343

This will work, but I (personally) don't like this because this is not standard and won't help to make script readable. Also disabling job control seem expensive for accessing this behaviour.

Note: Job control is enabled by default only in interactive sessions. So set +m is not required in normal scripts.

So forgotten set +m in a script would create different behaviours if run in a console or if run in a script. This will not going to make this easy to understand or to debug...

F. Hauri
  • 51,421
  • 13
  • 88
  • 109
  • 1
    Thank you. Soon after I had posted my question I realized that the `while` loop was still executed in child instead of the parent and that I couldn't use `A` and `B` outside the loop. – huoneusto Dec 07 '12 at 14:57
  • 2
    One note here -- there are shells that support creating subshells for the other components of the pipeline, and using the existing shell for the right-hand side. This is `shopt -s lastpipe` in bash; ksh88 and ksh93 behave similarly out-of-the-box. – Charles Duffy Apr 02 '15 at 22:21
  • @CharlesDuffy Many thanks! I've edited my answer, but I dislike this strange option.... – F. Hauri Apr 06 '15 at 16:45
  • 1
    It might be worth noting that job control is turned off by default in noninteractive shells (thus, in all scripts not explicitly reenabling it). Thus, the limitation you note is not relevant in all situations. (To be sure, though, having code that doesn't work identically between interactive and noninteractive shells is a disadvantage in and of itself). – Charles Duffy Apr 06 '15 at 16:50
  • @CharlesDuffy Note added... But I don't like this anyways! ;-) – F. Hauri Apr 06 '15 at 17:02
  • This answer is monumentally thorough, but it leaves the impression there is still no way to use `read` to capture standard out when you want to put read at the tail of a pipe-like syntax on a single line. This is what the OP was trying to do. You can do this with subshells or grouping, as other answers show. But is there a way to do this both easily and correctly? -- that is, without loops or subshells or groups, and with process substitution? – algal Feb 19 '16 at 18:22
  • @algal see ***Here strings*** and ***Here commands*** paragraphs, in my explanation. `echo blah | read foo` have to be replaced by `read foo < – F. Hauri May 28 '16 at 09:59
  • ... And no, @algal, the op asked about a `while` loop! (look at the title of this page) – F. Hauri May 28 '16 at 10:21
  • @F.Hauri when I read the body of the question itself, not just the title, it seems to me the OP is trying to put read at the end and is wondering why he or she must use while. They're not trying to use while. And they were trying to put read at the end. Maybe I assume too much. – algal May 28 '16 at 17:31
26

a much cleaner work-around...

first="firstvalue"
second="secondvalue"
read -r a b < <(echo "$first $second")
echo "$a $b"

This way, read isn't executed in a sub-shell (which would clear the variables as soon as that sub-shell has ended). Instead, the variables you want to use are echoed in a sub-shell that automatically inherits the variables from the parent shell.

immotus
  • 261
  • 3
  • 2
21

First, this pipe-chain is executed:

echo first second | read A B

then

echo $A-$B

Because the read A B is executed in a subshell, A and B are lost. If you do this:

echo first second | (read A B ; echo $A-$B)

then both read A B and echo $A-$B are executed in the same subshell (see manpage of bash, search for (list)

pbhd
  • 4,166
  • 16
  • 24
  • Argh... I was just a bit late... – anishsane Dec 07 '12 at 13:38
  • 3
    btw, you don't need subshell. you can simply use grouping. – anishsane Dec 07 '12 at 13:38
  • @anishsane: That's strange: If you open two console, in first hit `tty` for knowing which pty is used, than on second `watch ps --tty pts/NN` (where NN is in the answer of 1st console). Than in 1st, try: `echo | (sleep 10)` and `echo | { sleep 10 ; }`, you will see that the second syntax will generate two forks while first bounce only one subshell. – F. Hauri Sep 05 '13 at 17:38
  • For anyone else wondering about it, you need to remember spaces and semicolons if you want to use grouping operators instead of creating a subshell: `echo something | { read myvar; echo $myvar; }` – algal Feb 19 '16 at 18:11
  • Thanks, best answer for me. – Eric Duminil Nov 24 '16 at 09:25
1

What you are seeing is the separation between processes: the read occurs in a subshell - a separate process which cannot alter the variables in the main process (where echo commands later occur).

A pipeline (like A | B) implicitly places each component in a sub-shell (a separate process), even for built-ins (like read) that usually run in the context of the shell (in the same process).

The difference in the case of "piping into while" is an illusion. The same rule applies there: the loop is the second half of a pipeline, so it's is in a subshell, but the whole loop is in the same subshell, so the separation of processes does not apply.

  • Syntaxe: `A | B` will create **one** separated process for `B`! Not *each component in a sub-shell*: `A` will remain in main (parent) shell! (except if `shopt -s lastpipe` where `A` will be run in a subshell, **but no `B`** !) – F. Hauri May 28 '16 at 11:02
  • @F.Hauri no, you have that backwards. **A** will always be in a subshell; **B** will be in a subshell _unless_ you use `shopt -s lastpipe` – Martin Kealey May 07 '21 at 07:32
  • `( set x y z ; set a b c | set d e f ; echo "$@" )` x y z `( shopt -s lastpipe ; set x y z ; set a b c | set d e f ; echo "$@" )` d e f – Martin Kealey May 07 '21 at 07:35