1

Hello I am writing a script to extract some lines from a list, the main idea is that i have a list of names and I want to create an archive only with the information of that persons, the first list looks as follows:

Name    Address Age
Steve   72 Andover Dr.  11
Yuri    1133 Nicholas Av.       14
David   110 Arlington Street    11
Andy    38 Devonshire Road      11
Dan     38 Devonshire Road      14
Blaine  59 Village Road 11
Justin  59 Village Road 11
Rob     39 Jackson Drive        15
Russell 39 Jackson Dr   14
Justin  16 Crestwood Rd.        17
Tristan 75 Peak Avenue  11
Marty   101 Captains Walk       16
Jason   73 Minuteman Dr.        13
Joe     73 Minuteman Dr.        11
Philip  61 Prospect Avenue      11
Evan    281 Burnt Plains Road   11

And the list of names is the following:

Andy
Russell
Philip
Evan

I want to produce a new archive called newlist.txt that should contain only the following information:

Andy    38 Devonshire Road      11
Russell 39 Jackson Dr   14
Philip  61 Prospect Avenue      11
Evan    281 Burnt Plains Road   11

In order to achieve that I tried writing the following script:

#!/bin/bash
while read line
do
        grep $line $2 >> newlist.txt
done <$1

I am getting the file newlist.txt with the information of the persons, in fact my script works well but I am not sure if this is the best approach and I am considering the enough cautions, since I am going to use it with a very long text I decided to show my code in order to receive suggestions, thanks any how.

Cyrus
  • 69,405
  • 13
  • 65
  • 117
neo33
  • 1,591
  • 4
  • 13
  • 33

2 Answers2

2

With GNU grep, sed and bash:

grep -f <(sed 's/.*/^&[ \t]/' file2) file1

Output:

Andy    38 Devonshire Road      11
Russell 39 Jackson Dr   14
Philip  61 Prospect Avenue      11
Evan    281 Burnt Plains Road   11

See: The Stack Overflow Regular Expressions FAQ

Community
  • 1
  • 1
Cyrus
  • 69,405
  • 13
  • 65
  • 117
1

Using awk you can do this without using regex in a single command:

awk 'FNR==NR{seen[$1]; next} $1 in seen' names.txt list.txt

Andy    38 Devonshire Road      11
Russell 39 Jackson Dr   14
Philip  61 Prospect Avenue      11
Evan    281 Burnt Plains Road   11
anubhava
  • 664,788
  • 59
  • 469
  • 547
  • 1
    thanks for this approach seems like awk is a very powerful command, – neo33 Aug 17 '16 at 13:57
  • Yes indeed awk is as powerful as BASH itself (or in some cases more powerful). You can check: [Effective AWK Programming Book Online - GNU](https://www.gnu.org/s/gawk/manual/gawk.pdf) – anubhava Aug 17 '16 at 14:00
  • only one question, what would be the meaning of seen[$1] ? – neo33 Aug 17 '16 at 14:01
  • `seen[$1]` fills up an associative array named `seen` by key as `$1` (which is first column in 1st file in the list i.e. `names.txt`) – anubhava Aug 17 '16 at 14:02
  • 1
    Thanks I undestand, I really appreciate the support. – neo33 Aug 17 '16 at 14:05
  • Hello one question, what is the meaning of FNR==NR? this command is really helpful but I have that question. – neo33 Aug 24 '16 at 13:27
  • `NR` is overall number of record and `FNR` is number of record for each file. So `FNR==NR` condition makes sure we're executing that block for first file in the list i.e. `names.txt` – anubhava Aug 24 '16 at 14:06
  • Thanks I really appreciate the support, the thing is that I have the same question but now with a new separator the "|" do you think that is enough to add -F "|" to get the same result? – neo33 Aug 24 '16 at 14:32