1

I wanted to extract (using perl) xxx (string after Block:) and prod (string after Milestone:). The string (after Block: and Milestone:) and number of empty spaces are not standard. I only able to grep the full line using bottom command:

use strict;
use warnings;

my $file = 'xxx.txt';
open my $fh, '<', $file or die "Could not open '$file' $!\n";
while (my $line = <$fh>){
    chomp $line;
#   my @stage_status = $line =~ /(\:.*)\s*$/;
my @stage_status = $line =~ /\b(Block)(\W+)(\w+)/;
    foreach my $stage_statuss (@stage_status){
        print "$stage_statuss\n";
    }
    }

Example of line in a file:

| Block:                   | xxx | Milestone:           | prod        |
Blurman
  • 173
  • 7
  • 1
    As per comments below OP is attempting this code in Perl/Grep and needs regex help therefore it is relevant tag here. – anubhava Feb 01 '21 at 14:49

2 Answers2

1

Using gnu grep you can do:

grep -oP '\b(Block|Milestone)\W+\K\w+' file

xxx
prod

RexEx Details:

  • \b; Word boundary
  • (Block|Milestone): Match Black or Milestone
  • \W+: Match 1+ non-word characters
  • \K: Reset matched info
  • \w+: Match 1+ word characters

Update:

Suggested perl code as per OP's edited question:

use strict;
use warnings;

my $file = 'xxx.txt';
open my $fh, '<', $file or die "Could not open '$file' $!\n";

while (my $line = <$fh>){
    chomp $line;
    print "checking: $line\n";
    my @stage_status = $line =~ /\b(?:Block|Milestone)\W+(\w+)/g;
    
    foreach my $stage_statuss (@stage_status){
       print "$stage_statuss\n";
    }
}

Output:

checking: | Block:                   | xxx | Milestone:           | prod        |
xxx
prod
anubhava
  • 664,788
  • 59
  • 469
  • 547
0

You could do this with a simple awk. By setting appropriate field separator values we can get the needed value. Simply setting field separator as pipe followed by space OR space occurrences and then in main program checking condition if 2nd field is Block: then print 4th field.

awk -F'\\|[[:space:]]+|[[:space:]]+' '$2=="Block:"{print $4} $6=="Milestone:"{print $8}' Input_file


2nd solution: Almost same solution like my 1st solution above, only thing is making only 1 field separator here for awk.

awk -F'([[:space:]]+)?\\|([[:space:]]+|$)' '$2=="Block:"{print $3} $4=="Milestone:"{print $5}' Input_file
RavinderSingh13
  • 101,958
  • 9
  • 41
  • 77