1

Hi I want to search something in the file which looks similar to this :

Start Cycle
report 1
report 2
report 3
report 4
End Cycle

.... goes on and on..

I want to search for "Start Cycle" and then pull out report 1 and report 3 from it.. My regex looks something like this

(Start Cycle .*\n)(.*\n)(.*\n)(.*\n)

The above regex select Start Cycle and the next three lines.. But i want to omit the thrid line from my result. Is that possible? Or any easier perl script can be done?? I am expecting a result like :

Start Cycle
report 1
report 3
thejartender
  • 9,119
  • 6
  • 31
  • 50
FatDaemon
  • 666
  • 2
  • 8
  • 13

8 Answers8

5

The following code prints the odd-numbered lines between Start Cycle and End Cycle:

foreach (<$filehandle>) {
    if (/Start Cycle/ .. /End Cycle/) {
        print if /report (\d+)/ and $1 % 2;
    }
}
Ether
  • 48,919
  • 12
  • 83
  • 153
2

You can find text between start and end markes then split context by lines. Here is example:

my $text = <<TEXT;
Start Cycle
report 1
report 2
report 3
report 4
End Cycle
TEXT

## find text between all start/end pairs
while ($text =~ m/^Start Cycle$(.*?)^End Cycle$/msg) {
    my $reports_text = $1;
    ## remove leading spaces
    $reports_text =~ s/^\s+//;
    ## split text by newlines
    my @report_parts = split(/\r?\n/m, $reports_text);
}
Ivan Nevostruev
  • 26,025
  • 8
  • 61
  • 80
2

Perhaps a crazy way to do it: alter Perl's understanding of an input record.

$/ = "End Cycle\n";
print( (/(.+\n)/g)[0,1,3] ) while <$file_handle>;
FMc
  • 39,513
  • 12
  • 72
  • 131
1

The regex populates $1, $2, $3 and $4 with the contents of each pair of brackets.

So if you just look at the contents of $1, $2 and $4 you have what you want.

Alternatively you can just leave off the brackets from the third line.

Your regex should look something like

/Start Cycle\n(.+)\n.+\n(.+)\n.+\nEnd Cycle/g

The /g will allow you to evaluate the regex repeatedly and always get the next match every time.

mollmerx
  • 618
  • 1
  • 5
  • 18
1

If you wanted to leave all of the surrounding code the same but stop capturing the third thing, you could simply remove the parens that cause that line to be captured:

(Start Cycle .*\n)(.*\n).*\n(.*\n)
hobbs
  • 187,508
  • 16
  • 182
  • 271
1

I took the OP's question as a Perl exercise and came up with the following code. It was just written for learning purposes. Kindly correct me if anything looks suspicious.

while(<>) {
   if(/Start Cycle/) {
        push @block,$_;
        push @block, scalar<> for 1..3;               
        print @block[0,1,3];
        @block=(); 
           }
        }

Another version (edited and thanks,@FM):

local $/;
$_ = <>;
  @block = (/(Start Cycle\n)(.+\n).+\n(.+\n)/g);
  print @block;
Mike
  • 1,831
  • 4
  • 24
  • 34
  • Looks good, Mike -- nice use of array slices, slurp mode, and regex in list context. Two minor thoughts. (1) In example #1, if you add `my @block` as the first command inside the loop, you will properly scope the array and can remove `@block = ()`. See this for some detail: http://stackoverflow.com/questions/845060/what-is-the-difference-between-my-and-our-in-perl/990945#990945. (2) Example #2 is somewhat misleading because you don't need the loop at all. If you remove the loop and use `$_ = ` instead, your code will work the same way and express its behavior more clearly. – FMc Nov 26 '09 at 15:15
  • @FM, Thanks for sharing the thoughts :) I didn't know that my declaration could replace the array empty line so naturally here. Thanks for the pointer. And for the second snippet code, I agree since the slup mode is enabled, the while loop is not really a loop. My understanding of the while statement was definitely faulty. – Mike Nov 27 '09 at 04:07
  • wow just looks like there are multiple ways to do it in perl. :) I am still a n00b – FatDaemon Nov 30 '09 at 18:35
0

Update: I did not originally notice that this was just @FM's answer in a slightly more robust and longer form.

#!/usr/bin/perl

use strict; use warnings;

{
    local $/ = "End Cycle\n";
    while ( my $block = <DATA> ) {
        last unless my ($heading) = $block =~ /^(Start Cycle\n)/g;
        print $heading, ($block =~ /([^\n]+\n)/g)[1, 3];
    }
}

__DATA__
Start Cycle
report 1
report 2
report 3
report 4
End Cycle

Output:

Start Cycle
report 1
report 3
Community
  • 1
  • 1
Sinan Ünür
  • 113,391
  • 15
  • 187
  • 326
0
while (<>) {
    if (/Start Cycle/) {
        print $_;
        $_ = <>;
        print $_;
        $_ = <>; $_ = <>;
        print $_;
    }
}
ghostdog74
  • 286,686
  • 52
  • 238
  • 332