0

how can i search for patterns in texts that cover multiple lines and have fixed positions relating each other, for example a pattern consisting of 3 letters of x directly below each other and I want to find them at any position in the line, not just at the beginning for example. Thank you in advance for the answer!

  • The problem is that I established already a regular expression which is able to find this pattern, for example with re.compile(r"(.*)x(.*)\n(.*)x(.*)\n(.*)x(.*)). This finds now 3 letters of x in 3 lines below each other but the horizontal x positions in the line are not explicitely the same (the x´es are not explicitely below each other), so how can I establish this boundary condition? Thanks a lot! – Richard Eisenstein Mar 14 '19 at 00:43
  • You should include your `Input` and your `Desired Output` so people who answer don't have to guess based on sentences – FailSafe Mar 15 '19 at 02:59

1 Answers1

0

I believe the problem you are asking about is "Find patterns that appear at the same offset in a series of lines."

I do not think this describes a regular language, so you would need to draw on Python's extended regex features to have a chance at a regex-based solution. But I do not believe Python supports sufficiently extended features to accomplish this task [1].

If it is acceptable that they occur at a particular offset (rather than "any offset, so long as the offset is consistent"), then something like this should work:

/^.{OFFSET}PATTERN.*\n^.{OFFSET}PATTERN.*\n^.{OFFSET}PATTERN/, using the MULTILINE flag so that ^ matches the beginning of a series of lines instead of just the beginning of the entire text.

[1] In particular, you could use a backreference to capture the text preceding the desired pattern on one line, but I do not think you can query the length of the captured content "inline". You could search for the same leading text again on the next line, but that does not sound like what you want.

James Davis
  • 588
  • 5
  • 11