0

I have a text file with some of the rows containing data as below. I want to replace rows matching these patterns by adding more space in the beginning (i.e. currently these rows having 14 Space which I want to make 34).

I can not just replace 14 Spaces to 34 as there are other rows as well with 14 Spaces but not matching below pattern.

          9
          1P
          PKC
          ABC1
          1BC1C
          ZBC12X
          A4C12XZ
          H4C12XZQ
          94C12XZQQ
          Q4C12XZQQT
          A4C12XZQQTY

I am using all 11 individual code to search for this 11 pattern as below:

[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$
[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$

Below is the Python code I'm using.

import re

input_file = open("1.txt", "r")

for line in input_file:
    if re.match('[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$', line):
        print (line)

I need guidance on how I can replace the text now to get below output. I can use re.Sub but then how I can just add the spaces and not replace the other character in the original line which should be used just for matching pattern.

Output:

                              9
                              1P
                              PKC
                              ABC1
                              1BC1C
                              ZBC12X
                              A4C12XZ
                              H4C12XZQ
                              94C12XZQQ
                              Q4C12XZQQT
                              A4C12XZQQTY
Rahul
  • 191
  • 1
  • 10
  • 2
    @toolic While we're at it, we can ditch the square brackets ` {14}`. @Rahul it looks like you're new to regex are confused about what some of the common special characters represent. I'd recommend going to regex101.com and experimenting, or check out the answer to this question for a general regex reference: https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean – CAustin Nov 13 '19 at 23:14
  • 2
    You could match 14 spaces and 1-11 times the character class `[A-Z0-9]` in 2 groups and replace with the 2 groups `\1extraspaceshere\2` like `^( {14})([0-9A-Z]{1,11})` see https://regex101.com/r/DG3FNF/1 – The fourth bird Nov 13 '19 at 23:14
  • 1
    @The fourth bird: Thanks, This sounds helpful. I'm going through this link and try to to incorporate changes. Will post in case of any issue or if it works. – Rahul Nov 13 '19 at 23:19
  • @CAustin: Thank You for sharing the link , I'm new indeed but will try to go through this link and make changes for my requirement. – Rahul Nov 13 '19 at 23:21

2 Answers2

1

I suggest the following RegEx: https://regex101.com/r/6crgHK/1

Then, your substitution pattern would be:

import re

input_file = open("1.txt", "r")

pattern = re.compile(r'\s+([\dA-Z]+)$')

for line in input_file:
    if re.match(pattern, line):
        line = re.sub(pattern, r'34spaces\1', line)
        print(line)

Of course the 34spaces part you'll need to replace with actual 34 spaces. :)

jayg_code
  • 426
  • 4
  • 11
  • @Jayg_Code: Thank you for the answer, i'm checking this one. shall post update in few minutes. – Rahul Nov 13 '19 at 23:26
  • @CAustin: Sure will try to use multiplier as well – Rahul Nov 13 '19 at 23:27
  • 1
    @Rahul, I'd like to add that, since you're just after the characters after the space, you're only really after the first capture group (i.e., what's in between the parentheses). I've edited my answer to reflect that. – jayg_code Nov 13 '19 at 23:33
  • thank you , i see the updates , the code is searching and matching the string but still not able to add the spaces. I'm getting the same output as input in file as well if i a printing the line print (line) re.sub(pattern, r'34ActualSpacesEntered\1', line) print (line) – Rahul Nov 13 '19 at 23:38
  • 1
    I've edited my answer again so you replace the value of the `line` variable with `re.sub`'s output. – jayg_code Nov 13 '19 at 23:44
  • Ohh got mistake , was not storing in line. Thanks a lot :) come to Mumbai will give you Party :P One last question when i am writing this output to another file using with open('2.txt', 'w') as f1: f1.write(line2 + "\n") it only storing last line. Any clue > – Rahul Nov 13 '19 at 23:54
  • You do need to concatenate the lines back in once you've edited them. My suggestion is to save all the lines first in a list (i.e., `revised_lines = []`, then at the print part you instead do `revised_lines.append(line)`. Only after you've completed the lines you can concatenate the output (i.e., `\n.join(revised_lines)`) and saving it to a new file. – jayg_code Nov 14 '19 at 00:05
0
> import re
> 
> input_file = open("1.txt", "r") space = ' ' * 14 for line in input file:
>     if re.match('[ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z][0-9,A-Z]$',line):
>         print (space + line)

This might work! I'm just prepending 14 spaces to whatever line matches your query and then concatenating the space and your line to print 14 spaces followed by your line.

  • Hey Abhishek, thanks for the answer , but i need to actually store the string instead print preferably in other new text file. – Rahul Nov 13 '19 at 23:59