-2

I am new to regex I want to capture multiple capital words. Sometime capital words can have special characters between them.

example 1:

string = string = 'MY MANAGEMENT PRIOR ASSESSMENT / NEW PLANNING SUPRESS RATE  - TEAM : 14 

I want the regex to capture all capital words and the special characters that seperate them

"MY MANAGEMENT PRIOR ASSESSMENT / NEW PLANNING SUPRESS RATE  - TEAM"

example 2:

string2 = 'SPORT/TRACK INFO  ¶·»Sport Coverage(s): All Sport  primary ¶·»WWE Hi-Low:  ¶·»BBC Hi-Low: ¶·»Sports Issues: can run forever ¶·»BBC Sports: kjkj '

I want the regex to capture "SPORT/TRACK INFO", "WWE", "BBC"

Kalana
  • 4,683
  • 6
  • 22
  • 46
ryan b
  • 1
  • 1

1 Answers1

1

Given the capital words need to be separate from lower case letters
and can have space, -, /, \ between the words, would be this :

[A-Z](?<![a-zA-Z][A-Z])(?:[A-Z]|[-\s/\\])*(?<=[A-Z])(?![a-zA-Z])

https://regex101.com/r/28FR7s/1


Python findall() example code

>>> import re
>>>
>>> string1 = 'MY MANAGEMENT PRIOR ASSESSMENT / NEW PLANNING SUPRESS RATE - TEAM : 14'
>>> string2 = 'SPORT/TRACK INFO ¶·»Sport Coverage(s): All Sport primary ¶·»WWE Hi-Low: ¶·»BBC Hi-Low: ¶·»Sports Issues: can run forever ¶·»BBC Sports: kjkj '
>>>
>>> Rx = r"[A-Z](?<![a-zA-Z][A-Z])(?:[A-Z]|[-\s/\\])*(?<=[A-Z])(?![a-zA-Z])"
>>>
>>> re.findall( Rx, string1 )
['MY MANAGEMENT PRIOR ASSESSMENT / NEW PLANNING SUPRESS RATE - TEAM']
>>> re.findall( Rx, string2 )
['SPORT/TRACK INFO', 'WWE', 'BBC', 'BBC']