Regular expression for a list of string objects

Question

I have a list as following:

list12 = ['**FIRS0425 SOPL ZTE First Company limited', 'Apple Technology','*ROS Sami']

My code is as following

import re
[item2 for item in list12 for item2 in item.split() if not re.match("^[*A-Z]+(0-9){4}$", item2)]

I got output like :

['First', 'Company', 'limited', 'Apple', 'Technology', 'Sami']

I expect the output to be like :

['SOPL', 'ZTE', 'First', 'Company', 'limited', 'Apple', 'Technology', 'ROS', 'Sami']

I am not good with regular expression. How can I reach to my required solution?

You may use `if not re.match(r"\**[A-Z]+[0-9]{4}$", item2)`. Note it will output `'*ROS'`, not `'ROS'`. See [demo](https://ideone.com/K4hYaN). However, the same output can be achieved if you replace `(0-9)` with `[0-9]` in your pattern. — Wiktor Stribiżew, Feb 19 '19 at 15:24

score 0 · Answer 1 · answered Feb 19 '19 at 15:28

0

Seems you're looking for

\b([A-Za-z]+)\b

In Python:

import re
list12 = ['**FIRS0425 SOPL ZTE First Company limited', 'Apple Technology','*ROS Sami']

rx = re.compile(r'\b([A-Za-z]+)\b')
result = [word for item in list12 for word in rx.findall(item)]
print(result)

Which yields

['SOPL', 'ZTE', 'First', 'Company', 'limited', 'Apple', 'Technology', 'ROS', 'Sami']

answered Feb 19 '19 at 15:28

Jan

38,539
8
41
69

This solution is similar but i have a huge text and the code which you wrote is having 2 for loops and consuming most of my time. – Vas Feb 19 '19 at 15:49

score 0 · Accepted Answer · answered Feb 19 '19 at 15:28

0

A non-regex way in python,

list12 = ['**FIRS0425 SOPL ZTE First Company limited', 'Apple Technology','*ROS Sami']
str = " ".join(list12)
list21 = str.split()
res = [k.strip('*') for k in list21 if '**' not in k]
print(res)

Output:

['SOPL', 'ZTE', 'First', 'Company', 'limited', 'Apple', 'Technology', 'ROS', 'Sami']

DEMO: http://tpcg.io/s9aBhe

answered Feb 19 '19 at 15:28

Always Sunny

29,081
6
43
74

any specific reason for voting it down? – Always Sunny Feb 20 '19 at 09:09

Regular expression for a list of string objects

2 Answers2