Python Regex get substring before and after certain word that only occur 1 time

Asked May 22 '21 at 08:15

Active May 22 '21 at 09:48

Viewed 33 times

-1

text = Kes local: 6,072 (5,452WN, 620BWN) Kes import:3 (1WN, 2BWN)  J kematian:46(45WN)

I want to get the substring after the word 'import' occur, before either 'WN' or 'BWN', whichever occur first, and match only 1 time.

I used

re.search(r"(import)(.*)(WN|BWN)", text, re.IGNORECASE).group(0)
re.search(r"(import)(.*)(WN|BWN){0,1}", text, re.IGNORECASE).group(0)
re.search(r"(import)(.*)(?!WN|BWN)", text, re.IGNORECASE).group(0)

# does not work  # output
# 'import:3 (1WN, 2BWN)  J kematian:46(45WN'
# 'import:3 (1WN, 2BWN)  J kematian:46(45WN) J kumulatif: 2,040(0.42%)  J kes di ICU:559 Pesakit Intubated:303'

# output that I want      
# 'import:3 (1WN'

Can anyone explain why my approach doesn't work? and also appreciate your solution.

edited May 22 '21 at 09:48

asked May 22 '21 at 08:15

Kyle

1

Use `(import)(.*?)(WN|BWN)` – anubhava May 22 '21 at 08:17
huh? that's a bit mind blogging? it works, but I can actually use * (zero or more occurrences) followed by ? (one or more occurrences) ? I used {0,1} and ?! on (WN|BWN) but those doesn't work. Mind elaborate a little? [updated question] – Kyle May 22 '21 at 08:31
1

@Kyle See: https://stackoverflow.com/a/5583884/8967612 and https://stackoverflow.com/q/2301285/8967612 – 41686d6564 May 22 '21 at 08:38
`.*?` doesn't mean match 0 or 1, it means match 0 or more, but the minimal number necessary to make the entire regex match. See https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean – Nick May 22 '21 at 09:22
sorry, that's the typo from my copy paste.. – Kyle May 22 '21 at 09:45
thanks all, but the answer is in comment. I can't mark it as accepted – Kyle May 22 '21 at 09:47

Python Regex get substring before and after certain word that only occur 1 time

0 Answers0