1

my problem is that I need to find multiple elements in one string.

For example I got one string that looks like this:

line = if ((var.equals("INPUT")) || (var.equals("OUTPUT"))

and then i got this code to find everything between ' (" ' and ' ") '

char1 = '("'
char2 = '")'


add = line[line.find(char1)+2 : line.find(char2)]
list.append(add)

The current result is just:

['INPUT']

but I need the result to look like this:

['INPUT','OUTPUT', ...]

after it got the first match it stopped searching for other matches, but I need to find everything in that string that matches this search.

I also need to append every single match to the list.

Cœur
  • 32,421
  • 21
  • 173
  • 232
M4I3X
  • 43
  • 6

4 Answers4

5

The simplest:

>>> import re
>>> s = """line = if ((var.equals("INPUT")) || (var.equals("OUTPUT"))"""
>>> r = re.compile(r'\("(.*?)"\)')
>>> r.findall(s)
['INPUT', 'OUTPUT']

The trick is to use .*? which is a non-greedy *.

Samuel GIFFARD
  • 687
  • 4
  • 20
  • 1
    This is the way to do it. +1 – Ma0 Nov 23 '18 at 08:44
  • Thanks, im using this now. – M4I3X Nov 23 '18 at 09:06
  • I got one more thing, how can i get the result of findall into a variable? If i try `result = r.findall(s)` i get this error `TypeError: expected string or bytes-like object` – M4I3X Nov 23 '18 at 09:15
  • Your code should work. Looks more like an encoding issue. You're Python 2 or Python 3? Check the type of your input. With something like `print(my_input.__class__)` – Samuel GIFFARD Nov 23 '18 at 09:22
  • I solved it, i was reading lines from a file but forgot to assign a variable to it^^ – M4I3X Nov 23 '18 at 09:32
1

You should look into regular expressions because that's a perfect fit for what you're trying to achieve.

Let's examine a regular expression that does what you want:

import re
regex = re.compile(r'\("([^"]+)"\)')

It matches the string (" then captures anything that isn't a quotation mark and then matches ") at the end.

By using it with findall you will get all the captured groups:

In [1]: import re

In [2]: regex = re.compile(r'\("([^"]+)"\)')

In [3]: line = 'if ((var.equals("INPUT")) || (var.equals("OUTPUT"))'

In [4]: regex.findall(line)
Out[4]: ['INPUT', 'OUTPUT']
Raniz
  • 10,048
  • 1
  • 28
  • 61
  • NB: Will not work if there's a `"` in the string that he wants to find. Non-greedy star operator `*?` is the clean way to go there. – Samuel GIFFARD Nov 23 '18 at 08:48
  • That was intentional though – Raniz Nov 23 '18 at 08:49
  • Regular experssions are not generally a good fit for [brackets matching](https://stackoverflow.com/a/546457/4050925). For this particular task it works, but it can break easily (or get messy) if doing something slightly more complicated. Just a heads up to the OP – Robin Nemeth Nov 23 '18 at 09:24
0

If you don't want to use regex, this will help you.

line = 'if ((var.equals("INPUT")) || (var.equals("OUTPUT"))'
char1 = '("'
char2 = '")'


add = line[line.find(char1)+2 : line.find(char2)]
list.append(add)
line1=line[line.find(char2)+1:]
add = line1[line1.find(char1)+2 : line1.find(char2)]
list.append(add)
print(list)

just add those 3 lines in your code, and you're done

Sandesh34
  • 276
  • 1
  • 2
  • 13
0

if I understand you correct, than something like that is help you:

line = 'line = if ((var.equals("INPUT")) || (var.equals("OUTPUT"))'
items = []
start = 0
end = 0
c = 0;
while c < len(line):
    if line[c] == '(' and line[c + 1] == '"':
        start = c + 2
    if line[c] == '"' and line[c + 1] == ')':
        end = c
    if start and end:
        items.append(line[start:end])
        start = end = None
    c += 1

print(items)    # ['INPUT', 'OUTPUT']
Andrey Suglobov
  • 661
  • 8
  • 13