0

I am trying to get everything between /* and */

I am using re library and following expression to get data

pattern = re.compile(r'(?<=\/\*)(.|\n)+?(?=\*\/)')
result = pattern.findall(query)

query - a string with data shown in the image (may contain line breaks, etc.)

my expression is not working correctly and I get this result

['\n', '\n', '\n', '\n']

how can I get all content in between /* and */?

enter image description here

  • If your file came from Windows it might have `\r` characters in it as well as `\n` – Nick May 03 '21 at 23:03
  • If you `open()` a file with Python, it'll use universal newlines of just `\n`! – ti7 May 03 '21 at 23:04
  • Capturing content inside C-style comments [Python Regex reading in c style comments](https://stackoverflow.com/questions/25735506/python-regex-reading-in-c-style-comments) – DarrylG May 03 '21 at 23:08

2 Answers2

0

Last time I had this issue I resolved it by just removing new lines from string. For your example:

query.replace('\n', '')
Marcin-99
  • 41
  • 2
0
  1. Use re.DOTALL to make . match newlines
  2. Don't escape forward slashes
  3. Non-capturing syntax is: (?:<expr>)
>>> txt = """
/*
name = foo
table = bar
*/ don't capture this
"""
>>> re.findall(r'(?:/\*)(.*?)(?:\*/)', txt, re.DOTALL)
['\nname: foo\ntable: bar\n']
Woodford
  • 1,599
  • 7
  • 15