I am not very familiar with regex(s) and would like somebody to put this into something that I will be able to understand? As in, outline what each part of the regex is doing
re.compile(r'ATG((?:[ACTG]{3})+?)(?:TAG|TAA|TGA)')
So far, this is what I have come up with:
re.compile
is a regex method... or something along those lines
r'
is simply needed in regex
After that, I'm not too sure...
Searches for a piece in the string ATG
?:[ACTG]{3}
searches for a piece of the string containing the characters A
C
T
G
within the string (does the order of these matter?) that is {3}
three characters long.
+?
something about going at least once, but minimal times...? What would part of code would be going at least once, but minimal times?
?:
searches for TAG|TAA|TGA
within the string. Once it finds these, what does happens?
Would I be able to do something like
key_words = "TAG TAA TGA".replace(" ", "|")
so that I can have a whole long list without having to type of |
a bunch of times if I have over 100 substrings?
I would then format this to something like this:
...(?:key_words)')
Examples and simple explanations always work wonders - thanks!