Find all permutations of a string by matching terms in any order

Question

I want to find and replace the following string:

<tag a=“x” b=“y” c=“z”/>

However it can present in any order, e.g.

<tag c=“z” b=“y” a=“x”/>
<tag b=“y” a=“x” c=“z”/>

What would be the regex term to find all instances of this string?

This looks like an [XY Problem](http://xyproblem.info/). I don't think the proper solution is to do string/regex search in the first place, but rather to use an HTML parser and search based on attributes for example. — Samuel Dion-Girardeau, Mar 11 '20 at 16:43
Maybe this`(?=.*c\=“z”)(?=.*b\=“y”)(?=.*a\=“x”)$` . https://regex101.com/r/gT8wK5/1349 — Eraklon, Mar 11 '20 at 16:51

score 0 · Answer 1 · answered Mar 11 '20 at 17:00

0

I believe the regex query that you want is:

<tag ([a-z]){1}=“([a-z]){1}” ([a-z]){1}=“([a-z]){1}” ([a-z]){1}=“([a-z]){1}”/>

Let me help to explain the elements of this query:

([a-z]) Will match any string between a to z
Adding {1} will tell the query that you want to match this query just once!
So ([a-z]){1} will match any string between a to z just once.

If we use this element in this example the matched strings will be:

<tag a=“x” b=“y” c=“z”/>

matched strings: t,a,g,a,x,b,y,c in that order.

If you add your string structure to your query:

tag ([a-z]){1}=“([a-z]){1}”

matched strings: tag a=“x“

Hope this helps!

answered Mar 11 '20 at 17:00

EnriqueBet

1,091
2
10
20

Thanks @CarySwoveland, I tested it on an online regex match. But you are right, in python the "/" should be escaped, and according to your comments the solution should be: `` – EnriqueBet Mar 11 '20 at 17:57
Thank you for your very instructional answer. However I should have specified in my question that this was only a simplified example. I am looking for a specific xml tag in a DOCX file and my problem is that MS Word randomly reorders the attributes in some instances (I believe when a user with a different system language saves the file). – Invertedchicken Mar 11 '20 at 17:58
The online regex engine that I am using is [https://regexr.com/](regexr.com). I understand your point on the string `` but this kind of string would be valid for an xml or html document? Just trying to justify my not so good answer hahahaha :P! – EnriqueBet Mar 12 '20 at 15:28

Cary Swoveland · Answer 2 · 2020-03-18T12:38:15.337

This is one way:

^<tag +([abc])=“([xyz])“ +(?!\1)([abc])=“(?!\2)([xyz])“ +(?!\1|\3)[abc]=“(?!\2|\4)[xyz]“\/>$

Demo

^         # match beginning of line
<tag      # match '<tag'       
 +        # match 1+ spaces
([abc])   # match 'a', 'b' or 'c' in cap group 1
=“        # match '=“'
([xyz])   # match 'x', 'y' or 'z' in cap group 2
“ +       # match '“' followed by 1+ spaces 
(?!\1)    # following cannot match contents of cap group 1
([abc])   # match 'a', 'b' or 'c' in cap group 3
=“        # match '=“'
(?!\2)    # following cannot match contents of cap group 2
([xyz])   # match 'x', 'y' or 'z' in cap group 4
“ +       # match '“' followed by 1+ spaces 
(?!\1|\3) # following cannot match contents of cap group 1 or 3
[abc]=“   # match 'a', 'b' or 'c' followed by '=“'
(?!\2|\4) # do not match contents of cap group 2 or 4
[xyz]“\/> # match 'x', 'y' or 'z' followed by '“/>'
$         # match end of line

Find all permutations of a string by matching terms in any order

2 Answers2