RegEx: Do not match - please check my solution

Question

What I'm doing is trying to use diff's --ignore-matching-lines= to ignore lines that contain certain patterns. The reason for this is I have a bash script that uses HPE's RESTful API to check/patch BIOS settings on the hosts. I use the --ignore-matching-lines= to omit patterns in the BIOS settings when both the RegEx matches the BIOS setting on the host as well as a basic template for the settings that I have stored in a variable. If diff finds the settings do not match the "golden configuration" it shows the differences on the terminal and prompts to apply the correct config. The settings I'm using the RegEx for are the UEFI boot order and Intel SGX settings.

This particular RegEx question is around the SGX settings. HPE has occasionally set the Epoch to a default of all 0's. I need to ensure that the SgxEpoch value is not all 0's, but rather a random 32 character string like in the following instance or we have a problem securing our enclave secretes.

Broken:

"SgxEpoch": "00000000000000000000000000000000"

OK:

"SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI"

I did quite a bit of looking and found that Wiktor Stribiżew was extremely helpful in showing how you can use POSIX to "match everything but" (I found out diff doesn't support lookaheads as it is BRE) - Regex: match everything but

So I came up with the following ERE version which also looks for "SgxEpoch": "" as that is how my comparison template is defined - https://regex101.com/r/Upd1KL/1

SgxEpochRegEx='"SgxEpoch":\s*(\"([^0].{31}|.[^0].{30}|.{2}[^0].{29}|.{3}[^0].{28}|.{4}[^0].{27}|.{5}[^0].{26}|.{6}[^0].{25}|.{7}[^0].{24}|.{8}[^0].{23}|.{9}[^0].{22}|.{10}[^0].{21}|.{11}[^0].{20}|.{12}[^0].{19}|.{13}[^0].{18}|.{14}[^0].{17}|.{15}[^0].{16}|.{16}[^0].{15}|.{17}[^0].{14}|.{18}[^0].{13}|.{19}[^0].{12}|.{20}[^0].{11}|.{21}[^0].{10}|.{22}[^0].{9}|.{23}[^0].{8}|.{24}[^0].{7}|.{25}[^0].{6}|.{26}[^0].{5}|.{27}[^0].{4}|.{28}[^0].{3}|.{29}[^0].{2}|.{30}[^0].|.{31}[^0])\"|"")'

And then converted it to BRE so it would work with diff:

SgxEpochRegEx='"SgxEpoch":\s*\("\([^0].\{31\}\|.[^0].\{30\}\|.\{2\}[^0].\{29\}\|.\{3\}[^0].\{28\}\|.\{4\}[^0].\{27\}\|.\{5\}[^0].\{26\}\|.\{6\}[^0].\{25\}\|.\{7\}[^0].\{24\}\|.\{8\}[^0].\{23\}\|.\{9\}[^0].\{22\}\|.\{10\}[^0].\{21\}\|.\{11\}[^0].\{20\}\|.\{12\}[^0].\{19\}\|.\{13\}[^0].\{18\}\|.\{14\}[^0].\{17\}\|.\{15\}[^0].\{16\}\|.\{16\}[^0].\{15\}\|.\{17\}[^0].\{14\}\|.\{18\}[^0].\{13\}\|.\{19\}[^0].\{12\}\|.\{20\}[^0].\{11\}\|.\{21\}[^0].\{10\}\|.\{22\}[^0].\{9\}\|.\{23\}[^0].\{8\}\|.\{24\}[^0].\{7\}\|.\{25\}[^0].\{6\}\|.\{26\}[^0].\{5\}\|.\{27\}[^0].\{4\}\|.\{28\}[^0].\{3\}\|.\{29\}[^0].\{2\}\|.\{30\}[^0].\|.\{31\}[^0]\)"\|""\)'

Here's an example of the BRE version catching either a non-zero or blank version of the Epoch as well as not catching something.

[~]$ echo '"SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI"' | grep '"SgxEpoch":\s*\("\([^0].\{31\}\|.[^0].\{30\}\|.\{2\}[^0].\{29\}\|.\{3\}[^0].\{28\}\|.\{4\}[^0].\{27\}\|.\{5\}[^0].\{26\}\|.\{6\}[^0].\{25\}\|.\{7\}[^0].\{24\}\|.\{8\}[^0].\{23\}\|.\{9\}[^0].\{22\}\|.\{10\}[^0].\{21\}\|.\{11\}[^0].\{20\}\|.\{12\}[^0].\{19\}\|.\{13\}[^0].\{18\}\|.\{14\}[^0].\{17\}\|.\{15\}[^0].\{16\}\|.\{16\}[^0].\{15\}\|.\{17\}[^0].\{14\}\|.\{18\}[^0].\{13\}\|.\{19\}[^0].\{12\}\|.\{20\}[^0].\{11\}\|.\{21\}[^0].\{10\}\|.\{22\}[^0].\{9\}\|.\{23\}[^0].\{8\}\|.\{24\}[^0].\{7\}\|.\{25\}[^0].\{6\}\|.\{26\}[^0].\{5\}\|.\{27\}[^0].\{4\}\|.\{28\}[^0].\{3\}\|.\{29\}[^0].\{2\}\|.\{30\}[^0].\|.\{31\}[^0]\)"\|""\)'
"SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI"
[~]$ echo '"SgxEpoch": ""' | grep '"SgxEpoch":\s*\("\([^0].\{31\}\|.[^0].\{30\}\|.\{2\}[^0].\{29\}\|.\{3\}[^0].\{28\}\|.\{4\}[^0].\{27\}\|.\{5\}[^0].\{26\}\|.\{6\}[^0].\{25\}\|.\{7\}[^0].\{24\}\|.\{8\}[^0].\{23\}\|.\{9\}[^0].\{22\}\|.\{10\}[^0].\{21\}\|.\{11\}[^0].\{20\}\|.\{12\}[^0].\{19\}\|.\{13\}[^0].\{18\}\|.\{14\}[^0].\{17\}\|.\{15\}[^0].\{16\}\|.\{16\}[^0].\{15\}\|.\{17\}[^0].\{14\}\|.\{18\}[^0].\{13\}\|.\{19\}[^0].\{12\}\|.\{20\}[^0].\{11\}\|.\{21\}[^0].\{10\}\|.\{22\}[^0].\{9\}\|.\{23\}[^0].\{8\}\|.\{24\}[^0].\{7\}\|.\{25\}[^0].\{6\}\|.\{26\}[^0].\{5\}\|.\{27\}[^0].\{4\}\|.\{28\}[^0].\{3\}\|.\{29\}[^0].\{2\}\|.\{30\}[^0].\|.\{31\}[^0]\)"\|""\)'
"SgxEpoch": ""
[~]$ echo '"SgxEpoch": "00000000000000010000000000000000"' | grep '"SgxEpoch":\s*\("\([^0].\{31\}\|.[^0].\{30\}\|.\{2\}[^0].\{29\}\|.\{3\}[^0].\{28\}\|.\{4\}[^0].\{27\}\|.\{5\}[^0].\{26\}\|.\{6\}[^0].\{25\}\|.\{7\}[^0].\{24\}\|.\{8\}[^0].\{23\}\|.\{9\}[^0].\{22\}\|.\{10\}[^0].\{21\}\|.\{11\}[^0].\{20\}\|.\{12\}[^0].\{19\}\|.\{13\}[^0].\{18\}\|.\{14\}[^0].\{17\}\|.\{15\}[^0].\{16\}\|.\{16\}[^0].\{15\}\|.\{17\}[^0].\{14\}\|.\{18\}[^0].\{13\}\|.\{19\}[^0].\{12\}\|.\{20\}[^0].\{11\}\|.\{21\}[^0].\{10\}\|.\{22\}[^0].\{9\}\|.\{23\}[^0].\{8\}\|.\{24\}[^0].\{7\}\|.\{25\}[^0].\{6\}\|.\{26\}[^0].\{5\}\|.\{27\}[^0].\{4\}\|.\{28\}[^0].\{3\}\|.\{29\}[^0].\{2\}\|.\{30\}[^0].\|.\{31\}[^0]\)"\|""\)'
"SgxEpoch": "00000000000000010000000000000000"
[~]$ echo '"SgxEpoch": "00000000000000000000000000000000"' | grep '"SgxEpoch":\s*\("\([^0].\{31\}\|.[^0].\{30\}\|.\{2\}[^0].\{29\}\|.\{3\}[^0].\{28\}\|.\{4\}[^0].\{27\}\|.\{5\}[^0].\{26\}\|.\{6\}[^0].\{25\}\|.\{7\}[^0].\{24\}\|.\{8\}[^0].\{23\}\|.\{9\}[^0].\{22\}\|.\{10\}[^0].\{21\}\|.\{11\}[^0].\{20\}\|.\{12\}[^0].\{19\}\|.\{13\}[^0].\{18\}\|.\{14\}[^0].\{17\}\|.\{15\}[^0].\{16\}\|.\{16\}[^0].\{15\}\|.\{17\}[^0].\{14\}\|.\{18\}[^0].\{13\}\|.\{19\}[^0].\{12\}\|.\{20\}[^0].\{11\}\|.\{21\}[^0].\{10\}\|.\{22\}[^0].\{9\}\|.\{23\}[^0].\{8\}\|.\{24\}[^0].\{7\}\|.\{25\}[^0].\{6\}\|.\{26\}[^0].\{5\}\|.\{27\}[^0].\{4\}\|.\{28\}[^0].\{3\}\|.\{29\}[^0].\{2\}\|.\{30\}[^0].\|.\{31\}[^0]\)"\|""\)'
[~]$

The problem I have is I don't understand how the above RegEx works. It appears to me that it is all just a bunch of alternatives using the meta-character | where I am basically saying if x number of "any character except newline" (via .{x}) and then followed by a 0 and then again x number of any character except newline (via .{x}). It seems to me that this RegEx should match the following examples but it doesn't... and I don't understand why.

"SgxEpoch": "00000000000000010000000000000000"
"SgxEpoch": "00000000000000000000000001000000"
"SgxEpoch": "00000100000000000000000000000000"

Here is it working very similarly to how it is implemented in the bash script. In the first input fd <(curl ...) it pulls all BIOS settings from the host in JSON format. For the 2nd fd <(echo ...) I echo a variable with the desired BIOS settings in generic form. Then both are fed to | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges' to alphabetize both inputs to diff, pretty print format to separate the JSON settings by \n so --suppress-common-lines only displays the discrepancies to the user and perl removes \n characters for array type variables because I could not get diff to match across newlines as "one chunk" with something like ..

[awilk00@nvdejb-dc-2p ~]$ SgxEpochRegEx='"SgxEpoch":\s*\("\([^0].\{31\}\|.[^0].\{30\}\|.\{2\}[^0].\{29\}\|.\{3\}[^0].\{28\}\|.\{4\}[^0].\{27\}\|.\{5\}[^0].\{26\}\|.\{6\}[^0].\{25\}\|.\{7\}[^0].\{24\}\|.\{8\}[^0].\{23\}\|.\{9\}[^0].\{22\}\|.\{10\}[^0].\{21\}\|.\{11\}[^0].\{20\}\|.\{12\}[^0].\{19\}\|.\{13\}[^0].\{18\}\|.\{14\}[^0].\{17\}\|.\{15\}[^0].\{16\}\|.\{16\}[^0].\{15\}\|.\{17\}[^0].\{14\}\|.\{18\}[^0].\{13\}\|.\{19\}[^0].\{12\}\|.\{20\}[^0].\{11\}\|.\{21\}[^0].\{10\}\|.\{22\}[^0].\{9\}\|.\{23\}[^0].\{8\}\|.\{24\}[^0].\{7\}\|.\{25\}[^0].\{6\}\|.\{26\}[^0].\{5\}\|.\{27\}[^0].\{4\}\|.\{28\}[^0].\{3\}\|.\{29\}[^0].\{2\}\|.\{30\}[^0].\|.\{31\}[^0]\)"\|""\),'
[awilk00@nvdejb-dc-2p ~]$ hostsettings='{"ServicePhone":"","SgxEpoch": "00000000000000000000000000000000","SgxEpochControl":"SgxEpochNoChange","DefaultBootOrder":["PcieSlotNic","EmbeddedFlexLOM","EmbeddedStorage","PcieSlotStorage","Usb","Cd","UefiShell","Floppy"]}'
[awilk00@nvdejb-dc-2p ~]$ desiredsettings='{"ServicePhone":"","SgxEpoch": "","SgxEpochControl":"SgxEpochNoChange","DefaultBootOrder":["Floppy","Cd","Usb","EmbeddedStorage","PcieSlotStorage","EmbeddedFlexLOM","PcieSlotNic","UefiShell"]}'
[awilk00@nvdejb-dc-2p ~]$ echo "${hostsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges'
{
    "DefaultBootOrder": ["PcieSlotNic","EmbeddedFlexLOM","EmbeddedStorage","PcieSlotStorage","Usb","Cd","UefiShell","Floppy"],
    "ServicePhone": "",
    "SgxEpoch": "00000000000000000000000000000000",
    "SgxEpochControl": "SgxEpochNoChange"
}
[awilk00@nvdejb-dc-2p ~]$ echo "${desiredsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges'
{
    "DefaultBootOrder": ["Floppy","Cd","Usb","EmbeddedStorage","PcieSlotStorage","EmbeddedFlexLOM","PcieSlotNic","UefiShell"],
    "ServicePhone": "",
    "SgxEpoch": "",
    "SgxEpochControl": "SgxEpochNoChange"
}
[awilk00@nvdejb-dc-2p ~]$ diff --report-identical-files --suppress-common-lines --side-by-side --ignore-matching-lines="${SgxEpochRegEx}" <(echo "${hostsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges') <(echo "${desiredsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges')
    "DefaultBootOrder": ["PcieSlotNic","EmbeddedFlexLOM","Emb |     "DefaultBootOrder": ["Floppy","Cd","Usb","EmbeddedStorage
    "SgxEpoch": "00000000000000000000000000000000",           |     "SgxEpoch": "",
[awilk00@nvdejb-dc-2p ~]$
[awilk00@nvdejb-dc-2p ~]$
[awilk00@nvdejb-dc-2p ~]$ hostsettings='{"ServicePhone":"","SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI","SgxEpochControl":"SgxEpochNoChange","DefaultBootOrder":["PcieSlotNic","EmbeddedFlexLOM","EmbeddedStorage","PcieSlotStorage","Usb","Cd","UefiShell","Floppy"]}'
[awilk00@nvdejb-dc-2p ~]$ echo "${hostsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges'
{
    "DefaultBootOrder": ["PcieSlotNic","EmbeddedFlexLOM","EmbeddedStorage","PcieSlotStorage","Usb","Cd","UefiShell","Floppy"],
    "ServicePhone": "",
    "SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI",
    "SgxEpochControl": "SgxEpochNoChange"
}
[awilk00@nvdejb-dc-2p ~]$ echo "${desiredsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges'
{
    "DefaultBootOrder": ["Floppy","Cd","Usb","EmbeddedStorage","PcieSlotStorage","EmbeddedFlexLOM","PcieSlotNic","UefiShell"],
    "ServicePhone": "",
    "SgxEpoch": "",
    "SgxEpochControl": "SgxEpochNoChange"
}
[awilk00@nvdejb-dc-2p ~]$ diff --report-identical-files --suppress-common-lines --side-by-side --ignore-matching-lines="${SgxEpochRegEx}" <(echo "${hostsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges') <(echo "${desiredsettings}" | python -m json.tool | perl -00pe 's:\[.*?\]:($x=$&)=~s/\s//gs;$x:ges')
    "DefaultBootOrder": ["PcieSlotNic","EmbeddedFlexLOM","Emb |     "DefaultBootOrder": ["Floppy","Cd","Usb","EmbeddedStorage
[awilk00@nvdejb-dc-2p ~]$

NOTE: I'm almost certain I found a thread on stack overflow where someone reviewed diff's source code and found it does a compare on a line by line basis but cannot find that thread ATM.

NOTE2: I noticed that when diff finds differentiating lines immediately preceding a line where there would have been have a match in the --suppress-common-lines regex, diff will not remove the two lines that match the regex and show them as a difference immediately after the preceding non-regex matching line it found a difference in. Hope I didn't butcher that too bad. For example:

[~]$ SgxEpochRegEx='"SgxEpoch":\s*\("\([^0].\{31\}\|.[^0].\{30\}\|.\{2\}[^0].\{29\}\|.\{3\}[^0].\{28\}\|.\{4\}[^0].\{27\}\|.\{5\}[^0].\{26\}\|.\{6\}[^0].\{25\}\|.\{7\}[^0].\{24\}\|.\{8\}[^0].\{23\}\|.\{9\}[^0].\{22\}\|.\{10\}[^0].\{21\}\|.\{11\}[^0].\{20\}\|.\{12\}[^0].\{19\}\|.\{13\}[^0].\{18\}\|.\{14\}[^0].\{17\}\|.\{15\}[^0].\{16\}\|.\{16\}[^0].\{15\}\|.\{17\}[^0].\{14\}\|.\{18\}[^0].\{13\}\|.\{19\}[^0].\{12\}\|.\{20\}[^0].\{11\}\|.\{21\}[^0].\{10\}\|.\{22\}[^0].\{9\}\|.\{23\}[^0].\{8\}\|.\{24\}[^0].\{7\}\|.\{25\}[^0].\{6\}\|.\{26\}[^0].\{5\}\|.\{27\}[^0].\{4\}\|.\{28\}[^0].\{3\}\|.\{29\}[^0].\{2\}\|.\{30\}[^0].\|.\{31\}[^0]\)"\|""\),'
[~]$ desiredsettings='{"SgxEpoch": "","SgxEpochControl":"SgxEpochNoChange","DefaultBootOrder":["Floppy"]}'
[~]$ hostsettings='{"SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI","SgxEpochControl":"SgxEpochNoChange","DefaultBootOrder":["PcieSlotNic"]}'
[~]$ diff --report-identical-files --side-by-side --ignore-matching-lines="${SgxEpochRegEx}" <(echo "${hostsettings}" | python -m json.tool) <(echo "${desiredsettings}" | python -m json.tool)
{                                                               {
    "DefaultBootOrder": [                                           "DefaultBootOrder": [
        "PcieSlotNic"                                         |         "Floppy"
    ],                                                              ],
    "SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI",                 "SgxEpoch": "",
    "SgxEpochControl": "SgxEpochNoChange"                           "SgxEpochControl": "SgxEpochNoChange"
}                                                               }
[~]$ diff --report-identical-files --side-by-side --ignore-matching-lines="${SgxEpochRegEx}" <(echo "${hostsettings}" | python -m json.tool | grep -v '[]],') <(echo "${desiredsettings}" | python -m json.tool | grep -v '[]],')
{                                                               {
    "DefaultBootOrder": [                                           "DefaultBootOrder": [
        "PcieSlotNic"                                         |         "Floppy"
    "SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI",           |     "SgxEpoch": "",
    "SgxEpochControl": "SgxEpochNoChange"                           "SgxEpochControl": "SgxEpochNoChange"
}                                                               }
[~]$

Because of the sensitivity of the data, I wanted to be sure I had a clear understanding of how this RegEx worked.

I would also greatly appreciate any syntax verification that you can offer around the huge regex line since I don't really understand how it works.

The ask here is for someone to explain/verify the regex. I thought it would be best that I explain my end goal due to the XY Problem, but really I can't afford the time investment currently to re-write everything. I am open to that going forward but I have a date to meet.

Aaron

Your regex is too long to read IMO, and, as formatted, can't even be viewed on a single line. Please either fix your formatting, or better yet, give us a minimal question. — Tim Biegeleisen, Feb 19 '18 at 10:20
I'm with Tim. Your solution seems overly complicated. Maybe it would be faster to ask us how to solve the actual problem instead of asking how to fix you current approach. See [XY Problem](https://meta.stackexchange.com/a/66378/374558). — Socowi, Feb 19 '18 at 10:23
@TimBiegeleisen - I didn't want to make the RegEx that long but based on the solution by Wiktor Stribiżew in https://stackoverflow.com/questions/1687620/regex-match-everything-but there didn't seem to be an alternative to get it to work in POSIX. Additionally, I can see the entire regex when I look at the question? — R37ribution, Feb 19 '18 at 10:31
@Socowi - the problem I'm trying to solve is to identify when I have all 0's in the SgxEpoch value returned from the REST interface. If so, then I have some code to set it to a random alphanumeric value. Namely, `cat /dev/urandom | tr -dc 'A-Z0-9' | fold -w 32 | head -n 1`. In my current solution I require the regex for that check to be in BRE format so I can use `diff` to omit it when its not all 0's. I'm using `diff` already for all of thee rest of the settings. — R37ribution, Feb 19 '18 at 10:34
Note that when you say you added `"SgxEpoch": ""` to the verification, you added it so that it is considered valid, is that what you wanted to do, or the contrary? Also, `.` is not "any character except newline (via .{x})" and will actually catch new lines, see example here: https://regex101.com/r/UUr8nq/6 — Kaddath, Feb 19 '18 at 10:34
@Kaddath - Yes, the SgxEpoch key will always be there. `diff`'s implementation of regex for the purposes of the `--ignore-matching-lines=` doesn't continue across newlines. — R37ribution, Feb 19 '18 at 10:36
@R37ribution it seems that in BRE "Even alternation is not supported." ([link here](https://www.regular-expressions.info/posix.html)), have you tested a basic alternative to check if it worked? Like does regex `^(a|b)$` actually match `b`? Also, your current regex allows spaces (" "), tabs and also things like `"` in your value — Kaddath, Feb 19 '18 at 10:59

Socowi · Answer 1 · 2018-02-19T15:54:54.360

1

Your regex seems fine. Maybe the problem is diff's --ignore-matching-lines option (shorthand -I) which works a bit different than one might expect.

How does `diff -I regex` work?

Let a and b be two files.

If I hadn't read the documentation, I would expect that the following two commands are equivalent:

diff -I regex a b
diff <(grep -v regex a) <(grep -v regex b)

This is wrong in two ways:

diff always considers pairs or lines from a and b. Such a pair is ignored if both lines (the line from a and the line from b) both match the regex.
Even if both lines from the pair match, it can happen that they are not ignored. Not only both lines from the pair have to match but all lines from the hunk have to match!

Example for the second point (-y is a shorthand for --side-by-side):

diff --suppress-common-lines -yI '1\|2' <(printf '1\n') <(printf '2\n')

diff --suppress-common-lines -yI '1\|2' <(printf '1\n1\n') <(printf '2\nX\n')
1                                 | 2
1                                 | X

The first command worked as expected, but the second command didn't. Instead of the line pair (1,2) diff tried to match all lines from the hunk ((1,1),(2,X)). One line from that hunk did not match therefore the whole hunk was printed.

Alternative to `diff -I`

I'm not entirely sure what your typical input and expected output is. What I guessed:

You have the file original that might contain the line
"SgxEpoch": "00000000000000000000000000000000"
You generate the file generated that fixed the 0-lines to something like
"SgxEpoch": "5FSQUWEED6XPC8PJ2CWZGQIS4WWKLKUI"
(Step where you need help)
You want to compare the two files original and generated to make sure, that your script from the second step did the right thing. To make the comparison easier, you only want to see the differences when the corresponding line from original was either
"SgxEpoch": "00000000000000000000000000000000"
or
"SgxEpoch": ""

There is an easy solution for this. Simply grep the output of diff:

diff --suppress-common-lines -y original generated |
grep -E '^\s*"SgxEpoch"\s*:\s*"0*"'

edited Feb 19 '18 at 15:54

answered Feb 19 '18 at 13:09

Socowi

17,678
2
21
39

I really appreciate the help. This program is pretty complicated and I'm aware of the XY Problem as you had mentioned earlier. I have this 100% implemented I just need to verify that my current regex works as I expect it to. – R37ribution Feb 20 '18 at 10:04
@R37ribution Then you should have tried [codereview.stackexchange](https://codereview.stackexchange.com/). Your regex works with `grep`, but not with `diff -I` due to the mentioned behavior. – Socowi Feb 20 '18 at 10:07
I do similar to your second example where `diff – R37ribution Feb 20 '18 at 10:21
The pairing of line `a` and `b` in your example is the reason I have the regex matching both `00000000000000000000000000000000` and `` since the template settings are "" (blank). – R37ribution Feb 20 '18 at 10:24
I don't believe that the comment "Not only both lines from the pair have to match but all lines from the hunk have to match!" is true... `[~]$ diff --suppress-common-lines -yI '1\|2' – R37ribution Feb 20 '18 at 10:24
@R37ribution I still think its true. That's what the documentation says. In your example, the first line is `2` for both files. Due to `--suppress-common-lines` both first lines are ignored and not part of the hunk. Regarding the other comments: When you give more details on your question, please edit your question instead of writing comments. – Socowi Feb 20 '18 at 10:34
I experimented with `printf` a bit and deleted my previous comment - I've always just used `echo` so I apologize for my ignorance. I think this command does what you were intending to do with `printf` - `diff --report-identical-files --suppress-common-lines --side-by-side --ignore-matching-lines='1\|2' – R37ribution Feb 20 '18 at 10:56
*»what you were intending to do«*. I did not intend anything without doing it. What did you want to show with this example? PS: `printf '%s\n%s\n%s\n' "1" "2" "3"` is the same as `printf '1\n2\n3\n'` or even `printf '%s\n' 1 2 3`. – Socowi Feb 20 '18 at 11:06
I believe this is a bug in diff - I have updated my question to explain this. `diff` ignores the regex for the line immediate after a miss. If you try these two variations of `printf` with `diff` I think you will see it looks like a bug as well `diff --report-identical-files --suppress-common-lines --side-by-side --ignore-matching-lines='1\|2' – R37ribution Feb 20 '18 at 13:04
@R37ribution it is not a bug. In your example everything works exactly as described in `diff`'s documentation which I tried to explain in my answer. In both cases there is only one hunk. In the first case, the hunk contains the complete input including the non-matching line `X`, therefore the hunk is printed. In the second case, the hunk contains only both first lines. The second lines are identical and ignored due to `--ignore-matching-lines`. Both lines from the hunk match `1\|2`, therefore the hunk is not printed. – Socowi Feb 20 '18 at 13:18
I'm trying hard to understand this. I read the links on how diff uses hunks and how you can pass the `--minimal` flag to "produce a smaller set of differences" but adding `diff --text --minimal` didn't change the output at all from the commands I updated my question with. Simply removing the `],` from the input to diff results in diff including the next line previously omitted by regex. If I put any other line in front of the section the regex covers, i.e. `"ServicePhone":"",` diff will not find that line in the "hunk" as being different. Why? If the regex is separate from the comparison? – R37ribution Feb 20 '18 at 15:03

RegEx: Do not match - please check my solution

1 Answers1

How does diff -I regex work?

Alternative to diff -I

How does `diff -I regex` work?

Alternative to `diff -I`