1

I'm trying to find all the places we have ambiguous format specifiers in our code base.

For example:

int64_t fixed_size_type;
printf ("Ambiguous replacement value: %lld", fixed_size_type);

Which results in:

format '%lld' expects argument of type 'long long int', but argument 5 has type 'int64_t {aka long int}' [-Wformat=]

I'm using regex to search the code base, and I want to find all occurrences of %lld where the values have NOT been cast into (long long).

Using the VSCode search tab, I can find all appropriately cast occurrences using:

%lld.*(\(long long\))

However, I don't know how to ask it to show me the strings where (long long) in not present?

When I try:

%lld(?!.*\(long long\))

VSCode gives me the following error message:

Error parsing regex near 'lld(?!.*\(' at character offset 6: Unrecognized flag: '!'. (Allowed flags: i, m, s, U, u, x.)

I have tried several examples I have seen in forums with similar problems, and I have tried to use regex documentation to cipher those examples; all with no avail.

Most examples suggest you use a negative lookahead or lookbehind to check for the existence of the (long long) substring. I thought parenthesis made a block. Why we can't use I just negate a block with forward logic.

Unfortunately, regex processing and syntax is difficult to comprehend when you have so little working experience with regex.

NOTE: The duplicate suggestion offers a great explanation of why this doesn't work in VSCode, but it still doesn't provide the solution to my question.

Zak
  • 10,506
  • 15
  • 52
  • 90
  • @anubhava I have already tried, but I'm getting, `Error parsing regex near 'lld(?!.*\(' at character offset 6: Unrecognized flag: '!'. (Allowed flags: i, m, s, U, u, x.)` – Zak Apr 03 '18 at 17:21
  • Yes. I'm using the **VSCode** search bar, I'll add it to my post – Zak Apr 03 '18 at 17:23
  • 1
    Trying to use regex to parse C code is futile. – tripleee Apr 03 '18 at 17:33
  • C (or any other language) code is nothing more than text strings, which is in fact what regex is designed to parse. Therefore, there should be a way in regex to match the beginning of a string, while ensuring the remaining string does not contain a certain substring. The absolute worst case scenario would be that VSCode has an inferior regex parser, which would still have nothing to do with parsing C code specifically. – Zak Apr 03 '18 at 17:51
  • Look up the difference between regular and context-free languages. – tripleee Apr 03 '18 at 18:26
  • @tripleee Just to clarify, when I said, "_which is in fact what regex is designed to parse_", I did not mean in the same fashion as the C language parser. Rather as simply as being capable of searching an ASCII string for a matching set of characters (all of which would have nothing to do with context-free or regular languages). I wish to invoke regex to search strings for matching substrings, which is what the language is designed to do. – Zak Apr 03 '18 at 19:19
  • Perhaps I could improve my post if I understood the point you are trying to make. Could you explicitly state how you believe regex will fall short (with examples). Post it as an answer, and if you are correct I will mark it correct. Thanks. – Zak Apr 03 '18 at 19:22
  • Your immediate problem seems to be that VScode apparently doesn't support the regex dialect you are attempting to use. Negative lookaheads are not a standard regex facility; it is only supported in variants which implement some form of Perl 5/PCRE regular expressions. – tripleee Apr 03 '18 at 21:23
  • If you used e.g. Perl to print matches on `%lld((?!\(long long\)).)*$` it would still not work where the format string was manipulated by the preprocessor or was defined programmatically, or more mundanely do what you want where your format string contained multiple occurrences of `%lld` or had the arguments on a different physical line than the format string. The proper solution really is a static code analysis tool which understands C syntax and can reason in terms of code semantics. – tripleee Apr 03 '18 at 21:30
  • As it happens, multiple `%lld` occurrences in a format string is a good example of the difference between regular and context-free. You cannot solve this with a regex in the general case (though you could devise a - ridiculously complex - regex to cope with, say, up to three or up to five occurrences). – tripleee Apr 03 '18 at 21:36
  • Since VSCode [supports any lookarounds now](https://stackoverflow.com/questions/39915644/regex-look-behind-in-vs-code/39930449#39930449), the question is obsolete. – Wiktor Stribiżew Sep 10 '19 at 12:53

0 Answers0