1

I was doing a regex golf the other day, and the task was to match 'u' at the end of the string without using $. The goal was to match "fu", "tofu" and "snafu" but not "futz", "fusillade", "functional" or "discombobulated".

I came up with fu[^tsn], which worked on regex101; however, it does not pass the test as it does not match any of "fu", "tofu" and "snafu" for some reason. I'd like to know why it is not working, and if there is a smarter way to work around this (bonus: is there any real-life situation where not using $ would be better).

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
Jingjie YANG
  • 565
  • 2
  • 8
  • 21
  • 2
    The question has been asked already. Add `(?!.)`. Can't find the other question now. – Wiktor Stribiżew Apr 11 '17 at 08:26
  • 1
    The question is not practical, thus closed. The `$` is actually a construct supported in all regex flavors, and should not be replaced with any other "alternatives" (the `\z` or `\Z` are specific cases that are language dependent). – Wiktor Stribiżew Apr 11 '17 at 08:34
  • 1
    The reason `fu[^tsn]` doesn't match "tofu" is that there is not one character which isn't t, s, or n after "fu" in that word. The approach to exclude alphabetics is flawed anyway -- `fu[^a-z]` would still match `fu-bar` for example; but even if your input consists solely of alphabetics, prohibiting any alphabetic after the match still doesn't solve the problem of matching only an empty string after the match. – tripleee Apr 11 '17 at 08:35

1 Answers1

8

You can find a list of available regex elements in https://stackoverflow.com/a/22944075/224671. Some that you can try to pass the RegexGolf test:

  • \b: matches the word boundary, assuming the input is always a single word
  • \z or \Z: matches the end of string (Note: not supported in JavaScript)
  • (?!.): negative look-ahead, matches as long as there is no next character, semantically equivalent to $.

The reason your regex "works" in regex101 is because [^tsn] matched the newline next to fu. But in the RegexGolf it expects fu to be the whole input, thus the match failed. So be careful when you are testing with regex101. Perhaps it is better to switch to the "Unit Tests" mode in your case.

Graham
  • 6,577
  • 17
  • 55
  • 76
kennytm
  • 469,458
  • 94
  • 1,022
  • 977
  • But what is your actual full regex? – Tim Biegeleisen Apr 11 '17 at 08:26
  • @TimBiegeleisen Replace `/fu$/` with `fu` plus one of these. – kennytm Apr 11 '17 at 08:27
  • 1
    `\b` does not match the end of string only. `\z` is not supported in a lot of regex flavors. `(?![tsn])` only fails the match before `t`, `s`, `n`. – Wiktor Stribiżew Apr 11 '17 at 08:30
  • @WiktorStribiżew Fixed. – kennytm Apr 11 '17 at 08:33
  • Yeah, but a lookahead is not supported in RE2, Go, POSIX. No sense using all these "alternatives". – Wiktor Stribiżew Apr 11 '17 at 08:35
  • 1
    @WiktorStribiżew I agree it is impractical, but have you checked OP's linked page? The question totally makes sense. https://i.stack.imgur.com/vyqJE.png – kennytm Apr 11 '17 at 08:44
  • @kennytm: No, sorry, I do not see any *practical* sense. Who will use `(?!.)` in real life scenarios? Would you? `\z` only makes sense in the context of Ruby regex (or strict validation). `\Z` is important in Python/Ruby. `\b` is a commonly used construct and is not a good choice even for the current question. – Wiktor Stribiżew Apr 11 '17 at 08:47
  • BTW, the `(?!.)` is from my comment, and it is the best choice for the current question. – Wiktor Stribiżew Apr 11 '17 at 08:51
  • @WiktorStribiżew OP's question about how to solve the RegexGolf problem, and the best choice is actually `u\b` (3 chars). – kennytm Apr 11 '17 at 08:54
  • *Regex matching **end of string** without using `$`* - `\b` does not only match the end of string, and not always does it match the end of string. – Wiktor Stribiżew Apr 11 '17 at 08:56
  • @WiktorStribiżew Yes they are not equivalent, but you are missing the whole point, the "no `$`" restriction comes from RegexGolf, the "`\b` works" result also comes from RegexGolf matching only a single word (which is clearly stated in the answer). There's no point extending the discussion if we keep ignoring that the whole question stems from an attempt to solve a RegexGolf problem. – kennytm Apr 11 '17 at 09:03
  • Yes, there was no point answering the question, as there is no real problem. – Wiktor Stribiżew Apr 11 '17 at 09:04
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/141412/discussion-between-kennytm-and-wiktor-stribizew). – kennytm Apr 11 '17 at 09:05