8

https://regex101.com/r/sB9wW6/1

(?:(?<=\s)|^)@(\S+) <-- the problem in positive lookbehind

Working like this on prod: (?:\s|^)@(\S+), but I need a correct start index (without space).

Here is in JS:

var regex = new RegExp(/(?:(?<=\s)|^)@(\S+)/g);

Error parsing regular expression: Invalid regular expression: /(?:(?<=\s)|^)@(\S+)/

What am I doing wrong?

UPDATE

Ok, no lookbehind in JS :(

But anyways, I need a regex to get the proper start and end index of my match. Without leading space.

Charles Duffy
  • 235,655
  • 34
  • 305
  • 356
Kindzoku
  • 1,137
  • 9
  • 25
  • 4
    There is no lookbehind in Javascript – anubhava Sep 22 '16 at 10:07
  • Oh, thx! :D I didn't know :D Erm... Any idea how I can reach my goal than? :) – Kindzoku Sep 22 '16 at 10:07
  • Next time be careful and select `JavaScript` [like so](https://regex101.com/r/sB9wW6/2) – Thomas Ayoub Sep 22 '16 at 10:08
  • 1
    It's useful to select the JavaScript option on the left hand side, to verify the syntax is actually valid for JS, not just for PCRE (which is the default) – VLAZ Sep 22 '16 at 10:08
  • In what cases RegEx shouldn't match? – revo Sep 22 '16 at 10:31
  • Always tag your questions with a specific implementation language. Regexes in Python are different from regexes in JavaScript are different from regexes in Java are different from regexes in `grep -E` are different from regexes in emacs are different from bash's built-in regexes... – Charles Duffy Feb 24 '20 at 00:40
  • ...oh -- I see it *was* tagged that way, but @WiktorStribiżew pulled the tag in making it into a generic reference question. Hmm -- maybe the title and text could use some editing to make it clear that this *is* a general reference question now, without needing to read the answer first. – Charles Duffy Feb 24 '20 at 00:43

1 Answers1

36

Make sure you always select the right regex engine at regex101.com. See an issue that occurred due to using a JS-only compatible regex with [^] construct in Python.

JS regex - at the time of answering this question - did not support lookbehinds. Now, it becomes more and more adopted after its introduction in ECMAScript 2018. You do not really need it here since you can use capturing groups:

var re = /(?:\s|^)@(\S+)/g; 
var str = 's  @vln1\n@vln2\n';
var res = [];
while ((m = re.exec(str)) !== null) {
  res.push(m[1]);
}
console.log(res);

The (?:\s|^)@(\S+) matches a whitespace or the start of string with (?:\s|^), then matches @, and then matches and captures into Group 1 one or more non-whitespace chars with (\S+).

To get the start/end indices, use

var re = /(\s|^)@\S+/g; 
var str = 's  @vln1\n@vln2\n';
var pos = [];
while ((m = re.exec(str)) !== null) {
  pos.push([m.index+m[1].length, m.index+m[0].length]);
}
console.log(pos);

BONUS

My regex works at regex101.com, but not in...

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • My main goal is to get start and end indexes. – Kindzoku Sep 22 '16 at 10:12
  • The indexes of what? The pos after `@`? – Wiktor Stribiżew Sep 22 '16 at 10:13
  • Index of '@' and end of word. Now in case of start of the string I get 0, and in case of middle text - index - 1 (coz of matched space) – Kindzoku Sep 22 '16 at 10:14
  • @Kindzoku let's backtrack - _why_ do you need the indeces? Because I am not sure a regex would help, even if it worked. – VLAZ Sep 22 '16 at 10:16
  • Well, I added another snippet to return the list of start and end positions of `@\S+`-matching values. Really, no idea why you need them. – Wiktor Stribiżew Sep 22 '16 at 10:18
  • Actualy, lookehind checking, without taking. So it is working. Why, is really not important. There is a lot of ways I can handle this. I just wanted to do it in an elegant way. I have `(?:\s|^)@(\S+)` at the start. I used lookbehind to get rid of space. – Kindzoku Sep 22 '16 at 10:19
  • So I guess it's not doable by regex then. – Kindzoku Sep 22 '16 at 10:20
  • Oh! That's it. Well, actually I still need the match, so `/(\s|^)@(\S+)/g`, but `m.index+m[1].length` - pretty smart! In my current solution I'm checking for a `' '` on zero index and do +1 in this case. But your solution is a lot better. Thx :) I think it'll be better to reduce the answer to second part. – Kindzoku Sep 22 '16 at 11:06