Javascript Word Boundary Utf-8 Character

Asked Aug 22 '16 at 12:26

Active Aug 22 '16 at 12:41

Viewed 90 times

I've tried to replace a word with a word boundary but not working when anyone word begins to utf-8 character. This is my code;

var text="Su çsu su çsuö suö";
var word="su";
var regex=new RegExp("\\b"+word+"\\b", "gi");
text=text.replace(regex,"<span style='color:red;'>"+word+"</span>");

console.log(text);

/* çsuö */

and all "su" change this one; çsuö

edited Aug 22 '16 at 12:41

asked Aug 22 '16 at 12:26

ozen

Alas, you can't use word-boundaries with UTF-8. Maybe you could solve your problem by looking around for delimiters (spaces? punctuation?). – Aaron Aug 22 '16 at 12:30
I'm guessing it's because you are placing two "\" in front of the boundary keyword. Look at this example `'Su çsu su çsuö suö'.replace(/\b(su)\b/gi, '$1')` – n0m4d Aug 22 '16 at 12:33
@n0m4d: Looks like exactly the same result to me. – Whothehellisthat Aug 22 '16 at 12:37
Does this help? [SO](http://stackoverflow.com/questions/2881445/utf-8-word-boundary-regex-in-javascript) – Whothehellisthat Aug 22 '16 at 12:39
Standard Javascript regex word boundaries only match at "ASCII-character" words. There are several workarounds, up to using a different regex library. Which workaround you use depends on the circumstances. – Tomalak Aug 22 '16 at 12:39
Another workaround suggestion: https://github.com/mathiasbynens/es-regexp-unicode-character-class-escapes/blob/master/d-w-b.md – Tomalak Aug 22 '16 at 12:41
@Whothehellisthat that pattern not change sentence begin and end. – ozen Aug 22 '16 at 12:52

Javascript Word Boundary Utf-8 Character

0 Answers0