Questions tagged [lookahead]

Lookahead is a zero-length assertion used in combinatorial searches and parsing.

Lookahead is a zero-length assertion used for parsing text ahead of the current point. This is frequently used in regex to assert if a match is possible or not. Further explanation in this tutorial.

The term is also used in combinatorial search to determine how deep the graph representing the problem is explored which allows for better control of the search algorithm's time. Wikipedia

376 questions
803
votes
14 answers

Regular Expressions: Is there an AND operator?

Obviously, you can use the | (pipe?) to represent OR, but is there a way to represent AND as well? Specifically, I'd like to match paragraphs of text that contain ALL of a certain phrase, but in no particular order.
hugoware
  • 33,265
  • 24
  • 58
  • 70
78
votes
4 answers

Does lookaround affect which languages can be matched by regular expressions?

There are some features in modern regex engines which allow you to match languages that couldn't be matched without that feature. For example the following regex using back references matches the language of all strings that consist of a word that…
sepp2k
  • 341,501
  • 49
  • 643
  • 658
70
votes
3 answers

Regular expression negative lookahead

In my home directory I have a folder drupal-6.14 that contains the Drupal platform. From this directory I use the following command: find drupal-6.14 -type f -iname '*' | grep -P 'drupal-6.14/(?!sites(?!/all|/default)).*' | xargs tar -czf…
themesandmodules
  • 898
  • 1
  • 6
  • 7
34
votes
2 answers

String negation using regular expressions

Is it possible to do string negation in regular expressions? I need to match all strings that do not contain the string "..". I know you can use ^[^\.]*$ to match all strings that do not contain "." but I need to match more than one character. I…
Paul Bevis
  • 811
  • 1
  • 8
  • 14
21
votes
4 answers

LR(1) Item DFA - Computing Lookaheads

I have trouble understanding how to compute the lookaheads for the LR(1)-items. Lets say that I have this grammar: S -> AB A -> aAb | a B -> d A LR(1)-item is an LR(0) item with a lookahead. So we will get the following LR(0)-item for state 0: S ->…
mrjasmin
  • 1,124
  • 5
  • 16
  • 31
20
votes
3 answers

How does the regular expression ‘(?<=#)[^#]+(?=#)’ work?

I have the following regex in a C# program, and have difficulties understanding it: (?<=#)[^#]+(?=#) I'll break it down to what I think I understood: (?<=#) a group, matching a hash. what's `?<=`? [^#]+ one or more non-hashes (used to…
knittl
  • 197,664
  • 43
  • 269
  • 318
19
votes
2 answers

Javascript won't split using regex

Since I started writing this question, I think I figured out the answers to every question I had, but I thought I'd post anyway, as it might be useful to others and more clarification might be helpful. I was trying to use a regular expression with…
user45743
  • 449
  • 1
  • 6
  • 8
18
votes
3 answers

How to match multiple words in regex

Just a simple regex I don't know how to write. The regex has to make sure a string matches all 3 words. I see how to make it match any of the 3: /advancedbrain|com_ixxocart|p\=completed/ but I need to make sure that all 3 words are present in the…
UpHelix
  • 4,666
  • 10
  • 54
  • 82
18
votes
8 answers

Using lookahead with generators

I have implemented a generator-based scanner in Python that tokenizes a string into tuples of the form (token type, token value): for token in scan("a(b)"): print token would print ("literal", "a") ("l_paren", "(") ... The next task implies…
jena
  • 6,497
  • 1
  • 20
  • 23
15
votes
4 answers

REGEX - Matching any character which repeats n times

How to match any character which repeats n times? Example: for input: abcdbcdcdd for n=1: .......... for n=2: ......... for n=3: .. ..... for n=4: . . .. for n=5: no matches After several hours my best is this…
ferit
  • 4,937
  • 3
  • 27
  • 49
12
votes
4 answers

Regex to match all permutations of {1,2,3,4} without repetition

I am implementing the following problem in ruby. Here's the pattern that I want : 1234, 1324, 1432, 1423, 2341 and so on i.e. the digits in the four digit number should be between [1-4] and should also be non-repetitive. to make you understand in a…
Apoorv Saxena
  • 3,736
  • 9
  • 27
  • 44
10
votes
4 answers

Need a regex to match a variable length string of numbers that can't be all zeros

I need to validate an input on a form. I'm expecting the input to be a number between 1 to 19 digits. The input can also start with zeros. However, I want to validate that they are not all zeros. I've got a regex that will ensure that the input is…
Notorious2tall
  • 1,418
  • 4
  • 16
  • 26
10
votes
1 answer

lookahead regex in nginx location

I'm trying to match /category/anything, except /category/paid in nginx location. I have the following regex, but it's not working. Google tells me that I can use lookahead in nginx. Am I doing something wrong? location ^/category(?!/paid)/ { }
Moon
  • 20,835
  • 65
  • 174
  • 263
9
votes
3 answers

Regex: negative look-ahead between two matches

I'm trying to build a regex somewhat like this: [match-word] ... [exclude-specific-word] ... [match-word] This seems to work with a negative look-ahead, but I'm running into a problem when I have a case like this: [match-word] ...…
Alexander Malfait
  • 2,611
  • 1
  • 20
  • 22
9
votes
3 answers

How to use regex lookahead to limit the total length of input string

I have this regular expression and want to add the rule which limit the total length is no more than 15 chars. I saw some lookahead examples but they're not quite clear. Can you help me to modify this expression to support the new rule. ^([A-Z]+(…
AustinTX
  • 1,112
  • 3
  • 19
  • 27
1
2 3
25 26