3

I have a query and want to get table names between from & where If its a single line and single table without alias i could do so:

(?<=from )([^#]\w*)(?=.*where)

I need to get each table except the prefixed table. i.e course c marks s

But i can't figure out regex for the following Query. (The where clause could be in same line or new line, on start of line or with space or tab)

from #prefix#student, course c, marks m
where .... 

There are also sub queries in some places, if that case could also be handled would help.

select ... from course c
where id = (select ... from student where ...)

I'm trying to find & replace in sublime text 3 editor

Test case queries:

//output [course]
select ... from course
where ...

//output [course c] [marks s]    
select ... from course c, marks s
where ....

//output [marks m]  
select ... from #prefix#course c, marks m
where ...

//output [student s]  
select ... from #prefix#course c
where id = (select ... from student s where ...)
Bsienn
  • 1,312
  • 2
  • 18
  • 30
  • what's your expected output? – Avinash Raj Nov 13 '15 at 10:37
  • @AvinashRaj i have edited the question please check. i need to get all the table names so i can replace them – Bsienn Nov 13 '15 at 10:40
  • Not sure, but I guess you may try [`\bfrom([^w]*(?:\bw(?!here\b)[^w]*)*)where`](https://regex101.com/r/uT9zJ9/1). – Wiktor Stribiżew Nov 13 '15 at 10:56
  • If you do not want to match any substring starting with `#` after `from`, I'd rather use [`\bfrom(?!\s*#)([^w]*(?:\bw(?!here\b)[^w]*)*)where`](https://regex101.com/r/uT9zJ9/2). – Wiktor Stribiżew Nov 13 '15 at 11:04
  • oddly enough @stribizhev & bobble Bubble regex are working in fiddle but not in sublime text editor. my regex for single work though – Bsienn Nov 13 '15 at 11:11
  • @bobblebubble i need all table names from query & sub query, one query may have more than one table with aliases. – Bsienn Nov 13 '15 at 11:29
  • 1
    @Bsienn Thanks for answer! My regex was for the opposite. To exclude the tablenames in subqueries. Best to describe the problem as accurate as possible in question. – bobble bubble Nov 13 '15 at 12:17

1 Answers1

3

You can use the following regex:

\bfrom\b(?!\s*#)([^w]*(?:\bw(?!here\b)[^w]*)*)\bwhere\b

See the regex demo

Check Case sensitive option in case you need that.

If you need to just highlight all between from and where, use lookarounds:

(?<=\bfrom\b)(?!\s*#)([^w]*(?:\bw(?!here\b)[^w]*)*)(?=\bwhere\b)

See another demo and the screen with results:

enter image description here

Regex breakdown:

  • (?<=\bfrom\b) - check if there is a whole word from before the next...
  • (?!\s*#) - make sure there is no 0 or more whitespaces followed by #
  • ([^w]*(?:\bw(?!here\b)[^w]*)*) - match any text that is not where up to...
  • (?=\bwhere\b) - a whole word where.

UPDATE

Since you need to get comma-separated values excluding prefixed names with their aliases, you need a boundary-constrained regex. It can be achieved with \G operator:

(?:\bfrom\b(?:\s*#\w+(?:\s*\w+))*+|(?!^)\G),?\s*\K(?!(?:\w+ )?\bwhere\b)([\w ]+)(?=[^w]*(?:\bw(?!here\b)[^w]*)*\bwhere\b)

Here,

  • (?:\bfrom\b(?:\s*#\w+(?:\s*\w+))*+|(?!^)\G) - matches from (as a whole word) followed by optional whitespace followed by # and 1 or more alphanumerics that are followed by whitespaces+alphanumerics (alias)
  • ,?\s*\K - optional (1 or 0) commas followed by 0 or more whitespaces that are followed by \K that forces the engine to omit the whole chunk of text matched so fat
  • (?!(?:\w+ )?\bwhere\b) - a restrictive lookahead with which we forbid the next or the word following the next word to be equal to where
  • ([\w ]+) - our match, 1 or more alphanumerics or space (may be replaced with [\w\h]+)
  • (?=[^w]*(?:\bw(?!here\b)[^w]*)*\bwhere\b) - a trailing boundary: there must be text other than where up to the first where.
Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • Can you please suggest me some recourse to learn regex plz? this is my 1st try on regex and i was banging my head. This worked for query & subb-query as well in same regex. Awesome. Thanks :) – Bsienn Nov 13 '15 at 11:14
  • 1
    This task is not a basic-level one. You need to learn look-arounds, negated character classes, word boundaries, unrolling-the-loop... Certainly you could just use `(?s)from.*?where` but what if the document is too long and contained `nowhere` somewhere in-between? I tried to come up with the safest regex, and it requires a lot of regex knowledge. – Wiktor Stribiżew Nov 13 '15 at 11:16
  • 1
    I updated the first regex with word boundaries. I can only suggest doing all lessons at [regexone.com](http://regexone.com/), reading through [regular-expressions.info](http://www.regular-expressions.info), [regex SO tag description](http://stackoverflow.com/tags/regex/info) (with many other links to great online resources), and the community SO post called [What does the regex mean](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean). Also, [rexegg.com](http://rexegg.com) is a very, very good Web site for regex learning (use a reference). – Wiktor Stribiżew Nov 13 '15 at 11:18
  • @strbizhev one more thing. in my query there are more than 1 table comma separated. this regex isn't getting individual table if its 2 tables or more. it gets whole string within from & to in that case. could you help? – Bsienn Nov 13 '15 at 11:28
  • Try [`(?:\bfrom\b(?!\s*#)|(?!^)\G),?\s*\K(?!\bwhere\b)(\w+)(?=[^w]*(?:\bw(?!here\b)[^w]*)*\bwhere\b)`](https://regex101.com/r/uT9zJ9/6). I do not see any examples, so I am just guessing you do not want to return `student c`, but `student`, `c` as separate matches. If not, use [`(?:\bfrom\b(?!\s*#)|(?!^)\G),?\s*\K(?!\bwhere\b)([\w ]+)(?=[^w]*(?:\bw(?!here\b)[^w]*)*\bwhere\b)`](https://regex101.com/r/uT9zJ9/7). – Wiktor Stribiżew Nov 13 '15 at 11:38
  • thanks for reply, i have an expample in my question. its: `from #prefix#student, course c, marks s where .... ` the table i want are `course c` & `marks s`. aliase are not necessory to get but tables not having prefix. sorry for late reply – Bsienn Nov 13 '15 at 11:52
  • 1
    Then, try [`(?:\bfrom\b(?:\s*#\w+)*|(?!^)\G),?\s*\K(?!\bwhere\b)([\w ]+)(?=[^w]*(?:\bw(?!here\b)[^w]*)*\bwhere\b)`](https://regex101.com/r/uT9zJ9/8). – Wiktor Stribiżew Nov 13 '15 at 11:54
  • just 1 problem in that, its getting alias of prefixed table. if it won't get that. will solve my problem 100% https://regex101.com/r/lJ4rW3/1 – Bsienn Nov 13 '15 at 12:07
  • Mind-blowing.... This [`(?:\bfrom\b(?:\s*#\w+(?:\s*\w+)?)*|(?!^)\G),?\s*\K(?!\bwhere\b)([\w ]+)(?=[^w]*(?:\bw(?!here\b)[^w]*)*\bwhere\b)`](https://regex101.com/r/lJ4rW3/2) should work. – Wiktor Stribiżew Nov 13 '15 at 12:14
  • in the last query the prefixed query have alias that is getting in match. shouldn't get matched as single character alias `c` otherwise its almost what i desire. https://regex101.com/r/lJ4rW3/2 – Bsienn Nov 13 '15 at 12:38
  • 1
    Try [`(?:\bfrom\b(?:\s*#\w+(?:\s*\w+))*+|(?!^)\G),?\s*\K(?!(?:\w+ )?\bwhere\b)([\w ]+)(?=[^w]*(?:\bw(?!here\b)[^w]*)*\bwhere\b)`](https://regex101.com/r/lJ4rW3/3). – Wiktor Stribiżew Nov 13 '15 at 12:50
  • 1
    This solved the issue. Thanks. Thanks a lot, you helped me a lot and those tutorials are great. Thanks – Bsienn Nov 13 '15 at 13:06