0

I need a RegEx to extract the table names and aliases from an expression like:

INNER JOIN dbo.table1 t ON t.x = table n.y
INNER JOIN table2 on table2.x = table n.y
INNER JOIN table3 t3 on t3.x = tablen.y

From the above query, I would want to capture:

(Table Name, Alias)
table1, t
table2, " " <-- No alias specified so blank or null
table3, t3

Note that there are the following ways to specify a table:

  1. with or without an owner such as dbo.table1
  2. with or without an alias

In any event, all I care about is the table name and the alias.

In case it makes a difference, the Query will be against a SQL Server database, and I will be using it inside a C# application.

I am familiar with iterating through the Matches of Regex Match Collection as well as the RegEx groups. I am struggling only with the proper RegEx expression to use.

stema
  • 80,307
  • 18
  • 92
  • 121

2 Answers2

2

You could use:

(?i)\bJOIN\s+([\w.]+)(?:\s+(?!ON\b)(\w+))?

But note that this could give invalid results, for example if you have quoted strings with matching content. In such cases you'd be better off using a proper SQL parser instead of regex.

Qtax
  • 31,392
  • 7
  • 73
  • 111
2

You are assuming that a regular expression is powerful enough to parse a context-free language like SQL.

It isn't.

You need to reconsider your strategy, and therefore your question, completely.

I would look for an SQL parser, in one of the recognized parser generators, e.g. YACC/Bison, ANTLR, JavaCC, ...

user207421
  • 289,834
  • 37
  • 266
  • 440
  • *[It is](http://stackoverflow.com/questions/7434272/match-an-bn-cn-e-g-aaabbbccc-using-regular-expressions-pcre)*. But using a proper parser is usually better. – Qtax Jun 11 '12 at 11:14
  • @QTax It *isn't*, by definition. Otherwise parsers and REs wouldn't be different things. Your citation does not support the assertion that SQL can be parsed with an RE. – user207421 Jun 11 '12 at 11:16
  • @EJP, see link. Regex in this context are not regular. – Qtax Jun 11 '12 at 11:17
  • @Qtax I saw the link and I read it. Nowhere does it say SQL can be parsed with a RE. If you have a counter-example that answers the OP's actual question please provide it. – user207421 Jun 11 '12 at 11:18
  • @EJP, it doesn't say that, it just invalidates your argument. – Qtax Jun 11 '12 at 11:19
  • @Qtax A counter-example, i.e. an RE that can parse a CFL, and specifically *this* CFL, would invalidate my argument. A claim that regexes are more powerful than strict regular expressions as defined by Chomsky does not. – user207421 Jun 11 '12 at 11:50
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/12384/discussion-between-qtax-and-ejp) – Qtax Jun 11 '12 at 11:56
  • @EJP, your argument is that regex cannot parse CFL. The link shows that they can. And since you state that SQL is CFL, your argument is invalid. – Qtax Jun 11 '12 at 12:04
  • @EJP I don't believe I am assuming anything and I have already rethought my strategy several times over and have landed here. I'm not trying to "...parse a context-free language like SQL" as you say. I'm trying to break down a SINGLE line pattern that starts with INNER JOIN. I've use RegEx before for similiar requirements, but alas - my RegEx Fu is no longer strong and I felt the need for assistance. So please, look at this with an open mind, as to how you would break down this single line pattern to extract 2 pieces of information (table and optional alias), rather than parse SQL. – Worlds Local Yank Jun 11 '12 at 13:18
  • @Qtax Your citation only shows that one extended RE can parse one CFL grammar. Not proven. You would have to show me a proof that an extended RE is equivalent to a DPDA. Arbitrary Internet posts won't do it. The Chomsky hierarchy and all that stuff comes with *real* proofs. – user207421 Jun 12 '12 at 08:03
  • @user1448758 So you are assuming that an extended RE can parse an INNER JOIN clause. – user207421 Jun 12 '12 at 08:04
  • @EJP I guess I can't argue that. I am trying to use RegEx to break down a line pattern so yes, I am assuming there is an RE pattern that will allow this. I'm failing to understand your point, however. By undertaking anything aren't you assuming it can be done? I've read your previous comments and understand that a SQL Parser would be better. But are you saying definitively that what I am asking for is impossible? – Worlds Local Yank Jun 12 '12 at 11:19
  • @WorldsLocalYank What part of 'it isn't' don't you understand? It is proven impossible by the results of Chomsky 1956. – user207421 Jan 17 '17 at 23:03