1

Suppose that *a* is a Java identifier. I would like a regex to match things like this:

\#a \#a.a.a (a any number of times)

but not this:

\#a. (ending with dot)

So in a phase like this: "#a.a less than #a." it would match only the first \#a.a (because it doesn't end with a dot).

This regex:

\#[a-zA-Z_$][\\w$]*(\\.[a-zA-Z_$][\\w$]*)*

almost does the job, but it matches the last case too.

Thank you.

Marcos

cathulhu
  • 545
  • 1
  • 7
  • 20
Marcos
  • 1,119
  • 13
  • 25
  • 1
    Possible duplicate: http://stackoverflow.com/questions/5205339/regular-expression-matching-fully-qualified-class-names – Lii May 02 '16 at 10:39

2 Answers2

2

This can be accomplished with a negated look ahead. This first looks for "#text_$". It then looks for ".text_$" or more times. The match will be invalid if it ends with 0 or more of "text_$" and a period. This is assuming the i modifier is on.

At first I just had it as checking if it didn't end with a period, but that would just take away the last character in the match.

\\#([a-z_$][a-z_$\d]*)(\.[a-z_$][a-z_$\d]*)*(?![a-z_$\d]*\.)

Results

\#abc           => YES
\#abc.abc       => YES
\#abc.a23.abc   => YES
\#abc.abc.abc.  => NO
\#abc.2bc.abc   => NO

Try it out

Daniel Gimenez
  • 14,859
  • 2
  • 38
  • 61
  • @Marcos: added digits. The accepted answer also did not work for digits. – Daniel Gimenez Jul 18 '13 at 12:50
  • The complete regex that I'm using is this: (?i)#[a-zA-Z_$][\\w$]*(?:\\.[a-zA-Z_$][\\w$]*)*(?!\\w*\\.) So it works with digits. – Marcos Jul 18 '13 at 12:54
  • @DanielGimenez: My answer definitely works with digits, you can try yourself. – anubhava Jul 18 '13 at 12:55
  • @anubhava you're right. I suppose our answers are redudant because at the end I reached the same answer you had without the `\w`. I will delete in a few after I know you read this comment. – Daniel Gimenez Jul 18 '13 at 13:01
  • @DanielGimenez: Yes at this point I would think both answers look same (after you start using `\w`). But I would say just leave it like this, why delete. – anubhava Jul 18 '13 at 13:16
2

You almost got it right but some minor adjustments are needed. Consider this regex:

#[A-Za-z_$][\w$]*(?:\.[A-Za-z_$][\w$]*)*(?!\w*\.)

Live Demo: http://www.rubular.com/r/kJbSJKHhtv

Translated to Java:

(?i)#[a-z_$][\\w$]*(?:\\.[a-z_$][\\w$]*)*(?!\\w*\\.)
anubhava
  • 664,788
  • 59
  • 469
  • 547