-2

I was going thorough code snippet for web crawler written in java.It has used String url = "http://www.wikipedia.org/" to start the crawling and regular expression "http://(\w+\.)*(\w+)".can anybody exaplain meaning of it?

user3526905
  • 161
  • 12
  • Please consider bookmarking the [Stack Overflow Regular Expressions FAQ](http://stackoverflow.com/a/22944075/2736496) for future reference. An additional link that may be of interest is [matching urls](http://stackoverflow.com/a/190405/2736496), which is listed under "Common Validation Tasks". – aliteralmind Apr 12 '14 at 18:20

1 Answers1

0

Well, let's look at the documentation, shall we?

  • The text http:// will be matched literally.
  • (...) denote capture groups
  • \w means "a word character"
  • + means "one or more of the previous thing"
  • \. means a literal dot (.)
  • * means zero or more of the previous thing (everything in the capture group)
  • Then another capture group for word characters
T.J. Crowder
  • 879,024
  • 165
  • 1,615
  • 1,639