4

I want to get regex for the following construct where it should result as:

Actions and Sci-Fi

<a href="/?genre=Action">Actions</a> <a href="/?genre=Sci-Fi">Sci-Fi</a>
Muhammad Muazzam
  • 2,714
  • 6
  • 25
  • 53

1 Answers1

4

Don't parse html files with regex. If you insist then you could use the below regex and get the text inside anchor tags from group index 1.

<a\s[^<>]*>([^<>]*)<\/a>

DEMO

Explanation:

<a                       '<a'
\s                       whitespace (\n, \r, \t, \f, and " ")
[^<>]*                   any character except: '<', '>' (0 or more
                         times)
>                        '>'
(                        group and capture to \1:
  [^<>]*                   any character except: '<', '>' (0 or
                           more times)
)                        end of \1
<                        '<'
\/                       '/'
a>                       'a>'
Avinash Raj
  • 160,498
  • 22
  • 182
  • 229