-1

What regular expression to perform search for header that starts with a number such as 1. Humility?

Here's the sample data screen shot, http://www.knowledgenotebook.com/issue/sampleData.html

Thanks.

Don Don
  • 29
  • 3

1 Answers1

1

Don't know what regex your using so I asume its Perl compatible.
You should always post some example data incase your perceptions of regex are unclear.

Breaking down what your 'Stop signs' are:

## left out of regex, this could be anything up here
##
(?:              # Start of non-capture group         START sign
     \d+\.           # 1 or more digits followed by '.'
   |              # or
     \(\d+\)         # '(' folowed by 1 or more digits followed by ')'
                     # note that \( could be start of capture group1 in bizzaro world
)                # End group
\s?              # 0 or 1 whitespace (includes \n)
[^\n<]+          # 1 or more of not \n AND not '<'    STOP sign's

It seems you want all chars after the group up to but not to include the
very next \n OR the very next '<'. In that case you should get rid of the \s?
because \s includes newline, if it matches a newline here, it will continue to match
until [^\n<]+ is satisfied.

(?:\d+\.|\(\d+\))[^\n<]+

Edit - After viewing your sample, it appears that you are searching unrendered html
pasted in html content. In that case the header appears to be:
'1. Self-Knowledge&lt;br&gt;' which when the entities are converted, would be
1. Self-Knowledge<br>

  1. Self-Knowledge
    Superior leadership ...

You can add the entity to the mix so that all your bases are covered (ie: entity, \n, <):

((?:\d+\.|\(\d+\)))[^\S\n]+((?:(?!&lt;|[\n<]).)+)

Where;
Capture group1 = '1.'
Capture group2 = 'Self-Knowledge'

Other than that, I don't know what it could be.