What regular expression to perform search for header that starts with a number such as 1. Humility?
Here's the sample data screen shot, http://www.knowledgenotebook.com/issue/sampleData.html
Thanks.
What regular expression to perform search for header that starts with a number such as 1. Humility?
Here's the sample data screen shot, http://www.knowledgenotebook.com/issue/sampleData.html
Thanks.
Don't know what regex your using so I asume its Perl compatible.
You should always post some example data incase your perceptions of regex are unclear.
Breaking down what your 'Stop signs' are:
## left out of regex, this could be anything up here
##
(?: # Start of non-capture group START sign
\d+\. # 1 or more digits followed by '.'
| # or
\(\d+\) # '(' folowed by 1 or more digits followed by ')'
# note that \( could be start of capture group1 in bizzaro world
) # End group
\s? # 0 or 1 whitespace (includes \n)
[^\n<]+ # 1 or more of not \n AND not '<' STOP sign's
It seems you want all chars after the group up to but not to include the
very next \n OR the very next '<'. In that case you should get rid of the \s?
because \s includes newline, if it matches a newline here, it will continue to match
until [^\n<]+ is satisfied.
(?:\d+\.|\(\d+\))[^\n<]+
Edit - After viewing your sample, it appears that you are searching unrendered html
pasted in html content. In that case the header appears to be:
'1. Self-Knowledge<br>'
which when the entities are converted, would be
1. Self-Knowledge<br>
You can add the entity to the mix so that all your bases are covered (ie: entity, \n, <):
((?:\d+\.|\(\d+\)))[^\S\n]+((?:(?!<|[\n<]).)+)
Where;
Capture group1 = '1.'
Capture group2 = 'Self-Knowledge'
Other than that, I don't know what it could be.