I have a long, Markdown-formatted string which consists of repeated sections of one or more headers and a multi-line description, like so:
**[Title1](link1) brief description** flag1, flag2
commentary,
occasionally multi-line
---
**[Title2](link2) brief description** flag3, flag4
**[Title3](link3) brief description** flag5, flag6, flag7
commentary
---
...
This order is occasionally broken with other text, interwoven between ---
and the next header.
I wish to process it with JS's regex in order to capture the title, link, description and commentary in separate capture groups. Ideally, from the example given I would like to get something like:
1st match:
group 1: Title1
group 2: link1
group 3: brief description
group 4: commentary,
occasionally multi-line
2nd match:
group 1: Title2
group 2: link2
group 3: brief description 2
group 4: Title3
group 5: link3
group 6: brief description 3
group 7: commentary
...
I'm not going to lie - my regex skills could use some polishing, however I managed to solve this problem, restricting it to singular headers (using a regex akin to /\*\*\[(.*)\]\((.*)\)\s+(.*)\*\*.*\s+((?:.*\s)*?)?---/g
). With an unspecified number of them, I'm not sure how to gather the separate fragments into concise groups, because no matter what I try, I either get separate matches for headers belonging to one item, or the second and subsequent headers get mashed with the commentary.
Is this possible with regex only? I would like to avoid splitting by item boundaries (**[
and ---
in this case) and chopping it further from there, because that seems less elegant than a single regex match.