How to get every highlighted 1st, 4th, 7th, 10th line (starting at line 1 then +3) regex?

Question

I have gotten the page source from Amazon and used Regex to find the names of the monitors I want. On amazon you can see 3 monitors per line with prices. I essentially want the monitors at the start of each line which means every 1st, 4th line and so on.

https://www.amazon.com/Best-Sellers-Electronics-Computer-Monitors/zgbs/electronics/1292115011

My code is: (?<='true'>\n\s+)\w.*.*(?=\n\s+</div>)

How do I get every highlighted 1st, 4th, 7th, 10th line (starting at line 1 then +3) regex? Maybe find and replace?

Your link is broken. You should include the text you are trying to match directly in the question. The question already has a bit of a smell, because it looks like you are trying to parse HTML using regex. Instead, consider using an HTML parser. — Tim Biegeleisen, Jul 16 '17 at 14:55
@TimBiegeleisen Any recommendations? Hmm, it appears java or python html parsers. Looks like I'll be installing quite a lot — , Jul 16 '17 at 15:01
@TimBiegeleisen It works when I copy and paste it. For some reason stack overflow seems to create an error — , Jul 16 '17 at 15:17
You are using EditPad Pro, I suggest you do not add Notepad++ tag to your question. — Wiktor Stribiżew, Jul 16 '17 at 17:27
If this is for notepad++ as tagged before the edit: To identify the 1st, 4th, 7th item eg by the number. In np++ check the checkbox *. matches newline* and try regex like [`zg_rankNumber[^>]+>\s*\b(?:[147]|1[0369]|2[258])\b.*?'true'>\s*\K[^ — bobble bubble, Jul 16 '17 at 17:30
@bobblebubble Thank you. I tried that in notepad++ and editpad and it gave me an error. In notepad++ I got Search "zg_rankNumber[^>]+>\s*\b(?:[147]|1[0369]|2[258])\b.*?'true'>‌\s*\K[^ — , Jul 17 '17 at 06:44
@bobblebubble I will try and modify it to see if it works in editpadpro. \k is not supported so have to use something else — , Jul 17 '17 at 07:30
Hmm have been trying to modify.. not really sure how to use a \K alternative as it just does not work in my job. I also am not too sure how to get every nth match either. I will keep on digging... — , Jul 17 '17 at 08:35
If your tool supports lookbehind of variable length, try something like `(?<=_(?:[147]|1[0369]|2[258])\?[^>]*>(?:]*>\s*){3}]*'true'>\s*)\b[^ — bobble bubble, Jul 17 '17 at 10:43
@bobblebubble Hmmm no such luck. Did it work for you in editpad? I'll try and change around a few things — , Jul 17 '17 at 11:12
Na it worked for me in [regexstorm tool](http://www.regexstorm.net/tester). — bobble bubble, Jul 17 '17 at 11:14
@bobblebubble Uh okay. I think a lot of regex editors vary in capabilities I guess. Hopefully I can get it working in editpadpro as the job I have tends to have capabilities similar to that. — , Jul 17 '17 at 11:23
@bobblebubble Are there any guides on getting nth line of a match or something of the sort. Might help me work out what to change. — , Jul 17 '17 at 11:45
You're parsing html. if you view the source you see, it's not nth line you need but all 3 enumerated items. You need an identifier from source (I used the link or the rank number in first regex). — bobble bubble, Jul 17 '17 at 12:38
@bobblebubble I can't really understand how [147]|1[0369]|2[258] represent anything I'm viewing in the source. Or how that would get every 3rd? Sorry — , Jul 18 '17 at 01:57
In first pattern I used `zg_rankNumber` in second pattern this part from link `/ref=zg_bs_1292115011_1?`, `/ref=zg_bs_1292115011_4?`, `/ref=zg_bs_1292115011_7?` (it's a responsive website. scale your broswerwindow and you have two rows instead three. You need some item-number from source to get each 3rd item. — bobble bubble, Jul 18 '17 at 09:25
@bobblebubble Thanks. Are there any guides online or anywhere you can point me to in the right direction so I can use this. I understand what you're saying but I feel like I would not be able to replicate. — , Jul 20 '17 at 13:39
You can download [notepad++](https://notepad-plus-plus.org/download/v7.4.2.html) and use [this pattern](https://regex101.com/r/Hnum3C/1). If you need a regex tutorial, try [regular-expressions.info](http://www.regular-expressions.info/) or see the [SO regex faq](https://stackoverflow.com/a/22944075/5527985). Wish you good luck. — bobble bubble, Jul 20 '17 at 14:47

How to get every highlighted 1st, 4th, 7th, 10th line (starting at line 1 then +3) regex?

0 Answers0