0

In my answer to this SO question I originally tried using regex to grab the javascript array that follows json_items = and ends lazily at first ";" i.e. captures everything in the javascript array [array] variable called json_items.

You can see the start of the match here:

enter image description here

And the regex is saved here. I can correctly match with json_items = (.*?); in both python and PCRE (PHP) if I set the re.DOTALL flag or single line flag respectively.

For example, in Python the following functions correctly:

import requests, re, json

data = {
  'inputType': '1',
  'stringInput': 'https://www.youtube.com/channel/UC43lrLHl4EhxaKQn2HrcJPQ',
  'limit': '100',
  'keyType': 'default'
}

r = requests.post('https://youtube-playlist-analyzer.appspot.com/submit',  data=data)
p = re.compile(r'json_items = (.*?);', re.DOTALL)
results = json.loads(p.findall(r.text)[0])

The array is retrieved. However, I cannot seem to find the right setting (or to adjust regex appropriately) to do the same in VBA. My best attempt is shown below. It simply fails to match. How do I grab the javascript array as discussed using Regex in VBA please?

Option Explicit
Public Sub RegexMatch()

    Dim s As String, ws As Worksheet, body As String, re As Object

    body = "inputType=1&stringInput=https://www.youtube.com/channel/UC43lrLHl4EhxaKQn2HrcJPQ&limit=100&keyType=default"

    Set ws = ThisWorkbook.Worksheets("Sheet1")
    Set re = CreateObject("vbscript.regexp")

    With CreateObject("MSXML2.XMLHTTP")
        .Open "POST", "https://youtube-playlist-analyzer.appspot.com/submit", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send body
        s = .responseText
    End With

    Dim matches As Object

    With re
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .pattern = "json_items = (.*?);"
        If .Test(s) Then
            Set matches = .Execute(s)
            Debug.Print matches(0).SubMatches(0)
        Else
           Debug.Print "Failed"
        End If
    End With
End Sub

To preserve two answers given in comments:

FlorentB:

var json_items\s*=\s*((?:"(?:\\"|[^"])+"|[^;"]+)+)

Wiktor Stribiżew:

There is no option, use [^] or [\s\S] / [\d\D] / [\w\W] instead of a dot. 
QHarr
  • 72,711
  • 10
  • 44
  • 81

0 Answers0