I need to declare a string for use as a Regular Expression pattern.
The string is: (?<="[a-zA-Z0-9.-]*\d{8}.xml(?=")
Usually to declare a string in VBA for use in Reg Exp you enclose in double quotes so it would look like this: "(?<="[a-zA-Z0-9.-]*\d{8}.xml(?=")" but that results in a VBA Compile Error: Expected: end of statement with the [a-zA-Z0-9.-] highlighted.
This: "(?<="""[a-zA-Z0-9.-]*\d{8}.xml(?=""")" results in the same error.
This "(?<=""""[a-zA-Z0-9.-]*\d{8}.xml(?="""")"
works but when I use Msgbox to view the pattern it appears like this:
(?<=""[a-zA-Z0-9.-]*\d{8}.xml(?="")
and therefore won't work correctly in RegEx.
Arghhhh!
Here's the code I'm using for testing:
Sub tester()
Dim PATH_TO_FILINGS As String
'PATH_TO_FILINGS = "www.sec.gov/Archives/edgar/data/1084869/000110465913082760"
PATH_TO_FILINGS = "www.sec.gov/Archives/edgar/data/1446896/000144689612000023"
MsgBox GetInstanceDocumentPath(PATH_TO_FILINGS)
End Sub
Function GetInstanceDocumentPath(PATH_TO_FILINGS As String)
'this part launches IE and goes to the correct directory
If IEbrowser Is Nothing Then
Set IEbrowser = CreateObject("InternetExplorer.application")
IEbrowser.Visible = False
End If
IEbrowser.Navigate URL:=PATH_TO_FILINGS
While IEbrowser.Busy Or IEbrowser.readyState <> 4: DoEvents: Wend
'this part starts the regular expression engine and searches for the reg exp pattern (i.e. the file name)
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "(?<="[a-zA-Z0-9.-]*\d{8}.xml(?=")" '"\w+(?=-)(-)\d{8}(.xml)"
MsgBox RE.Pattern
RE.IgnoreCase = True
Dim INSTANCEDOCUMENT As Object
Set INSTANCEDOCUMENT = RE.Execute(IEbrowser.Document.body.innerhtml)
If INSTANCEDOCUMENT.Count = 1 Then
GetInstanceDocumentPath = PATH_TO_FILINGS & "/" & INSTANCEDOCUMENT.Item(0)
End If
End Function
Any thoughts on how to approach this are appreciated.