1

OCaml, how to disassemble large multiline text data using the Page module. Ignoring the symbol of the beginning of a new line.

let get_info content =
  let re = Str.regexp "\\(.+?\\)" in
  match Str.string_match re content 0 with
    | true -> print_endline("-->"^(Str.matched_group 1 content)^"<--")
    | false -> print_endline("not found");;

This example returns only the first line, but need some text in multiple lines.

1 Answers1

2

According to http://pleac.sourceforge.net/pleac_ocaml/patternmatching.html:

  • Str's regexps lack a whitespace-matching pattern.

So, here is a workaround suggested on that page:

#load "str.cma";;
...
let whitespace_chars =
  String.concat ""
    (List.map (String.make 1)
       [
         Char.chr 9;  (* HT *)
         Char.chr 10; (* LF *)
         Char.chr 11; (* VT *)
         Char.chr 12; (* FF *)
         Char.chr 13; (* CR *)
         Char.chr 32; (* space *)
       ])

and then

let re = Str.regexp "\\((?:[^" ^ whitespace_chars ^ "]|" ^ whitespace_chars ^ ")+?\\)" in
Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397