2

I want a Regular expression for below statement to satisfy the condition in angular material input field having pattern attribute

"Note that strings SHALL NOT exceed 1MB (1024*1024 characters) in size. Strings SHOULD not contain Unicode character points below 32, except for u0009 (horizontal tab), u0010 (carriage return) and u0013 (line feed). Leading and Trailing whitespace is allowed, but SHOULD be removed when using the XML format. Note: This means that a string that consists only of whitespace could be trimmed to nothing, which would be treated as an invalid element value. Therefore strings SHOULD always contain non-whitespace content"

I expect any string with above unicode above 32 characters and words with spacing

I tried with regex "^((?![\u0001-\u0008]|[\u000B-\u000C]|[\u000E-\u0020]).)*$" but no luck

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • Looks like you are looking to create a regex, but do not know where to get started. Please check [Reference - What does this regex mean](https://stackoverflow.com/questions/22937618) resource, it has plenty of hints. Once you get some expression ready and still have issues with the solution, please edit the question with the latest details and we'll be glad to help you fix the problem. – Wiktor Stribiżew Jul 07 '19 at 11:45
  • i am not asking for any referance link but its good to know,as I tried with "^((?![\u0001-\u0008]|[\u000B-\u000C]|[\u000E-\u0020]).)*$" but no luck so can help me to make satisfied the above mentioned statement – user3008819 Jul 09 '19 at 01:41
  • So, you need it for the HTML5 pattern attribute, right? It seems you want to match a string that fully consists of ASCII "visible" characters + CR, LF or TAB, right? Try `pattern="[ -~\x0A\x0D\x09]*"`. If you want to also allow all other Unicode chars but emojis or other surrogate pairs, use ``pattern="[ -\uFFFF\x0A\x0D\x09]*"`` – Wiktor Stribiżew Jul 09 '19 at 15:12
  • pattern="[ -\uFFFF\x0A\x0D\x09]*" is this satisfying statment "Strings SHOULD not contain Unicode character points below 32, except for u0009 (horizontal tab), u0010 (carriage return) and u0013 (line feed). Leading and Trailing whitespace is allowed" ? – user3008819 Jul 11 '19 at 18:44

1 Answers1

0

You may use

pattern="[ -\uFFFF\x0A\x0D\x09]*"

It will be "converted" to ^(?:[ -\uFFFF\x0A\x0D\x09]*)$ regex and will match

  • ^ - start of string
  • [ -\uFFFF\x0A\x0D\x09]* - 0 or more chars from space till the last Unicode char in Unicode table, and also an LF (\x0A), CR (\x0D) and TAB (\x09)
  • $ end of string.
Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397