Regex for IBAN mask

Question

I am trying to extract this text "NL dd ABNA ddddddddd" from string:

IBAN NL 91ABNA0417463300
IBAN NL91ABNA0417164300
Iban: NL 69 ABNA 402032566

And that string may have three or more variations.

Yet I've only come to this:

NL\s?\d{2}\s?[A-Z]{4}0\s?\d{9}$

Which matches the first two examples, but not the third.

See: https://regex101.com/r/zGDXa2/1.

How can I treat it?

Why don't you remove all the white spaces? That would make the regex easier. In python you can do `sentence = ' hello apple'`, `sentence.replace(" ", "")` — regina_fallangi, Mar 17 '19 at 18:47
I would also uppercase the strings, that way you avoid yourself including lowercase letters in the regex. Just use ` sentence.upper()` — regina_fallangi, Mar 17 '19 at 18:49
did you search SO? https://stackoverflow.com/questions/44656264 or https://stackoverflow.com/questions/23471591 or https://stackoverflow.com/questions/tagged/iban+regex ? — Patrick Artner, Mar 17 '19 at 18:57

score 3 · Answer 1 · answered Mar 17 '19 at 18:57

The problem in your regex101 demo is, there is an extra character in your regex after $ so remove that and change 0 to [0 ] and this fixes all and starts matching your third line too. The correct regex becomes,

NL\s?\d{2}\s?[A-Z]{4}[0 ]\s?\d{9}$

Check your updated demo

score 1 · Accepted Answer · answered Mar 18 '19 at 13:27

You can use the following regex:

(?i)(?:(?<=IBAN(?:[:\s]\s|\s[:\s]))NL\s?\d{2}\s?[A-Z]{4}[0 ]\s?\d{9,10})|(?:(?<=IBAN[:\s])NL\s?\d{2}\s?[A-Z]{4}[0 ]\s?\d{9,10})

demo:

https://regex101.com/r/zGDXa2/11

If you work in python you can remove the (?:i) and replace it by a flag re.I or re.IGNORECASE

Tested on:

Uw BTW nummer NL80
 IBAN NL 11abna0317164300asdfasf234
iBAN NL21ABNA0417134300 22
Iban: NL 29 ABNA 401422366f sdf
IBAN :NL 39 ABNA 0822416395s
IBAN:NL 39 ABNA 0822416395s

Extracts:

NL 11abna0317164300
NL21ABNA0417134300
NL 29 ABNA 401422366
NL 39 ABNA 0822416395
NL 39 ABNA 0822416395

score 0 · Answer 3 · answered Mar 17 '19 at 18:53

0

You can just remove all spaces and uppercase the rest, Like this:

iban = NL 91ABNA0417463300
iban.replace(" ", "")
iban.upper()

And then your regex would be:

NL\d{2}ABNA(\d{10}|\d{9})

It works in https://regex101.com/r/zGDXa2/1

answered Mar 17 '19 at 18:53

regina_fallangi

1,561
15
30

Dmitrii · Answer 4 · 2020-01-03T02:19:19.337

It's not what you want, but works.

IBAN has a strict format, so it's better to normalize it, and next just cut part, because everything will match regexp, as an example:

CODE

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# I'm not sure, that alphabet is correct, A-Z, 0-9
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"


def normalize(string):
    stage1 = "".join(IBAN.split()).upper()
    stage2 = ''
    for l in stage1:
        if l in alphabet:
            stage2 = stage2 + l

    return stage2.split('IBAN')[1]


if __name__ == '__main__':

    IBAN_LIST = ['IBAN NL 91ABNA0417463300', 'IBAN NL91ABNA0417164300', 'Iban: NL 69 ABNA 402032566']

    for IBAN in IBAN_LIST:
        IBAN_normalized = normalize(IBAN)
        print(IBAN_normalized[2:4], IBAN_normalized[8:])

OUTPUT

91 0417463300
91 0417164300
69 402032566

It's not a regexp, but should work faster, but if you know how to normalize better, please, help with it.

You can see source code here.

Regex for IBAN mask

4 Answers4

Linked

Related