-1

Is there a way in python to perform partial matching between a word and a generic pattern (a regular expression)?

The aim is to understand how far is a word from a given pattern, e.g. the distance of a word from the pattern of a license plate that is in format AB123CD, so two letters,three digts and two letters again expressed by its regular expression.

Examples: -the word DF345EE matches exactly the pattern.

-the word D345EE would match with one letter more at the beginning

-the word DFC45EE would match if the 'C' was a digit.

I was looking for fuzzy matching but it's usually used for searching a word in another expression.

Thanks!

n7h_m4d
  • 19
  • 3

1 Answers1

0

There is regex module that supports fuzzy matches, needs once pip install regex. In next code {e<=2} means at most 2 errors of any type (substitutions, insertions, deletions).

e means any error, s is count of substitions, i insertions, d deletions, you may provide complex combination like {1<=s<=2,2<i<=4,3<=d<6}.

# Needs: python -m pip install regex
import regex as re

r = r'(?:[A-Z]{2}\d{3}[A-Z]{2}){e<=2}'

for s in [
    'DF345EE',
    'D345EE',
    'DFC45EE',
]:
    m = re.fullmatch(r, s)
    print(m, '\n', f'{m.fuzzy_counts[0]} substitutions, {m.fuzzy_counts[1]} insertions, {m.fuzzy_counts[2]} deletions')

Outputs:

<regex.Match object; span=(0, 7), match='DF345EE'>
 0 substitutions, 0 insertions, 0 deletions
<regex.Match object; span=(0, 6), match='D345EE', fuzzy_counts=(1, 0, 1)>
 1 substitutions, 0 insertions, 1 deletions
<regex.Match object; span=(0, 7), match='DFC45EE', fuzzy_counts=(1, 0, 0)>
 1 substitutions, 0 insertions, 0 deletions
Arty
  • 8,027
  • 3
  • 16
  • 26
  • Thx a lot! It's what I need :-) – n7h_m4d Sep 30 '20 at 09:20
  • @n7h_m4d BTW, If my answer solved your task you Accept (CheckBox) and/or UpVote my answer, because looks like I solved the question and got `-1` points for that. – Arty Sep 30 '20 at 12:50