Questions tagged [string-matching]

String matching is the problem of finding occurrences of one string (“pattern”, “needle”) in another (“text”, “haystack”).

There are two types of string matching:

  • Exact
  • Approximate

Exact string matching is the problem of finding occurrence(s) of a pattern string within another string or body of text. (NIST). For example, finding CGATCGATTA in CTAGATCCTGCGATCGATTAAGCCTGA.

A comprehensive online reference of string matching algorithms is Exact String Matching Algorithms by Christian Charras and Thierry Lecroq.

Approximate string matching, also called fuzzy string matching, searches for matches based on the edit distance between the pattern and the text.

1969 questions
7419
votes
3 answers

How to check whether a string contains a substring in JavaScript?

Usually I would expect a String.contains() method, but there doesn't seem to be one. What is a reasonable way to check for this?
gramm
  • 18,623
  • 6
  • 24
  • 25
2660
votes
36 answers

How do I check if a string contains a specific word?

Consider: $a = 'How are you?'; if ($a contains 'are') echo 'true'; Suppose I have the code above, what is the correct way to write the statement if ($a contains 'are')?
Charles Yeung
  • 36,649
  • 27
  • 83
  • 130
385
votes
6 answers

Check if string matches pattern

How do I check if a string matches this pattern? Uppercase letter, number(s), uppercase letter, number(s)... Example, These would match: A1B2 B10L1 C1N200J1 These wouldn't ('^' points to problem) a1B2 ^ A10B ^ AB400 ^
DanielTA
  • 4,537
  • 2
  • 20
  • 24
175
votes
10 answers

Return positions of a regex match() in Javascript?

Is there a way to retrieve the (starting) character positions inside a string of the results of a regex match() in Javascript?
stagas
  • 3,872
  • 3
  • 25
  • 28
172
votes
4 answers

PowerShell and the -contains operator

Consider the following snippet: "12-18" -Contains "-" You’d think this evaluates to true, but it doesn't. This will evaluate to false instead. I’m not sure why this happens, but it does. To avoid this, you can use this…
tnw
  • 12,391
  • 15
  • 64
  • 108
155
votes
24 answers

A better similarity ranking algorithm for variable length strings

I'm looking for a string similarity algorithm that yields better results on variable length strings than the ones that are usually suggested (levenshtein distance, soundex, etc). For example, Given string A: "Robert", Then string B: "Amy…
marzagao
  • 3,656
  • 4
  • 17
  • 14
148
votes
2 answers

High performance fuzzy string comparison in Python, use Levenshtein or difflib

I am doing clinical message normalization (spell check) in which I check each given word against 900,000 word medical dictionary. I am more concern about the time complexity/performance. I want to do fuzzy string comparison, but I'm not sure which…
Maggie
  • 5,331
  • 8
  • 38
  • 54
129
votes
9 answers

How to search a specific value in all tables (PostgreSQL)?

Is it possible to search every column of every table for a particular value in PostgreSQL? A similar question is available here for Oracle.
Sandro Munda
  • 36,427
  • 21
  • 94
  • 117
119
votes
3 answers

Check whether a string contains a substring

How can I check whether a given string contains a certain substring, using Perl? More specifically, I want to see whether s1.domain.com is present in the given string variable.
Belgin Fish
  • 16,577
  • 39
  • 98
  • 128
119
votes
11 answers

Javascript fuzzy search that makes sense

I'm looking for a fuzzy search JavaScript library to filter an array. I've tried using fuzzyset.js and fuse.js, but the results are terrible (there are demos you can try on the linked pages). After doing some reading on Levenshtein distance, it…
willlma
  • 6,485
  • 2
  • 24
  • 42
94
votes
15 answers

Regular Expression Match to test for a valid year

Given a value I want to validate it to check if it is a valid year. My criteria is simple where the value should be an integer with 4 characters. I know this is not the best solution as it will not allow years before 1000 and will allow years such…
Ranhiru Jude Cooray
  • 18,386
  • 17
  • 80
  • 123
77
votes
4 answers

Filter multiple values on a string column in dplyr

I have a data.frame with character data in one of the columns. I would like to filter multiple options in the data.frame from the same column. Is there an easy way to do this that I'm missing? Example: data.frame name = dat days name 88 …
Tom O
  • 1,157
  • 2
  • 11
  • 14
60
votes
21 answers

javascript regular expression to check for IP addresses

I have several ip addresses like: 115.42.150.37 115.42.150.38 115.42.150.50 What type of regular expression should I write if I want to search for the all the 3 ip addresses? Eg, if I do 115.42.150.* (I will be able to search for all 3 ip…
KennC.
  • 2,985
  • 6
  • 18
  • 18
60
votes
11 answers

Regular Expression Arabic characters and numbers only

I want Regular Expression to accept only Arabic characters, Spaces and Numbers. Numbers are not required to be in Arabic. I found the following expression: ^[\u0621-\u064A]+$ which accepts only only Arabic characters while I need Arabic characters,…
moujtahed
  • 621
  • 1
  • 5
  • 7
59
votes
8 answers

Regex allow a string to only contain numbers 0 - 9 and limit length to 45

I am trying to create a regex to have a string only contain 0-9 as the characters and it must be at least 1 char in length and no more than 45. so example would be 00303039 would be a match, and 039330a29 would not. So far this is what I have but I…
NewToRegEx
  • 591
  • 1
  • 4
  • 3
1
2 3
99 100