I am trying to avoid using a regexp function when I need to match strings regardless to their accent or case. For instance, when I search for "VOILA", I want to match "voilà".
I noted that it is possible to create a custom collation to achieve that but I don't know where to start and what to expect in terms of performance:
I assume it will be faster than my REGEXP function defined in my VB.net code (see code below). Is that correct? EDIT: this approach doesn't work. I guess the regex is false. Since I am looking for another solution, I'll wait for answers instead of fixing it.
_ Class MyRegEx Inherits SQLiteFunction Public Overrides Function Invoke(args As Object()) As Object Return System.Text.RegularExpressions.Regex.IsMatch(Convert.ToString(args(1)), Convert.ToString(args(0))) End Function End Class
The regex build:
Dim testString As String = word.Replace("A", "[=a=]")
testString = testString.Replace("E", "[=e=]")
testString = testString.Replace("I", "[=i=]")
testString = testString.Replace("O", "[=o=]")
testString = testString.Replace("U", "[=u=]")
testString = testString.Replace("Œ", "œ")
testString = "(?i)(\W|^)" & testString & "(\W|$)"
- I have read the SQLite documentation on the creation of a custom collation but I don't really get it. Should I define this in my .net application? Can I embed my custom collation in the database to avoid to create it each time?
- I guess that it's a common issue considering the number of related questions. However, I haven't seen a single written example achieving this kind of accent/case insensitive search. Can someone share this?
The best lead I have so far would be to create a custom column without accents as described here. This approach seems to be satisfying, but I wanted to get the elegant solution (i.e. custom collate) instead, if it is possible in my case.
Thanks!