4

I have a column where some of the elements contain accented letters. eg : Grambú

My requirement is that when I search for "Grambu" I should get "Grambú" in the results as well.

For this requirement I tried using "COLLATE NOCASE" parameter for that specific column. But that didnt work.

When I searched for solutions in the web , I found many people suggesting normalizing the accented characters and creating another column based on it as the only option.

Is there any other easier solutions to this problem?

Dheeraj Vepakomma
  • 18,747
  • 12
  • 67
  • 98
Rojil Thomas
  • 299
  • 4
  • 11

3 Answers3

7

COLLATE NOCASE works only for the 26 upper case characters of ASCII.

Set the database's locale to one that has accented character support using setLocale() and use COLLATE LOCALIZED.

You may also try using COLLATE UNICODE. But beware of this bug: SQLite UNICODE sort broken in ICS - no longer case-insensitive.

Check the documentation for mention of these two collators in Android.

Also check out this online collation demo tool.

Dheeraj Vepakomma
  • 18,747
  • 12
  • 67
  • 98
0

http://www.sqlite.org/lang_expr.html

(A bug: SQLite only understands upper/lower case for ASCII characters by default. The LIKE operator is case sensitive by default for unicode characters that are beyond the ASCII range. For example, the expression 'a' LIKE 'A' is TRUE but 'æ' LIKE 'Æ' is FALSE.)

juFo
  • 15,486
  • 9
  • 90
  • 128
0

In Android sqlite, LIKE and GLOB ignore both COLLATE LOCALIZED and COLLATE UNICODE. However, there is a solution without having to add extra columns to your table. As @asat explains in this answer, you can use GLOB with a pattern that will replace each letter with all the available alternatives of that letter. In Java:

public static String addTildeOptions(String searchText) {
    return searchText.toLowerCase()
                     .replaceAll("[aáàäâã]", "\\[aáàäâã\\]")
                     .replaceAll("[eéèëê]", "\\[eéèëê\\]")
                     .replaceAll("[iíìî]", "\\[iíìî\\]")
                     .replaceAll("[oóòöôõ]", "\\[oóòöôõ\\]")
                     .replaceAll("[uúùüû]", "\\[uúùüû\\]")
                     .replace("*", "[*]")
                     .replace("?", "[?]");
}

And then (not literally like this, of course):

SELECT * from table WHERE lower(column) GLOB "*addTildeOptions(searchText)*"

This way, a user searching for either Grambu or Grambú will get the search converted into Gramb[uúùüû], returning both results.

It is important to notice that GLOB ignores COLLATE NOCASE, that's why I converted everything to lower case both in the function and in the query. Notice also that the lower() function in sqlite doesn't work on non-ASCII characters - but again those are probably the ones that you are already replacing!

The function also replaces both GLOB wildcards, * and ?, with "escaped" versions.

descatalogado
  • 101
  • 1
  • 5