I would like to know if there is a method that compares 2 strings and ignores the accents making "noção" equal to "nocao". it would be something like string1.methodCompareIgnoreAccent(string2);
Asked
Active
Viewed 2.0k times
20
-
2Have you looked at [`Collator`](http://docs.oracle.com/javase/8/docs/api/java/text/Collator.html)? – Jon Skeet Mar 03 '15 at 14:05
-
1You can also have a look at https://stackoverflow.com/questions/1008802/converting-symbols-accent-letters-to-english-alphabet. – Florent Bayle Mar 03 '15 at 14:10
-
I have written a class for searching trough arabic texts by ignoring diacritic (NOT removing them). maybe you can get the idea or use it in some way. https://gist.github.com/mehdok/e6cd1dfccab0c75ac7a9536c6afac8ff – mehdok Jul 19 '17 at 15:48
2 Answers
42
You can use java Collators for comparing the tests ignoring the accent, see a simple example:
import java.text.Collator;
/**
* @author Kennedy
*/
public class SimpleTest
{
public static void main(String[] args)
{
String a = "nocao";
String b = "noção";
final Collator instance = Collator.getInstance();
// This strategy mean it'll ignore the accents
instance.setStrength(Collator.NO_DECOMPOSITION);
// Will print 0 because its EQUAL
System.out.println(instance.compare(a, b));
}
}
Documentation: JavaDoc
I'll not explain in details because i used just a little of Collators and i'm not a expert in it, but you can google there's some articles about it.
![](../../users/profiles/3922423.webp)
Kennedy Oliveira
- 1,841
- 2
- 14
- 24
-
-
this doesn't work, it won't print 0. Sometimes it prints -1 other times 1 – alexandre1985 Mar 03 '15 at 16:48
-
1
-
your first answer worked. Now I want to edit the answer so that it shows your answer, do you now how I can do it? – alexandre1985 Mar 03 '15 at 20:07
-
6
There is no built in method to do this, so you have to build your own:
A part of this is solution is from here : This first splits all accented characters into their deAccented counterparts followed by their combining diacritics. Then you simply remove all combining diacritics. Also see https://stackoverflow.com/a/1215117/4095834
And then your equals method will look like this:
import java.text.Normalizer;
import java.text.Normalizer.Form;
public boolean equals(Object o) {
// Code omitted
if (yourField.equals(removeAccents(anotherField))) {
return true;
}
}
public static String removeAccents(String text) {
return text == null ? null : Normalizer.normalize(text, Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}