Like this:
scala> import java.text.Normalizer
import java.text.Normalizer
scala> def removeDiacritics(in: String) : String = {
| // Separate accents from characters and then remove non-unicode characters
| Normalizer.normalize(in, Normalizer.Form.NFD).replaceAll("\\p{M}", "")
| }
removeDiacritics: (in: String)String
scala> val languages = List("Deutsch","english","español")
languages: List[String] = List(Deutsch, english, español)
scala> val results = languages.map(removeDiacritics).filter(_.contains("espan"))
results: List[String] = List(espanol)
scala>
The solution here provides a "removeDiacritics" function that can be used with mapping a list before you do the contains("espan")
. The key is understanding that the normalizer will separate diacritics from the alphabetic character while the pattern \p{M}
matches anything not unicode which the diacritics aren't.
One side effect of this is that the string without the diacritics is returned. You might not want that but I'll leave it as an exercise to you to return the original now that you can do the comparison without the diacritics.