3

Is there any way to dynamically replace accented characters such as the following?

requesón => requeson

What I mean is that every accented (or whatever) character would be replaced with the normal one.

Is this possible using ColdFusion?

Miguel-F
  • 13,042
  • 5
  • 33
  • 55
Jordelca
  • 84
  • 6
  • I know that you could use a regular expression to _remove_ the special characters but not sure about replacing. That would mean you will need some sort of mapping for each character and it's replacement. Are you dealing with a limited number of characters that you know, or are you wanting to replace any special character that is encountered? – Miguel-F Jan 31 '13 at 16:22
  • I would need to replace every special character found, but I think that this is going to be difficult. So I already think in the option of replace know special caracters. – Jordelca Jan 31 '13 at 16:29
  • It would be also nice to have some regex which select, for example, "e" and all its special characters. Then I could replace them with the normal one. – Jordelca Jan 31 '13 at 16:34
  • What is the actual task you are trying to solve? (i.e. why do you [think you] need to remove accents?) – Peter Boughton Jan 31 '13 at 16:35

1 Answers1

7

You can fix accented characters through java text normalization and a regex. There is a function on cflib that does this already:

From: http://cflib.org/udf/deAccent

function deAccent(str){
    //based on the approach found here: http://stackoverflow.com/a/1215117/894061
    var Normalizer = createObject("java","java.text.Normalizer");
    var NormalizerForm = createObject("java","java.text.Normalizer$Form");
    var normalizedString = Normalizer.normalize(str, createObject("java","java.text.Normalizer$Form").NFD);
    var pattern = createObject("java","java.util.regex.Pattern").compile("\p{InCombiningDiacriticalMarks}+");
    return pattern.matcher(normalizedString).replaceAll("");
}
Peter Boughton
  • 102,341
  • 30
  • 116
  • 172
Nathan Strutz
  • 7,721
  • 1
  • 37
  • 48
  • The last two lines of that function can be merged into one: `return normalizedString.replaceAll("\p{InCombiningDiacriticalMarks}+","");` – Peter Boughton Jan 31 '13 at 16:50
  • Looking into what NFD form changes - there's a spec here: http://www.unicode.org/reports/tr15/tr15-23.html#Decomposition - it doesn't replace `ß` for example, but if I try searching for that in the document, but stupid Chrome is searching for `ss` - which appears over a hundred times. :/ – Peter Boughton Jan 31 '13 at 17:00