1

I have a string in UTF-8 format. I want to convert it to clean ANSI format. How to do that?

Xplosive
  • 733
  • 3
  • 13
  • 24
  • What *exactly* do you mean by "ANSI"? There are various encodings which could be (inaccurately) referred to as "ANSI". See http://stackoverflow.com/questions/701882/what-is-ansi-format – Jon Skeet May 10 '14 at 19:10

3 Answers3

2

You can do something like this:

new String("your utf8 string".getBytes(Charset.forName("utf-8")));

in this format 4 bytes of UTF8 converts to 8 bytes of ANSI

EN20
  • 344
  • 1
  • 2
  • 13
1

You could use a java function like this one here to convert from UTF-8 to ISO_8859_1 (which seems to be a subset of ANSI):

private static String convertFromUtf8ToIso(String s1) {
    if(s1 == null) {
        return null;
    }
    String s = new String(s1.getBytes(StandardCharsets.UTF_8));
    byte[] b = s.getBytes(StandardCharsets.ISO_8859_1);
    return new String(b, StandardCharsets.ISO_8859_1);
}

Here is a simple test:

String s1 = "your utf8 stringáçﬠ";
String res = convertFromUtf8ToIso(s1);
System.out.println(res);

This prints out:

your utf8 stringáç?

The character gets lost because it cannot be represented with ISO_8859_1 (it has 3 bytes when encoded in UTF-8). ISO_8859_1 can represent á and ç.

gil.fernandes
  • 9,585
  • 3
  • 41
  • 57
0

Converting UTF-8 to ANSI is not possible generally, because ANSI only has 128 characters (7 bits) and UTF-8 has up to 4 bytes. That's like converting long to int, you lose information in most cases.

Mike
  • 2,964
  • 2
  • 10
  • 5