0

I have a piece of code in Java which prints out a String:

String original = new String("A" + "\u00ea" + "\u00f1" + "\u00fc" + "C");
byte[] utf8Bytes = original.getBytes("UTF8");
String roundTrip = new String(utf8Bytes, "UTF8");
System.out.println("roundTrip = " + roundTrip);    // Output is [roundTrip = AêñüC]

When I run this piece of Java code in Scala, I get back AΩ±ⁿC

If I print the String in hex, outputs from Scala and Java match each other.

    try {
        String original = new String("A" + "\u00ea" + "\u00f1" + "\u00fc" + "C");
        byte[] utf8Bytes = original.getBytes("UTF8");
        byte[] defaultBytes = original.getBytes();
        String roundTrip = new String(utf8Bytes, "UTF8");
        printBytes(utf8Bytes, "utf8Bytes");
        System.out.println();
        printBytes(defaultBytes, "defaultBytes");
    } catch (UnsupportedEncodingException e) {
        e.printStackTrace();
    }

Here's the output:

utf8Bytes[0] = 0x41
utf8Bytes[1] = 0xc3
utf8Bytes[2] = 0xaa
utf8Bytes[3] = 0xc3
utf8Bytes[4] = 0xb1
utf8Bytes[5] = 0xc3
utf8Bytes[6] = 0xbc
utf8Bytes[7] = 0x43

defaultBytes[0] = 0x41
defaultBytes[1] = 0xc3
defaultBytes[2] = 0xaa
defaultBytes[3] = 0xc3
defaultBytes[4] = 0xb1
defaultBytes[5] = 0xc3
defaultBytes[6] = 0xbc
defaultBytes[7] = 0x43

Reference: https://docs.oracle.com/javase/tutorial/i18n/text/string.html

Edit: I'm using the Scala to call the Java code.

How come the Scala out put is different from Java? How could I fix the Scala so that it will give me the same output as Java's?

Top.Deck
  • 937
  • 2
  • 15
  • 28
  • 1
    What's the Scala code? And what's the point of the default thing if you want to work with UTF8 specifically? – pvg May 30 '17 at 22:20
  • My guess is this isn't something to do with scala/java but instead with how you're running them. Maybe your IDE or terminal or something. – Joe K May 30 '17 at 22:27
  • When I run the code translated to Scala I get the expected, not reported, output. – jwvh May 30 '17 at 22:40
  • I don't see any Scala code in your posting. – Bob Dalgleish May 31 '17 at 00:34
  • Are you running both programs with the same terminal settings? If you use a *nix variant what's the output of the `locale` command? If you are using Windows check: https://stackoverflow.com/questions/1259084/what-encoding-code-page-is-cmd-exe-using – Diego May 31 '17 at 00:48
  • @pvg I'm using Scala to call the Java function. I have some input in different languages so that I think UTF8 is the best way to represent them. – Top.Deck May 31 '17 at 02:13
  • @JoeK I run both Java code and Scala code in Windows terminal. – Top.Deck May 31 '17 at 02:14
  • @Diego Yes, under same console. – Top.Deck May 31 '17 at 02:15

0 Answers0