My conclusion is the conversion depended on default encoding(Windows setting "Language for non-Unicode programs")
Here is the program for testing:
package test;
import java.io.FileOutputStream;
public class Test {
public static void main(String[] args) throws Exception {
StringBuilder sb = new StringBuilder();
sb.append("[카운터] sysprop=[").append(System.getProperty("cenv"));
if (args.length > 0) {
sb.append("], cmd args=[").append(args[0]);
}
sb.append("], file.encoding=").append(System.getProperty("file.encoding"));
FileOutputStream fout = new FileOutputStream("/testout");
fout.write(sb.toString().getBytes("UTF-8"));
fout.close();//write result to a file instead of System.out
//Thread.sleep(10000);//For checking arguments using Process Explorer
}
}
Test1: "Language for non-Unicode programs" is Korean(Korea)
Exceute in command prompt: java -Dcenv=카운터 test.Test 카운터
(Korean chars are correct when I verify the arguments using Process Explorer)
Result: [카운터] sysprop=[카운터], cmd args=[카운터], file.encoding=MS949
Test2: "Language for non-Unicode programs" is Chinese(Traditional, Taiwan)
Exceute in command prompt(paste from clipboard): java -Dcenv=카운터 test.Test 카운터
(I cannot see Korean chars in command windows. However, Korean chars are correct when I verify the arguments using Process Explorer)
Result: [카운터] sysprop=[???], cmd args=[???], file.encoding=MS950
Test3: "Language for non-Unicode programs" is Chinese(Traditional, Taiwan)
Launch from Eclipse by setting Program arguments and VM arguments (The command line in Process Explorer is C:\pg\jdk160\bin\javaw.exe -agentlib:jdwp=transport=dt_socket,suspend=y,address=localhost:50672 -Dcenv=카운터 -Dfile.encoding=UTF-8 -classpath S:\ws\wtest\bin test.Test 카운터
This is the same as you see in the Properties dialog of Eclipse Debug view)
Result: [카운터] sysprop=[???], cmd args=[bin], file.encoding=UTF-8
Change the Korean chars to "碁石",which exist in MS950/MS949 charset:
- Test1 Result:
[碁石] sysprop=[碁石], cmd args=[碁石], file.encoding=MS949
- Test2 Result:
[碁石] sysprop=[碁石], cmd args=[碁石], file.encoding=MS950
- Test3 Result:
[碁石] sysprop=[碁石], cmd args=[碁石], file.encoding=UTF-8
Change the Korean chars to "鈥焢",which exist in MS950 charset:
- Test1 Result:
[鈥焢] sysprop=[??], cmd args=[??], file.encoding=MS949
- Test2 Result:
[鈥焢] sysprop=[鈥焢], cmd args=[鈥焢], file.encoding=MS950
- Test3 Result:
[鈥焢] sysprop=[鈥焢], cmd args=[鈥焢], file.encoding=UTF-8
Change the Korean chars to "宽广",which exist in GBK charset:
- Test1 Result:
[宽广] sysprop=[??], cmd args=[??], file.encoding=MS949
- Test2 Result:
[宽广] sysprop=[??], cmd args=[??], file.encoding=MS950
- Test3 Result:
[宽广] sysprop=[??], cmd args=[??], file.encoding=UTF-8
- Test4: to verify my assumption, I change "Language for non-Unicode programs" to Chinese(Simplified, PRC) and exceute
java -Dcenv=宽广 test.Test 宽广
in command prompt
Result: [宽广] sysprop=[宽广], cmd args=[宽广], file.encoding=GBK
During testing, I always check the command line via Process Explorer, and make sure all chars are correct.
However, the command argument chars are converted using default encoding before invoke main(String[] args) of Java class
. If one of char does not exist in the charset of default encoding, the program will get unexpected argument.
I'm not sure the problem is caused by java.exe/javaw.exe or Windows. But passing non-ASCII parameter via command arguments is not a good idea.
BTW, I also try to execute the command via .bat file(file encoding is UTF-8). Maybe someone is interest,
Test5: "Language for non-Unicode programs" is Korean(Korea)
The command line in Process Explorer is java -Dcenv=移댁슫?? test.Test 移댁슫??
(The Korean chars are collapsed)
Result: [카운터] sysprop=[移댁슫??], cmd args=[移댁슫??], file.encoding=MS949
Test6: "Language for non-Unicode programs" is Korean(Korea)
Add another VM arguments. The command line in Process Explorer is java -Dfile.encoding=UTF-8 -Dcenv=移댁슫?? test.Test 移댁슫??
(The Korean chars are collapsed)
Result: [카운터] sysprop=[移댁슫??], cmd args=[移댁슫??], file.encoding=UTF-8
Test7: "Language for non-Unicode programs" is Chinese(Traditional, Taiwan)
The command line in Process Explorer is java -cp s:\ws\wtest\bin -Dcenv=儦渥?? test.Test 儦渥??
(The Korean chars are collapsed)
Result: [카운터] sysprop=[儦渥??], cmd args=[儦渥??], file.encoding=MS950