ant-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hardacker, Andrew" <>
Subject RE: National characters filtering problem
Date Fri, 11 Jun 2004 14:24:08 GMT
Radomir, I think you need to encode your non-ascii characters for Java.

>From Sun:
"The Java compiler and other Java tools can only process files which contain
Latin-1 and/or Unicode-encoded (\udddd notation) characters. native2ascii
converts files which contain other character encodings into files containing
Latin-1 and/or Unicode-encoded charaters." (Spelling verbatim)

# your file run through: native2ascii -encoding UTF8
loc.cmd.ok = OK
loc.cmd.back = \u03a0\u03af\u03c3\u03c9
loc.cmd.pause = \u03a0\u03b1\u03cd\u03c3\u03b7 = \u0395\u03c0\u03b9\u03bb\u03ad\u03be\u03c4\u03b5

There's also an Ant task called, surprise, native2ascii, that I use to
convert localized property files.



-----Original Message-----
From: Radomír Vencek []
Sent: Friday, June 11, 2004 5:40 AM
Subject: National characters filtering problem

Hi all!


I’m using the <copy> task to copy and “nationalize” java source codes of
J2ME application by replacing normal @patterns@. This process is working
properly for most of different national characters.


But some of them are encoded incorrectly.

Just now I want to make Greek port of my application. And here is the
problem – some characters are filtered incorrectly.


I have normal java source:


      public static final String [] cmdsLList =



//          "@loc.cmd.back@",

            "Î ÎŻĎ�ω", // direct greek text „Πίσω“ (binary: ce a0  ce af
cf 83  cf 89)





I have filtering file stored by notepad in UTF-8 encoding:


loc.cmd.ok = OK

loc.cmd.back = Πίσω

loc.cmd.pause = Παύση = Επιλέξτε


And this is the build.xml part copying and nationalizing the source code:


      <copy todir="Temp/Src" flatten="yes" filtering="true"

                  <filterset begintoken="@" endtoken="@">

                        <filtersfile file="Temp/Locales/${bld.loc}.txt"/>


<fileset dir="Src" includes="**/*"/>



And the result is following:

Original java source:




//                      "@loc.cmd.back@",






Copied and filtered java source:



//                      "ΠίÏ?ω",






I can only hope that the characters here in mail are displayed correctly.

If not – please look at - copy of this


The problem is in “loc.cmd.back” string (and ofcourse more more other) in
the σ (sigma) character whis is encoded in UTF-8 as byte sequence 0xcf 0x83,
but after filtering the result file contains bytes 0xcf 0x3f which are not
interpreted and displayed correctly.


BUT the problem occurs also if I directly include the Greek text into the
original java source – this text should NOT be filtered, only copied. But
also in this case the binary representation of the sigma character differs
(the “?” character instead of “ƒ”).


If I comment the filterset task, the file is normaly copied and no change
occurs (ofcourse the @patterns@ are not replaced) and the Greek text is
displayed correctly.





I’m working on Windows XP Professional version 2002 with Service Pack 1 and
I’m using ant 1.5.3.





The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify us immediately
and then destroy it. 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message