ant-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radomír Věncek <>
Subject National characters filtering problem
Date Fri, 11 Jun 2004 09:40:03 GMT
Hi all!


I’m using the <copy> task to copy and “nationalize” java source codes of J2ME
application by replacing normal @patterns@. This process is working properly for most of different
national characters.


But some of them are encoded incorrectly.

Just now I want to make Greek port of my application. And here is the problem – some characters
are filtered incorrectly.


I have normal java source:


      public static final String [] cmdsLList =



//          "@loc.cmd.back@",

            "Î ÎŻĎ�ω", // direct greek text „Πίσω“ (binary: ce a0  ce af
 cf 83  cf 89)





I have filtering file stored by notepad in UTF-8 encoding:


loc.cmd.ok = OK

loc.cmd.back = Πίσω

loc.cmd.pause = Παύση = Επιλέξτε


And this is the build.xml part copying and nationalizing the source code:


      <copy todir="Temp/Src" flatten="yes" filtering="true" includeemptydirs="false">

                  <filterset begintoken="@" endtoken="@">

                        <filtersfile file="Temp/Locales/${bld.loc}.txt"/>


<fileset dir="Src" includes="**/*"/>



And the result is following:

Original java source:




//                      "@loc.cmd.back@",






Copied and filtered java source:



//                      "ΠίÏ?ω",






I can only hope that the characters here in mail are displayed correctly.

If not – please look at - copy of this mail.


The problem is in “loc.cmd.back” string (and ofcourse more more other) in the σ (sigma)
character whis is encoded in UTF-8 as byte sequence 0xcf 0x83, but after filtering the result
file contains bytes 0xcf 0x3f which are not interpreted and displayed correctly.


BUT the problem occurs also if I directly include the Greek text into the original java source
– this text should NOT be filtered, only copied. But also in this case the binary representation
of the sigma character differs (the “?” character instead of “ƒ”).


If I comment the filterset task, the file is normaly copied and no change occurs (ofcourse
the @patterns@ are not replaced) and the Greek text is displayed correctly.





I’m working on Windows XP Professional version 2002 with Service Pack 1 and I’m using
ant 1.5.3.





  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message