ant-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan Christiansen" <>
Subject RE: Problem with FixCRLF
Date Wed, 04 Jun 2003 15:36:52 GMT
The problem with this reasoning is that there are no SJIS characters that begin with either
0x0D or 0x0A, both of those are in the escape character range.

If <fixcrlf.../> was not recognizing the character set, none of the Kanji would come
though ok. (Learned after frustrating week.)

You may need to use Microsoft's Code Page 932 (MS932 or Windows Japanese) instead of SJIS
for the encoding.  Shift-JIS is the standard implementation, and MS932 is Microsoft's extension
to the standard. (Java supports several Japanese character encodings. See

MS Windows uses Microsoft's Code Page 932 (MS932) instead of the standard Shift-JIS. However,
every multi-byte character in SJIS is encoded the exact same in MS932. MS932 just adds 360
more multi-byte characters that SJIS reserves for IBM escape sequences.

-- Nathan Christiansen
   Tahitian Noni International

-----Original Message-----
From: W. Sean Hennessy []
Sent: Tuesday, June 03, 2003 4:13 PM
To: Ant Users List
Subject: RE: Problem with FixCRLF

I'll wager the fixcrlf task is not multi-byte char set capable.
It might not distinguish between UTF-8 and UTF-16 encoded files.
This means any combination crlf (0x0D 0x0A) at the byte level is being
converted to
just lf (0x0A), hence the corruption of your SHIFT_JIS files whose single
char represented by
the two byte combination of 0x0D0A are being converted to 0x0A by fixcrlf.

-----Original Message-----
From: Bill Chmura []
Sent: Tuesday, June 03, 2003 1:48 PM
Subject: Problem with FixCRLF

On Sun JDK 1.4.1_02 / Ant 1.5.1

I have a task that goes through a directory on a windows box and makes
all the linefeed/Cr into the unix linefeed so when I archive it into a
tar.gz and upload it - its all ready on the other unix end.

Here is the code

<target name="makeunixlf">
<fixcrlf srcdir="${webroot.dir}"
includes="**/*.html, **/*.css, **/*.txt, **/*.sh, **/*.js, **/*.cgi,
**/*.pl, **/*.pm"

This works great, fantastic, etc... Everything I hoped it would be.

The problem I noticed is that I have some web pages that are in the
SHIFT_JIS (Japanese) character set.  When I run these pages through it
mangles a little bit of the text (enough that I did not notice it at
first).  Now, I cannot read japanese, so it could be converting
everything into huge profanities for all I know.  I do know that the
results before I perform the makeunixlf above are different after.

Traditionally these files have been posted thru FTP, so I am not sure
why the converstion is any different.  It should still be the same

Any ideas?

PS.  The makeunixlf is in a shared library we have for ant, so I cannot
modify it to just exclude certain files unless I pass it in as a
variable to use as a default exclude...



William B Chmura
Director of Internet Technology
Explosivo Internet Technology Group
Tel: (888) 560-YWEB

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message