Return-Path: Mailing-List: contact user-help@ant.apache.org; run by ezmlm Delivered-To: mailing list user@ant.apache.org Received: (qmail 15754 invoked from network); 4 Jun 2003 16:05:02 -0000 Received: from cobain.siteprotect.com (64.41.120.18) by daedalus.apache.org with SMTP; 4 Jun 2003 16:05:02 -0000 Received: from s2701 (wsip-68-15-62-138.ri.ri.cox.net [68.15.62.138]) by cobain.siteprotect.com (8.11.6/8.11.6) with ESMTP id h54G54f15781 for ; Wed, 4 Jun 2003 11:05:04 -0500 From: "Bill Chmura" To: "'Ant Users List'" Subject: RE: Problem with FixCRLF Date: Wed, 4 Jun 2003 12:04:52 -0400 Message-ID: <005b01c32ab3$0959f9b0$6400a8c0@s2701> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2727.1300 In-Reply-To: <783A3EE2AA17B349B869477606E922A6175E95@USMAIL04.morinda.com> Importance: Normal X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Okay, in my template program (combination ant, velocity and a custom task) I set the output to SHIFT_JIS... I don't know if that changes things? Its weird and I know I am probably doing things wrong, but its all could do. I can give it a try -----Original Message----- From: Nathan Christiansen [mailto:Nathan_Christiansen@tni.com] Sent: Wednesday, June 04, 2003 11:56 AM To: Ant Users List Subject: RE: Problem with FixCRLF All Microsoft products will call the encoding that they are using Shift-JIS, however Java calls the same encoding MS932. So... In your task that is doing the Japanese pages (thanks to W. Sean Hennesy for the code to distinguish) you would be better off using: instead of: -- Nathan Christiansen Tahitian Noni International http://www.tahitiannoni.com -----Original Message----- From: Bill Chmura [mailto:Bill@Explosivo.com] Sent: Wednesday, June 04, 2003 9:47 AM To: 'Ant Users List' Subject: RE: Problem with FixCRLF Well it's a really weird thing this whole Kanji thing. We get the content in a word doc or text file. Do you know how weird this stuff looks in a text file??? Anywho... We load it into word so we can at least see it in KANJI, then we add in the hypertext tags in english. The whole thing gets processed later through a template system written in java which we had problems with their also... Long story made short is that the solution to not fix the lf based on charset=SHIFT_JIS being in the content works for now. It's a band-aid, but the output seems okay on the brower after the upload. Phooey -----Original Message----- From: Nathan Christiansen [mailto:Nathan_Christiansen@tni.com] Sent: Wednesday, June 04, 2003 11:37 AM To: Ant Users List Subject: RE: Problem with FixCRLF The problem with this reasoning is that there are no SJIS characters that begin with either 0x0D or 0x0A, both of those are in the escape character range. If was not recognizing the character set, none of the Kanji would come though ok. (Learned after frustrating week.) You may need to use Microsoft's Code Page 932 (MS932 or Windows Japanese) instead of SJIS for the encoding. Shift-JIS is the standard implementation, and MS932 is Microsoft's extension to the standard. (Java supports several Japanese character encodings. See http://java.sun.com/j2se/1.4.1/docs/guide/intl/encoding.doc.html) MS Windows uses Microsoft's Code Page 932 (MS932) instead of the standard Shift-JIS. However, every multi-byte character in SJIS is encoded the exact same in MS932. MS932 just adds 360 more multi-byte characters that SJIS reserves for IBM escape sequences. -- Nathan Christiansen Tahitian Noni International http://www.tahitiannoni.com -----Original Message----- From: W. Sean Hennessy [mailto:shennessy@goldenhourdata.com] Sent: Tuesday, June 03, 2003 4:13 PM To: Ant Users List Subject: RE: Problem with FixCRLF I'll wager the fixcrlf task is not multi-byte char set capable. It might not distinguish between UTF-8 and UTF-16 encoded files. This means any combination crlf (0x0D 0x0A) at the byte level is being converted to just lf (0x0A), hence the corruption of your SHIFT_JIS files whose single char represented by the two byte combination of 0x0D0A are being converted to 0x0A by fixcrlf. -----Original Message----- From: Bill Chmura [mailto:Bill@Explosivo.com] Sent: Tuesday, June 03, 2003 1:48 PM To: user@ant.apache.org Subject: Problem with FixCRLF On Sun JDK 1.4.1_02 / Ant 1.5.1 I have a task that goes through a directory on a windows box and makes all the linefeed/Cr into the unix linefeed so when I archive it into a tar.gz and upload it - its all ready on the other unix end. Here is the code This works great, fantastic, etc... Everything I hoped it would be. The problem I noticed is that I have some web pages that are in the SHIFT_JIS (Japanese) character set. When I run these pages through it mangles a little bit of the text (enough that I did not notice it at first). Now, I cannot read japanese, so it could be converting everything into huge profanities for all I know. I do know that the results before I perform the makeunixlf above are different after. Traditionally these files have been posted thru FTP, so I am not sure why the converstion is any different. It should still be the same right? Any ideas? PS. The makeunixlf is in a shared library we have for ant, so I cannot modify it to just exclude certain files unless I pass it in as a variable to use as a default exclude... TIA Bill William B Chmura Director of Internet Technology Explosivo Internet Technology Group http://www.Explosivo.com Tel: (888) 560-YWEB --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@ant.apache.org For additional commands, e-mail: user-help@ant.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@ant.apache.org For additional commands, e-mail: user-help@ant.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@ant.apache.org For additional commands, e-mail: user-help@ant.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@ant.apache.org For additional commands, e-mail: user-help@ant.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@ant.apache.org For additional commands, e-mail: user-help@ant.apache.org