ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 36290] - <copy filtering="on"> mutilates LATIN1 text files
Date Tue, 23 Aug 2005 08:14:33 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=36290>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=36290





------- Additional Comments From anathaniel@apache.org  2005-08-23 10:14 -------
It need not be LATIN1 but it must be an encoding which can be used to read an 
arbitrary byte stream into Java chars and write it out again that the new file 
is a genuine copy of the original.  For token replacement to work, ASCII must 
be a subset.

UTF-8 cannot be used because some of the 256 byte values are invalid.

Multi-byte encodings cannot be used because they would fail odd file lengths.

I came across this problem in the Cocoon build file macro where an unknown set 
of XML files with eithor UTF-8 or ISO-8859-1 encodings by author's choice needs 
to be copied, possibly with token replacement.  The platform default encoding 
is outside our control.

Of course, an even better solution would be if the copy+filtering task looked 
at the <?xml encoding="..."?> to determine the correct encoding for XML files.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


Mime
View raw message