commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Honton, Charles" <Charles_Hon...@intuit.com>
Subject Re: [io] Unicode escape/unescape Writer/Reader
Date Wed, 07 Mar 2012 16:56:46 GMT
Emmanuel,

Isn't this performing the function of a java.nio.charset.CharsetDecoder or
a org.apache.commons.codec.StringDecoder?

Regards,
Chas Honton


On 3/7/12 8:12 AM, "Emmanuel Bourg" <ebourg@apache.org> wrote:

>I now have an implementation ready for the reader in the [csv] source
>code:
>
>https://svn.apache.org/repos/asf/commons/sandbox/csv/trunk/src/main/java/o
>rg/apache/commons/csv/UnicodeUnescapeReader.java
>
>I think I'll also handle other escape sequences such as \n or \t.
>
>Emmanuel Bourg
>
>
>Le 12/11/2011 00:27, Emmanuel Bourg a écrit :
>> Hi,
>>
>> It seem that unescaping unicode escape sequences (\u1234) in input
>> stream is a common need. [configuration] does it for
>> PropertiesConfiguration, and [csv] can also decode these sequences
>> optionally.
>>
>> In the other direction, there is also a need to escape unicode
>> characters not supported by a given encoding when writing (see
>> CONFIGURATION-457).
>>
>> I think these features could be implemented as a UnicodeUnescapeReader
>> and a UnicodeEscapeWriter that might fit into [io].
>>
>> For the reader, any unicode escape sequence would be transformed into
>> the corresponding unicode character, or ignored if the sequence is not
>> valid.
>>
>> For the writer, a target charset would be specified in the constructor,
>> and any character not supported by this charset would be turned into
>> \uxxxx.
>>
>> What do you think?
>>
>> Emmanuel Bourg
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message