couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Cormack <jus...@specialbusservice.com>
Subject Re: unicode output representation
Date Wed, 15 Apr 2009 11:43:57 GMT

On 13 Apr 2009, at 23:39, Chris Anderson wrote:

> On Mon, Apr 13, 2009 at 3:34 PM, dmi <losthost@yandex.ru> wrote:
>> Hello All!
>>
>> CouchDB now using modified version of mochijson2 for JSON output.
>> The standard behavior of this library is to accept unicode in all  
>> forms (unicode, utf8, \uXXXX) via decode/1,
>> but when unicode is emitted via encode/1 to the client app, all  
>> unicode symbols are converted to \uXXXX form.
>>
>> This is done for maximal compatibility. But I suspect, that modern  
>> software, which may want to interact with CouchDB, will have no  
>> problems with raw UTF8.
>>
>> Recent version of mochiweb (r99) introduces an optional capability  
>> for mochijson2 to emit raw utf8.
>> The proposed way is:
>>
>> Encoder = mochijson2:encoder([{utf8, true}]),
>> JSON = Encoder(json())
>>
>> I have tested this patch (in reduced form) against CouchDB and it  
>> seems to be working.
>>
>> I think, that bringing this option to CouchDB will be a good  
>> improvement for developers of international software.
>>
>
> Thanks for digging in here.
>
> To avoid incompatibility with old software, we may want to either:
>
> - make this a request time option
> - switch intelligently on some http request header
>
> Any thoughts on how best to do this? Should utf8 be the default, or  
> \uXXXX?
>
> Once we have these questions answered, if you put a patch in JIRA[1]
> it's likely to be accepted.
>
> [1] http://issues.apache.org/jira/browse/COUCHDB
>

In my experience real unicode is better than \u (and shorter!). The  
json spec (http://json.org) specifically says that you *must* accept  
any unicode character other than the \ escaped ones, and I was very  
surprised to find that a lot of json tools produce the \u versions by  
default.

Because it is part of the spec I dont see any problem in just changing  
it.

>
> -- 
> Chris Anderson
> http://jchrisa.net
> http://couch.io


Mime
View raw message