couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: unicode output representation
Date Mon, 13 Apr 2009 22:39:59 GMT
On Mon, Apr 13, 2009 at 3:34 PM, dmi <losthost@yandex.ru> wrote:
> Hello All!
>
> CouchDB now using modified version of mochijson2 for JSON output.
> The standard behavior of this library is to accept unicode in all forms (unicode, utf8,
\uXXXX) via decode/1,
> but when unicode is emitted via encode/1 to the client app, all unicode symbols are converted
to \uXXXX form.
>
> This is done for maximal compatibility. But I suspect, that modern software, which may
want to interact with CouchDB, will have no problems with raw UTF8.
>
> Recent version of mochiweb (r99) introduces an optional capability for mochijson2 to
emit raw utf8.
> The proposed way is:
>
> Encoder = mochijson2:encoder([{utf8, true}]),
> JSON = Encoder(json())
>
> I have tested this patch (in reduced form) against CouchDB and it seems to be working.
>
> I think, that bringing this option to CouchDB will be a good improvement for developers
of international software.
>

Thanks for digging in here.

To avoid incompatibility with old software, we may want to either:

- make this a request time option
- switch intelligently on some http request header

Any thoughts on how best to do this? Should utf8 be the default, or \uXXXX?

Once we have these questions answered, if you put a patch in JIRA[1]
it's likely to be accepted.

[1] http://issues.apache.org/jira/browse/COUCHDB


-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message