couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mark (JIRA)" <j...@apache.org>
Subject [jira] Created: (COUCHDB-333) Json handling of UTF8 strings not in accordance with rfc4627
Date Sat, 25 Apr 2009 17:00:30 GMT
Json handling of UTF8 strings not in accordance with rfc4627
------------------------------------------------------------

                 Key: COUCHDB-333
                 URL: https://issues.apache.org/jira/browse/COUCHDB-333
             Project: CouchDB
          Issue Type: Bug
          Components: Database Core
    Affects Versions: 0.9
         Environment: couchdb 0.9.0  spidermonkey 0.7.0 erlang R12B3
            Reporter: mark


Handling of some unicode values escaped in json format \uXXXX fails with "invalid_json" error.

curl -X PUT -d '{"revisions":[],"_id":"U_1d11e","codepoint":"3441","definition":"\uD834\uDD1E
G clef character"}' http://localhost:5984/mydb/U_1d11e

yields

{"error":"invalid_json","reason":"{\"revisions\":[],\"_id\":\"U_1d11e\",\"codepoint\":\"3441\",\"definition\":\"\\uD834\\uDD1E
G clef character\"}"}

When the RFC states:
   To escape an extended character that is not in the Basic Multilingual
   Plane, the character is represented as a twelve-character sequence,
   encoding the UTF-16 surrogate pair.  So, for example, a string
   containing only the G clef character (U+1D11E) may be represented as
   "\uD834\uDD1E".

Furthermore, couchdb accepts encoded strings of the format \uXXXXXXXX which is not mentioned
as acceptable in the json rfc

curl -X PUT -d '{"revisions":[],"_id":"U_1d11e","codepoint":"3441","definition":"\u0001D11E
G clef character"}' http://localhost:5984/mydb/U_1d11e
Yields:
{"ok":true,"id":"U_1d11e","rev":"1-1270273433"}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message