couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joan Touzet (JIRA)" <>
Subject [jira] Created: (COUCHDB-345) High ASCII can be
Date Wed, 06 May 2009 22:15:30 GMT
High ASCII can be 

                 Key: COUCHDB-345
             Project: CouchDB
          Issue Type: Bug
    Affects Versions: 0.9
         Environment: OSX 10.5.6
            Reporter: Joan Touzet

It is possible to PUT/POST a document into CouchDB with a "high ASCII" value that cannot be
retrieved. This results from not escaping a non-ASCII value into \u#### when PUT/POSTing the

This sample code will recreate the problem using the hex value D8 (Ø) in a possibly unsavoury
test string. It requires the file in the same directory, containing the code from
the "Example wrapper class" at
to run.

from Couch import Couch
db = Couch('localhost', '5984')
badtext = "4E45494D454E2046D85252204641454E21".decode("hex")
doc = """
    "Message":\"""" + badtext + """\",
db.saveDoc('utf8_fail', doc, 'fail')
db.openDoc('utf8_fail', 'fail')
Sample output against 0.9.0 is as follows:

    "ok": true
    "id": "fail", 
    "ok": true, 
    "rev": "1-76726372"
    "error": "ucs", 
    "reason": "{bad_utf8_character_code}"


Please note this defect turned up another problem, namely that the bad_utf8_character_code
exception thrown by a design document attempting to map() the bad document caused Futon to
fail silently in building the view, with no indication (except via debug log) that there was
a failure. The log indicated two attempts to build the view, both failing, followed by an
uncaught exception error for Futon.

Based on this, there are likely other areas in the codebase that do not handle the bad_utf8_character_code
exception correctly.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message