couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joan Touzet (JIRA)" <j...@apache.org>
Subject [jira] Updated: (COUCHDB-345) "High ASCII" can be inserted into db but not retrieved
Date Wed, 06 May 2009 22:23:30 GMT

     [ https://issues.apache.org/jira/browse/COUCHDB-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joan Touzet updated COUCHDB-345:
--------------------------------

    Attachment: badtext.tar.gz

Code to reproduce the defect including the pre-requisite wrapper module.

> "High ASCII" can be inserted into db but not retrieved
> ------------------------------------------------------
>
>                 Key: COUCHDB-345
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-345
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.9
>         Environment: OSX 10.5.6
>            Reporter: Joan Touzet
>         Attachments: badtext.tar.gz
>
>
> It is possible to PUT/POST a document into CouchDB with a "high ASCII" value that cannot
be retrieved. This results from not escaping a non-ASCII value into \u#### when PUT/POSTing
the document.
> This sample code will recreate the problem using the hex value D8 (Ø) in a possibly
unsavoury test string. It requires the file Couch.py in the same directory, containing the
code from the "Example wrapper class" at http://wiki.apache.org/couchdb/Getting_started_with_Python
to run.
> ================================================
> #!/usr/bin/python
> from Couch import Couch
> db = Couch('localhost', '5984')
> db.createDb('utf8_fail')
> badtext = "4E45494D454E2046D85252204641454E21".decode("hex")
> doc = """
> {
>     "Message":\"""" + badtext + """\",
> }
> """
> db.saveDoc('utf8_fail', doc, 'fail')
> db.openDoc('utf8_fail', 'fail')
> ================================================
> Sample output against 0.9.0 is as follows:
> {
>     "ok": true
> }
> {
>     "id": "fail", 
>     "ok": true, 
>     "rev": "1-76726372"
> }
> {
>     "error": "ucs", 
>     "reason": "{bad_utf8_character_code}"
> }
> ================================================
> Please note this defect turned up another problem, namely that the bad_utf8_character_code
exception thrown by a design document attempting to map() the bad document caused Futon to
fail silently in building the view, with no indication (except via debug log) that there was
a failure. The log indicated two attempts to build the view, both failing, followed by an
uncaught exception error for Futon.
> Based on this, there are likely other areas in the codebase that do not handle the bad_utf8_character_code
exception correctly.
> My belief is that CouchDB shouldn't accept this input and should have rejected the PUT/POST,
or should have escaped the input itself before the insertion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message