incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Evans" <j...@jpevans.com>
Subject Unnecessary escaping in URLs?
Date Sat, 28 Jun 2008 22:18:05 GMT
Hi All,

I am storing normal URLs (e.g. "http://www.example.com/foo/bar/baz.html") in
a field in couchdb.  I am accessing couchdb from python. Initially I used
simplejson as my parser since there was example code based on that.  As I
have started working with larger and larger data sets with larger and larger
records I have discovered that simplejson is very slow, so I looked into
faster alternatives.  I discovered cjson which is very fast but less
tolerant, so now in my couchdb client class, everywhere that used to call
simplejson now tries cjson first and if cjson raises an exception then it
fails back to simple json.  Something like this:

    try:
        native_data = cjson.decode(raw_data)
    except DecodeError:
        native_data = simplejson.loads(raw_data)

Well, what I have discovered is that when reading the URLs I have stored,
these two implementations treat them differently.  Specifically, everything
works as I would expect it to with simplejson, but with cjson the URLs show
up with backslashes in them, so the example above would come out as
"http:\/\/www.example.com\/foo\/bar\/bar.html".  I thought at first that
this must be a bug in cjson, but then I pulled up a record in the browser
(and again with curl) and I see that the extra backslashes are actually
returned by couchdb.  (I double checked that my POST/PUTs were not including
them) and in fact couchdb seems to be the culprit.  Since it works with
simplejson, my guess would be that this may still be a bug in how cjson
parses the strings, but I'm curious, is this additional escaping
intentional? and if so, is it necessary? and if so, why?

(and if anyone has any recommendations on how to get cjson to parse it
correctly, that would be great too :))

Thanks,
-
John

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message