incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Davis" <>
Subject Re: Document Updates
Date Sat, 15 Nov 2008 01:37:55 GMT
I wrote a fuzz thing to go along with the diff testing.

You can get it with:

$ sudo easy_install jsontools

# Examples
from StringIO import StringIO
import jsontools
stream = StringIO()

//Fuzzy objects
fj = jsontools.FuzzyJson()
obj1 = fj.generate(1).next()
obj2 = fj.modify(obj1)

//Diff the objects
jsontools.jsondiff(obj1, obj2, stream=stream)

//Apply the diff
result = jsontools.jsonapply(stream, obj1)

//Compare them
assert jsontools.jsoncmp(result, obj2) == 2

Any comments?


On Thu, Nov 13, 2008 at 9:34 PM, Paul Davis <> wrote:
> I don't think we need canonical JSON.
> The Spec definitely needs to be disambiguated though. As I see it
> there are two interpretations:
> 1. Order of fields matters which means repeated fields are ok
> 2. Order does not matter which means repeated fields are NOT ok
> It doesn't matter which is chosen, but one of them must be to make this work.
> Also, I got bored. So I implemented JSON diff in python for Case #2.
> I gotta jet, but when I get home in a bit I'm gonna write a JSON fuzz
> library and then pound the diff thing with it.
> Not sure if it's obvious or not, but switching from case 2 to 1 is
> straightforward. Also, my current array diff implementation is kinda
> whack. And indels screw the rest of the diff, as in, its not so much a
> diff as a delete rest and add new. Getting this optimal is actually an
> N^2 runtime algorithm via dynamic programming (smith-waterman style)
> Also, do note that the erlang parser and python (and i assume ruby is
> in the python boat) have different behaviors in respect to the 2
> cases. Erlang is Case 1, python is case 2.
> Paul
> On Thu, Nov 13, 2008 at 8:20 PM, Chris Anderson <> wrote:
>> On Thu, Nov 13, 2008 at 5:02 PM, ara.t.howard <> wrote:
>>> On Nov 13, 2008, at 5:49 PM, Antony Blakey wrote:
>>>> You could use the view mechanism, and attach a "language" attribute, and
>>>> have this be a general transformation interface, which would indeed be very
>>>> nice. For efficiency you would want to apply this over sets of documents,
>>>> and probably in a transactional context like bulk update does now.
>>>> However... Damien wants something to use in replication, which would mean
>>>> that javascript would then become a required, rather than an optional part
>>>> of Couch, because replication would require it (unless you made the
>>>> replication diff generator pluggable ... but why go there?). The benefit
>>>> the declarative diff format is that applying a diff can be done within
>>>> Couch.
>>> couldn't these queries run in the view server?  in fact any mechanism which
>>> would allow the view server could accomplish this with a protocol between it
>>> and the db server.  basically it's an addition to the map/reduce
>>> functionality which would alter documents on the fly.
>> Antony's right the currently replication does not depend on the
>> availability of the view server. And I think it is smart to avoid that
>> dependence, when possible.
>> Alas, my attempt to bypass all the craziness that is canonical JSON,
>> has come short of that. Oh wells...
>> --
>> Chris Anderson

View raw message