couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Davis" <paul.joseph.da...@gmail.com>
Subject Re: Document Updates
Date Sat, 15 Nov 2008 01:41:27 GMT
On Fri, Nov 14, 2008 at 8:37 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> I wrote a fuzz thing to go along with the diff testing.
>
> You can get it with:
>
> $ sudo easy_install jsontools
>
> # Examples
> from StringIO import StringIO
> import jsontools
> stream = StringIO()
>
> //Fuzzy objects
> fj = jsontools.FuzzyJson()
> obj1 = fj.generate(1).next()
> obj2 = fj.modify(obj1)
>
> //Diff the objects
> jsontools.jsondiff(obj1, obj2, stream=stream)
>
> //Apply the diff
> stream.seek(0)
> result = jsontools.jsonapply(stream, obj1)
>
> //Compare them
> assert jsontools.jsoncmp(result, obj2) == 2
>

== True

> Any comments?
>
> Paul
>
> On Thu, Nov 13, 2008 at 9:34 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> I don't think we need canonical JSON.
>>
>> The Spec definitely needs to be disambiguated though. As I see it
>> there are two interpretations:
>>
>> 1. Order of fields matters which means repeated fields are ok
>> 2. Order does not matter which means repeated fields are NOT ok
>>
>> It doesn't matter which is chosen, but one of them must be to make this work.
>>
>> Also, I got bored. So I implemented JSON diff in python for Case #2.
>>
>> http://www.davispj.com/svn/projects/json-diff/json-diff.py
>>
>> I gotta jet, but when I get home in a bit I'm gonna write a JSON fuzz
>> library and then pound the diff thing with it.
>>
>> Not sure if it's obvious or not, but switching from case 2 to 1 is
>> straightforward. Also, my current array diff implementation is kinda
>> whack. And indels screw the rest of the diff, as in, its not so much a
>> diff as a delete rest and add new. Getting this optimal is actually an
>> N^2 runtime algorithm via dynamic programming (smith-waterman style)
>>
>> Also, do note that the erlang parser and python (and i assume ruby is
>> in the python boat) have different behaviors in respect to the 2
>> cases. Erlang is Case 1, python is case 2.
>>
>> Paul
>>
>>
>> On Thu, Nov 13, 2008 at 8:20 PM, Chris Anderson <jchris@apache.org> wrote:
>>> On Thu, Nov 13, 2008 at 5:02 PM, ara.t.howard <ara.t.howard@gmail.com>
wrote:
>>>>
>>>> On Nov 13, 2008, at 5:49 PM, Antony Blakey wrote:
>>>>
>>>>> You could use the view mechanism, and attach a "language" attribute,
and
>>>>> have this be a general transformation interface, which would indeed be
very
>>>>> nice. For efficiency you would want to apply this over sets of documents,
>>>>> and probably in a transactional context like bulk update does now.
>>>>>
>>>>> However... Damien wants something to use in replication, which would
mean
>>>>> that javascript would then become a required, rather than an optional
part
>>>>> of Couch, because replication would require it (unless you made the
>>>>> replication diff generator pluggable ... but why go there?). The benefit
of
>>>>> the declarative diff format is that applying a diff can be done within
>>>>> Couch.
>>>>
>>>> couldn't these queries run in the view server?  in fact any mechanism which
>>>> would allow the view server could accomplish this with a protocol between
it
>>>> and the db server.  basically it's an addition to the map/reduce
>>>> functionality which would alter documents on the fly.
>>>>
>>>
>>> Antony's right the currently replication does not depend on the
>>> availability of the view server. And I think it is smart to avoid that
>>> dependence, when possible.
>>>
>>> Alas, my attempt to bypass all the craziness that is canonical JSON,
>>> has come short of that. Oh wells...
>>>
>>> --
>>> Chris Anderson
>>> http://jchris.mfdz.com
>>>
>>
>

Mime
View raw message