couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Friesen <li...@danielfriesen.name>
Subject Re: Erlang vs Scala
Date Mon, 13 Apr 2009 05:01:13 GMT
It wasn't XML vs. JSON that made us go with the xml database.
The issue was CouchDB's methodology of dealing with whole documents and 
using views vs. Sedna (XQuery) which allows modification of portions of 
the document and provides extensive querying and indexing capability.

Our data is hierarchical and quite complex. As a result of how we work 
with our data in CouchDB we would have to store the data with each 
individual object inside of it's own document (because we cannot store 
multiple in the same without running into the whole commit conflict 
issue). However the views system does not allow us to query for a bunch 
of objects the way needed because we cannot deal with the data recursively.
It was discussed in the user list and lead to a proposal on how to deal 
with recursive stuff inside of CouchDB which moved to the dev list. 
However it also ended up with me searching for other non-conventional 
database types and settling on an XML+XQuery database being the most 
mature method of dealing with our use case.

To detail, we use "Jits" inside of our app. These "Jits" can have any 
number of jits as children. There is no real limitation on the depth of 
this structure, and jits are not limited to having only one parent. As a 
result within a site Jits can also belong to multiple pages. The only 
limitation I have placed is that data can't be cyclic and can't belong 
to more than one site.
In CouchDB we would have needed a structure similar to:
document aaa: { "isa": "jit", "state": {}, "children": ["bbb", "ccc"] }
document bbb: { "isa": "jit", "state": {}, "children": ["ddd"] }
document ccc: { "isa": "jit", "state": {}, "children": ["ddd"] }
document ddd: { "isa": "jit", "state": {}, "children": [] }
But unfortunately we can't go grabbing data hierarchically without doing 
a pile of queries, or complicating the data structure in a way that I 
cannot sanely complicate the program itself by using when I have other 
options. Avoiding the huge piles of queries and overcomplicated data 
structures was the whole reason behind moving away from MySQL and into 
using non-traditional databases.

In the XML database instead of the structure before I can place all the 
jits into one document for the site (because I can modify individual 
parts of it without worrying about conflicts in other parts of the 
document) and make use of XQuery to to iteration and recursion and 
return whatever I need.
<site id="...">
<jits>
<jit 
id="aaa"><state/><children><jit>bbb</jit><jit>ccc</jit></children></jit>
<jit id="bbb"><state/><children><jit>ddd</jit></children></jit>
<jit id="ccc"><state/><children><jit>ddd</jit></children></jit>
<jit id="ddd"><state/><children/></jit>
</jits>
</site>

So querying for the jit ID by document id in couch is equivalent to 
`doc(...)/site/jits/jit[id="..."]` in XQuery. But because of FLWOR and 
the other parts of XQuery I should theoretically be able to deal with 
the recursive nature of our data on the database side.

~Daniel Friesen (Dantman, Nadir-Seen-Fire)

Dean Landolt wrote:
> On Wed, Apr 8, 2009 at 10:18 PM, Wout Mertens <wout.mertens@gmail.com>wrote:
>
>   
>> Hi Daniel,
>>
>> Interesting, what hierarchy can you express in Sedna (XML) that you
>> can't express in CouchDB (JSON)?
>>     
>
>
> Let me start by saying I think json has already won the utility vs.
> usability tradeoff war. That said, there is nothing you can express in json
> that you can't express with xml+xsd (unweildy as this may be). But there are
> things baked into xml that aren't easy (or even possible) with json --
> namespaces come to mind. Things like XPath and XQuery can be bridged (or
> bested) with a few client libraries (e.g. JSONPath and JSONQuery). But json
> will always live in a flat namespace -- and that's part of the charm.
>
> And let's not forget that (perhaps due largely to momentum) there's a metric
> shitton of good structured data locked up in xml, xhtml, even tidyable html.
> So to answer your question...*the web*. That's tough to express purely in
> json -- but luckily couch is flexible enough to allow external indexing and
> querying against all this tagsoup crap using something like sedna, monetdb,
> etc. And will no doubt be even more so in the future. I know you were
> kidding...just wanted to get across the *
> we-can't-discount-xml-and-the-others* point.
>
>   


Mime
View raw message