couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Dionne <dio...@dionne-associates.com>
Subject Re: couch_gen_btree: pluggable storage / tree engines
Date Fri, 06 Feb 2009 12:31:07 GMT

Robert Dionne
Chief Programmer
dionne@dionne-associates.com
203.231.9961



On Feb 1, 2009, at 8:52 AM, Martin Scholl wrote:

> Hello Robert,
>
>
> Robert Dionne wrote:
>> Martin,
>>
>>   I'm very keen on relationships between documents. Coming from the
>> description logic community, I'd like to allow users to declare  
>> certain
>> fields that relate documents and then compute transitive closures  
>> over
>> dags whose nodes are documents and whose arcs the fields of interest.
>> This goes against the grain of  couchdb as collections of unrelated
>> documents, I know, but it's what I want to do as couchdb's schema- 
>> less
>> design offers many advantages over relational databases. Relational
>> databases aren't that great for storing graphs either.
> I like the idea (reminds me of an RDF DB btw), especially when used
> together with views.

it does, though I'm convinced the OWL/RDF community is laboring under  
a delusion that the semantic web can be enabled if we just get enough  
ontologies out there and can federate them. Even RDF is overkill for  
most applications.


>
>>
>>   I don't need to run full classification algorithms in the document
>> store, but would like to just maintain relationships (user- 
>> defined) and
>> transitive closures of them. Inferencing would perhaps be better done
>> externally similar to the hypercouch work. So this would best be  
>> served
>> by pluggable indexing and maybe pluggable storage, though I think I
>> could live without the latter for now.
> With Antony's latest hints (thank you Antony!) in mind, I think I will
> implement first sketches in an external way first. FTI is  
> implemented in
> the same way afair.
>
>>
>>   So I'm very excited about your ideas. I too have been reviewing the
>> code with this in mind and I would agree with others that it's  
>> perhaps a
>> post 1.0 task. From the little time I've spent chasing down a  
>> couple of
>> bugs I've seen there are a few subtle aspects to it. I've also  
>> noticed
>> that the style of design in this community is more bottoms up,  
>> which is
>> how it should be when building something new, so prototypes are  
>> perhaps
>> better for fleshing out ideas. Anyway I'm very happy to help an d
>> collaborate on this as I can.
> Great! I will just publish my results on github. I hope, others will
> join then.
>
> What worries me most, is that I am still unsure in how to differ  
> between
> design docs and indexing schemes, and when to use which  
> infrastructure.
> Applied to the doc-relationship example you gave: how should
> "intermediate reults" of the dag processing be treated? As documents?
> Should they be put into view functions? Should views be able to hint,
> which indexing scheme is to be used? Depending on the index type,
> indexing and doc / view-processing can become inherently coupled and
> complex. Is this still CouchDB then?

Great question, I'd say no it runs entirely against the grain of what  
CouchDB is. Documents aren't supposed to be related to one another.  
But relational databases don't handle this kind of thing either so I  
figure why not CouchDB as it offers other features that solve lots of  
problems. Here's a typical use case (quoted words are documents,  
those with asterisks are fields between documents)"

"heart disease"  is *located_in* the "heart"
"myopathy" is *located_in* the "myocardium"
"myocardium" is *part_of* the "heart"

A reasoner might allow one to compose two relations, .eg.  
*located_in* composed with *part_of* is equal to *located_in* and  
thus conclude that myopathy is a disease of the heart.

So these transitive closures of links between documents would need to  
be incrementally computed and treated the same as views. I think this  
would be best implemented with plugins in the same vm? This kinds f  
processing sems to require a tighter coupling than something like  
full text indexing.

regards,

Bob



>
>
> Martin


Mime
View raw message