Mailing-List: contact couchdb-dev-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: couchdb-dev@incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com
 designates 74.125.92.147 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=message-id:date:from:to:subject:in-reply-to:mime-version
         :content-type:content-transfer-encoding:content-disposition
         :references;
        b=ACAwIM34YooGqkKUqWaG2zVlDUkMoYbQzzEly2sYOsyrmZF1VwZidg494u9Ojc2dDu
         gPBgUnJZg4eGyv0k0sfHCLdSQxxQ+YoFEYSXk09whF8TI9gS7QedufNC7oUfZ8dtmDAk
         3hJRFneJfaaDPCO+zO7cRrlPrMMXdkd5jwaeE=
Message-ID: <e2111bbb0811041639o291bae40hbd41c6176bf27fe1@mail.gmail.com>
Date: Tue, 4 Nov 2008 19:39:08 -0500
From: "Paul Davis" <paul.joseph.davis@gmail.com>
To: couchdb-dev@incubator.apache.org
Subject: Re: Architectural advice requested, and some 1.0 suggestions.
In-Reply-To: <4C915D97-E639-4FEC-BCB4-F8E6B42FE7AD@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <4C915D97-E639-4FEC-BCB4-F8E6B42FE7AD@gmail.com>

On Tue, Nov 4, 2008 at 6:47 PM, Antony Blakey <antony.blakey@gmail.com> wrote:
> Hi, [ I'm posting to both -user and -dev because IMO it's relevant to both ]
>
> I'm in the middle of a trial port of a system from postgresql to couchdb.
> The system has a distribution/replication requirement due primarily to
> unreliable connectivity and limited end-to-end bandwidth (and high latency)
> over a geographically dispersed set of users.
>
> I have two requirements that don't fit well with the basic couchdb model.
> I've developed solutions for which I'd appreciate some feedback. Also it's a
> request for the notification mechanism to be improved, something like
> _external to moved to the core, and/or an in-process erlang plugin model
> that could replace notification/_external-based techniques.
>
> My solution is a generalization of the GeoCouch technique:
> http://vmx.cx/cgi-bin/blog/index.cgi/geocouch-geospatial-queries-with-couchdb,
> and it allows me to mix SQL and couchdb access, all with the couchdb model
> and without breaking replication.
>
> Requirement 1: I have a User/Role/Permission model for security. Users have
> roles and roles have permissions, and the primary (and very common)
> operation is to check for a transitive (user, permission) pair, which
> involves a join. It would be wrong to invert the user-role relationship and
> put both a list of users and a list of permissions into the role document
> because it is the user that joins roles and the maintenance of the user-role
> relationship is done when editing the user.
>
> Requirement 2: I have a metadata search mechanism that allows the user to
> construct arbitrary queries (in the problem domain, not SQL). The
> transformation of the model from relational to document-based has eliminated
> any joins, but the problem isn't amenable to the oft-described key-based
> virtual joins, and I need to avoid retrieving potentially large sets of
> results and doing set operations in the client because it doesn't scale.
>
> My solution is to use the notification mechanism to maintain (multiple)
> SQLite databases containing the document keys I need to search over. In the
> URP example, I store (db, src, dest) records. I also store the last seqno,
> so that I can do incremental updates to SQLite.
>
> Then, using _external, I can make queries of SQLite to get me either (user,
> permission) pairs using a self-join, or in the case of the arbitrary
> metadata queries, a list of document ids.
>
> The primary difficulties I have with this simple model are:
>
> 1) The notification mechanism doesn't give me enough information. Currently
> I have to do a _all_docs_by_seq and check for deletions by attempting to get
> each document, which I have to do for every document in any case (unless I
> use transparent id's) to determine if I'm interested in it, and then get the
> data. I presume this technique works because deletion is actually a form of
> update until compaction (I copied it from GeoCouch).
>
> ** SUGGESTION ** I would prefer an update interface that gave me a stream of
> (old, new) document pairs, which covers add/update/delete, plus a (from, to)
> seqno pair. Have a 'from' seqno lets me know when I have to trigger a full
> rescan, which I need to do in a variety of circumstances such as
> configuration change.
>
> 2) The process is not as performant as it could be because of the JSON
> marshalling.
>
> ** SUGGESTION ** I would prefer to write the notification process in erlang,
> and be passed the internal representation of the pre/post documents to avoid
> marshalling. Same for map/reduce. I know I can modify couchdb explicitly to
> do this, but that commits me to git tracking/merging etc. A in-process
> plugin option would be better.
>
> 3) Allowing the notification and _external handlers to communicate is more
> work than it should be. I want to do this so that the _external handler can
> cache the queries and invalidate the cache in response to updates.
>
> As an alternative, for the URP data, I could use memcached and let the
> notification process invalidate the cache, which would mean that the
> _external handler doesn't have to cache at all because the client would only
> hit it on a memcached miss. The ultimate extension of this would be to have
> no _external handler, and have the notification process not use SQLite at
> all and instead let the client maintain the data in memcached using normal
> couchdb requests to calculate the transitive closure, which would be
> invalidated by the notification process. Unless the client passivates this
> data to disk it would mean recreating it (on-demand, incrementally) each
> time the client process starts. This is a difficult problem with multiple
> clients (from a development coordination perspective), so maybe an _external
> process is still required, even if it doesn't use on-disk passivation.
>
> That solution doesn't work for the generalized metadata queries, but that's
> not a problem.
>
> ** SUGGESTION ** I think the _external mechanism (or something better)
> should be integrated as a matter of priority. IMO this is a key mechanism
> for augmenting the map/reduce model to satisfy more use-cases.
>
> ** SUGGESTION ** A plugin mechanism should allow a plugin to both listen to
> notifications and respond to e.g. _external.
>
> Antony Blakey
> --------------------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Human beings, who are almost unique in having the ability to learn from the
> experience of others, are also remarkable for their apparent disinclination
> to do so.
>  -- Douglas Adams
>
>
>

This is gonna be quick as I'm getting to go watching the election
coverage, so lets get started:

1) Pretty sure when you iterate over _all_docs_by_seq, each document
should have a _deleted=true member if it's been deleted, no reason to
check the database if it was deleted. That's not to say that there may
be extra information that could be passed to the update processes. (On
second read through I'm starting to doubt this now. It could be that
you end up getting this from the _all_docs. My general thing is read
500 or so rows from _all_docs_by_seq and then do a multi-get on those
500 docs etc. so the deleted info might come from _all_docs).

2) Currently trunk does have an easy method for registering external
erlang code easily via the different daemons and httpd handlers. Two
things though, the startup script needs to have  a couple extra
parameters for telling erlang to load other directories than its own
lib directory. (Or maybe possible config directives? Not sure on
erlang's runtime directory path support mechanism) Also, there are
parts of the api that will probably need to be extended for you to get
all the bells and whistles you're wanting. (ie, an erlang
udpate_notification handler or some such)

3) This would be alleviated I think if we flesh out the process for
adding plugins etc. Your updater/query code would all be in one module
and have access to everything required.

Those are my thoughts for now. I'll think more on it later.

HTH,
Paul