couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: multiview on github
Date Tue, 21 Sep 2010 16:19:31 GMT
> 1) How do you get a row count with a view for a startkey and endkey
> that would solve one of my problems?

Looks like we don't have an API for it yet, but the basic idea is that
you run a reduce with the given query parameters to get this info. In
all views there's a built-in reduce function that does row counting,
so its just a matter of exposing an API to query this. There use to be
an example in couch_db.erl that did this with just a startkey for
enum_docs_since but it appears to have changed to be more complicated
for _changes.

> 2) How do you test for document id inclusion in the results of a view?

How do you mean? I'm proposing the bloom filter method which is just a
constant-space set data-structure that can be used to test for
existence of a key. The first draft implementation would just stream a
query to build a bloom filter for each query.


> <ncb>
> fti and spatial code is only called if the query asks for it, I will
> look into this.

I'm not sure on how best to handle this, I just know that I really
don't like seeing spatial/fti specific code in trunk when the spatial
and fti code is not.

> <ncb>
> ok, it is really unclear in couchdb when to use supervisor,
> gen_servers, I wrote multiview as a gen_server since I thought it
> similar to an EJB and encapsulated unit of work that I wanted to
> delegate tasks to and not hog the HTTP process.
>
> Saying that if couch_query_rings use gen_server delegates as you
> recommend below then that will achieve that goal.

Its a bit complicated and end the end comes down to just having the
experience. Though its important to remember that Erlang processes are
extremely lightweight. Doing operations directly in the HTTP request
processes is fine because each request has its own process (well,
keep-alive requests re-use the process, but that's orthogonal).

Whether or not the ring uses a gen_server the idea was just to
abstract the different query nodes in the ring as a Pid which should
make the code cleaner and easier to understand as well as allow for
the other query types to be added in dynamically.

> <ncb>
> plugins would be good, but honestly it isn't hard to change local.ini,
> With the multiview I would rather see focus on external
> http_db_handlers such as FTI and getting them streaming the results
> rather than having to write a complete result on one stdio line.
>
> I would like this is trunk mainly because I want to hack on trunk and
> to do that I need to be a committer :-) Plugins work fine.

When I say plugins, I'm generally just referring to formalizing how
external code should integrate with CouchDB. Ie, making use of
default.d instead of editing default.ini or local.ini directly.

As to updating the external API, there was some talk at CouchCamp on
changing the current system to allow a bit more flexibility to this by
giving couchdb a reverse proxy system for externals instead of using a
stdio protocol. If we did that, then multiview could just define a
simple api that various external indexers could choose to support. And
the same would work for internal indexers as well.

Becoming a committer is as easy as writing enough accepted patches
that everyone gets tired of applying them for you. We're always
looking for more help.

HTH,
Paul Davis

Mime
View raw message