couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Metson <>
Subject Re: Efficient couchdb queries with explicitly specified keyset of >1000 IDs?
Date Wed, 29 Sep 2010 14:46:19 GMT
	I think you could use the "fetch related data" feature added in 0.11  

  for a nice summary). You could have a doc defining your groups and  
then emit that and do the include_docs to pull in the documents in  
that group. I've not used that feature for 1000's of dependant docs,  
but I suspect it should work and not be too inefficient, depending on  
your document structure. If you want to narrow down by other keys  
they'd need to be in the group document (I think - I don't think you  
can further filter on the included docs).

The alternative is having a list of groups that the document belongs  
to in the document (which you don't want to do) and doing a view  
indexed by group.

If the documents are only in one group, and all queries are only  
interesting for the group en masse (e.g. you don't need to have views  
over the whole dataset, or those views are simple) you could have a  
database per group.

On 29 Sep 2010, at 15:27, Heiko Schaefer wrote:

> Hello Couchdb-User-List,
> I'm building a little family of REST-services with couchdb over the  
> past
> months and have made a lot of progress. It's a great experience to get
> to know and use couchdb. Right now I'm trying to figure out how to
> implement a new requirement - and I feel a little lost.
> Here's what I am considering, and would like to know if it sounds at  
> all
> feasible:
> I expect to have on the order of 100.000 documents in my couchdb, very
> soon. But an (independent and external) service may define groups of
> documents (a few thousand documents being one group). I'd like to be
> able do searches (query couchdb views) on just the subset of documents
> that are in one group.
> I'm not very keen to mark my documents themselves as members of a  
> group,
> if it can be at all avoided. So I thought maybe I can make queries  
> that
> explicitly list the relevant few thousand couch IDs in queries.
> Example use-case:
> There are 100.000 documents in the couchdb. An externally supplied  
> list
> specifies a subset of 5.000 IDs.
> I'd like to efficiently query a view, but only considering the  
> documents
> within this subset of 5.000 IDs. Then I would like to further narrow
> down these results by other keys.
> [The IDs would probably not map 1:1 onto the couchdb "_id", but rather
> onto another field (or two) of the documents]
> Would that be as mad (and inefficient) as it sounds? Is there any  
> other
> way to achieve my objectives with couchdb that I might be missing?
> Sorry if my question is not worded properly or clearly - I don't feel
> fluent in the couchdb terminology and way of thinking yet.
> Thanks in advance for any help and pointers.
> cheers,
> :) Heiko

View raw message