incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: getting unique set of document id's
Date Sun, 05 Jul 2009 22:44:04 GMT
On Sun, Jul 5, 2009 at 4:18 PM, Ross Bates<rbates@gmail.com> wrote:
> Hi Paul - thank you for the pointers. Something I'm unclear on though...
> using a sum in the reduce returns something like this for all the tags:
>
> foo, 3
> bar, 5
> baz, 7
>
> When I use the multi-key fetch against the view it doesn't return specific
> docid's for each tag, just a subset of tags and their counts.
>
> POST {"keys": ["foo", "bar"]}
>
> foo, 3
> bar, 5
>
> How can I get access to the list of docid's which make up the total?
>

The idea is that now that you have a set of tags you know what order
to do the client side join. Ie, in this case you start with retrieving
the docids that have the tag 'foo' then 'bar'. this will limit the
number of tags to a maximum of 3 because that's the smallest number.

HTH,
Paul Davis

>
>
> On Sat, Jul 4, 2009 at 3:52 PM, Paul Davis <paul.joseph.davis@gmail.com>wrote:
>
>> On Sat, Jul 4, 2009 at 2:48 PM, Ross Bates<rbates@gmail.com> wrote:
>> > Hi All - finally got up an running with 0.9.0 and have been experimenting
>> > with the POST {"keys": ["key1", "key2", ...]} feature and have a question
>> >
>> > Take the typical example of a set of blog posts which can be tagged with
>> > 1...n tags. Pretend I want to find all posts tagged with both "foo" and
>> > "bar".
>> >
>> > A simple map would look something like this
>> >
>> > function(doc) {
>> >  for(i in doc.tags) {
>> >      emit(doc.tags[i], doc._id);
>> >    }
>> > }
>> >
>> > So now when I post {"keys": ["foo", "bar"]} to the view I get all the
>> > documents tagged "foo", "bar", and also duplicates for any document that
>> is
>> > tagged with both.
>> >
>> > Is the best option to deduplicate the doc._id on the client and resubmit
>> to
>> > all docs, or can this be handled in a reduce function?
>> >
>> > Thank you for any help!
>> >
>> > Ross
>> >
>> > -----
>> > @rossbates
>> > rossbates.com
>> >
>>
>> Assuming the set of tags you want to query isn't known until query
>> time then you need to go with doing the client side intersection. A
>> method for making this process a bit faster is to create a view that
>> lists the tag count using a standard map/reduce of
>> doc.tags.forEach(function(tag) {emit(tag, 1);}); and a reduce of
>> "return sum(values)". Then query this view with the multi-key fetch to
>> get the list of docids for each tag, then query in order of ascending
>> counts.
>>
>> Paul Davis
>>
>

Mime
View raw message