incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: multiple key word count query problem
Date Mon, 20 Jul 2009 04:35:13 GMT
On Sun, Jul 19, 2009 at 11:06 PM, Tommy Chheng<tommy.chheng@gmail.com> wrote:
> so for keys with two or more parameters, only the first parameter can be
> used for range selection? the 2nd and remaining keys can only be used for
> grouping/sorting?
>

There's no parameters here. There's only one key. Array sorting works
the same way it does in any other situation. If the first elements
aren't equal there's no reason to consider the second position of the
arrays. Second elements aren't treated any different than the first or
eighth.

> the problem with having two views:
> If i had two views, one for [word, doc] => count and [doc, word] => count;
> it would be re-doing the same word counting function twice.
>

Indeed. But as Nitin notes, we do this to ensure that we ensure
incremental calculations among other things.

> I'm gonna try to compute the docs word counts and store the results in
> database itself.

This may save you some computation, but I'd be greatly surprised if
the separation causes you issues. It may be slower on bulk loading
data, but under a normal production load, the extra computational
demand isn't going to affect you most likely.


Paul Davis

>
> thanks,
> tommy
>
> On Jul 19, 2009, at 7:16 PM, Paul Davis wrote:
>
>> On Sun, Jul 19, 2009 at 9:14 PM, Tommy Chheng<tommy.chheng@gmail.com>
>> wrote:
>>>
>>> I have a simple word count view defined as:
>>> --------
>>> function(doc) {
>>>  if(doc['couchrest-type'] == 'NsfGrant'){
>>>   var words = doc['abstract'].split(/\W+/);
>>>   words.forEach(function(word){
>>>     if (word.length > 1) emit([word, doc['_id']],1);
>>>   });
>>>  }
>>> }
>>>
>>> function(keys, values, rereduce) {
>>>  return sum(values);
>>> }
>>> --------
>>> where the key's first parameter is the word and the 2nd parameter is the
>>> document_id.
>>>
>>> so i can do a query like this to get all the documents with the word
>>> "the"
>>> correctly.
>>>
>>> http://localhost:5984/nsf_grants/_design/NsfGrant/_view/by_word_doc_count?startkey=["the"]&endkey=["the",{}]&group_level=2
>>>
>>> I'm having trouble doing queries on the 2nd parameter, how can i find all
>>> the words in a particular document?
>>> I tried
>>>
>>> http://localhost:5984/nsf_grants/_design/NsfGrant/_view/by_word_doc_count?key=[null,"0808605"]&group_level=2
>>> which gives nothing(thinking that null would match all words)
>>> and
>>>
>>> http://localhost:5984/nsf_grants/_design/NsfGrant/_view/by_word_doc_count?startkey=[null,"0808605"]&endkey=[{},"0808605"]&group_level=2
>>> which gives all results. Why is this?
>>>
>>> Thanks,
>>> Tommy
>>>
>>
>> Querying a view is asking for a slice of a sorted list. Start and end
>> keys delimit the range of rows returned. The solution to your problem
>> is to create a second view so you can query by docid.
>>
>> Paul Davis
>
>

Mime
View raw message