incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Uvarov <alexander.uva...@gmail.com>
Subject Re: Any better solution for my case?
Date Mon, 17 May 2010 03:23:08 GMT

On 17.05.2010, at 6:58, Jarrod Roberson wrote:
> 
> you don't understand my approach, the list function doesn't apply any rules
> it just merges the duplicate documents, it is exactly the same thing that a
> RDBMS or say Lucene would do.

Thanks, now it seems I got it. Sleep deprivation is not cool :(

> 
> your View function would look something like this:
> I don't know what pages_count might need to be queried on so I skipped it,
> exercise for the reader :-) I am sure you could apply the same logic to
> pages_count that I applied to tags.

Not applicable. "pages_count": { "in": [1, 500]} is a range.

emit(['pages_count',doc.pages_count],null);

'{"keys":[['pages_count',1], ['pages_count',2], ['pages_count',3],... ['pages_count',500!!!]]}'
is unacceptable. Unfortunately there is no range support in couchdb.

The following is also unacceptable, turns into millions of keys, don't forget float values:

for (var i; i < pages_count; i++) {
  emit(...)
}

> 
> function(doc)
> {
>    emit(['type',doc.type],null);
>    emit(['color',doc.color],null);
>    emit(['condition',doc.condition],null);
>    for (var t in doc.tags)
>    {
>        emit(['tag',doc.tags[t]], null);
>    }
> }
> 
> of course you could do doctype checking if you have multiple types of
> documents and checking to see if each property actually exists but I didn't
> want to muddy the water with boilerplate code.
> then you use the generic merge list List function I wrote that reduces all
> the results down to one unique set of ids/documents.
> 
> something like this
> 
> curl -X POST
> http://localhost:5984/yourdatabase/_design/yourdesigndoc/_list/merge_search/search_criteria?include_docs=true-d
> '{"keys":[['tag','cool'],['tag','awesome'],['color','black'],['type','item'],['condition','mint']]}'
> 
> the one assumption I make is that I want the docs back on the queries, it
> would be easy enough to change the list function to optionally process the
> docs if they don't exist and only return back a unique set of keys.
> This is a generic way to search without having to have custom views for each
> field you want to search.

> Since you have a single user per database, you could just create permanent
> views on demand, it would take time to build the indexes based on how big
> each database was and it could bload the database with un-neccesarry
> duplication

This is what I am considering as the best solution. The only thing I don't like is javascript
code generation for a map function.
Ability to pass extra parameters from design doc to map function is much more elegant.

> , but this generic fashion avoids having all that duplicate data
> where they just have say a single tag difference in the "criteria".

Storage is not a resource.

> 
> you could even make the index marginally smaller by just using single
> character names for the field names if there were "lots" of documents to
> index t instead of tags, c1 instead of color, c2 instead of condition.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message