incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Woolley <danwool...@gmail.com>
Subject Re: Multiple search criteria with ranges
Date Tue, 16 Dec 2008 14:19:58 GMT
Thanks for all the good ideas - Couchdb definitely has an active  
community willing to help.  I'm going to try to run some tests over  
the holidays, with this and the array intersection methods on the  
server side, and see if it's worth pursuing further.


Dan Woolley
profile:  http://www.linkedin.com/in/danwoolley
company:  http://woolleyrobertson.com
product:  http://dwellicious.com
blog:  http://tzetzefly.com



On Dec 15, 2008, at 6:25 PM, Paul Davis wrote:

> I thought of a possible way to possible overcome this that may or may
> not work depending on the underlying data.
>
> General scheme is to basically make a view for each filter that you
> want to be able to combine. Then with a group reduce you could figure
> out the number of records in each range. Then fetch the records for
> the view with the fewest results and then multi get against all other
> filters removing keys if they come back with an error.
>
> More concretely:
>
> Each view would be:
>
> map:
> function(doc) {emit(doc.filter_field, 1);}
>
> reduce:
> function(keys, values) {return sum(values);}
>
> To get the count for a specific range you'd do:
>
> GET http://127.0.0.1:5984/db_name/_view/filters/by_field_x?group=true&startkey=min&endkey=max
>
> And in the client code you would merge each of the results. For things
> with a more continuous range like price, you may need to bucket
> appropriately. This query should be run once for each field.
>
> Then say we want to filter on fields X, Y, Z (assuming num_in(X) <
> num_in(Y) < num_in(Z))
>
> DocIds  = GET http://127.0.0.1:5984/db_name/_view/filters/by_field_X?startkey=minX&endkey=maxX
> Then intersect with a call to Y and Z views:
> POST http://127.0.0.1:5984/db_name/_view/filters/by_field_Y?startkey=minY&endkey=maxY
> BODY: {"keys": [DocIds]}
>
> That make sense?
>
> Paul
>
>
>
>
>
>
> On Sun, Dec 14, 2008 at 12:06 PM, Dan Woolley <danwoolley@gmail.com>  
> wrote:
>> I'm researching Couchdb for a project dealing with real estate  
>> listing data.
>> I'm very interested in Couchdb because the schema less nature,  
>> RESTful
>> interface, and potential off-line usage with syncing fit my problem  
>> very
>> well.  I've been able to do some prototyping and search on ranges  
>> for a
>> single field very successfully.  I'm having trouble wrapping my  
>> mind around
>> views for a popular use case in real estate, which is a query like:
>>
>> Price = 350000-400000
>> Beds = 4-5
>> Baths = 2-3
>>
>> Any single range above is trivial, but what is the best model for  
>> handling
>> this AND scenario with views?  The only thing I've been able to  
>> come up with
>> is three views returning doc id's - which should be very fast -  
>> with an
>> array intersection calculation on the client side.  Although I  
>> haven't tried
>> it yet, that client side calculation worries me with a potential  
>> document
>> with 1M records - the client would potentially be dealing with  
>> calculating
>> the intersection of multiple 100K element arrays.  Is that a  
>> realistic
>> calculation?
>>
>> Please tell me there is a better model for dealing with this type of
>> scenario - or that this use case is not well suited for Couchdb at  
>> this time
>> and I should move along.
>>
>>
>> Dan Woolley
>> profile:  http://www.linkedin.com/in/danwoolley
>> company:  http://woolleyrobertson.com
>> product:  http://dwellicious.com
>> blog:  http://tzetzefly.com
>>
>>
>>
>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message