couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Goodall <matt.good...@gmail.com>
Subject Re: Newbie :Filtering using complex key and array
Date Fri, 20 Nov 2009 15:40:08 GMT
2009/11/20 Sebastien PASTOR <sebastien.pastor@gmx.com>:
> Thanks for the quick reply Matt,
>
> It does sound like a trick because it might generate a much bigger list
> than i ll have document in the db (times number of delivery areas). But
> i guess as long as the map index  is generated only once it will just be
> a matter of more space used to store the index right?

Correct. CouchDB's views essentially trade disk space for predictable
query times (assuming the view is up to date, of course).

One thing you can do to reduce the size of views, if necessary, is to
emit a null value, i.e. emit(key, null), and add the include_docs=true
query parameter when GET'ing the view. See
http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options.

However, that may make things a little slower and put a bit more load
on CouchDB.  If everything's in the view then CouchDB can jump into
the view at the startkey and simply stream the view until the endkey.
With include_docs=true there's another lookup to get the doc per row
returned by the view.

I've never measured the performance hit of include_docs, I just know I
use it quite a lot and haven't worried too much about it so far ;-).

> I guess this is how things should be
> handled when using couchDB, not obvious when coming from SQL :)

Yep, it's quite a different way of thinking about your data.

> At least it works exactly as expected.One little thing that bugs me is that the reduce
function returns :
>
> {"rows":[
> {"key":[75010,"indian"],"value":1},
> {"key":[75010,"italian"],"value":2},
> {"key":[75010,"japanese"],"value":1}
> ]}

As you've probably discovered, you can get three types of count from
this view by playing with the view's group_level query param:

1. The number of restaurants of each type in a delivery area, e.g.
italian restaurants in 75010.
2. The number of restaurants in a delivery area.
3. The total number of restaurants in your database.

>
> If i want to return only restaurant type and the number of
> restaurants
> Something like :
> {"key":"italian","value":1},
>
> Is it something to could/should be done in Lists or in some way in the
> reduce function ?

For this you'll need an additional view. The map would probably
emit(doc.type, 1) and the reduce would calculate the sum.

- Matt

>
> Thanks again !
>
> Sebastien
>
> On Fri, Nov 20, 2009 at 12:15:59PM +0000, Matt Goodall wrote:
>> 2009/11/20 Sebastien PASTOR <sebastien.pastor@gmx.com>:
>> > Sorry for the previous blank mail ... here is the content :
>> >
>> > Hi there,
>> >
>> > Pretty new to couchDB. I ve read a lot about couchdb and finally dive
>> > into it with a small project :)
>> > I am trying to do a simple thing and i am not sure at all if i am going
>> > the right way :
>> >
>> > my docs look like this :
>> >
>> > {
>> > ??"name":"Pizza Torino",
>> > ?? "delivery_areas":[75019,75018,75012,75013,75010],
>> > ?? ?? "type":"italian"
>> > ?? ?? }
>> >
>> > ?? ?? I managed to get ??all shops by type and get a reduce function to
>> > ?? ?? do the sum ( not much i know but still quite an accomplishment for
>> > ?? ?? me :) )
>> > ?? ?? I then tried to get my result filtered by delivery_areas. as in
>> > ?? ?? getting only shop
>> > ?? ?? that do delivery in postal code 75019. I just could not ??have
>> > ?? ?? anything that
>> > ?? ?? worked using startkey and endkey ... is it the way to go or is
>> > ?? ?? storing
>> > ?? ?? delivery_areas within an array not right ?
>> >
>> > ?? ?? my last map function looks like this :
>> > ?? ?? ?? ?? "getShops" : {
>> > ?? ?? ?? ?? ?? ?? ?? ?? ??"map" : "function(doc){
>> > ?? ?? ?? ?? ?? ?? ?? ?? emit([doc.delivery_areas,doc.type],doc.name)
>> > ?? ?? ?? ?? ?? ?? ?? ?? }
>> > ?? ?? }
>> >
>> > ??Thanks for pointing me to the right direction
>>
>> Storing a list of delivery areas is good. The "trick" is to emit
>> multiple rows per document, e.g.
>>
>>     function(doc) {
>>       for each (area in doc.delivery_areas) {
>>         emit([area, doc.type], doc.name);
>>       }
>>     }
>>
>> You can then query with startkey=[75018, null] and endkey=[75018, {}]
>> to get all restaurants in the 75018 area.
>>
>> The null and {} might look a bit weird at first but it's all to do
>> with how CouchDB orders rows in a view. See
>> http://wiki.apache.org/couchdb/View_collation#Collation_Specification
>> for details.
>>
>> Hope this helps.
>>
>> - Matt
>

Mime
View raw message