couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aroj George <aro...@gmail.com>
Subject Re: filtering on timestamp + aggregation on another field
Date Mon, 14 Mar 2011 18:13:16 GMT
Thanks for the below.

Another option we came up with is as below,

map:
for each level in the location hierarchy:
     emit([level,timestamp],doc)

which will produce something like the below,

*for given documents:*
{ timestamp : 01/01/2011, location : [India, Maharashtra,Pune] , other_attrs
}
{ timestamp : 01/02/2011, location : [India, Maharashtra,Mumbai] ,
other_attrs }

*Map output:*
1. [India,01/01/2011], doc
2. [Maharashtra,01/01/2011], doc
3. [Pune,01/01/2011], doc
4. [India,01/02/2011], doc
5. [Maharashtra,01/02/2011], doc
6. [Mumbai,01/02/2011], doc

Now we can have a query like,
startkey=[India,01/01/2011] & endkey=[India,01/03/2011] & group_level=1

which should give me the documents grouped on India but filtered on
timestamp..

The question is, is this a good solution? One concern being the number of
records in the view now is number_of_levels * num_documents
ie in this case 2 documents * 3 levels = 6 records in the view.

Will couch performance suffer with this approach?

Rgds,
Aroj



On Mon, Mar 14, 2011 at 9:47 PM, Kinley Dorji <kinleyd@gmail.com> wrote:

> Hi Aroj,
>
> Unlike in SQL, CouchDB views cannot be filtered by value. They can be
> only filtered by key for performance reasons as the key is where the
> indexing action takes place.
>
> So the solution for you would be to have a view of this type:
>
> emit (doc.timestamp, {country:doc.country, city:doc.city,
> hospital:doc.hospital, rooms:doc.rooms}); // this representing the
> output of the map stage
>
> This would create a view that you could filter based on your start and
> end time stamp keys for the desired period.
>
> The aggregation would take place in your code in the reduce stage,
> which would manipulate the json that you created on the value side of
> your key/value pair for a particular time stamp to get whatever
> aggregation you need.
>
> Hope that helps.
>
> On Mon, Mar 14, 2011 at 10:02 PM, Aroj George <arojis@gmail.com> wrote:
> > Thanks Kinley. But just emitting the timestamp does not solve the problem
> > for us.
> >
> > We want to be able to emit something like,
> >
> > [India,Mahrashtra,Pune,timestamp]  (where Pune is a city in state
> > Maharashtra which is in country India)
> >
> > and then we would like to filter using the timestamp and group by
> location
> > on say India as,
> >
> > *startkey=India,%,%,timestamp1 & endkey=India,%,%,timestamp2 &
> group_level =
> > 1*
> >
> > This should give me all results grouped by location India, but filtered
> out
> > between the two timestamps.
> >
> > But ofcourse there is no % filter possible it seems.
> >
> > A SQL equivalent would be,
> >
> > select * from table where timestamp > timestamp1 and timestamp <
> timestamp2
> > group by country.
> >
> > What's the right thing to do when faced with such a requirement?
> >
> > we could do a temp view like below?
> >
> > if (doc.timestamp > T1 & doc.timestamp < T2) emit(doc.location_path,doc)
> >
> > But aren't temp views considered bad for performance.
> >
> > Please do suggest the best way to solve this problem in couchdb.
> >
> >
> > Rgds,
> > Aroj
> >
> >
> > On 14 Mar 2011 18:59, "Kinley Dorji" <kinleyd@gmail.com> wrote:
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message