incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Castonguay <mcastong...@justlexit.com>
Subject Re: Question about multiple keys with ranges
Date Mon, 13 Feb 2012 21:27:23 GMT
Thanks for the explanation Michael. This works great if you already know
the value of the date, but if it's dynamic, how can I replace this line

var d = new Date(Date.parse("2012-02-11T22:00:00"))

with the date from the key? Can I access key[0] or something along those
lines from inside my map function?

On Mon, Feb 13, 2012 at 3:46 PM, Michael Miller <mike@cloudant.com> wrote:

> Hi Mathieu,
>
> Sorry to jump in on this conversation late.  This is a bit verbose, but
> I've seen this question go by unanswered way too many times and decided to
> be proactive.
>
> *Long story short: CouchDB is ideal for this, even on big data sets.  It
> will be fast at scale.
>
> * Details:  When working with dates in couchdb, I almost always find
> myself using the following pattern:
>
> 1) Store the date-time in either epoch seconds or a ISO std format, both
> of which are convenient to work with in javascript (for server-side views
> as well as client applications).  Your choice of ISO 8601 formatted works
> nicely with the JS Date class:
>
> var d = new Date(Date.parse("2012-02-11T22:00:00"))
>
> 2) Then, in preparation for future reduces you will likely end up wanting,
> I'd use a compound key structured like:
> [<userId>, year, month, day]
>
> So, the map code would be:
>
> function(doc){
>    if (doc && doc.userId && doc.timeScheduled && doc.dollarValue)
{
>        var d = new Date(Date.parse("2012-02-11T22:00:00"));
>        //note, Month runs [0,11]
>        emit( [doc.userId, d.getFullYear(), d.getMonth(), d.getDate()],
> doc.dollarValue);
>    }
> }
>
> where I've assumed that you may want to aggregate on some fictitious
> doc.dollarValue numerical field.  For that, you would add to your design
> document a builtin reduce function:
>
> "reduce": "_stats"
>
> to get the count, sum, min value, max value, mean and std-dev.  Let's
> suppose we call this view "idByTime" and it  lives in the design_doc called
> "selectors".
>
> 3) Now, to query this for the SELECT you want you would do:
>
> curl -X GET '
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[
> "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]'
>
> to get the list of document ids that fall within Jan 1, 2012 and Jan 25,
> 2012 for user id "bob".
>
> Now, if you want to get the full documents, you can just change that to:
>
> curl -X GET '
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[
> "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]&include_docs=true'
>
> 4) Now, the real fun comes when you can use that same index to do
> query-time rollup that's super fast.  For this the thing you want to note
> is the group_level option at query time.  If you have a key of 'n'
> dimensions (n=4 in our case), then you can roll it up from dimensionality
> n=0 through n=4.  So, at full dimensionality:
>
> curl -X GET '
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=4
> '
>
> will give you the values for all users aggregated by day.  You can add
> startkey and endky just as before to slice into the range.
>
> Now if you want to roll it up by user/year/month:
>
> curl -X GET '
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=3
> '
>
> by user/year:
>
> curl -X GET '
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=2
> '
>
> by user:
>
> curl -X GET '
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=1
> '
>
> and ultimately roll up over all users:
>
> curl -X GET '
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=0
> '
>
> Note that group_level=0 => "group=false", and group_level = n =>
> "group=true" in the view query options at:
>
> http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options.
>
> I prefer to just be explicit with the group_level and forget that
> group=true/false exists.
>
> Thanks, Mike
>
> p.s., apologies for any typos, I was cribbing this from some cloudant
> blog-posts in the making.
>
>
>
> On Feb 13, 2012, at 11:11 AM, Mathieu Castonguay wrote:
>
> > I tried that exact example with
> >
> ?startKey=["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-11T22:00:00"]&endkey=["26de9c438e5d1c0f075f2ae6ad0bcc82",{}]
> > and I still get records in the past:
> >
> > {"total_rows":3,"offset":0,"rows":[
> >
> {"id":"344e921af796598bcd709ba973003c60","key":["26de9c438e5d1c0f075f2ae6ad0b39b2","2012-02-13T16:18:19.565+0000"],"value":"344e921af796598bcd709ba973003c60"},
> >
> {"id":"344e921af796598bcd709ba973001d3f","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T21:44:14.920+0000"],"value":"344e921af796598bcd709ba973001d3f"},
> >
> {"id":"344e921af796598bcd709ba973002c01","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T22:05:48.218+0000"],"value":"344e921af796598bcd709ba973002c01"}
> > ]}
> >
> >
> > The view's map function is:
> >
> > function(doc) { if(doc.userId && doc.timeScheduled)
> > {emit([doc.userId,doc.timeScheduled], doc._id)} }
> >
> >
> >
> >
> > On Mon, Feb 13, 2012 at 1:55 PM, James Klo <jim.klo@sri.com> wrote:
> >
> >> Not sure how you are querying, but are you doing the equivalent to this?
> >> startkey and endkey should be expressed as JSON
> >>
> >> curl -g '
> >>
> http://localhost:5984/orders/_design/Order/_view/by_users_after_time?startkey=[
> >>
> "f98ba9a518650a6c15c566fc6f00c157","2012-01-01T11:40:52.280Z"]&endkey=["userid",{}]'
> >>
> >>
> >> *
> >> Jim Klo
> >> Senior Software Engineer
> >> Center for Software Engineering
> >> SRI International
> >> e. jim.klo@sri.com
> >> p. 805.542.9330 x121
> >> m.  805.286.1350
> >> f. 805.546.2444
> >> *
> >>
> >> On Feb 13, 2012, at 10:27 AM, Mathieu Castonguay wrote:
> >>
> >> I tried reversing the keys with no luck. I still get timestamps that
> are in
> >> the past (before the startKey).
> >>
> >> On Sat, Feb 11, 2012 at 6:37 PM, James Klo <jim.klo@sri.com> wrote:
> >>
> >> Reverse the key. [userid, time]
> >>
> >>
> >> CouchDB is all about understanding collation. Basically views are
> >>
> >> sorted/grouped from left to right alphanumeric. See
> >>
> >> http://wiki.apache.org/couchdb/View_collation for the finer details as
> >>
> >> there are more rules than the basics I mention.
> >>
> >>
> >> so the reversal sorts the view by userid first, then date as string.
> >>
> >> Instead of sorting by dates then userids.
> >>
> >>
> >> You do it this way because you know the exact userid, but not the exact
> >>
> >> date. If you knew the exact date, but not the userid, what you have
> >>
> >> currently would be better.
> >>
> >>
> >> - Jim
> >>
> >>
> >>
> >> Sent from my iPad
> >>
> >>
> >> On Feb 11, 2012, at 1:54 PM, "Mathieu Castonguay" <
> >>
> >> mcastonguay@justlexit.com> wrote:
> >>
> >>
> >> I have a simple document named Order structure with the fields id, name,
> >>
> >> userId and timeScheduled.
> >>
> >>
> >> What I would like to do is create a view where I can find the
> >>
> >> document.idfor those who's userId is some value and timeScheduledis
> >>
> >> after a given date.
> >>
> >>
> >> My view:
> >>
> >>
> >> "by_users_after_time": {
> >>
> >>     "map": "function(doc) { if (doc.userId && doc.timeScheduled) {
> >>
> >> emit([doc.timeScheduled, doc.userId], doc._id); }}"
> >>
> >> }
> >>
> >>
> >> If I do
> >>
> >>
> >>
> >>
> localhost:5984/orders/_design/Order/_view/by_users_after_time?startKey="[2012-01-01T11:40:52.280Z,f98ba9a518650a6c15c566fc6f00c157]"
> >>
> >> I get every result back. Is there a way to access key[1] to do an if
> >>
> >> doc.userId == key[1] or something along those lines and simply emit on
> >>
> >> the
> >>
> >> time?
> >>
> >>
> >> This would be the SQL equivalent of select id from Order where userId =
> >>
> >> "f98ba9a518650a6c15c566fc6f00c157" and timeScheduled >
> >>
> >> 2012-01-01T11:40:52.280Z;
> >>
> >>
> >> I did quite a few Google searches but I can't seem to find a good
> >>
> >> tutorial
> >>
> >> on working with multiple keys. It's also possible that my approach is
> >>
> >> entirely flawed so any guidance would be appreciated.
> >>
> >>
> >> Thank you,
> >>
> >>
> >> Matt
> >>
> >>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message