incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Castonguay <mcastong...@justlexit.com>
Subject Re: Question about multiple keys with ranges
Date Tue, 14 Feb 2012 01:18:45 GMT
Yes, it was me that misunderstood your example, I've been trying a few
things now and it's working great, thank you for your help.

On Mon, Feb 13, 2012 at 7:46 PM, Michael Miller <mike@cloudant.com> wrote:

> Thanks Simon,
>
> Mathieu I'm afraid that I may have misunderstood what you're trying to do.
>  I assumed the timestamp was a static property of the document.  The role
> of the map function is to render those static properties into a static
> index, and then to use dynamic start/stop keys at query time to to range
> queries.   It's a common misperception to thing that you are interacting
> with the map function at query time, but you aren't.  You can only interact
> with the output of the map function, so you want to put the logic into the
> startkey/endky to slice into the index appropriately.  Are we on the right
> track?
>
> -M
>
> On Feb 13, 2012, at 4:33 PM, Simon Metson wrote:
>
> > Hi,
> > Do you mean how do you query the view for a given date? Once the
> document is written I'd assume it has a fixed date, e.g. you'd do something
> like:
> >> var d = new Date(Date.parse(doc.date));
> >>
> >>
> >
> >
> > You don't want to dynamically generate the date in the view, as this
> will be the date the view was created, not the date of the query or the
> date associated to the data.
> > Cheers
> > Simon
> >
> >
> > On Monday, 13 February 2012 at 21:27, Mathieu Castonguay wrote:
> >
> >> Thanks for the explanation Michael. This works great if you already know
> >> the value of the date, but if it's dynamic, how can I replace this line
> >>
> >> var d = new Date(Date.parse("2012-02-11T22:00:00"))
> >>
> >> with the date from the key? Can I access key[0] or something along those
> >> lines from inside my map function?
> >>
> >> On Mon, Feb 13, 2012 at 3:46 PM, Michael Miller <mike@cloudant.com(mailto:
> mike@cloudant.com)> wrote:
> >>
> >>> Hi Mathieu,
> >>>
> >>> Sorry to jump in on this conversation late. This is a bit verbose, but
> >>> I've seen this question go by unanswered way too many times and
> decided to
> >>> be proactive.
> >>>
> >>> *Long story short: CouchDB is ideal for this, even on big data sets. It
> >>> will be fast at scale.
> >>>
> >>> * Details: When working with dates in couchdb, I almost always find
> >>> myself using the following pattern:
> >>>
> >>> 1) Store the date-time in either epoch seconds or a ISO std format,
> both
> >>> of which are convenient to work with in javascript (for server-side
> views
> >>> as well as client applications). Your choice of ISO 8601 formatted
> works
> >>> nicely with the JS Date class:
> >>>
> >>> var d = new Date(Date.parse("2012-02-11T22:00:00"))
> >>>
> >>> 2) Then, in preparation for future reduces you will likely end up
> wanting,
> >>> I'd use a compound key structured like:
> >>> [<userId>, year, month, day]
> >>>
> >>> So, the map code would be:
> >>>
> >>> function(doc){
> >>> if (doc && doc.userId && doc.timeScheduled && doc.dollarValue)
{
> >>> var d = new Date(Date.parse("2012-02-11T22:00:00"));
> >>> //note, Month runs [0,11]
> >>> emit( [doc.userId, d.getFullYear(), d.getMonth(), d.getDate()],
> >>> doc.dollarValue);
> >>> }
> >>> }
> >>>
> >>> where I've assumed that you may want to aggregate on some fictitious
> >>> doc.dollarValue numerical field. For that, you would add to your design
> >>> document a builtin reduce function:
> >>>
> >>> "reduce": "_stats"
> >>>
> >>> to get the count, sum, min value, max value, mean and std-dev. Let's
> >>> suppose we call this view "idByTime" and it lives in the design_doc
> called
> >>> "selectors".
> >>>
> >>> 3) Now, to query this for the SELECT you want you would do:
> >>>
> >>> curl -X GET '
> >>>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[
> >>> "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]'
> >>>
> >>> to get the list of document ids that fall within Jan 1, 2012 and Jan
> 25,
> >>> 2012 for user id "bob".
> >>>
> >>> Now, if you want to get the full documents, you can just change that
> to:
> >>>
> >>> curl -X GET '
> >>>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[
> >>> "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]&include_docs=true'
> >>>
> >>> 4) Now, the real fun comes when you can use that same index to do
> >>> query-time rollup that's super fast. For this the thing you want to
> note
> >>> is the group_level option at query time. If you have a key of 'n'
> >>> dimensions (n=4 in our case), then you can roll it up from
> dimensionality
> >>> n=0 through n=4. So, at full dimensionality:
> >>>
> >>> curl -X GET '
> >>>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=4
> >>> '
> >>>
> >>> will give you the values for all users aggregated by day. You can add
> >>> startkey and endky just as before to slice into the range.
> >>>
> >>> Now if you want to roll it up by user/year/month:
> >>>
> >>> curl -X GET '
> >>>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=3
> >>> '
> >>>
> >>> by user/year:
> >>>
> >>> curl -X GET '
> >>>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=2
> >>> '
> >>>
> >>> by user:
> >>>
> >>> curl -X GET '
> >>>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=1
> >>> '
> >>>
> >>> and ultimately roll up over all users:
> >>>
> >>> curl -X GET '
> >>>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=0
> >>> '
> >>>
> >>> Note that group_level=0 => "group=false", and group_level = n =>
> >>> "group=true" in the view query options at:
> >>>
> >>> http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options.
> >>>
> >>> I prefer to just be explicit with the group_level and forget that
> >>> group=true/false exists.
> >>>
> >>> Thanks, Mike
> >>>
> >>> p.s., apologies for any typos, I was cribbing this from some cloudant
> >>> blog-posts in the making.
> >>>
> >>>
> >>>
> >>> On Feb 13, 2012, at 11:11 AM, Mathieu Castonguay wrote:
> >>>
> >>>> I tried that exact example with
> >>>
> ?startKey=["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-11T22:00:00"]&endkey=["26de9c438e5d1c0f075f2ae6ad0bcc82",{}]
> >>>> and I still get records in the past:
> >>>>
> >>>> {"total_rows":3,"offset":0,"rows":[
> >>>
> {"id":"344e921af796598bcd709ba973003c60","key":["26de9c438e5d1c0f075f2ae6ad0b39b2","2012-02-13T16:18:19.565+0000"],"value":"344e921af796598bcd709ba973003c60"},
> >>>>
> >>>
> >>>
> {"id":"344e921af796598bcd709ba973001d3f","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T21:44:14.920+0000"],"value":"344e921af796598bcd709ba973001d3f"},
> >>>>
> >>>
> >>>
> {"id":"344e921af796598bcd709ba973002c01","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T22:05:48.218+0000"],"value":"344e921af796598bcd709ba973002c01"}
> >>>> ]}
> >>>>
> >>>>
> >>>> The view's map function is:
> >>>>
> >>>> function(doc) { if(doc.userId && doc.timeScheduled)
> >>>> {emit([doc.userId,doc.timeScheduled], doc._id)} }
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Feb 13, 2012 at 1:55 PM, James Klo <jim.klo@sri.com (mailto:
> jim.klo@sri.com)> wrote:
> >>>>
> >>>>> Not sure how you are querying, but are you doing the equivalent
to
> this?
> >>>>> startkey and endkey should be expressed as JSON
> >>>>>
> >>>>> curl -g '
> >>>
> http://localhost:5984/orders/_design/Order/_view/by_users_after_time?startkey=[
> >>>>>
> >>>>
> >>>
> >>>
> "f98ba9a518650a6c15c566fc6f00c157","2012-01-01T11:40:52.280Z"]&endkey=["userid",{}]'
> >>>>>
> >>>>>
> >>>>> *
> >>>>> Jim Klo
> >>>>> Senior Software Engineer
> >>>>> Center for Software Engineering
> >>>>> SRI International
> >>>>> e. jim.klo@sri.com (mailto:jim.klo@sri.com)
> >>>>> p. 805.542.9330 x121
> >>>>> m. 805.286.1350
> >>>>> f. 805.546.2444
> >>>>> *
> >>>>>
> >>>>> On Feb 13, 2012, at 10:27 AM, Mathieu Castonguay wrote:
> >>>>>
> >>>>> I tried reversing the keys with no luck. I still get timestamps
that
> >>> are in
> >>>>> the past (before the startKey).
> >>>>>
> >>>>> On Sat, Feb 11, 2012 at 6:37 PM, James Klo <jim.klo@sri.com (mailto:
> jim.klo@sri.com)> wrote:
> >>>>>
> >>>>> Reverse the key. [userid, time]
> >>>>>
> >>>>>
> >>>>> CouchDB is all about understanding collation. Basically views are
> >>>>>
> >>>>> sorted/grouped from left to right alphanumeric. See
> >>>>>
> >>>>> http://wiki.apache.org/couchdb/View_collation for the finer details
> as
> >>>>>
> >>>>> there are more rules than the basics I mention.
> >>>>>
> >>>>>
> >>>>> so the reversal sorts the view by userid first, then date as string.
> >>>>>
> >>>>> Instead of sorting by dates then userids.
> >>>>>
> >>>>>
> >>>>> You do it this way because you know the exact userid, but not the
> exact
> >>>>>
> >>>>> date. If you knew the exact date, but not the userid, what you have
> >>>>>
> >>>>> currently would be better.
> >>>>>
> >>>>>
> >>>>> - Jim
> >>>>>
> >>>>>
> >>>>>
> >>>>> Sent from my iPad
> >>>>>
> >>>>>
> >>>>> On Feb 11, 2012, at 1:54 PM, "Mathieu Castonguay" <
> >>>>>
> >>>>> mcastonguay@justlexit.com (mailto:mcastonguay@justlexit.com)>
wrote:
> >>>>>
> >>>>>
> >>>>> I have a simple document named Order structure with the fields id,
> name,
> >>>>>
> >>>>> userId and timeScheduled.
> >>>>>
> >>>>>
> >>>>> What I would like to do is create a view where I can find the
> >>>>>
> >>>>> document.idfor those who's userId is some value and timeScheduledis
> >>>>>
> >>>>> after a given date.
> >>>>>
> >>>>>
> >>>>> My view:
> >>>>>
> >>>>>
> >>>>> "by_users_after_time": {
> >>>>>
> >>>>> "map": "function(doc) { if (doc.userId && doc.timeScheduled)
{
> >>>>>
> >>>>> emit([doc.timeScheduled, doc.userId], doc._id); }}"
> >>>>>
> >>>>> }
> >>>>>
> >>>>>
> >>>>> If I do
> >>>
> localhost:5984/orders/_design/Order/_view/by_users_after_time?startKey="[2012-01-01T11:40:52.280Z,f98ba9a518650a6c15c566fc6f00c157]"
> >>>>>
> >>>>> I get every result back. Is there a way to access key[1] to do an
if
> >>>>>
> >>>>> doc.userId == key[1] or something along those lines and simply emit
> on
> >>>>>
> >>>>> the
> >>>>>
> >>>>> time?
> >>>>>
> >>>>>
> >>>>> This would be the SQL equivalent of select id from Order where
> userId =
> >>>>>
> >>>>> "f98ba9a518650a6c15c566fc6f00c157" and timeScheduled >
> >>>>>
> >>>>> 2012-01-01T11:40:52.280Z;
> >>>>>
> >>>>>
> >>>>> I did quite a few Google searches but I can't seem to find a good
> >>>>>
> >>>>> tutorial
> >>>>>
> >>>>> on working with multiple keys. It's also possible that my approach
is
> >>>>>
> >>>>> entirely flawed so any guidance would be appreciated.
> >>>>>
> >>>>>
> >>>>> Thank you,
> >>>>>
> >>>>>
> >>>>> Matt
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message