Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 725EC9972 for ; Tue, 14 Feb 2012 12:18:11 +0000 (UTC) Received: (qmail 61524 invoked by uid 500); 14 Feb 2012 11:55:14 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 32823 invoked by uid 500); 14 Feb 2012 11:53:48 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 5061 invoked by uid 99); 14 Feb 2012 09:38:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Feb 2012 09:38:36 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of simonmetson@googlemail.com designates 209.85.212.180 as permitted sender) Received: from [209.85.212.180] (HELO mail-wi0-f180.google.com) (209.85.212.180) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Feb 2012 09:38:30 +0000 Received: by wibhm6 with SMTP id hm6so5057749wib.11 for ; Tue, 14 Feb 2012 01:38:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=date:from:to:message-id:in-reply-to:references:subject:x-mailer :mime-version:content-type; bh=6OPT7rmIn35vrDxaiY23v0z9lsxEPWGCGWyNQWN7TDc=; b=Xc/9nfbpXyBZT6oxDc2Bz2fTP8+uyyv8+d/kkl2thuem7X6qN4fxMIR67Gz0UvSbeS M6lQ+IEOFq6YVSGH+6m/LM+ASwK/hIYfmHLnArs98WdnwBIb97Q7rzZEjqXO5N2yODJh nvNN7YIhn4yoFm5B0HM034R/6yoK56qKwPD38= Received: by 10.180.95.1 with SMTP id dg1mr29367320wib.21.1329212289045; Tue, 14 Feb 2012 01:38:09 -0800 (PST) Received: from hilbert.lan (93-97-111-13.zone5.bethere.co.uk. [93.97.111.13]) by mx.google.com with ESMTPS id fw5sm25772877wib.0.2012.02.14.01.38.06 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 14 Feb 2012 01:38:07 -0800 (PST) Date: Tue, 14 Feb 2012 09:38:04 +0000 From: Simon Metson To: user@couchdb.apache.org Message-ID: In-Reply-To: References: <4643A725-CF8E-4726-85D2-E452A671B2AB@sri.com> <5CF88DBD-17CA-4E89-A5D2-AAC73D02C57C@sri.com> <507B387585EF49CAA32E84AC9598A6EE@googlemail.com> <9AD36519-7817-4DE8-9E8F-6A941AF02EA8@cloudant.com> Subject: Re: Question about multiple keys with ranges X-Mailer: sparrow 1.5 (build 1043.1) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="4f3a2b7c_12200854_b604" --4f3a2b7c_12200854_b604 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi, Looks like the Date.parse is failing, try emitting the doc.timeScheduled as the value instead of the _id (aside: it's probably not worth emitting the _id as a value since it's in the view result anyway...) and then checking on the command line that what's returned is parseable. Cheers Simon On Tuesday, 14 February 2012 at 01:54, Mathieu Castonguay wrote: > Actually disregard that, it's still not working... :( > > The view: > > function(doc) { if(doc.userId && doc.timeScheduled) {var d = new > Date(Date.parse(doc.timeScheduled)); emit([doc.userId, > d.getFullYear(), d.getMonth(), d.getDate()], doc._id)} } > > > > When I do ?startKey=["226de9c438e5d1c0f075f2ae6ad0bcc82",2012,1,11] > > I get these results, which seems to get null for those values. > > {"id":"344e921af796598bcd709ba973003c60","key":["26de9c438e5d1c0f075f2ae6ad0b39b2",null,null,null],"value":"344e921af796598bcd709ba973003c60"}, > {"id":"344e921af796598bcd709ba973004cd9","key":["26de9c438e5d1c0f075f2ae6ad0b39b2",null,null,null],"value":"344e921af796598bcd709ba973004cd9"}, > {"id":"344e921af796598bcd709ba973001d3f","key":["26de9c438e5d1c0f075f2ae6ad0bcc82",null,null,null],"value":"344e921af796598bcd709ba973001d3f"}, > {"id":"344e921af796598bcd709ba973002c01","key":["26de9c438e5d1c0f075f2ae6ad0bcc82",null,null,null],"value":"344e921af796598bcd709ba973002c01"} > > If I do the full thing with the end key: > ?startKey=["226de9c438e5d1c0f075f2ae6ad0bcc82",2012,1,11]&endkey=["226de9c438e5d1c0f075f2ae6ad0bcc82",2012,3,25] > > I get no results: > > {"total_rows":4,"offset":0,"rows":[]} > > > On Mon, Feb 13, 2012 at 8:18 PM, Mathieu Castonguay < > mcastonguay@justlexit.com (mailto:mcastonguay@justlexit.com)> wrote: > > > Yes, it was me that misunderstood your example, I've been trying a few > > things now and it's working great, thank you for your help. > > > > > > On Mon, Feb 13, 2012 at 7:46 PM, Michael Miller wrote: > > > > > Thanks Simon, > > > > > > Mathieu I'm afraid that I may have misunderstood what you're trying to > > > do. I assumed the timestamp was a static property of the document. The > > > role of the map function is to render those static properties into a static > > > index, and then to use dynamic start/stop keys at query time to to range > > > queries. It's a common misperception to thing that you are interacting > > > with the map function at query time, but you aren't. You can only interact > > > with the output of the map function, so you want to put the logic into the > > > startkey/endky to slice into the index appropriately. Are we on the right > > > track? > > > > > > -M > > > > > > On Feb 13, 2012, at 4:33 PM, Simon Metson wrote: > > > > > > > Hi, > > > > Do you mean how do you query the view for a given date? Once the > > > > > > > > > > document is written I'd assume it has a fixed date, e.g. you'd do something > > > like: > > > > > var d = new Date(Date.parse(doc.date)); > > > > > > > > > > > > > > > > You don't want to dynamically generate the date in the view, as this > > > will be the date the view was created, not the date of the query or the > > > date associated to the data. > > > > Cheers > > > > Simon > > > > > > > > > > > > On Monday, 13 February 2012 at 21:27, Mathieu Castonguay wrote: > > > > > > > > > Thanks for the explanation Michael. This works great if you already > > > know > > > > > the value of the date, but if it's dynamic, how can I replace this line > > > > > > > > > > var d = new Date(Date.parse("2012-02-11T22:00:00")) > > > > > > > > > > with the date from the key? Can I access key[0] or something along > > > those > > > > > lines from inside my map function? > > > > > > > > > > On Mon, Feb 13, 2012 at 3:46 PM, Michael Miller > > mike@cloudant.com (mailto:mike@cloudant.com))> wrote: > > > > > > > > > > > Hi Mathieu, > > > > > > > > > > > > Sorry to jump in on this conversation late. This is a bit verbose, but > > > > > > I've seen this question go by unanswered way too many times and > > > > > > > > > > > > > > > > > > > > > > > > > > decided to > > > > > > be proactive. > > > > > > > > > > > > *Long story short: CouchDB is ideal for this, even on big data sets. > > > It > > > > > > will be fast at scale. > > > > > > > > > > > > * Details: When working with dates in couchdb, I almost always find > > > > > > myself using the following pattern: > > > > > > > > > > > > 1) Store the date-time in either epoch seconds or a ISO std format, > > > both > > > > > > of which are convenient to work with in javascript (for server-side > > > > > > > > > > > > > > > views > > > > > > as well as client applications). Your choice of ISO 8601 formatted > > > > > > > > > > > > > > > works > > > > > > nicely with the JS Date class: > > > > > > > > > > > > var d = new Date(Date.parse("2012-02-11T22:00:00")) > > > > > > > > > > > > 2) Then, in preparation for future reduces you will likely end up > > > wanting, > > > > > > I'd use a compound key structured like: > > > > > > [, year, month, day] > > > > > > > > > > > > So, the map code would be: > > > > > > > > > > > > function(doc){ > > > > > > if (doc && doc.userId && doc.timeScheduled && doc.dollarValue) { > > > > > > var d = new Date(Date.parse("2012-02-11T22:00:00")); > > > > > > //note, Month runs [0,11] > > > > > > emit( [doc.userId, d.getFullYear(), d.getMonth(), d.getDate()], > > > > > > doc.dollarValue); > > > > > > } > > > > > > } > > > > > > > > > > > > where I've assumed that you may want to aggregate on some fictitious > > > > > > doc.dollarValue numerical field. For that, you would add to your > > > > > > > > > > > > > > > > > > > > > design > > > > > > document a builtin reduce function: > > > > > > > > > > > > "reduce": "_stats" > > > > > > > > > > > > to get the count, sum, min value, max value, mean and std-dev. Let's > > > > > > suppose we call this view "idByTime" and it lives in the design_doc > > > > > > > > > > > > > > > > > > > > > called > > > > > > "selectors". > > > > > > > > > > > > 3) Now, to query this for the SELECT you want you would do: > > > > > > > > > > > > curl -X GET ' > > > http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[ > > > > > > "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]' > > > > > > > > > > > > to get the list of document ids that fall within Jan 1, 2012 and Jan > > > 25, > > > > > > 2012 for user id "bob". > > > > > > > > > > > > Now, if you want to get the full documents, you can just change that > > > to: > > > > > > > > > > > > curl -X GET ' > > > http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[ > > > > > > "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]&include_docs=true' > > > > > > > > > > > > 4) Now, the real fun comes when you can use that same index to do > > > > > > query-time rollup that's super fast. For this the thing you want to > > > > > > > > > > > > > > > > > > > > > note > > > > > > is the group_level option at query time. If you have a key of 'n' > > > > > > dimensions (n=4 in our case), then you can roll it up from > > > > > > > > > > > > > > > > > > > > > dimensionality > > > > > > n=0 through n=4. So, at full dimensionality: > > > > > > > > > > > > curl -X GET ' > > > http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=4 > > > > > > ' > > > > > > > > > > > > will give you the values for all users aggregated by day. You can add > > > > > > startkey and endky just as before to slice into the range. > > > > > > > > > > > > Now if you want to roll it up by user/year/month: > > > > > > > > > > > > curl -X GET ' > > > http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=3 > > > > > > ' > > > > > > > > > > > > by user/year: > > > > > > > > > > > > curl -X GET ' > > > http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=2 > > > > > > ' > > > > > > > > > > > > by user: > > > > > > > > > > > > curl -X GET ' > > > http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=1 > > > > > > ' > > > > > > > > > > > > and ultimately roll up over all users: > > > > > > > > > > > > curl -X GET ' > > > http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=0 > > > > > > ' > > > > > > > > > > > > Note that group_level=0 => "group=false", and group_level = n => > > > > > > "group=true" in the view query options at: > > > > > > > > > > > > http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options. > > > > > > > > > > > > I prefer to just be explicit with the group_level and forget that > > > > > > group=true/false exists. > > > > > > > > > > > > Thanks, Mike > > > > > > > > > > > > p.s., apologies for any typos, I was cribbing this from some cloudant > > > > > > blog-posts in the making. > > > > > > > > > > > > > > > > > > > > > > > > On Feb 13, 2012, at 11:11 AM, Mathieu Castonguay wrote: > > > > > > > > > > > > > I tried that exact example with > > > ?startKey=["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-11T22:00:00"]&endkey=["26de9c438e5d1c0f075f2ae6ad0bcc82",{}] > > > > > > > and I still get records in the past: > > > > > > > > > > > > > > {"total_rows":3,"offset":0,"rows":[ > > > {"id":"344e921af796598bcd709ba973003c60","key":["26de9c438e5d1c0f075f2ae6ad0b39b2","2012-02-13T16:18:19.565+0000"],"value":"344e921af796598bcd709ba973003c60"}, > > > > > > > > > > > > > > > > > > > > > > > > > > > > {"id":"344e921af796598bcd709ba973001d3f","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T21:44:14.920+0000"],"value":"344e921af796598bcd709ba973001d3f"}, > > > > > > > > > > > > > > > > > > > > > > > > > > > > {"id":"344e921af796598bcd709ba973002c01","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T22:05:48.218+0000"],"value":"344e921af796598bcd709ba973002c01"} > > > > > > > ]} > > > > > > > > > > > > > > > > > > > > > The view's map function is: > > > > > > > > > > > > > > function(doc) { if(doc.userId && doc.timeScheduled) > > > > > > > {emit([doc.userId,doc.timeScheduled], doc._id)} } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 13, 2012 at 1:55 PM, James Klo > > jim.klo@sri.com (mailto:jim.klo@sri.com))> wrote: > > > > > > > > > > > > > > > Not sure how you are querying, but are you doing the equivalent to > > > this? > > > > > > > > startkey and endkey should be expressed as JSON > > > > > > > > > > > > > > > > curl -g ' > > > http://localhost:5984/orders/_design/Order/_view/by_users_after_time?startkey=[ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > "f98ba9a518650a6c15c566fc6f00c157","2012-01-01T11:40:52.280Z"]&endkey=["userid",{}]' > > > > > > > > > > > > > > > > > > > > > > > > * > > > > > > > > Jim Klo > > > > > > > > Senior Software Engineer > > > > > > > > Center for Software Engineering > > > > > > > > SRI International > > > > > > > > e. jim.klo@sri.com (mailto:jim.klo@sri.com) > > > > > > > > p. 805.542.9330 x121 > > > > > > > > m. 805.286.1350 > > > > > > > > f. 805.546.2444 > > > > > > > > * > > > > > > > > > > > > > > > > On Feb 13, 2012, at 10:27 AM, Mathieu Castonguay wrote: > > > > > > > > > > > > > > > > I tried reversing the keys with no luck. I still get timestamps that > > > > > > are in > > > > > > > > the past (before the startKey). > > > > > > > > > > > > > > > > On Sat, Feb 11, 2012 at 6:37 PM, James Klo > > jim.klo@sri.com (mailto:jim.klo@sri.com))> wrote: > > > > > > > > > > > > > > > > Reverse the key. [userid, time] > > > > > > > > > > > > > > > > > > > > > > > > CouchDB is all about understanding collation. Basically views are > > > > > > > > > > > > > > > > sorted/grouped from left to right alphanumeric. See > > > > > > > > > > > > > > > > http://wiki.apache.org/couchdb/View_collation for the finer > > > details as > > > > > > > > > > > > > > > > there are more rules than the basics I mention. > > > > > > > > > > > > > > > > > > > > > > > > so the reversal sorts the view by userid first, then date as string. > > > > > > > > > > > > > > > > Instead of sorting by dates then userids. > > > > > > > > > > > > > > > > > > > > > > > > You do it this way because you know the exact userid, but not the > > > exact > > > > > > > > > > > > > > > > date. If you knew the exact date, but not the userid, what you have > > > > > > > > > > > > > > > > currently would be better. > > > > > > > > > > > > > > > > > > > > > > > > - Jim > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sent from my iPad > > > > > > > > > > > > > > > > > > > > > > > > On Feb 11, 2012, at 1:54 PM, "Mathieu Castonguay" < > > > > > > > > > > > > > > > > mcastonguay@justlexit.com (mailto:mcastonguay@justlexit.com)> > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > I have a simple document named Order structure with the fields id, > > > name, > > > > > > > > > > > > > > > > userId and timeScheduled. > > > > > > > > > > > > > > > > > > > > > > > > What I would like to do is create a view where I can find the > > > > > > > > > > > > > > > > document.idfor those who's userId is some value and timeScheduledis > > > > > > > > > > > > > > > > after a given date. > > > > > > > > > > > > > > > > > > > > > > > > My view: > > > > > > > > > > > > > > > > > > > > > > > > "by_users_after_time": { > > > > > > > > > > > > > > > > "map": "function(doc) { if (doc.userId && doc.timeScheduled) { > > > > > > > > > > > > > > > > emit([doc.timeScheduled, doc.userId], doc._id); }}" > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > If I do > > > localhost:5984/orders/_design/Order/_view/by_users_after_time?startKey="[2012-01-01T11:40:52.280Z,f98ba9a518650a6c15c566fc6f00c157]" > > > > > > > > > > > > > > > > I get every result back. Is there a way to access key[1] to do an if > > > > > > > > > > > > > > > > doc.userId == key[1] or something along those lines and simply emit > > > on > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > time? > > > > > > > > > > > > > > > > > > > > > > > > This would be the SQL equivalent of select id from Order where > > > userId = > > > > > > > > > > > > > > > > "f98ba9a518650a6c15c566fc6f00c157" and timeScheduled > > > > > > > > > > > > > > > > > 2012-01-01T11:40:52.280Z; > > > > > > > > > > > > > > > > > > > > > > > > I did quite a few Google searches but I can't seem to find a good > > > > > > > > > > > > > > > > tutorial > > > > > > > > > > > > > > > > on working with multiple keys. It's also possible that my approach > > > is > > > > > > > > > > > > > > > > entirely flawed so any guidance would be appreciated. > > > > > > > > > > > > > > > > > > > > > > > > Thank you, > > > > > > > > > > > > > > > > > > > > > > > > Matt --4f3a2b7c_12200854_b604--