From user-return-15299-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Thu Mar 17 20:34:42 2011 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 66831 invoked from network); 17 Mar 2011 20:34:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Mar 2011 20:34:42 -0000 Received: (qmail 92560 invoked by uid 500); 17 Mar 2011 20:34:40 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 92521 invoked by uid 500); 17 Mar 2011 20:34:40 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 92513 invoked by uid 99); 17 Mar 2011 20:34:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Mar 2011 20:34:40 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jwalgran@azavea.com designates 209.85.214.52 as permitted sender) Received: from [209.85.214.52] (HELO mail-bw0-f52.google.com) (209.85.214.52) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Mar 2011 20:34:36 +0000 Received: by bwj24 with SMTP id 24so3712847bwj.11 for ; Thu, 17 Mar 2011 13:34:14 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.127.29 with SMTP id e29mr228258bks.52.1300394054351; Thu, 17 Mar 2011 13:34:14 -0700 (PDT) Received: by 10.204.32.198 with HTTP; Thu, 17 Mar 2011 13:34:14 -0700 (PDT) In-Reply-To: References: Date: Thu, 17 Mar 2011 16:34:14 -0400 Message-ID: Subject: Re: Paging large result sets with sorting From: Justin Walgran To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I'm sorry, I oversimplified my problem statement. Your solution is correct if I only need to select by month. Unfortunately I also need to support an arbitrary inspection date range for filtering results. February 6th to march 14th for example. This is where the trouble creeps in. Justin On Thu, Mar 17, 2011 at 4:29 PM, Keith Gable w= rote: > Then simply emit the name before the day of the month. Then, it'll > sort by name then day of month. > > On Thu, Mar 17, 2011 at 3:17 PM, Justin Walgran wro= te: >> Thanks for the thoughtful reply, Keith. >> >> Assume these input docs: >> >> =A0{ "inspection_date": "2011-03-01", "homeowner_name": "Bob" } >> >> =A0{ "inspection_date": "2011-03-02", "homeowner_name": "Keith" } >> >> =A0{ "inspection_date": "2011-03-03", "homeowner_name": "Alice" } >> >> The key output from >> by_inspection_date_and_homeowner_name?reduce=3Dfalse&startkey=3D[2011,3,= 0]&endkey=3D[2011,3,{}] >> would be: >> >> =A0[2011,3,1,"Bob"] >> =A0[2011,3,2,"Keith"] >> =A0[2011,3,3,"Alice"] >> >> Which is not sorted by home owner name. That's the gotcha. >> >> >> Justin >> >> On Thu, Mar 17, 2011 at 2:13 PM, Keith Gable wrote: >>> Uh. This sounds simple? >>> >>> view: by_home_owner_name: >>> if (doc.home_owner_name) { emit(doc.home_owner_name, 1); } >>> >>> view: by_inspection_date: >>> if (doc.inspection_date) { >>> var d =3D new Date(doc.inspection_date); >>> emit ([ d.getFullYear(), d.getMonth() + 1, d.getDate() ], 1); >>> } >>> >>> To look for all of my inspections: >>> ...by_home_owner_name?key=3DKeith Gable >>> >>> To get all of the inspections for today: >>> ...by_inspection_date?reduce=3Dfalse&key=3D[2011,3,17] >>> >>> To get all of the inspections for this month: >>> ...by_inspection_date?reduce=3Dfalse&startkey=3D[2011,3,0]&endkey=3D[20= 11,3,{}] >>> >>> >>> Combining the two: >>> >>> view: by_inspection_date_and_homeowner_name: >>> if (doc.inspection_date && doc.homeowner_name) { >>> var d =3D new Date(doc.inspection_date); >>> emit ([ d.getFullYear(), d.getMonth() + 1, d.getDate(), >>> doc.homeowner_name ], 1); >>> } >>> >>> ...by_inspection_date_and_homeowner_name?reduce=3Dfalse&startkey=3D[201= 1,3,0]&endkey=3D[2011,3,{}] >>> >>> Will result in: >>> [2011,3,1,"Alice"] >>> [2011,3,1,"Bob"] >>> [2011,3,2,"Keith"] >>> >>> >>> Does any of that not do what you want? >>> >>> On Thu, Mar 17, 2011 at 12:33 PM, Justin Walgran = wrote: >>>> Assume a CouchDB storing and indexing housing inspection records. Each >>>> inspection document as two important fields. >>>> >>>> =A0- Home owner name >>>> =A0- Inspection date >>>> >>>> There are about 15,000 inspection documents generated per month. >>>> >>>> I need to quickly retrieve a list of inspections for January, sorted >>>> by home owner name. >>>> >>>> The issue I am running into is the fact that the size of the result >>>> set requires paging the data using limit and startkey. This would >>>> required that the view key be the inspection date, which means the >>>> results cannot be sorted by home owner name. The size of the data >>>> means that pulling it all down to the client and sorting in the >>>> browser is not performant. >>>> >>>> Is there a clever way to solve this problem? >>>> >>>> >>>> Thanks, >>>> >>>> Justin >>>> >>> >>> >>> >>> -- >>> Keith Gable >>> A+ Certified Professional >>> Network+ Certified Professional >>> Web Developer >>> >> > > > > -- > Keith Gable > A+ Certified Professional > Network+ Certified Professional > Web Developer >