incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Castonguay <mcastong...@justlexit.com>
Subject Re: Question about multiple keys with ranges
Date Tue, 14 Feb 2012 16:10:19 GMT
Ok the issue has finally been resolved and this may be valuable information
for everyone, so I'm adding it to this thread.

I also want to thank everyone who took the time to help me debug.

Essentially, the Date.parse() doesn't like the +0000 on the timestamps. By
doing a substring and removing the +0000, everything worked.

For the record,

document.write(new Date("2012-02-13T16:18:19.565+0000")); //Outputs Invalid
Date
document.write(Date.parse("2012-02-13T16:18:19.565+0000")); //Outputs NaN

But if you remove the +0000, both lines of code work perfectly.

Again thanks everyone for the support.

On Tue, Feb 14, 2012 at 10:30 AM, Robert Newson <rnewson@apache.org> wrote:

> These results;
>
>
> {"id":"344e921af796598bcd709ba973003c60","key":["26de9c438e5d1c0f075f2ae6ad0b39b2",null,null,null],"value":"344e921af796598bcd709ba973003c60"},
>
> {"id":"344e921af796598bcd709ba973004cd9","key":["26de9c438e5d1c0f075f2ae6ad0b39b2",null,null,null],"value":"344e921af796598bcd709ba973004cd9"},
>
> are the output of your map function, so you *are* emitting null, null,
> null from your map function. This explains why your queries don't work
> as you expected.
>
> B.
>
> On 14 February 2012 15:21, Mathieu Castonguay <mcastonguay@justlexit.com>
> wrote:
> > I tried lowercase "startkey" it made no difference.
> >
> > As for the doc.timeScheduled, it's normally formatted timestamps and none
> > are null, ie
> >
> > "2012-02-13T16:18:19.565+0000"
> >
> >
> > On Tue, Feb 14, 2012 at 4:38 AM, Simon Metson <
> simonmetson@googlemail.com>wrote:
> >
> >> Hi,
> >> Looks like the Date.parse is failing, try emitting the doc.timeScheduled
> >> as the value instead of the _id (aside: it's probably not worth emitting
> >> the _id as a value since it's in the view result anyway...) and then
> >> checking on the command line that what's returned is parseable.
> >> Cheers
> >> Simon
> >>
> >>
> >> On Tuesday, 14 February 2012 at 01:54, Mathieu Castonguay wrote:
> >>
> >> > Actually disregard that, it's still not working... :(
> >> >
> >> > The view:
> >> >
> >> > function(doc) { if(doc.userId && doc.timeScheduled) {var d = new
> >> > Date(Date.parse(doc.timeScheduled)); emit([doc.userId,
> >> > d.getFullYear(), d.getMonth(), d.getDate()], doc._id)} }
> >> >
> >> >
> >> >
> >> > When I do ?startKey=["226de9c438e5d1c0f075f2ae6ad0bcc82",2012,1,11]
> >> >
> >> > I get these results, which seems to get null for those values.
> >> >
> >> >
> >>
> {"id":"344e921af796598bcd709ba973003c60","key":["26de9c438e5d1c0f075f2ae6ad0b39b2",null,null,null],"value":"344e921af796598bcd709ba973003c60"},
> >> >
> >>
> {"id":"344e921af796598bcd709ba973004cd9","key":["26de9c438e5d1c0f075f2ae6ad0b39b2",null,null,null],"value":"344e921af796598bcd709ba973004cd9"},
> >> >
> >>
> {"id":"344e921af796598bcd709ba973001d3f","key":["26de9c438e5d1c0f075f2ae6ad0bcc82",null,null,null],"value":"344e921af796598bcd709ba973001d3f"},
> >> >
> >>
> {"id":"344e921af796598bcd709ba973002c01","key":["26de9c438e5d1c0f075f2ae6ad0bcc82",null,null,null],"value":"344e921af796598bcd709ba973002c01"}
> >> >
> >> > If I do the full thing with the end key:
> >> >
> >>
> ?startKey=["226de9c438e5d1c0f075f2ae6ad0bcc82",2012,1,11]&endkey=["226de9c438e5d1c0f075f2ae6ad0bcc82",2012,3,25]
> >> >
> >> > I get no results:
> >> >
> >> > {"total_rows":4,"offset":0,"rows":[]}
> >> >
> >> >
> >> > On Mon, Feb 13, 2012 at 8:18 PM, Mathieu Castonguay <
> >> > mcastonguay@justlexit.com (mailto:mcastonguay@justlexit.com)> wrote:
> >> >
> >> > > Yes, it was me that misunderstood your example, I've been trying a
> few
> >> > > things now and it's working great, thank you for your help.
> >> > >
> >> > >
> >> > > On Mon, Feb 13, 2012 at 7:46 PM, Michael Miller <mike@cloudant.com
> (mailto:
> >> mike@cloudant.com)> wrote:
> >> > >
> >> > > > Thanks Simon,
> >> > > >
> >> > > > Mathieu I'm afraid that I may have misunderstood what you're
> trying
> >> to
> >> > > > do. I assumed the timestamp was a static property of the document.
> >> The
> >> > > > role of the map function is to render those static properties
> into a
> >> static
> >> > > > index, and then to use dynamic start/stop keys at query time
to to
> >> range
> >> > > > queries. It's a common misperception to thing that you are
> >> interacting
> >> > > > with the map function at query time, but you aren't. You can
only
> >> interact
> >> > > > with the output of the map function, so you want to put the logic
> >> into the
> >> > > > startkey/endky to slice into the index appropriately. Are we
on
> the
> >> right
> >> > > > track?
> >> > > >
> >> > > > -M
> >> > > >
> >> > > > On Feb 13, 2012, at 4:33 PM, Simon Metson wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > > Do you mean how do you query the view for a given date?
Once the
> >> > > > >
> >> > > >
> >> > > > document is written I'd assume it has a fixed date, e.g. you'd
do
> >> something
> >> > > > like:
> >> > > > > > var d = new Date(Date.parse(doc.date));
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > You don't want to dynamically generate the date in the view,
as
> >> this
> >> > > > will be the date the view was created, not the date of the query
> or
> >> the
> >> > > > date associated to the data.
> >> > > > > Cheers
> >> > > > > Simon
> >> > > > >
> >> > > > >
> >> > > > > On Monday, 13 February 2012 at 21:27, Mathieu Castonguay
wrote:
> >> > > > >
> >> > > > > > Thanks for the explanation Michael. This works great
if you
> >> already
> >> > > > know
> >> > > > > > the value of the date, but if it's dynamic, how can
I replace
> >> this line
> >> > > > > >
> >> > > > > > var d = new Date(Date.parse("2012-02-11T22:00:00"))
> >> > > > > >
> >> > > > > > with the date from the key? Can I access key[0] or
something
> >> along
> >> > > > those
> >> > > > > > lines from inside my map function?
> >> > > > > >
> >> > > > > > On Mon, Feb 13, 2012 at 3:46 PM, Michael Miller <
> >> mike@cloudant.com (mailto:mike@cloudant.com)(mailto:
> >> > > > mike@cloudant.com (mailto:mike@cloudant.com))> wrote:
> >> > > > > >
> >> > > > > > > Hi Mathieu,
> >> > > > > > >
> >> > > > > > > Sorry to jump in on this conversation late. This
is a bit
> >> verbose, but
> >> > > > > > > I've seen this question go by unanswered way too
many times
> and
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > > decided to
> >> > > > > > > be proactive.
> >> > > > > > >
> >> > > > > > > *Long story short: CouchDB is ideal for this,
even on big
> data
> >> sets.
> >> > > > It
> >> > > > > > > will be fast at scale.
> >> > > > > > >
> >> > > > > > > * Details: When working with dates in couchdb,
I almost
> always
> >> find
> >> > > > > > > myself using the following pattern:
> >> > > > > > >
> >> > > > > > > 1) Store the date-time in either epoch seconds
or a ISO std
> >> format,
> >> > > > both
> >> > > > > > > of which are convenient to work with in javascript
(for
> >> server-side
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > > views
> >> > > > > > > as well as client applications). Your choice of
ISO 8601
> >> formatted
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > > works
> >> > > > > > > nicely with the JS Date class:
> >> > > > > > >
> >> > > > > > > var d = new Date(Date.parse("2012-02-11T22:00:00"))
> >> > > > > > >
> >> > > > > > > 2) Then, in preparation for future reduces you
will likely
> end
> >> up
> >> > > > wanting,
> >> > > > > > > I'd use a compound key structured like:
> >> > > > > > > [<userId>, year, month, day]
> >> > > > > > >
> >> > > > > > > So, the map code would be:
> >> > > > > > >
> >> > > > > > > function(doc){
> >> > > > > > > if (doc && doc.userId && doc.timeScheduled
&&
> doc.dollarValue)
> >> {
> >> > > > > > > var d = new Date(Date.parse("2012-02-11T22:00:00"));
> >> > > > > > > //note, Month runs [0,11]
> >> > > > > > > emit( [doc.userId, d.getFullYear(), d.getMonth(),
> d.getDate()],
> >> > > > > > > doc.dollarValue);
> >> > > > > > > }
> >> > > > > > > }
> >> > > > > > >
> >> > > > > > > where I've assumed that you may want to aggregate
on some
> >> fictitious
> >> > > > > > > doc.dollarValue numerical field. For that, you
would add to
> >> your
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > > design
> >> > > > > > > document a builtin reduce function:
> >> > > > > > >
> >> > > > > > > "reduce": "_stats"
> >> > > > > > >
> >> > > > > > > to get the count, sum, min value, max value, mean
and
> std-dev.
> >> Let's
> >> > > > > > > suppose we call this view "idByTime" and it lives
in the
> >> design_doc
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > > called
> >> > > > > > > "selectors".
> >> > > > > > >
> >> > > > > > > 3) Now, to query this for the SELECT you want
you would do:
> >> > > > > > >
> >> > > > > > > curl -X GET '
> >> > > >
> >>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[
> >> > > > > > > "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]'
> >> > > > > > >
> >> > > > > > > to get the list of document ids that fall within
Jan 1, 2012
> >> and Jan
> >> > > > 25,
> >> > > > > > > 2012 for user id "bob".
> >> > > > > > >
> >> > > > > > > Now, if you want to get the full documents, you
can just
> >> change that
> >> > > > to:
> >> > > > > > >
> >> > > > > > > curl -X GET '
> >> > > >
> >>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?reduce=false&startkey=\[
> >> > > > > > >
> "bob",2012,0,1\]&endkey=\["bob",2012,0,25\]&include_docs=true'
> >> > > > > > >
> >> > > > > > > 4) Now, the real fun comes when you can use that
same index
> to
> >> do
> >> > > > > > > query-time rollup that's super fast. For this
the thing you
> >> want to
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > > note
> >> > > > > > > is the group_level option at query time. If you
have a key
> of
> >> 'n'
> >> > > > > > > dimensions (n=4 in our case), then you can roll
it up from
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > > dimensionality
> >> > > > > > > n=0 through n=4. So, at full dimensionality:
> >> > > > > > >
> >> > > > > > > curl -X GET '
> >> > > >
> >>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=4
> >> > > > > > > '
> >> > > > > > >
> >> > > > > > > will give you the values for all users aggregated
by day.
> You
> >> can add
> >> > > > > > > startkey and endky just as before to slice into
the range.
> >> > > > > > >
> >> > > > > > > Now if you want to roll it up by user/year/month:
> >> > > > > > >
> >> > > > > > > curl -X GET '
> >> > > >
> >>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=3
> >> > > > > > > '
> >> > > > > > >
> >> > > > > > > by user/year:
> >> > > > > > >
> >> > > > > > > curl -X GET '
> >> > > >
> >>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=2
> >> > > > > > > '
> >> > > > > > >
> >> > > > > > > by user:
> >> > > > > > >
> >> > > > > > > curl -X GET '
> >> > > >
> >>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=1
> >> > > > > > > '
> >> > > > > > >
> >> > > > > > > and ultimately roll up over all users:
> >> > > > > > >
> >> > > > > > > curl -X GET '
> >> > > >
> >>
> http://demo.cloudant.com/dbname/_design/sectors/_view/idByTime?group_level=0
> >> > > > > > > '
> >> > > > > > >
> >> > > > > > > Note that group_level=0 => "group=false", and
group_level =
> n
> >> =>
> >> > > > > > > "group=true" in the view query options at:
> >> > > > > > >
> >> > > > > > >
> http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options.
> >> > > > > > >
> >> > > > > > > I prefer to just be explicit with the group_level
and forget
> >> that
> >> > > > > > > group=true/false exists.
> >> > > > > > >
> >> > > > > > > Thanks, Mike
> >> > > > > > >
> >> > > > > > > p.s., apologies for any typos, I was cribbing
this from some
> >> cloudant
> >> > > > > > > blog-posts in the making.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Feb 13, 2012, at 11:11 AM, Mathieu Castonguay
wrote:
> >> > > > > > >
> >> > > > > > > > I tried that exact example with
> >> > > >
> >>
> ?startKey=["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-11T22:00:00"]&endkey=["26de9c438e5d1c0f075f2ae6ad0bcc82",{}]
> >> > > > > > > > and I still get records in the past:
> >> > > > > > > >
> >> > > > > > > > {"total_rows":3,"offset":0,"rows":[
> >> > > >
> >>
> {"id":"344e921af796598bcd709ba973003c60","key":["26de9c438e5d1c0f075f2ae6ad0b39b2","2012-02-13T16:18:19.565+0000"],"value":"344e921af796598bcd709ba973003c60"},
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >>
> {"id":"344e921af796598bcd709ba973001d3f","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T21:44:14.920+0000"],"value":"344e921af796598bcd709ba973001d3f"},
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >>
> {"id":"344e921af796598bcd709ba973002c01","key":["26de9c438e5d1c0f075f2ae6ad0bcc82","2012-02-10T22:05:48.218+0000"],"value":"344e921af796598bcd709ba973002c01"}
> >> > > > > > > > ]}
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > The view's map function is:
> >> > > > > > > >
> >> > > > > > > > function(doc) { if(doc.userId &&
doc.timeScheduled)
> >> > > > > > > > {emit([doc.userId,doc.timeScheduled], doc._id)}
}
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Mon, Feb 13, 2012 at 1:55 PM, James Klo
<
> jim.klo@sri.com(mailto:
> >> jim.klo@sri.com) (mailto:
> >> > > > jim.klo@sri.com (mailto:jim.klo@sri.com))> wrote:
> >> > > > > > > >
> >> > > > > > > > > Not sure how you are querying, but are
you doing the
> >> equivalent to
> >> > > > this?
> >> > > > > > > > > startkey and endkey should be expressed
as JSON
> >> > > > > > > > >
> >> > > > > > > > > curl -g '
> >> > > >
> >>
> http://localhost:5984/orders/_design/Order/_view/by_users_after_time?startkey=[
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >>
> "f98ba9a518650a6c15c566fc6f00c157","2012-01-01T11:40:52.280Z"]&endkey=["userid",{}]'
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > *
> >> > > > > > > > > Jim Klo
> >> > > > > > > > > Senior Software Engineer
> >> > > > > > > > > Center for Software Engineering
> >> > > > > > > > > SRI International
> >> > > > > > > > > e. jim.klo@sri.com (mailto:jim.klo@sri.com)
> >> > > > > > > > > p. 805.542.9330 x121
> >> > > > > > > > > m. 805.286.1350
> >> > > > > > > > > f. 805.546.2444
> >> > > > > > > > > *
> >> > > > > > > > >
> >> > > > > > > > > On Feb 13, 2012, at 10:27 AM, Mathieu
Castonguay wrote:
> >> > > > > > > > >
> >> > > > > > > > > I tried reversing the keys with no luck.
I still get
> >> timestamps that
> >> > > > > > > are in
> >> > > > > > > > > the past (before the startKey).
> >> > > > > > > > >
> >> > > > > > > > > On Sat, Feb 11, 2012 at 6:37 PM, James
Klo <
> >> jim.klo@sri.com (mailto:jim.klo@sri.com)(mailto:
> >> > > > jim.klo@sri.com (mailto:jim.klo@sri.com))> wrote:
> >> > > > > > > > >
> >> > > > > > > > > Reverse the key. [userid, time]
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > CouchDB is all about understanding collation.
Basically
> >> views are
> >> > > > > > > > >
> >> > > > > > > > > sorted/grouped from left to right alphanumeric.
See
> >> > > > > > > > >
> >> > > > > > > > > http://wiki.apache.org/couchdb/View_collation
for the
> >> finer
> >> > > > details as
> >> > > > > > > > >
> >> > > > > > > > > there are more rules than the basics
I mention.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > so the reversal sorts the view by userid
first, then
> date
> >> as string.
> >> > > > > > > > >
> >> > > > > > > > > Instead of sorting by dates then userids.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > You do it this way because you know
the exact userid,
> but
> >> not the
> >> > > > exact
> >> > > > > > > > >
> >> > > > > > > > > date. If you knew the exact date, but
not the userid,
> what
> >> you have
> >> > > > > > > > >
> >> > > > > > > > > currently would be better.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > - Jim
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Sent from my iPad
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > On Feb 11, 2012, at 1:54 PM, "Mathieu
Castonguay" <
> >> > > > > > > > >
> >> > > > > > > > > mcastonguay@justlexit.com (mailto:
> >> mcastonguay@justlexit.com)>
> >> > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > I have a simple document named Order
structure with the
> >> fields id,
> >> > > > name,
> >> > > > > > > > >
> >> > > > > > > > > userId and timeScheduled.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > What I would like to do is create a
view where I can
> find
> >> the
> >> > > > > > > > >
> >> > > > > > > > > document.idfor those who's userId is
some value and
> >> timeScheduledis
> >> > > > > > > > >
> >> > > > > > > > > after a given date.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > My view:
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > "by_users_after_time": {
> >> > > > > > > > >
> >> > > > > > > > > "map": "function(doc) { if (doc.userId
&&
> >> doc.timeScheduled) {
> >> > > > > > > > >
> >> > > > > > > > > emit([doc.timeScheduled, doc.userId],
doc._id); }}"
> >> > > > > > > > >
> >> > > > > > > > > }
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > If I do
> >> > > >
> >>
> localhost:5984/orders/_design/Order/_view/by_users_after_time?startKey="[2012-01-01T11:40:52.280Z,f98ba9a518650a6c15c566fc6f00c157]"
> >> > > > > > > > >
> >> > > > > > > > > I get every result back. Is there a
way to access key[1]
> >> to do an if
> >> > > > > > > > >
> >> > > > > > > > > doc.userId == key[1] or something along
those lines and
> >> simply emit
> >> > > > on
> >> > > > > > > > >
> >> > > > > > > > > the
> >> > > > > > > > >
> >> > > > > > > > > time?
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > This would be the SQL equivalent of
select id from Order
> >> where
> >> > > > userId =
> >> > > > > > > > >
> >> > > > > > > > > "f98ba9a518650a6c15c566fc6f00c157" and
timeScheduled >
> >> > > > > > > > >
> >> > > > > > > > > 2012-01-01T11:40:52.280Z;
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > I did quite a few Google searches but
I can't seem to
> find
> >> a good
> >> > > > > > > > >
> >> > > > > > > > > tutorial
> >> > > > > > > > >
> >> > > > > > > > > on working with multiple keys. It's
also possible that
> my
> >> approach
> >> > > > is
> >> > > > > > > > >
> >> > > > > > > > > entirely flawed so any guidance would
be appreciated.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Thank you,
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Matt
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message