lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: lucene 2.9 sorting algorithm
Date Fri, 23 Oct 2009 02:35:16 GMT
Mark,

  We're not seeing exactly the numbers that Mike is seeing in his tests,
running with jdk 1.5 on intel macs, so we're trying to eliminate factors of
difference.

  Point 2 does indeed make a difference, we've seen it, and it's only fair:
the
single pq comparator does this branch optimization but the current patch
multi-pq
does not, so let's level the playing field.

  John's on the road with limited net connectivity, but we'll have some
numbers to
compare more over the weekend for sure.

  -jake

On Thu, Oct 22, 2009 at 7:29 PM, Mark Miller <markrmiller@gmail.com> wrote:

> Why? What might he find? Whats with the cryptic request?
>
> Why would Java 1.5 perform better than 1.6? It erases 20 and 40% gains?
>
> I know point 2 certainly doesn't. Cards on the table?
>
> John Wang wrote:
> > Hey Michael:
> >
> >        Would you mind rerunning the test you have with jdk1.5?
> >
> >        Also, if you would, change the comparator method to avoid
> > brachning for int and string comparators, e.g.
> >
> >
> >       return index.order[i.doc] - index.order[j.doc];
> >
> >
> > Thanks
> >
> >
> > -John
> >
> >
> > On Thu, Oct 22, 2009 at 2:38 AM, Michael McCandless
> > <lucene@mikemccandless.com <mailto:lucene@mikemccandless.com>> wrote:
> >
> >     On Thu, Oct 22, 2009 at 2:17 AM, John Wang <john.wang@gmail.com
> >     <mailto:john.wang@gmail.com>> wrote:
> >
> >     >      I have been playing with the patch, and I think I have some
> >     information
> >     > that you might like.
> >     >      Let me spend sometime and gather some more numbers and
> >     update in jira.
> >
> >     Excellent!
> >
> >     >      say bottom has ords 23, 45, 76, each corresponding to a
> >     string. When
> >     > moving to the next segment, you need to make bottom to have ords
> >     that can be
> >     > comparable to other docs in this new segment, so you would need
> >     to find the
> >     > new ords for the values in 23,45 and 76, don't you? To find it,
> >     assuming the
> >     > values are s1,s2,s3, you would do a bin. search on the new val
> >     array, and
> >     > find index for s1,s2,s3.
> >
> >     It's that inversion (from ord->Comparable in first seg, and
> >     Comparable->ord in second seg) that I'm trying to avoid (w/ this new
> >     proposal).
> >
> >     > Which is 3 bin searches per convert, I am not sure
> >     > how you can short circuit it. Are you suggesting we call
> >     Comparable on
> >     > compareBottom until some doc beats it?
> >
> >     I'm saying on seg transition you indeed get the Comparable for
> current
> >     bottom, but, don't attempt to invert it.  Instead, as seg 2 finds a
> >     hit, you get that hit's Comparables and compare to bottom.  If it
> >     beats bottom, it goes into the queue.  If it does not, you use the
> ord
> >     (in seg 2's ord space) to "learn" a bottom in the ord space of seg 2.
> >
> >     > That would hurt performance I lot though, no?
> >
> >     Yeah I think likely it would, since we're talking about a binary
> >     search on transition VS having to do possibly many
> >     upgrade-to-Comparable and compare-Comparabls to slowly learn the
> >     equivalent ord in the new segment.  I was proposing it for cases
> where
> >     inversion is very difficult.  But realistically, since you must keep
> >     around the ful ord -> Comparable for every segment anyway (in order
> to
> >     merge in the end), inversion shouldn't ever actually be "difficult"
> --
> >     it'd just be a binary search on presumably in-RAM storage.
> >
> >     Mike
> >
> >     ---------------------------------------------------------------------
> >     To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >     <mailto:java-dev-unsubscribe@lucene.apache.org>
> >     For additional commands, e-mail: java-dev-help@lucene.apache.org
> >     <mailto:java-dev-help@lucene.apache.org>
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
View raw message