lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: lucene 2.9 sorting algorithm
Date Fri, 23 Oct 2009 02:39:17 GMT
I am patient :) And I'm not speaking for Mike, I'm speaking for me. I'm
wondering what your seeing. Asking Mike to rerun the tests without
giving any further info (you didn't even say that your seeing something
different) is unfair to the rest of us ;)

Giving 0 info along with your request just makes 0 sense to me and I
said as much.

John Wang wrote:
> Mark:
>
>        Please be patient with me. I am seeing a difference and was
> wondering if Mike would see the same thing. I thought Michael would be
> willing to because he expressed interest in understanding what the
> performance discrepancies are.
>
>        Again, it is only a request. It is perfectly fine if Michael
> refuses to. But it would be great if Michael speaks for himself.
>
> Thanks
>
> -John
>
> On Thu, Oct 22, 2009 at 7:29 PM, Mark Miller <markrmiller@gmail.com
> <mailto:markrmiller@gmail.com>> wrote:
>
>     Why? What might he find? Whats with the cryptic request?
>
>     Why would Java 1.5 perform better than 1.6? It erases 20 and 40%
>     gains?
>
>     I know point 2 certainly doesn't. Cards on the table?
>
>     John Wang wrote:
>     > Hey Michael:
>     >
>     >        Would you mind rerunning the test you have with jdk1.5?
>     >
>     >        Also, if you would, change the comparator method to avoid
>     > brachning for int and string comparators, e.g.
>     >
>     >
>     >       return index.order[i.doc] - index.order[j.doc];
>     >
>     >
>     > Thanks
>     >
>     >
>     > -John
>     >
>     >
>     > On Thu, Oct 22, 2009 at 2:38 AM, Michael McCandless
>     > <lucene@mikemccandless.com <mailto:lucene@mikemccandless.com>
>     <mailto:lucene@mikemccandless.com
>     <mailto:lucene@mikemccandless.com>>> wrote:
>     >
>     >     On Thu, Oct 22, 2009 at 2:17 AM, John Wang
>     <john.wang@gmail.com <mailto:john.wang@gmail.com>
>     >     <mailto:john.wang@gmail.com <mailto:john.wang@gmail.com>>>
>     wrote:
>     >
>     >     >      I have been playing with the patch, and I think I
>     have some
>     >     information
>     >     > that you might like.
>     >     >      Let me spend sometime and gather some more numbers and
>     >     update in jira.
>     >
>     >     Excellent!
>     >
>     >     >      say bottom has ords 23, 45, 76, each corresponding to a
>     >     string. When
>     >     > moving to the next segment, you need to make bottom to
>     have ords
>     >     that can be
>     >     > comparable to other docs in this new segment, so you would
>     need
>     >     to find the
>     >     > new ords for the values in 23,45 and 76, don't you? To
>     find it,
>     >     assuming the
>     >     > values are s1,s2,s3, you would do a bin. search on the new val
>     >     array, and
>     >     > find index for s1,s2,s3.
>     >
>     >     It's that inversion (from ord->Comparable in first seg, and
>     >     Comparable->ord in second seg) that I'm trying to avoid (w/
>     this new
>     >     proposal).
>     >
>     >     > Which is 3 bin searches per convert, I am not sure
>     >     > how you can short circuit it. Are you suggesting we call
>     >     Comparable on
>     >     > compareBottom until some doc beats it?
>     >
>     >     I'm saying on seg transition you indeed get the Comparable
>     for current
>     >     bottom, but, don't attempt to invert it.  Instead, as seg 2
>     finds a
>     >     hit, you get that hit's Comparables and compare to bottom.
>      If it
>     >     beats bottom, it goes into the queue.  If it does not, you
>     use the ord
>     >     (in seg 2's ord space) to "learn" a bottom in the ord space
>     of seg 2.
>     >
>     >     > That would hurt performance I lot though, no?
>     >
>     >     Yeah I think likely it would, since we're talking about a binary
>     >     search on transition VS having to do possibly many
>     >     upgrade-to-Comparable and compare-Comparabls to slowly learn the
>     >     equivalent ord in the new segment.  I was proposing it for
>     cases where
>     >     inversion is very difficult.  But realistically, since you
>     must keep
>     >     around the ful ord -> Comparable for every segment anyway
>     (in order to
>     >     merge in the end), inversion shouldn't ever actually be
>     "difficult" --
>     >     it'd just be a binary search on presumably in-RAM storage.
>     >
>     >     Mike
>     >
>     >    
>     ---------------------------------------------------------------------
>     >     To unsubscribe, e-mail:
>     java-dev-unsubscribe@lucene.apache.org
>     <mailto:java-dev-unsubscribe@lucene.apache.org>
>     >     <mailto:java-dev-unsubscribe@lucene.apache.org
>     <mailto:java-dev-unsubscribe@lucene.apache.org>>
>     >     For additional commands, e-mail:
>     java-dev-help@lucene.apache.org
>     <mailto:java-dev-help@lucene.apache.org>
>     >     <mailto:java-dev-help@lucene.apache.org
>     <mailto:java-dev-help@lucene.apache.org>>
>     >
>     >
>
>
>     --
>     - Mark
>
>     http://www.lucidimagination.com
>
>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <mailto:java-dev-unsubscribe@lucene.apache.org>
>     For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <mailto:java-dev-help@lucene.apache.org>
>
>


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message