lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Wang <john.w...@gmail.com>
Subject Re: lucene 2.9 sorting algorithm
Date Fri, 23 Oct 2009 02:43:31 GMT
Mike:
       I did just post with what I saw, feel free to read and comment on it.

       I am simply trying to work with Michael on this and trying to
understand the code.

       As I have expressed previously, I have seen a difference between 1.5
and 1.6 that is significant. Since Mike has posted some numbers on jdk 1.6,
I was hoping to eliminate all variables relating to the index and
environment and see if he sees the same thing.

        I guess I should be more clear in the email.

-John

On Thu, Oct 22, 2009 at 7:39 PM, Mark Miller <markrmiller@gmail.com> wrote:

> I am patient :) And I'm not speaking for Mike, I'm speaking for me. I'm
> wondering what your seeing. Asking Mike to rerun the tests without
> giving any further info (you didn't even say that your seeing something
> different) is unfair to the rest of us ;)
>
> Giving 0 info along with your request just makes 0 sense to me and I
> said as much.
>
> John Wang wrote:
> > Mark:
> >
> >        Please be patient with me. I am seeing a difference and was
> > wondering if Mike would see the same thing. I thought Michael would be
> > willing to because he expressed interest in understanding what the
> > performance discrepancies are.
> >
> >        Again, it is only a request. It is perfectly fine if Michael
> > refuses to. But it would be great if Michael speaks for himself.
> >
> > Thanks
> >
> > -John
> >
> > On Thu, Oct 22, 2009 at 7:29 PM, Mark Miller <markrmiller@gmail.com
> > <mailto:markrmiller@gmail.com>> wrote:
> >
> >     Why? What might he find? Whats with the cryptic request?
> >
> >     Why would Java 1.5 perform better than 1.6? It erases 20 and 40%
> >     gains?
> >
> >     I know point 2 certainly doesn't. Cards on the table?
> >
> >     John Wang wrote:
> >     > Hey Michael:
> >     >
> >     >        Would you mind rerunning the test you have with jdk1.5?
> >     >
> >     >        Also, if you would, change the comparator method to avoid
> >     > brachning for int and string comparators, e.g.
> >     >
> >     >
> >     >       return index.order[i.doc] - index.order[j.doc];
> >     >
> >     >
> >     > Thanks
> >     >
> >     >
> >     > -John
> >     >
> >     >
> >     > On Thu, Oct 22, 2009 at 2:38 AM, Michael McCandless
> >     > <lucene@mikemccandless.com <mailto:lucene@mikemccandless.com>
> >     <mailto:lucene@mikemccandless.com
> >     <mailto:lucene@mikemccandless.com>>> wrote:
> >     >
> >     >     On Thu, Oct 22, 2009 at 2:17 AM, John Wang
> >     <john.wang@gmail.com <mailto:john.wang@gmail.com>
> >     >     <mailto:john.wang@gmail.com <mailto:john.wang@gmail.com>>>
> >     wrote:
> >     >
> >     >     >      I have been playing with the patch, and I think I
> >     have some
> >     >     information
> >     >     > that you might like.
> >     >     >      Let me spend sometime and gather some more numbers and
> >     >     update in jira.
> >     >
> >     >     Excellent!
> >     >
> >     >     >      say bottom has ords 23, 45, 76, each corresponding to a
> >     >     string. When
> >     >     > moving to the next segment, you need to make bottom to
> >     have ords
> >     >     that can be
> >     >     > comparable to other docs in this new segment, so you would
> >     need
> >     >     to find the
> >     >     > new ords for the values in 23,45 and 76, don't you? To
> >     find it,
> >     >     assuming the
> >     >     > values are s1,s2,s3, you would do a bin. search on the new
> val
> >     >     array, and
> >     >     > find index for s1,s2,s3.
> >     >
> >     >     It's that inversion (from ord->Comparable in first seg, and
> >     >     Comparable->ord in second seg) that I'm trying to avoid (w/
> >     this new
> >     >     proposal).
> >     >
> >     >     > Which is 3 bin searches per convert, I am not sure
> >     >     > how you can short circuit it. Are you suggesting we call
> >     >     Comparable on
> >     >     > compareBottom until some doc beats it?
> >     >
> >     >     I'm saying on seg transition you indeed get the Comparable
> >     for current
> >     >     bottom, but, don't attempt to invert it.  Instead, as seg 2
> >     finds a
> >     >     hit, you get that hit's Comparables and compare to bottom.
> >      If it
> >     >     beats bottom, it goes into the queue.  If it does not, you
> >     use the ord
> >     >     (in seg 2's ord space) to "learn" a bottom in the ord space
> >     of seg 2.
> >     >
> >     >     > That would hurt performance I lot though, no?
> >     >
> >     >     Yeah I think likely it would, since we're talking about a
> binary
> >     >     search on transition VS having to do possibly many
> >     >     upgrade-to-Comparable and compare-Comparabls to slowly learn
> the
> >     >     equivalent ord in the new segment.  I was proposing it for
> >     cases where
> >     >     inversion is very difficult.  But realistically, since you
> >     must keep
> >     >     around the ful ord -> Comparable for every segment anyway
> >     (in order to
> >     >     merge in the end), inversion shouldn't ever actually be
> >     "difficult" --
> >     >     it'd just be a binary search on presumably in-RAM storage.
> >     >
> >     >     Mike
> >     >
> >     >
> >     ---------------------------------------------------------------------
> >     >     To unsubscribe, e-mail:
> >     java-dev-unsubscribe@lucene.apache.org
> >     <mailto:java-dev-unsubscribe@lucene.apache.org>
> >     >     <mailto:java-dev-unsubscribe@lucene.apache.org
> >     <mailto:java-dev-unsubscribe@lucene.apache.org>>
> >     >     For additional commands, e-mail:
> >     java-dev-help@lucene.apache.org
> >     <mailto:java-dev-help@lucene.apache.org>
> >     >     <mailto:java-dev-help@lucene.apache.org
> >     <mailto:java-dev-help@lucene.apache.org>>
> >     >
> >     >
> >
> >
> >     --
> >     - Mark
> >
> >     http://www.lucidimagination.com
> >
> >
> >
> >
> >     ---------------------------------------------------------------------
> >     To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >     <mailto:java-dev-unsubscribe@lucene.apache.org>
> >     For additional commands, e-mail: java-dev-help@lucene.apache.org
> >     <mailto:java-dev-help@lucene.apache.org>
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
View raw message