lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-443) ConjunctionScorer tune-up
Date Thu, 21 Sep 2006 01:11:24 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-443?page=comments#action_12436414 ] 
            
Grant Ingersoll commented on LUCENE-443:
----------------------------------------

Yonik, Paul, do either of you know the status on this one?  From the looks of it, it hasn't
been implemented.  It also has the highest number of votes in JIRA, so I thought I would take
a look at it.  One downside is it is not in patch form, but it also doesn't look to hard to
extract the changes, either.

One issue I have with these performance issues is that we don't have a reliable benchmarking
suite.  I am not a lawyer, but might we be able to use something like http://trec.nist.gov/data/reuters/reuters.html
to build a sample benchmark suite?  This corpus, plus 100 or so queries could work nicely.
 Of course, we would have to figure out some way for those interested to get their hands on
the data.  What do others do for benchmarking?

> ConjunctionScorer tune-up
> -------------------------
>
>                 Key: LUCENE-443
>                 URL: http://issues.apache.org/jira/browse/LUCENE-443
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 1.9
>         Environment: Linux, Java 1.5, Large Index with 4 million items and some heavily
nested boolean queries
>            Reporter: Abdul Chaudhry
>         Attachments: ConjunctionScorer.java, ConjunctionScorer.java
>
>
> I just recently ran a load test on the latest code from lucene , which is using a new
BooleanScore and noticed the ConjunctionScorer was crunching through objects , especially
while sorting as part of the skipTo call. It turns a linked list into an array, sorts the
array, then converts the array back to a linked list for further processing by the scoring
engines below.
> 'm not sure if anyone else is experiencing this as I have a very large index (> 4
million items) and I am issuing some heavily nested queries
> Anyway, I decide to change the link list into an array and use a first and last marker
to "simulate" a linked list.
> This scaled much better during my load test as the java gargbage collector was less -
umm - virulent 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message