lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Performance between Filter and HitCollector?
Date Thu, 15 Mar 2007 03:09:44 GMT
eks dev and others - have you tried using the code from LUCENE-584?  Noticed any performance
increase when you disabled scoring?  I'd like to look at that patch soon and commit it if
everything is in place and makes sense, so I'm curious if you or anyone else already tried
this patch...

Thanks,
Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

----- Original Message ----
From: eks dev <eksdev@yahoo.co.uk>
To: java-user@lucene.apache.org
Sent: Wednesday, March 14, 2007 3:59:25 PM
Subject: Re: Performance between Filter and HitCollector?

just to complete this fine answer,
there is also Matcher patch (https://issues.apache.org/jira/browse/LUCENE-584)  that could
bring the best of both worlds via e.g. ConstantScoringQuery or another abstraction that enables
disabling Scoring (where appropriate)

----- Original Message ----
From: Chris Hostetter <hossman_lucene@fucit.org>
To: java-user@lucene.apache.org
Sent: Wednesday, 14 March, 2007 7:15:06 PM
Subject: Re: Performance between Filter and HitCollector?


it's kind of an Apples/Oranges comparison .. in the examples you gave
below, one is executing an arbitrary query (which oculd be anything) the
other is doing a simple TermEnumeration.

Asuming that Query is a TermQuery, the Filter is theoreticaly going to be
faster becuase it does't have to compute any Scores ... generally speaking
a a Filter will alwyas be a little faster then a functionally equivilent
Query for the purposes of building up a simple BitSet of matching
documents because teh Query involves the score calcuations ... but the
Query is generally more usable.

The Query can also be more efficient in other ways, because the
HitCollector doesn't *have* to build a BitSet, it can deal with the
results in whatever way it wants (where as a Filter allways generates a
BitSet).

Solr goes the HitCollector route for a few reasons:
  1) allows us to use hte DocSet abstraction which allows other
     performance benefits over straight BitSets
  2) allows us to have simpler code that builds DocSets and DocLists
     (DocLists know about scores, sorting, and pagination) in a single
     pass when scores or sorting are requested.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org






        
___________________________________________________________ 
All New Yahoo! Mail – Tired of unwanted email come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message