lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Li <>
Subject Re: how to extend Similarity in this situation?
Date Wed, 02 Jun 2010 07:11:13 GMT
thank you

2010/6/2 Rebecca Watson <>:
> Hi Li Li
> If you want to support some query types and not others you should
> overide/extend the queryparser so that you throw an exception / makes
> a different query type instead.
> Similarity doesn't do the actual scoring, it's used by the Query
> classes (actually the Scorer implementation used by the Query)
> to get tf / idf scores. So scoring is done by the Query types themselves
> (the Weight/Scorer classes used by the Query). So sometimes
> you change/extend Similarity to affect scoring and sometimes you
> need to change/extend the Query/Scorer/Weight classes etc.
> Luckily though, there are already some Query types that boost
> scores if the terms are closer -- the SpanQuery query types do this.
> You will either have to write your own queryparser (to convert query string
> to the Query object(s) required,
> or you could use e.g. the Surround queryparser supports specifying
> SpanQuery queries via user-entered text:
> look in the lucene contrib/surround package for the surround queryparser.
> I've messed about with using SpanQuery classes a bit and people report
> that they are slower - which is true because they use a lot of extra
> info to perform their matching/scoring -- but they are still easily
> sub-second over most
> queries / reasonable sized doc collection
> -- and they only get a lot slower if you have terms that occur very frequently
> in the document collection you are searching.
> So if you are removing stop words etc you shouldn't have too
> many problems efficiency wise.
> hope that helps,
> bec :)
> On 1 June 2010 22:07, Li Li <> wrote:
>> I want to only support boolean or query(as many search engine do). But
>> I want to boost document whose terms are closer.
>> e.g. the query terms are 'apache lucene'
>> doc1 apache has many projects such as lucene
>> doc2 The Apache HTTP Server Project is an effort to develop and
>> maintain an ...  Lucene is a .....
>> I want to give doc1 more boost.
>> So I need a place where I can get all position information
>> How to extend Similarity to achieve this?
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message