lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <paul_t...@fastmail.fm>
Subject Re: Uable to extends TopTermsRewrite in Lucene 4.1
Date Tue, 26 Feb 2013 18:01:29 GMT
On 26/02/2013 17:22, Uwe Schindler wrote:
>> Hi,
>>
>> You cannot override rewrite() because you could easily break the logic
>> behind TopTermsRewrite. If you want another behavior, subclass another
>> base class and wrap the TopTermsRewrite instead of subclassing it (the
>> generics also enforce that the rewrite needs to rewrite() to a class that’s
>> specified in the generics parameter).
>>
>> addClause() is not final, its abstract. There is one "final" helper method used
>> by the rewrite itself, but the methods you need to override are abstract.
>>
>> Also your generics seem to be wrong, leading to the above question...
> In addition, you cast the call to super.rewrite() to DisjMaxQuery, so it is definitely
a DisjMaxQuery (because getTopLevelQuery() always returns one, see generics). You then pass
this DisjMaxQuery to this "getQueryBoostMethod", which checks for instanceof PrefixQuery.
This can never return true, so the boost is always 1. You can therefore nuke the whole rewrite
method (as it changes nothing) and only implement getToplevelQuery() and addClause().
>
> Uwe

Thanks Uwe, like I said I didn't really understand what I was doing but 
it did seem to do the job, I'll try out your recommendations

Paul
>
>>> -----Original Message-----
>>> From: Paul Taylor [mailto:paul_t100@fastmail.fm]
>>> Sent: Tuesday, February 26, 2013 5:34 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Uable to extends TopTermsRewrite in Lucene 4.1
>>>
>>> In Lucene 3.6 I had code that replicated a Dismax Query, and the
>>> search used fuzzy queries in some cases to match values. But I was
>>> finding the score attributed to matches on fuzzy searches was
>>> completely different to the score attributed to matches on exact
>>> searches so the total score returned was not good. I improved this by
>>> extends TopTermsRewrite so that if the query is a prefix query we
>>> boost it as if was exact match, I dont fully understand this but it
>>> improved things somewhat, but in Lucene 4.1 the
>>> rewrite() and addClause() methods are final
>>>
>>> So how can I implement this in Lucene 4.1, do I even need to - is
>>> there a more intuitive way to improve the scoring.
>>>
>>> This is what I currently have that wont compile because of the final
>>> methods
>>>
>>>       //TODO FIXME WAS Overriding methods that are now final
>>>       public static class MultiTermUseIdfOfSearchTerm<Q extends
>>> DisjunctionMaxQuery> extends TopTermsRewrite<Query> {
>>>
>>>       //public static final class MultiTermUseIdfOfSearchTerm extends
>>> TopTermsRewrite<BooleanQuery> {
>>>           private final TFIDFSimilarity similarity;
>>>
>>>           public MultiTermUseIdfOfSearchTerm(int size) {
>>>               super(size);
>>>               this.similarity = new DefaultSimilarity();
>>>
>>>           }
>>>
>>>           @Override
>>>           protected int getMaxSize() {
>>>               return BooleanQuery.getMaxClauseCount();
>>>           }
>>>
>>>           @Override
>>>           protected DisjunctionMaxQuery getTopLevelQuery() {
>>>               return new DisjunctionMaxQuery(0.1f);
>>>           }
>>>
>>>           @Override
>>>           protected void addClause(Query topLevel, Term term, float boost) {
>>>               final Query tq = new ConstantScoreQuery(new TermQuery(term));
>>>               tq.setBoost(boost);
>>>               ((DisjunctionMaxQuery)topLevel).add(tq);
>>>           }
>>>
>>>           protected float getQueryBoost(final IndexReader reader, final
>>> MultiTermQuery query)
>>>                   throws IOException {
>>>               float idf = 1f;
>>>               float df;
>>>               if (query instanceof PrefixQuery)
>>>               {
>>>                   PrefixQuery fq = (PrefixQuery) query;
>>>                   df = reader.docFreq(fq.getPrefix());
>>>                   if(df>=1)
>>>                   {
>>>                       //Same as idf value for search term, 0.5 acts as length
norm
>>>                       idf = (float)Math.pow(similarity.idf((int) df,
>>> reader.numDocs()),2) * 0.5f;
>>>                   }
>>>               }
>>>               return idf;
>>>           }
>>>
>>>           @Override
>>>           public Query rewrite(final IndexReader reader, final
>>> MultiTermQuery
>>> query) throws IOException {
>>>               DisjunctionMaxQuery  bq =
>>> (DisjunctionMaxQuery)super.rewrite(reader, query);
>>>
>>>               float idfBoost = getQueryBoost(reader, query);
>>>               Iterator<Query> iterator = bq.iterator();
>>>               while(iterator.hasNext())
>>>               {
>>>                   Query next = iterator.next();
>>>                   next.setBoost(next.getBoost() * idfBoost);
>>>               }
>>>               return bq;
>>>           }
>>>
>>>       }
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message