On 04/05/2011 12:51, Paul Taylor wrote:
> On 04/05/2011 12:39, Ahmet Arslan wrote:
>>
>> Im receiving a number of searches with many ORs so that the total
>> number of matches is huge (> 1 million) although only the first 20
>> results are required. Analysis shows most time is spent scoring the
>> results. Now it seems to me if you sending a query with 10 OR
>> components, documents that match most of the terms are bound to get a
>> better score than a match that only matches one or two of the terms.
>> So does lucene do any optimization to not bother working out the
>> scores of the poor matches.
>>
>> EDIT:Actually not sure the statement because if only term matches it
>> could still get the highest score if the match was on the shortest term.
>>
>> But can you see my point is there way to get lucene discount the less
>> good matches without scoring them, or is there another approach. At
>> the moment we allow the full lucene syntax and use QueryParser to
>> parse a query and pass the resultant query to search unchanged
>> (execpt for handling of numeric fields), should I be modifying the
>> query somehow ?
>>
>>
>> You can restrict number of returned results by using a adaptively
>> computed BooleanQuery.html#setMinimumNumberShouldMatch(int) parameter.
>> For example, If you have 10 optional clauses you can set minimum
>> should match to 60% of 10 = 6.
>>
>> Similar mechanism exists in solr :
>> http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29
>>
>>
>>
> Thanks for the hint, so this could be done by overriding
> getBooleanQuery() in QueryParser ?
>
> Paul
>
Well I did extend QuerParser, and the method is being called but rather
disappointingly it had no noticeablke effect on how long queries took. I
really thought by reducing the number of matches the corresponding
scoring phase would be quicker.
@Override
protected Query getBooleanQuery(List<BooleanClause> clauses,
boolean disableCoord)
throws ParseException
{
BooleanQuery query = (BooleanQuery)
super.getBooleanQuery(clauses,disableCoord);
if(query!=null)
{
if(clauses.size() > 5)
{
query.setMinimumNumberShouldMatch(3);
}
}
return query;
}
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|