lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <>
Subject Re: Query Scoring
Date Sat, 31 Dec 2005 13:11:24 GMT
Sorry to contradict, Erik, but the Highlighter's QueryScorer will make 
use of IDF, given a reader, in order to better prioritise which are the 
"best" bits of a document.
However, In the particular example given, the criteria includes several 
non-text fields which are not useful for IDF and general scoring 
purposes - these are perhaps better expressed using a filter of some 
form. Otherwise, why should the scarcity of a particular date in the 
given range boost one matching document above others? These numeric-type 
fields are simply mandatory boolean "hygiene factors" and  should 
ideally play no part in highlight selection or results ordering in 
general based on their IDF or TF.


Erik Hatcher wrote:

> Harini,
> I'm not sure I understand what you're asking.  IDF doesn't factor  
> into highlighting.
> IDF calculations are useful in scoring documents during a search,  
> such that the most relevant documents are returned, but again this is  
> unrelated to highlighting.
> Could you elaborate on what you're after?
>     Erik
> On Dec 30, 2005, at 12:02 PM, Harini Raghavan wrote:
>> Hi,
>> I have a requirement to highlight search keywords in the results and
>> display the matching fragment of the text with the results. I am using
>> the Hits highlighting mentioned in Lucene in Action.
>> Here is the search query(BooleanQuery) I am passing to the  
>> IndexSearcher
>> and QueryScorer:
>> +DocumentType:news
>> +(CompanyId:10 CompanyId:20 CompanyId:30 CompanyId:40)
>> +FilingDate:[20041201 TO 20051201]
>> +(Content:"cost saving" Content:"cost savings" Content:outsource
>> Content:outsources Content:downsize Content:downsizes
>> Content:restructuring Content:restructure)
>> I do not quite understand how the query scoring actually works &  how 
>> Inverse Document Frequency(IDF) calculations are useful?  Can
>> someone shed some light on this using the given query as an example?
>> Thanks,
>> Harini
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

NEW Yahoo! Cars - sell your car and browse thousands of new and used cars online!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message