lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Scoring results?!
Date Wed, 09 May 2007 11:13:51 GMT
Hi Eric,

On May 9, 2007, at 2:39 AM, supereric wrote:

> How I can get the tag word score in lucene. suppose that you have  
> searched a
> tag word and 3 hit documents
> are now found.
> 1 -How someone could find number of occurrences in any document so  
> it could
> sort the results.

Span Queries tell you where the matches occur in the document by  
offset, but I am not sure what your sorting criteria would be.  The  
explain method also can give you information about why a particular  
document scored a particular way.

> Also I wan to have some other policies for ranking the results.  
> What should
> I do to handle that. for example
> I want to score boldfaced tag words in an html document twice  
> normal texts.

Although totally experimental at this stage, the new Payload stuff in  
the trunk version of Lucene (or nightly builds) is designed for such  
a scenario.  Check out the BoostingTermQuery which can boost term  
scores based on the contents of a payload located at a particular  
term.  Feedback on the APIs is very much appreciated.

> 2- How I can omit some tag words from the index?! for example  
> common words
> in another language?

See the StopFilter token filter and/or the StopwordAnalyzer



Grant Ingersoll
Center for Natural Language Processing

Read the Lucene Java FAQ at 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message