lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <dmit...@earthlink.net>
Subject Re: What type of indexer is Lucene? Question reworded.
Date Thu, 07 Mar 2002 21:08:32 GMT
I can't answer all of these questions fully, but since Doug is out, I'll 
give it a start. Please check the FAQ for more detailed explanation. I 
believe you will find enough information there to answer all of your 
questions. The FAQ is linked from the Jakarta's page (there are actually 
two FAQs so you might want to check both).

As far as I understand, Lucene is a probabilistic indexer. It supports 
boolean queries but it also supports phrase queries, where it does true 
ranking. The ranking is done based on how many of the search words 
appear in a document and how "important" the words are for that 
document, which is a function of the word frequency and the size of the 
document.

For a given search, the type of result you get depends on the type of 
Query that is used. For example, boolean queries can have "traditional" 
AND terms which are all required for a match, but they can also have 
"optional" terms that rank the document higher if they are found, but do 
not rule out a document if they are not.

I hope this helps.
Dmitry.


Melissa Mifsud wrote:

>Hi again!
>
>I should really reword my question as follows:
>
>On which criteria are relevant documents chosen given a particular query
>
>and
>
>once retrieved, how are these documents ranked?
>
>The techniques by which this is done will then determine what type of IR model Lucene
implements.
>
>Thanks again!
>
>Melissa
>




--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message