lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karsten Konrad" <Karsten.Kon...@xtramind.com>
Subject AW: Real Boolean Model in Lucene?
Date Mon, 01 Dec 2003 13:00:21 GMT

Hi,

>>
My Question: Does Lucene use TF/IDF for getting this? (which would mean it does not use the
boolean model for the boolean query...)
>>

Lucene indeed uses TF/IDF with length normalization for fields and documents. 

However, Lucene is "downward compatible" to the Boolean Model where
documents are represented as 0/1-vectors in Vector Space. Ranking just 
adds weights to the elements of the result set, so the underlying 
interpretation of a query result can be still that of a 
Propositional/Boolean model. If a document appears in the result, 
its tokens valuate the query (which actually is a propositional 
formula formed over words and phrases) to true. The representation
of documents is more complex in Lucene than required for the Boolean
Model, and as a result, Lucene can efficiently handle phrases and 
proximity searches, but these seem to be compatible extensions -
if you can do it in the Boolean Model, you can do it in Lucene :)

One place where Lucene is not 100% compatible with a basic Boolean Model is that 
full negation is a bit tricky - you can not simply ask for all documents that 
do not contain a certain term unless you also have some term that appears in all 
documents. Not a great deal, really. 

If TF/IDF weighting is a problem to you, the Similarity interface implementation allows you

to remove all references to length normalization and document frequencies.

Regards,

Mit freundlichen Grüßen aus Saarbrücken

--

Dr.-Ing. Karsten Konrad
Head of Artificial Intelligence Lab

XtraMind Technologies GmbH
Stuhlsatzenhausweg 3
D-66123 Saarbrücken
Phone: +49 (681) 3025113
Fax: +49 (681) 3025109
konrad@xtramind.com
www.xtramind.com



-----Ursprüngliche Nachricht-----
Von: ambiesense@gmx.de [mailto:ambiesense@gmx.de] 
Gesendet: Montag, 1. Dezember 2003 13:11
An: lucene-user@jakarta.apache.org
Betreff: Real Boolean Model in Lucene?


Hi,

is it possible to use a real boolean model in lucene for searching. When one is using the
Queryparser with a boolean query (i.e. "dog AND horse") one does get a list of documents from
the Hits object. However these documents have a ranking (score).

My Question: Does Lucene use TF/IDF for getting this? (which would mean it does not use the
boolean model for the boolean query...)

How can one use a boolean model search, where the outcome are all score=1 ? Example?

Cheers,
Ralph

-- 
Neu bei GMX: Preissenkung für MMS-Versand und FreeMMS!

Ideal für alle, die gerne MMS verschicken:
25 FreeMMS/Monat mit GMX TopMail. http://www.gmx.net/de/cgi/produktemail

+++ GMX - die erste Adresse für Mail, Message, More! +++


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message