lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Claudia Santos" <>
Subject Lucene retrieval model
Date Tue, 30 Dec 2008 09:03:03 GMT


I would like to know more about Lucene's retrieval model, more specifically
about the boolean model.
Is that a standard model or an extended model? I mean, it returns just documents that
match the boolean expression or include in the search result all Documents which correspond
to the given conditions, regardless of
the boolean connectors - AND, OR, NOT and calculate a weight between 0 and 1 for all search
results that contains at least one of the terms. 
The extended model evaluates documents with only one of the terms with a 
smaller value than one that contains both. 

In the Apache Lucene - Scoring's page i found not that much about: 
"Lucene scoring uses a combination of the Vector Space Model (VSM) of
Information Retrieval and the Boolean model to determine how relevant a
given Document is to a User's query. In general, the idea behind the VSM is
the more times a query term appears in a document relative to the number of
times the term appears in all the documents in the collection, the more
relevant that document is to the query. It uses the Boolean model to first
narrow down the documents that need to be scored based on the use of boolean
logic in the Query specification. Lucene also adds some capabilities and
refinements onto this model to support boolean and fuzzy searching, but it
essentially remains a VSM based system at the heart."

Thanks in advance for any responses
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message