lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Lucene's use of vectors
Date Fri, 02 Mar 2012 16:09:52 GMT
On Thu, Mar 1, 2012 at 6:15 PM, Mike O'Leary <tmoleary@uw.edu> wrote:
> In the Javadoc page for the Similarity class, it says,
>
> "Lucene combines Boolean model (BM) of Information Retrieval with Vector Space Model
(VSM) of Information Retrieval - documents "approved" by BM are scored by VSM."
>
> Is the Vector Space Model that is referred to here different than the term vectors that
can optionally be stored in index fields?

Yes, it refers to http://en.wikipedia.org/wiki/Vector_space_model,
which uses statistics stored in the index. Term vectors are not used
here.

Instead term vectors are really just like storing a separate
individual inverted index for each document. For example, they are
used by MoreLikeThis to retrieve the terms and frequencies from just
that one document.

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message