lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fredrik Andersson <>
Subject VSM in Lucene, again
Date Sun, 04 Sep 2005 21:44:55 GMT
Hi folks.

I read a transcript from last months digest of this list, in a post by 
Rajesh Munavalli, that Lucene uses a VSM retrieval method. In my previous 
work with VSM, it has included matching a query vector towards the documents 
in the term-document space. I have dissected and customized a lot of classes 
in the Lucene indexing and searching classes, but I have yet to discover 
where the actual dot product of the query vector and the document vectors is 
performed, if Lucene uses this method for information retrieval. Using this 
method involves a certain angle which you consider as "close", which is a 
parameter that Lucene would benefit from exposing in its API. This I have 
not seen any trails of, either. To keep a long story short, a lot of the 
stuff that I usually associate with VSM and LSI information retrieval is 
missing or cleverly hidden.

If someone could shed some light on this issue, I would be very thankful. 
It's probably just that we have different notions of the VSM model, but I'd 
like to get this straightened out.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message