lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chong, Herb" <HCho...@bloomberg.com>
Subject RE: Vector Space Model in Lucene?
Date Fri, 14 Nov 2003 19:32:49 GMT
when people type in multiword queries, mostly they are interested in phrases in the linguistic
sense. phrases don't cross sentence boundaries. you need certain features in the index and
in the ranking algorithm to capture that distinction and rank documents truly having that
phrase higher than documents that just happen to have the same words as the phrase. it also
has to accommodate the human tendency to leave off words after mentioning the full form of
the phrase once.

Herb....

-----Original Message-----
From: Dror Matalon [mailto:dror@zapatec.com]
Sent: Friday, November 14, 2003 2:28 PM
To: Lucene Users List
Subject: Re: Vector Space Model in Lucene?


Hi,

I might be the only person on the list who's having a hard time
following this discussion. Would one of you wise folks care to point me
to a good "dummies", also known as an executive summary, resource about
the theoretical background of all of this. I understand the basic
premise of collecting the "words" and having pointers to documents and
weights, but beyond that ...

TIA,

Dror

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message