lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chong, Herb" <>
Subject RE: inter-term correlation [was Re: Vector Space Model in Lucene?]
Date Mon, 17 Nov 2003 13:56:59 GMT
i am not implying rejection of a match across sentence boundaries, i am saying that it receives
a lower score than a match within a sentence boundary.


-----Original Message-----
From: Tatu Saloranta []
Sent: Friday, November 14, 2003 8:15 PM
To: Lucene Users List
Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

Isn't that quite strict interpretation, however? There are many cases where 
linguistically separate sentences do have strong dependendies; in web world 
simple things like list items may be very closely related. Put another way;
it may not be trivially easy to detect sentence boundaries, nor is it certain 
that what (from language viewpoint) is a boundary really is hard boundary 
from semantic perspective? And are there not varying levels of separation 
(sentences close to each other often are related, back references being 
common), not just one, between sentences?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message