lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Morus Walter <morus.wal...@tanto-xipolis.de>
Subject RE: Indexing Speed: Documents vs. Sentences
Date Thu, 18 Dec 2003 08:39:15 GMT
Jochen Frey writes:
> Dan, I will send you a separate e-mail directly to your address.
> 
> In the meanwhile, I hope to get input from other people. Maybe someone else
> knows how to solve my original problem below.
> 
recently there was a discussion on this list, how to use token positions
to store a sentence boundary in the index.
Basically you increase the token position by some large number (e.g. 1000)
at a sentence boundary. Then "x y"~20 won't match across sentence boundaries
(unless you use something like "x y"~1020).

Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message