lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Sentence boundary storage
Date Fri, 28 Oct 2005 21:46:08 GMT

Was wondering what people's experience is with storing sentence (or 
other) boundary information in Lucene.  For instance, for phrase 
queries, you may not want to match when two terms lie on either side of 
a sentence boundary.  I know for phrase queries the common approach is 
to make the position increment larger than one, which solves that 
immediate problem, but I have other uses for such information, too.  
Should I just store some type of boundary marker at the appropriate 
position and check to see if I have a boundary marker when doing my 
processing?  I know I need an Analyzer that can detect the boundaries, 
for starters.  What other issues have people run up against?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message