lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: Preventing phrase queries from matching across lines
Date Fri, 28 Apr 2006 14:57:18 GMT

On Apr 28, 2006, at 5:35 AM, Eric Jain wrote:
> What is the best way to prevent a phrase query such as "eggs white"  
> matching "fried eggs\nwhite snow"?
> Two possibilities I have thought about:
> 1. Replace all line breaks with a special string, e.g. "newline".
> 2. Have an analyzer somehow increment the position of a term for  
> each line break it encounters.
> Latter seems a bit more complicated to implement, but it would also  
> be more efficient, right? Or are there better options?

#2 shouldn't be too hard to implement, but you'll need to catch new  
lines in the initial tokenizer.  I'm not sure about the efficiency,  
both options would require a tokenizer detecting new lines and either  
injecting a new term or setting a flag such that the next term gets a  
position increment bump.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message