lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brandon Mintern <mint...@easyesi.com>
Subject ComplexPhraseQueryParser and stop words
Date Fri, 26 Oct 2012 22:37:22 GMT
We recently switched from QueryParser to ComplexPhraseQueryParser
(from lucene-queryparser-3.6.0.jar), and we've come across two
separate problems.

The first is that because it parses quoted expressions twice, it is
necessary to double-escape any escaped characters. So if I do not want
to allow users to include : in their search, I have to escape it as
\:, but when it is in quotes, I have to escape it as \\:, because the
first parse will turn \\ into \ and then the second time around will
do the proper escape. Likewise, I need to escape \ because it shows up
frequently in paths. When not in quotes, this is simply \\. In quotes,
it must be \\\\.

So that was a minor issue, but we were able to work around it without
too much trouble. This next problem, though, does not seem to have an
easy answer.

Our searches for quoted phrases which include stop words no longer
match. If a document contained the phrase "time to leave", only "time"
and "leave" get indexed, but their positions are maintained so that a
later search for "time to leave" works correctly. With the standard
QueryParser, this worked just fine. With the ComplexPhraseQueryParser,
it no longer works at all. Searching for time AND leave works, but
"time to leave" simply fails.

Does anyone know where I should start in solving this issue?

Thanks,
Brandon

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message