lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SBS <jturn...@uow.edu.au>
Subject Strange StopFilter and stop words behaviour
Date Tue, 26 Jul 2011 03:25:13 GMT
My goal is to be able to get meaningful results from search queries that
include some words that are on the default stop words list, especially
"not".  I am using the StandardAnalyzer and I have tried passing in null and
an empty set for the set of stop words to use in the constructor hoping that
no words would be stripped but I am getting strange results.

If I enter a query of just the word "not" I get no matches.  If I run a
query with just the word "included" I get lots of matches.  If I run the
query "not included" (without surrounding quotation marks) I get lots of
matches and the highlighter indicates that "not" is one of the matching
fragments.  But if I run the query ""not included"" (with surrounding
quotation marks) I get no matches even though there are many occurrences in
the content of that exact phrase which were matched when I entered the same
query without the quotation marks.

What's going on here?  Why can't I search for the word "not" by itself or in
a quote?  Similar behaviour happens for other words like "the" but I am
explicitly telling the analyzer not to remove any words (or so I believe). 
How can I achieve a StandardAnalyzer where every word in the query is
significant?

Thanks,

-sbs

--
View this message in context: http://lucene.472066.n3.nabble.com/Strange-StopFilter-and-stop-words-behaviour-tp3199367p3199367.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message