lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Strange StopFilter and stop words behaviour
Date Tue, 26 Jul 2011 10:56:57 GMT
I think that passing an empty set or null to StandardAnalyzer should
do what you want.  There are useful tips at
http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F.

My guess would be that you aren't using a no-stop-words version of
StandardAnalyzer at both index and query time.


--
Ian.

On Tue, Jul 26, 2011 at 4:25 AM, SBS <jturnbul@uow.edu.au> wrote:
> My goal is to be able to get meaningful results from search queries that
> include some words that are on the default stop words list, especially
> "not".  I am using the StandardAnalyzer and I have tried passing in null and
> an empty set for the set of stop words to use in the constructor hoping that
> no words would be stripped but I am getting strange results.
>
> If I enter a query of just the word "not" I get no matches.  If I run a
> query with just the word "included" I get lots of matches.  If I run the
> query "not included" (without surrounding quotation marks) I get lots of
> matches and the highlighter indicates that "not" is one of the matching
> fragments.  But if I run the query ""not included"" (with surrounding
> quotation marks) I get no matches even though there are many occurrences in
> the content of that exact phrase which were matched when I entered the same
> query without the quotation marks.
>
> What's going on here?  Why can't I search for the word "not" by itself or in
> a quote?  Similar behaviour happens for other words like "the" but I am
> explicitly telling the analyzer not to remove any words (or so I believe).
> How can I achieve a StandardAnalyzer where every word in the query is
> significant?
>
> Thanks,
>
> -sbs
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Strange-StopFilter-and-stop-words-behaviour-tp3199367p3199367.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message