lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Search query problem
Date Fri, 08 Jan 2010 20:11:15 GMT
Looks like PorterStemFilter converts "Lowe's" to low.  Not very surprising.

Options include

.  Drop the stemming

.  Index stemmed and non-stemmed variants and search both, maybe
boosting the non-stemmed variant.

If you really want exact matches only, you may also/instead want
untokenized fields.  Apostrophes etc can be a problem.  Look into what
analyzers do and use Luke to see what is indexed.

--
Ian.


On Fri, Jan 8, 2010 at 8:01 PM, Jamie <jamie@stimulussoft.com> wrote:
> Hi There
>
> We are trying to search for the exact word "Lowe's" across a large set of
> indexed data. Our results include everything with "low" in it. Thus, we are
> receiving a much larger data set that we expected. The data is indexing
> using the analyzer:
>           TokenStream result = new StandardTokenizer(reader);
>           result = new StandardFilter(result);
>           result = new LowerCaseFilter(result);
>           result = new StopFilter(result, stopTable);
>           result = new PorterStemFilter(result);
>           return result;
>
> Thanks
>
> Jamie
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message