lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jenny Brown" <skyw...@gmail.com>
Subject Re: Searching using dash in words/abbreviations
Date Fri, 12 Dec 2008 19:46:53 GMT
Is it possible to configure Lucene such that it doesn't tokenize on
embedded dashes, and thus doesn't consider the "A" a stop word because
it's not standing alone?  I do believe the combination of dash
handling and stop words is why the query is causing problems for my
user.


On Fri, Dec 12, 2008 at 1:32 PM, Daniel Naber
<lucenelist2007@danielnaber.de> wrote:
> On Freitag, 12. Dezember 2008, Jenny Brown wrote:
>
>> I'm trying to search for company ABC Inc. in places where it may be
>> mentioned as A-B-C Inc.  Lucene is doing something with those dashes,
>> though, that prevents me from getting accurate results.
>
> "A" (even in "A-B-C" I think) is a stopword with StandardAnalyzer's default
> settings, which might cause problems. Please also check out these hints
> from the FAQ:
>
> http://wiki.apache.org/lucene-java/LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71
>
> Regards
>  Daniel
>
> --
> http://www.danielnaber.de
>

Mime
View raw message