lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Inquiry on Lucene Stemming
Date Tue, 16 Dec 2008 14:14:13 GMT
Why do you want to do this? The reason I ask is that you're
making each clause very complex.....

For a single term, it's not very complex, but for something like
((A AND B) OR (C AND D)) NOT X

expanding A, B, C, D and X to, possibly many terms is...er...ugly.

You could think about ngrams, although I confess I've only seen
this on the lists, haven't worked with it myself.

If your goal is to be able to search exact match words (i.e. you
need to find "flash" when exactly "flash" was indexed, not "flashing")
there are better strategies....

So a bit more explanation of the problem could perhaps generate more
helpful responses.

Best
Erick

On Tue, Dec 16, 2008 at 7:18 AM, Jay Joel Malaluan <exst_jmalaluan@yahoo.com
> wrote:

> Hi,
>
>
> Can anyone comment if my understanding of the stemming process in Lucene is
> correct. From my testing using the SnowballAnalyzer, if I passed this word
> "flashing" it will be trimmed to a root word "flash" and this root word
> ("flash") will be the one searched not the original word "flashing".
>
> Is there an API in Lucene or third-party APIs that can do the following, I
> passed the word "flash" instead it will search for "flashing", "flashed",
> "flashes" etc.?
>
>
> Regards,
> Jay Malaluan
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message