lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Storing Stemmed and Original Tokens
Date Mon, 22 Jan 2007 13:27:57 GMT
Take a look at the book Lucene In Action, particularly the SynonymAnalyzer
example. It shows you how to store multiple tokens at the same offset in a
document, and sounds like what you need. The basic idea is to use
SetNextPositionIncrement(0) on the 2-nth tokens you want to wind up in the
same position.

At least that's my guess <G>..

Best
Erick

On 1/22/07, hannes <hannes@hcmeyer.com> wrote:
>
> Hi All,
>
> I'm using the SnowballAnalyzer to "stemm" tokens - which is working fine!
>
> Now I got the requirement to also keep the original Tokens in the index
> for search. According to this
>
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/200302.mbox/%3C187D6D956106D84E9D8B280F6458FE140F5B2A@merc12.na.sas.com%3E
> Mail, I extended the SnowballAnalyzer
> in the way Eric described it.
>
> Does anyone has experience in storing stemmed and original tokens in the
> same field and same position? Is it the "right" way to do it?
>
> I also found some Discussions about storing stemmed tokens in an extra
> field, but that would mean I would have to rewrite the query ...
>
>
> thanks
> hannes
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message