lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Quaroni <dquar...@OPENRATINGS.com>
Subject RE: derive tokens from single token
Date Mon, 29 Sep 2003 13:25:34 GMT
My understanding is that the best way to do this is to create an extra field
that is reversed.  That way you store foobar and raboof, and when someone
wants to do a left-wildcard, you search in the reversed field so that they
search for:

*bar

And you reverse that and perform a search on the reversed field for:

rab*

And find your foobar.

-----Original Message-----
From: Hackl, Rene [mailto:Rene.Hackl@FIZ-Karlsruhe.DE]
Sent: Monday, September 29, 2003 9:20 AM
To: 'lucene-user@jakarta.apache.org'
Subject: derive tokens from single token


Hi All,

I'm looking for a way to implement simultaneous left and right truncation. 

The goal is to enable the user to search for e.g. "*hydronaphth*" and find
"hexahydronaphthalene" as well as "heptahydronaphthalin".

To achieve that functionality, I'd like to index terms in the way that from
a token "foobar" the tokens "oobar" and "obar" ( e.g. mininum word length =
4)
would be derived and added to the index. I tried to extend TokenFilter, but 
all I get is either "oobar" or "obar", depends on when 'return' is called. 

How could I add such extra tokens to the tokenStream? Any thoughts on this
appreciated.

Best regards,

René Hackl

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Mime
View raw message