lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Quaroni <dquar...@OPENRATINGS.com>
Subject RE: derive tokens from single token
Date Mon, 29 Sep 2003 13:30:54 GMT
Right, so I got all confused with the chemistry -- That solution won't work.

If you're just allowing them to do wildcards by tokenizing on the syntax of
chemistry, then you could split the words up according to that..

-----Original Message-----
From: Hackl, Rene [mailto:Rene.Hackl@FIZ-Karlsruhe.DE]
Sent: Monday, September 29, 2003 9:20 AM
To: 'lucene-user@jakarta.apache.org'
Subject: derive tokens from single token


Hi All,

I'm looking for a way to implement simultaneous left and right truncation. 

The goal is to enable the user to search for e.g. "*hydronaphth*" and find
"hexahydronaphthalene" as well as "heptahydronaphthalin".

To achieve that functionality, I'd like to index terms in the way that from
a token "foobar" the tokens "oobar" and "obar" ( e.g. mininum word length =
4)
would be derived and added to the index. I tried to extend TokenFilter, but 
all I get is either "oobar" or "obar", depends on when 'return' is called. 

How could I add such extra tokens to the tokenStream? Any thoughts on this
appreciated.

Best regards,

René Hackl

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Mime
View raw message