lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pleasant, Tracy" <>
Subject RE: Tokenizing text custom way
Date Tue, 25 Nov 2003 16:04:13 GMT
Not exactly and answer to the question but I haven't yet used the Token classes/functionality
that came with Lucene. Can someone give me an idea of how and why one may use this?


-----Original Message-----
From: Dragan Jotanovic []
Sent: Tuesday, November 25, 2003 6:42 AM
To: Lucene Users List
Subject: Tokenizing text custom way

Hi. I need to tokenize text while indexing but I don't want space to be delimiter. Delimiter
should be my custom character (for example comma). I understand that I would probably need
to implement my own analyzer, but could someone help me where to start. Is there any other
way to do this without writing custom analyzer?

This is what I want to achieve.
If I have some text that will be indexed like following:

man, people, time out, sun

and if I enter 'time' as a search word, I don't want to get "time out" in results. I need
exact keyword matching. I would achieve this if I tokenize "time out" as one token while idexing.

Maybe someone had similar problem? If someone knows how to handle this, please help me.

Dragan Jotanovic

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message