lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Tokenize on another character
Date Mon, 31 Mar 2008 09:42:01 GMT

I just joined the list and need some help.

I have a database of music tracks.These tracks have been added to an
index. They are classified using keywords, so a track can have up to
20 keywords assigned to them. I took the keywords and create a
"keyword" FIELD which was not stored and tokenized. The problem is
this... if a user searches for a specific keyword such as "ROCK", it
is finding as well as ROCK tracks, ROCK AND ROLL tracks. I realise
this is due to the tokenization of the keyword FIELD. My question is
this, how can i stop the analyser from tokenizing on the space
character and instead tokenize on one i specifiy. That way, if i chose
to tokenize on a comma, i could add a comma at the end of every
keyword. Or have i gone about this the wrong way?

Many thanks, any insight will be appreciated.


Email sent from
Virus-checked using McAfee(R) Software and scanned for spam

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message