lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Schrader" <schrader.n...@evendi.de>
Subject JavaCC Tokenizer
Date Wed, 29 May 2002 09:19:19 GMT
I need to construct a Tokenizer that tokenizes at word/number boundaries, so
that "IBM Deskstar IC35L060AVER07" would result in the following tokens:
IBM
Deskstar
IC
35
L
060
AVER
07

Has anybody solved this with the StandardTokenizer?

Christian


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message