lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Pimley <ppim...@semantico.com>
Subject LetterTokenizer to allow digits
Date Fri, 05 Nov 2004 17:16:26 GMT

Hi everybody,

I have just found myself in the situation of having to subclass 
CharTokenizer with a class that tests against 
Character.isLetterOrDigit.  I would use a LetterTokenizer, but it's 
important for me to allow numbers through, as the documents I'm indexing 
often have dates such as '2000' or '1945'.

Obviously it's only a few lines to do this, but I'm sure I'm not the 
first person to have had to do it.  May I make the feature request that 
LetterTokenizer should have an 'AllowDigits' property?

Apologies if this has been discussed earlier.  I googled for the 
relevant terms and found nothing.

Thanks,
Peter Pimley,
Semantico.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message