lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Pimley <>
Subject LetterTokenizer to allow digits
Date Fri, 05 Nov 2004 17:16:26 GMT

Hi everybody,

I have just found myself in the situation of having to subclass 
CharTokenizer with a class that tests against 
Character.isLetterOrDigit.  I would use a LetterTokenizer, but it's 
important for me to allow numbers through, as the documents I'm indexing 
often have dates such as '2000' or '1945'.

Obviously it's only a few lines to do this, but I'm sure I'm not the 
first person to have had to do it.  May I make the feature request that 
LetterTokenizer should have an 'AllowDigits' property?

Apologies if this has been discussed earlier.  I googled for the 
relevant terms and found nothing.

Peter Pimley,

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message