lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Searching behaviour with content containing decimal points
Date Fri, 26 Aug 2011 12:46:42 GMT
You have to do some normalizing here, and I don't think there's
anything available out of the box, so you'll probably have to roll
your own filter that does the normalization for this field.

Be a little cautious, though. Your example, while fine itself, may not
generalize. Your rule for normalization might be "remove all non
alphanum characters and drop trailing zeros". But if applied to,
say, part numbers does it still make sense? Does your user base
expect a part number (or something) like 123000 to fail to match

Anyway, you'll probably be making your own custom Analyzer
for this by chaining together, say, WhiteSpaceTokenizer with
your custom Filter.


On Thu, Aug 25, 2011 at 8:03 PM, Josh Rehman <> wrote:
> Actually I have this issue too. I've played around with various analyzers,
> and I would expect the WhitespaceAnalyzer to work (at least) but it does
> not.
> On Thu, Aug 25, 2011 at 4:58 PM, SBS <> wrote:
>> Can anyone help me with this?  Do you require further information?  This
>> has
>> become a serious issue for us.
>> Thanks,
>> -sbs
>> --
>> View this message in context:
>> Sent from the Lucene - Java Users mailing list archive at
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message