lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Chauhan <>
Subject Re: AlphaNumeric analyzer/tokenizer
Date Mon, 19 Aug 2019 06:23:20 GMT

Can someone please check the above mail and provide some feedback?

Thanks and Regards,

On Fri, Aug 16, 2019 at 2:52 PM Abhishek Chauhan <> wrote:

> Hi,
> We have been using SimpleAnalyzer which keeps only letters in its tokens.
> This limits us to search in strings that contains both letters and numbers.
> For e.g. "axt1234". SimpleAnalyzer would only enable us to search for "axt"
> successfully, but search strings like "axt1", "axt123" etc would give no
> results because while indexing it ignored the numbers.
> I can use StandardAnalyzer or WhitespaceAnalyzer but I want to tokenize on
> underscores also
> which these analyzers don't do. I have also looked at WordDelimiterFilter
> which will split "axt1234" into "axt" and "1234". However, using this also,
> I cannot search for "axt12" etc.
> Is there something like an Alphanumeric analyzer which would be very
> similar to SimpleAnalzyer but in addition to letters it would also keep
> digits in its tokens? I am willing contribute such an analyzer if one is
> not available.
> Thanks and Regards,
> Abhishek

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message