lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <>
Subject Re: Strange behaviour of tokenizer with wildcard queries
Date Fri, 20 Sep 2013 12:41:06 GMT
It's reasonable that "block-major" won't find anything.
"block-major-57" should match.

The split into block and major-57 will be because, from the javadocs
for ClassicTokenizer, "Splits words at hyphens, unless there's a
number in the token, in which case the whole token is interpreted as a
product number and is not split.".  So I guess it splits on the first
hyphen but not the second.

ClassicAnalyzer/Tokenizer is general purpose and will never meet
everyone's requirement all the time.  You could try a different
analyzer, or build your own.  That's what the javadoc recommends.


On Fri, Sep 20, 2013 at 1:26 PM, Ramprakash Ramamoorthy
<> wrote:
> Sorry, hit the send button accidentally the last time. Please read below :
> Hello,
>             We're using lucene 4.1. We have the word "*block-major-57*"
> indexed. Using the classic analyzer, we get the following tokens : *block*and
> *major-57*.
>              I search for *block-major*, *the document doesn't match.
> However searching for *block** works perfect. Is this a bug, or am I doing
> something wrong?
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> Chennai, India.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message