lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Li <fancye...@gmail.com>
Subject Re: Question about chinese and WildcardQuery
Date Wed, 27 Jun 2012 12:24:00 GMT
standard analyzer will segment each character into a token, you should use
whitespace analyzer or your own analyzer that can tokenize it as one token
for wildcard search
在 2012-6-27 傍晚6:20,"Paco Avila" <monkiki@gmail.com>写道:

> Hi there,
>
> I have to index chinese content and I don't get the expected results when
> searching. It seems that the WildcardQuery does not work properly with the
> chinese characters. See attached sample code.
>
> I store the string "专项信息管理.doc" using the StandardAnalyzer and after that
> search for "专项信*" and no result is given. AFAIK, it should match the
> "专项信息管理.doc" string but it doesn't :(
>
> NOTE: Use Lucene 3.1.0
>
> Regards.
> --
> http://www.openkm.com
> http://www.guia-ubuntu.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message