lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Herb Roitblat <herb.roitb...@orcatec.com>
Subject Re: QueryParser
Date Mon, 24 Mar 2014 13:00:42 GMT
The default query parser for CJK languages breaks text into bigrams.  A 
word consisting of characters ABCDE is broken into tokens  AB, BC, CD, 
DE, or

"轻歌曼舞庆元旦"

into
data:轻歌 data:歌曼 data:曼舞 data:舞庆 data:庆元 data:元旦

Each pair may or may not be a word, but if you use the same parser (i.e. 
analyzer) for indexing and for searching, you should get reasonable 
results.  A more powerful parser, typically one that includes a 
dictionary, is available, and may give more expected analyses at the 
cost of being slower.

Look here, for example: 
http://lucene.apache.org/core/4_0_0/analyzers-common/index.html
and here: http://lucene.apache.org/core/4_0_0/analyzers-smartcn/index.html



On 3/23/2014 11:21 PM, kalaik wrote:
> Dear Team,
>
>                  Any Update ?
>
>
>
>
>
>
>
>
> ---- On Fri, 21 Mar 2014 14:40:51 +0530 kalaik &lt;kalaiselvan.k@zohocorp.com&gt;
wrote ----
>
>
>
>
> Dear Team,
>
>                  we are using lucene in our product , it well searching for high speed
and performance but
>
>
>                  Japaneese, chinese and korean language not searching properly we had
use QueryParser
>
>
>                  QueryParser is splitted into word like "轻歌曼舞庆元旦"
>
>
>                   Example
>                          
>                              This word "轻歌曼舞庆元旦"
>   
>                             splited word :  data:轻歌 data:歌曼 data:曼舞 data:舞庆
data:庆元 data:元旦
>
> here is my code
>
>                              Query query =  parser.parse(searchData);
>           
>                               logger.log(Level.INFO,"Search Query is calling {0}",query);
>                                  
>                               TopDocs docs = is.search(query, resultRowSize);
>
>
> In case of any clarification please get back to me. please help as soon as possible
>
>
> Regards,
> kalai..
>
>
>
>
>
>
>
>
>
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message