jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From go canal <goca...@yahoo.com>
Subject full text search for CJK languages
Date Sun, 09 Aug 2009 14:20:28 GMT
Hi,
could not find detailed info wrt supporting full text search for 2-byte languages like CJK
(Chinese, Japanese and Korea). 

1) anybody know if there is one such library available ? and
2) how to config this in Jackrabbit ? Should I replace all the extractors in the current configuration:
    <SearchIndex .....
      <param name="textFilterClasses" 

        value="org.apache.jackrabbit.extractor.PlainTextExtractor,
         org.apache.jackrabbit.extractor.MsWordTextExtractor,
   org.apache.jackrabbit.extractor.MsExcelTextExtractor,
   org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
   org.apache.jackrabbit.extractor.PdfTextExtractor,
   org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
   org.apache.jackrabbit.extractor.RTFTextExtractor,
   org.apache.jackrabbit.extractor.HTMLTextExtractor,
   org.apache.jackrabbit.extractor.XMLTextExtractor" />
    </SearchIndex>
rgds,
canal



      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message