Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 42014 invoked from network); 12 Apr 2006 04:18:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 12 Apr 2006 04:18:48 -0000 Received: (qmail 36412 invoked by uid 500); 12 Apr 2006 04:18:45 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 36330 invoked by uid 500); 12 Apr 2006 04:18:44 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 36266 invoked by uid 99); 12 Apr 2006 04:18:44 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Apr 2006 21:18:44 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Apr 2006 21:18:43 -0700 Received: from ajax (localhost.localdomain [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id 8C3D5D4A03 for ; Wed, 12 Apr 2006 05:18:22 +0100 (BST) Message-ID: <2102512876.1144815502571.JavaMail.jira@ajax> Date: Wed, 12 Apr 2006 05:18:22 +0100 (BST) From: "Samphan Raruenrom (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-503) Contrib: ThaiAnalyzer to enable Thai full-text search in Lucene In-Reply-To: <989691685.1141209939287.JavaMail.jira@ajax.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/LUCENE-503?page=all ] Samphan Raruenrom updated LUCENE-503: ------------------------------------- Attachment: ThaiAnalyzer.java ThaiAnalyzer which simply return a TokenFilter chain with ThaiWordFilter in the middle > Contrib: ThaiAnalyzer to enable Thai full-text search in Lucene > --------------------------------------------------------------- > > Key: LUCENE-503 > URL: http://issues.apache.org/jira/browse/LUCENE-503 > Project: Lucene - Java > Type: New Feature > Components: Analysis > Versions: 1.4 > Reporter: Samphan Raruenrom > Attachments: ThaiAnalyzer.java > > Thai text don't have space between words. Usually, a dictionary-based algorithm is used to break string into words. For Lucene to be usable for Thai, an Analyzer that know how to break Thai words is needed. > I've implemented such Analyzer, ThaiAnalyzer, using ICU4j DictionaryBasedBreakIterator for word breaking. I'll upload the code later. > I'm normally a C++ programmer and very new to Java. Please review the code for any problem. One possible problem is that it requires ICU4j. I don't know whether this is OK. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org