lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] Assigned: (LUCENE-503) Contrib: ThaiAnalyzer to enable Thai full-text search in Lucene
Date Sat, 03 Jun 2006 01:01:30 GMT
     [ http://issues.apache.org/jira/browse/LUCENE-503?page=all ]

Hoss Man reassigned LUCENE-503:
-------------------------------

    Assign To: Hoss Man

I don't know anything about the Thai language ... but this code is clean, fairly easy to follow,
and has tests that pass.

If no one (who knows something about Thai) sees anything wrong with this implimentation and
objects i'll commit it sometime this weekend.



> Contrib: ThaiAnalyzer to enable Thai full-text search in Lucene
> ---------------------------------------------------------------
>
>          Key: LUCENE-503
>          URL: http://issues.apache.org/jira/browse/LUCENE-503
>      Project: Lucene - Java
>         Type: New Feature

>   Components: Analysis
>     Versions: 1.4
>     Reporter: Samphan Raruenrom
>     Assignee: Hoss Man
>  Attachments: TestThaiAnalyzer.java, ThaiAnalyzer.java, ThaiWordFilter.java
>
> Thai text don't have space between words. Usually, a dictionary-based algorithm is used
to break string into words. For Lucene to be usable for Thai, an Analyzer that know how to
break Thai words is needed.
> I've implemented such Analyzer, ThaiAnalyzer, using ICU4j DictionaryBasedBreakIterator
for word breaking. I'll upload the code later.
> I'm normally a C++ programmer and very new to Java. Please review the code for any problem.
One possible problem is that it requires ICU4j. I don't know whether this is OK.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message