lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Announce : arabic Stemmer/Analyzer for Lucene
Date Sun, 28 Sep 2003 13:05:44 GMT
We could put this in the Lucene sandbox CVS perhaps.  Could you package  
it similarly to the other contributions there with a build file and  
convert your command-line tests to JUnit tests that run from the build  
file?

I took a quick look and looks like you did a fair bit of work and have  
the ASL in the source files.  The question, though, is whether your  
basing it on GPL code is acceptable.  Did you copy code from it?  We  
can have no GPL code in Apache's CVS.

	Erik


On Sunday, September 28, 2003, at 03:49  AM, Pierrick Brihaye wrote:

> Hi all,
>
> I have written a Lucene Analyzer for arabic. You will find it here :
> http://perso.wanadoo.fr/pierrick.brihaye/ArabicAnalyzer.jar  
> (provisional
> adress, anybody interested in hosting it ?)
>
> This work is still in beta stage but it gives quite good results :-)
>
> In order to make it work, you need :
>
> 1) a 1.4+ JVM (because of the native support for regular expressions  
> which
> are heavily used in the program ; I've been too lazy to use an external
> package)
>
> 2) Apache Jakarta Commons-Collections :
> http://jakarta.apache.org/commons/collections.html
>
> 3) a recent Lucene distribution ;-)
>
> All this work is based on the amazing Tim Buckwalter's Arabic  
> Morphological
> Analyzer Version 1.0
> (http://www.ldc.upenn.edu/Catalog/ 
> CatalogEntry.jsp?catalogId=LDC2002L49)
> originaly written in Perl and released under the GPL.
>
> The jar contains :
>
> a) the compiled classes
> b) the required data files (dictionaries and compatibility tables)
> c) 2 command-line test programs
> d) 3 test documents with different encodings
> e) the source code
> f) a README file that will give you a little bit more of information  
> :-)
>
> To Lucene developers : I plan to offer this work to Lucene (see the jar
> hierarchy... and the source file headers ;-). Any objections ?
>
> Feedback is very welcome : there are quite a lot of unresolved issues,  
> with
> the analyzer itselfs as well as with Lucene.
>
> mE AlslAmap, cheers,
>
> p.b.
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message