lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject RE: language identifier contrib
Date Wed, 15 Jan 2003 03:47:34 GMT
I'd be interested in seeing it, sure.
It looks like this has been implemented in Perl as well:
http://search.cpan.org/author/MPIOTR/Lingua-Ident-1.4/ ...looks
trivial.
I remember the author's name from a few years back.  His thesis was
about language recognition, I believe.  I still have the printout
somewhere.

Otis


--- Neil Couture <ncouture@convera.com> wrote:
> Snowball stemmer is not of very good quality. I think the best would
> be to build a lemmatizer from ispell more precisely from the ispell
> rules syntax. As for the language identifier the best overall
> language identifier is based on Ted Dunning. You can find the source
> code on the web. 
> 
> 
> its c code but can easily be ported to java. Also of interest is the
> Mozilla source code, there is code that do encoding detection. In
> fact I devellloped a java lib starting from that source code. Its
> based upon the LGPL license would you be interested to merge that
> source code in Lucene?
> 
> 
> -Neil
> 
> 
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: 7 janvier, 2003 12:06
> To: lucene-dev@jakarta.apache.org
> Subject: language identifier contrib
> 
> 
> Now that Doug put Snowball's stemmer's in Lucene Sandbox, it would be
> nice to have that language recognition contribution that somebody
> mentioned a month or two ago.
> 
> Ah, here it is, the original email that mentions this language
> identifier:
>
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgNo=2695
> 
> There's also this:
> http://frank.spieleck.de/ngram/
> 
> Thanks,
> Otis
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> http://mailplus.yahoo.com
> 
> --
> To unsubscribe, e-mail:  
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
> 
> 
> --
> To unsubscribe, e-mail:  
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message